Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Autosync Cloud Storage #7038

Open
MayStepanyan opened this issue Feb 7, 2025 · 1 comment
Open

Autosync Cloud Storage #7038

MayStepanyan opened this issue Feb 7, 2025 · 1 comment

Comments

@MayStepanyan
Copy link

Is there a way to automatically synchronize source/target cloud storages (e.g. S3) with the labelstudio project? I didn't find any options in the settings nor did find info in issues.

@heidi-humansignal
Copy link
Collaborator

Hello,

Thank you for contacting Label Studio,

Currently, Label Studio does not include a built‐in option to automatically synchronize your source or target cloud storages (such as S3) with your project. In other words, once you configure a storage connection, you need to initiate the sync manually via the UI (by clicking the “Sync” button) or programmatically via the API.
For example, after you set up an S3 bucket as your source storage, Label Studio creates references to the objects in that bucket rather than importing them automatically. This design gives you full control over which data is added to your project. Similarly, for target storages (where annotations are pushed) a manual sync operation is required.
If you need an automated workflow, you could implement your own solution by scheduling periodic API calls to the relevant sync endpoint. This way, a background process (or cron job) would trigger the sync operation for you. For instance, you could use the following basic approach:

import requestsLABEL_STUDIO_URL = "https://your-label-studio-instance.com"API_TOKEN = "your_api_token"PROJECT_ID = "your_project_id"STORAGE_TYPE = "s3" # for your S3 storageSTORAGE_ID = "your_storage_id"sync_url = f"{LABEL_STUDIO_URL}/api/storages/export/{STORAGE_TYPE}/{STORAGE_ID}/sync/?project={PROJECT_ID}"response = requests.post(sync_url, headers={"Authorization": f"Token {API_TOKEN}"})if response.status_code == 200: print("Sync initiated successfully.")else: print("Sync failed:", response.text)

This method lets you mimic automatic synchronization by having your system call the sync endpoint on a schedule.
Let me know if you need further guidance or help setting up a scheduled sync process.

Comment by Oussama Assili
Workflow Run

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants