You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am reaching out to inquire if there are any plans to enhance the external storage synchronization functionality for local storage. Currently, the API call initiates a one-time sync for the entire project. While this works for smaller projects, it becomes challenging for larger ones (e.g., those with more than 5,000 or so tasks), as the sync process often times out before completion.
Although I have implemented a workaround by extending the timeout period #5890, this approach becomes increasingly impractical as project size continues to grow. For instance, I have a project with over 60,000 tasks, and the synchronization process takes more than 20 minutes to complete.
A potential improvement could be the introduction of multi-threaded synchronization. Additionally, providing users with more granular control over the synchronization process would be highly beneficial. For example, having the ability to sync specific tasks based on criteria could greatly enhance usability. My top two suggestions for such functionality include:
Synchronizing tasks in order of priority (e.g., oldest or newest tasks first).
Synchronizing tasks by a specified range of task IDs.
Would such enhancements be feasible?
Thank you for your time and consideration. I look forward to your thoughts.
Best regards,
Willie
The text was updated successfully, but these errors were encountered:
Regarding your question about scaling: Label Studio Enterprise is designed to handle the challenges of having a large number of tasks. It manages synchronization in the background as a separate task, running for several hours to sync a substantial number of tasks. If you require scalability, please consider switching to the enterprise edition. We do not have plans to support scalability in the community edition.
As for granular control: it seems you need to update your tasks over time. Why do you require this capability? What does your workflow look like?
Hello,
I hope this message finds you well.
I am reaching out to inquire if there are any plans to enhance the external storage synchronization functionality for local storage. Currently, the API call initiates a one-time sync for the entire project. While this works for smaller projects, it becomes challenging for larger ones (e.g., those with more than 5,000 or so tasks), as the sync process often times out before completion.
Although I have implemented a workaround by extending the timeout period #5890, this approach becomes increasingly impractical as project size continues to grow. For instance, I have a project with over 60,000 tasks, and the synchronization process takes more than 20 minutes to complete.
A potential improvement could be the introduction of multi-threaded synchronization. Additionally, providing users with more granular control over the synchronization process would be highly beneficial. For example, having the ability to sync specific tasks based on criteria could greatly enhance usability. My top two suggestions for such functionality include:
Would such enhancements be feasible?
Thank you for your time and consideration. I look forward to your thoughts.
Best regards,
Willie
The text was updated successfully, but these errors were encountered: