You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The data extract for the full Coronavirus dataset seems to have gotten hung up sometime after March 25, probably either when the shared /storage drive ran out of space, or when the server had to be restarted after a network outage. TweetSets read the task as still processing, although no files were being produced. In order to restart the task, it's necessary to delete the pertinent folder in /storage/full_datasets.
We need a way to recover gracefully from such errors.
If we continue using Celery, look at the call to _generate_tasks.AsyncResult(task_id), which was returning a "Pending" status even in the absence of a viable task.
If we are able to use Spark for extracts, consider exposing the Spark jobs UI from the container (for monitoring and disabling of jobs).
The text was updated successfully, but these errors were encountered:
The data extract for the full Coronavirus dataset seems to have gotten hung up sometime after March 25, probably either when the shared
/storage
drive ran out of space, or when the server had to be restarted after a network outage. TweetSets read the task as still processing, although no files were being produced. In order to restart the task, it's necessary to delete the pertinent folder in/storage/full_datasets
.We need a way to recover gracefully from such errors.
If we continue using Celery, look at the call to
_generate_tasks.AsyncResult(task_id)
, which was returning a "Pending" status even in the absence of a viable task.If we are able to use Spark for extracts, consider exposing the Spark jobs UI from the container (for monitoring and disabling of jobs).
The text was updated successfully, but these errors were encountered: