Dask executor doesn't get correct response if worker restarted during task running #45608
Replies: 4 comments
-
Thanks for opening your first issue here! Be sure to follow the issue template! If you are willing to raise PR to address this issue please do so, no need to wait for approval. |
Beta Was this translation helpful? Give feedback.
-
I haven't found the DaskExecutor tag in airflow-provider's list. |
Beta Was this translation helpful? Give feedback.
-
Yes. We no longer publish nor fix DasK Executor, it has been removed from maintance as result of this LAZY CONSENSUS https://lists.apache.org/thread/fxv44cqqljrrhll3fdpdgc9h9fz5ghcy and you will find the detailed discussion on why we decided to do it. But if someone would like to for the code of last dask executor, look at any problems there and release a new version - likely with a 3rd-party managed dask provider, they are entirely free to do it. |
Beta Was this translation helpful? Give feedback.
-
Converting to a discussion as it's at most that. |
Beta Was this translation helpful? Give feedback.
-
Apache Airflow Provider(s)
standard
Versions of Apache Airflow Providers
apache-airflow-providers-daskexecutor==1.1.1
apache-airflow-providers-fab==1.4.1
apache-airflow-providers-ftp==3.11.1
apache-airflow-providers-http==4.13.1
apache-airflow-providers-imap==3.7.0
apache-airflow-providers-smtp==1.8.0
apache-airflow-providers-sqlite==3.9.0
apache-airflow-providers-common-compat==1.2.1
apache-airflow-providers-common-io==1.4.2
apache-airflow-providers-common-sql==1.18.0
Apache Airflow version
2.10.2
Operating System
Debian GNU/Linux 12 (bookworm)
Deployment
Other Docker-based deployment
Deployment details
Kubernetes: v1.28
GoVersion:"go1.15.9"
What happened
Show as above logs, the dask worker triggered gracefully restart when a airflow task still running.
After that, the task's status keep in 'pending' and will no longer changed.
The running task's context is a shell cmd started with [airflow run ...], it will keep running and mark itself as success in airflow's metadata DB.
Their date are actually a same value with different time zone.
After that, this airflow task's status in airflow DB was success and will no longer be scheduled, at the mean time, its status in Dask executor(in airflow scheduler process) will keep in 'pending' until airflow scheduler restarted and will affect value of
executor_slots_available
, when theexecutor_slots_available
reach zero, the airflow scheduler will stop working.The debug dump from DaskExecutor, actually no task is running during that time:
What you think should happen instead
Obviously the above status is incorrect and looks like a memory leak Bug.
How to reproduce
Maybe airflow should sync task status in Dask executor with metadata DB and cleanup succeed tasks.
Or just make sure Dask executor can receive the same status as the actually running cmd.
Anything else
No response
Are you willing to submit PR?
Code of Conduct
Beta Was this translation helpful? Give feedback.
All reactions