Replies: 2 comments 1 reply
-
From what I see in the code - KPO should use delete_namespaced_pod method depending on parameter you have there can be different grace_periods set - there is 0 for immediate, default depending on your POD default temaplate, you can also pas different parameter. Also in case POD cannot be deleted for whatever reason you should see a log. One scenario which I can see it happening is when you have long enough grace period and your running task also needs to respond to various SIGNALS (likely TERM/HUP) that K8S uses to kill your task. If you have long running, badly written C-code that does not respond to signals, or where you do not handle signal propagation in your container (for example use bash as entrypoint) https://www.kaggle.com/code/residentmario/best-practices-for-propagating-signals-on-docker the initial signals that K8S sends might be swallowed and the POD will continue running until grace period expires. That's theory. In theory either forcing grace period to 0 or properly handling signals and terminating your in-container process should do the job. But maybe there are other problems I am not aware of. |
Beta Was this translation helpful? Give feedback.
-
Thanks @potiuk for quick reply! I was actually able to reproduce this behavior with python operator that just waits for 30s. If you mark it as failed from UI, it is marked red, it still works in the background and once you refresh the frontend in 30s, the task is showed as green. for reference this is the DAG: from datetime import datetime
from airflow.operators.python import PythonOperator
from airflow import DAG
import time
def wait_callable():
for i in range(1, 30):
print(f'sleep #{0}', i)
time.sleep(1)
default_args = {
'owner': 'airflow',
'start_date': datetime(2020, 3, 4)
}
dag = DAG("wait_dag_python",
default_args=default_args,
schedule=None)
wait_task = PythonOperator(
task_id='wait_task',
python_callable=wait_callable,
dag=dag
) the logfile shows printed line every 1s regardless if you fail the task or not. |
Beta Was this translation helpful? Give feedback.
-
we have long running task as KubernetesPodOperator.
When we mark the task as failed, it is marked red immediately on UI and the rest of the pipeline finishes accordingly. However Airflow does not kill the pod and when the pod finishes, the task is turned green despite the fact the DAG already finished long ago.
Is there any way how to tell Airflow to kill the pod in kubernetes when we mark the task as failed?
Tested both with versions 2.7.3 and 2.10.4.
Looks similar to #37263, but in our case is_delete_operator_pod does not have any effect
Beta Was this translation helpful? Give feedback.
All reactions