Templating (like {{ ds }} ) stopped working in papermill after upgrade from 2.3.x to 2.4.x #28977
Replies: 21 comments 4 replies
-
it looks like all templates are not working because I also have problems with getting ds_nodash to work. templating of output notebook however works Also, I can't understand how templating works in airflow. before 2.3.x there was dedicated function in base operator but that got removed. How is templating working now? I can see that execute function already contains partially template data but I can't see where that function is called |
Beta Was this translation helpful? Give feedback.
-
All the rendering functions are still there, ome of them are moved up to AbstractOperator (the parent class). |
Beta Was this translation helpful? Give feedback.
-
I copy-pasted entire BaseOperator codes from older versions to new and it still didn't work. Thanks to new downgrade command I was able to downgrade to 2.3.0 which was the last stable version in my environment(with small celery kubernetes executor script fixes) and now templating works correctly. I couldn't debug the source of that issue. I had problems with k8s pod overrides in 2.3.4 so I went back to 2.3.0. |
Beta Was this translation helpful? Give feedback.
-
Same here, upgraded to 2.4.1 and now my parameters in papermill are not executed. Here is my code:
|
Beta Was this translation helpful? Give feedback.
-
In case you want to downgrade to an older version:
That's it |
Beta Was this translation helpful? Give feedback.
-
I don’t have a setup to run Papermill, but a test with BashOperator(..., env={"FOO": "{{ ds }}"})` indicates that the template rendering mechanism is running correctly, so this is likely specific to the Papermill operator. But the operator is very very straightforward and I can’t see how it can be problematic at all. We’ll need someone that can reproduce this to debug this issue, I’m afraid 😞 |
Beta Was this translation helpful? Give feedback.
-
I don't think this is a bug in the operator. We even have a coverage for templating in the tests: airflow/tests/providers/papermill/operators/test_papermill.py Lines 61 to 73 in d9db89a so I doubt the issue is in the operator itself.
I suspect your issue is not a bug in the operator but wrong usage with how you pull from xcom / extract values from conf? |
Beta Was this translation helpful? Give feedback.
-
I downgraded to version 2.3.4 and it's working perfectly fine. I don't know how it is possible that the problem is on my script and not on the operator since it's working for older version |
Beta Was this translation helpful? Give feedback.
-
hi @marvinfretly Can you please test the the same templating code you use on other operators? |
Beta Was this translation helpful? Give feedback.
-
Tryed this and it's actually working
Tryed this minimal task and it's still not working.
|
Beta Was this translation helpful? Give feedback.
-
I checked old logs yesterday and I found out something interesting. So it seems the only problem was with |
Beta Was this translation helpful? Give feedback.
-
Cool. So this narrow down the issue to this class: airflow/airflow/providers/papermill/operators/papermill.py Lines 33 to 39 in 79354dd This is the difference between As for the cause of the bug it's still a mystery to me. @uranusjr given the above information maybe you can better speculate? |
Beta Was this translation helpful? Give feedback.
-
Hmm, I can see why this doesn’t work, but the problem is, I can’t understand how it ever worked before 2.4! The template rendering code should never have been able to go into nested class like this. I reduced the code to this and the result colaborates (I tested on main, 2.3.3, and 2.2.2): from airflow.providers.papermill.operators.papermill import PapermillOperator
context = {"ds": "2022-10-13"}
pp = PapermillOperator(task_id="pp", input_nb="my_nb", parameters={"x": "{{ ds }}"})
pp.render_template_fields(context)
# This passes because 'parameters' is implated.
assert pp.parameters["x"] == "2022-10-13"
# This does not work because the 'NoteBook' class is not renderable.
assert pp.inlets[-1].parameters["x"] == "2022-10-13" We can fix the apparent issue here (by making NoteBook renderable), but I’m uneasy about this since we still seem to be obviously missing something. |
Beta Was this translation helpful? Give feedback.
-
Is there any update on this? I'm facing the same issue regarding the parameters right now. |
Beta Was this translation helpful? Give feedback.
-
For anyone interested i worked around this issue with the PythonOperator, where i get in **kwargs and with e.g. |
Beta Was this translation helpful? Give feedback.
-
@uranusjr Could please provide a hint how to do it as a dirty fix? This issue prevents anyone using papermill operator from upgrading the airflow. |
Beta Was this translation helpful? Give feedback.
-
You can override |
Beta Was this translation helpful? Give feedback.
-
Thanks! I did it but there was no change, I added some console prints in AbstractOperator and it seems that EDIT: Here is the output calls from
Still, the |
Beta Was this translation helpful? Give feedback.
-
I found the issue with the code of Papermill (or Airflow). The render_template method apparently execute after the init method but before the execute() method. As papermill init the self.inlets in the constructor with the self.parameters, the self.inlets create a Notebook object with pre-rendered params and never got updated. I will visualize the flow with the code here:
The problem is fixed if you modify the execute method like this by updating self.inlets and self.outlets attribute inside the execute function, after the render function has been executed
I am currently on airflow 2.5.0, and looking at airflow ti now to see if they change the render. |
Beta Was this translation helpful? Give feedback.
-
Rendering has always happened after |
Beta Was this translation helpful? Give feedback.
-
Converted to discussion. This is not clear at all if it is an airflow issue and it seems we know how it should work in modern airflow. Whether how it worked before is unclear and if anything- it was accidental. |
Beta Was this translation helpful? Give feedback.
-
Apache Airflow version
2.4.1
What happened
I am using CeleryKubernetesExecutor and I have Celery Worker. I am running Papermill task.
Succendly after upgrade from Airflow 2.3.4 to 2.4.1,
{{ ds }}
template stopped being recognized in worker in Papermill notebooks. I can see it being rendered properly on UI but on worker there is only{{ ds }}
in parameters and operator fails withValueError: time data '{{ ds }}' does not match format '%Y-%m-%d'
errorNow all papermill tasks using
{{ ds }}
are broken.What you think should happen instead
{{ ds }}
should be properly templated.How to reproduce
Run airflow instance with celery worker, both should be 2.4.1. Try running PapermillOperator on the worker with
{{ ds }}
as a parameter for notebook. Check that in the output notebook in the parameter there is{{ ds }}
instead of proper value.Operating System
docker image nvidia/cuda:10.1-cudnn8-devel-ubuntu18.04 (Ubuntu 18.04)
Versions of Apache Airflow Providers
apache-airflow-providers-apache-beam 3.1.0 pyhd8ed1ab_0 conda-forge
apache-airflow-providers-apache-cassandra 2.0.1 pyhd8ed1ab_0 conda-forge
apache-airflow-providers-apache-hive 2.0.2 pyhd8ed1ab_0 conda-forge
apache-airflow-providers-apache-spark 2.0.1 pyhd8ed1ab_0 conda-forge
apache-airflow-providers-celery 2.1.0 pyhd8ed1ab_0 conda-forge
apache-airflow-providers-cncf-kubernetes 2.0.2 pyhd8ed1ab_0 conda-forge
apache-airflow-providers-ftp 2.0.1 pyhd8ed1ab_0 conda-forge
apache-airflow-providers-google 5.1.0 pyhd8ed1ab_0 conda-forge
apache-airflow-providers-http 2.0.1 pyhd8ed1ab_0 conda-forge
apache-airflow-providers-imap 2.0.1 pyhd8ed1ab_0 conda-forge
apache-airflow-providers-jdbc 2.0.1 pyhd8ed1ab_0 conda-forge
apache-airflow-providers-mysql 2.1.1 pyhd8ed1ab_0 conda-forge
apache-airflow-providers-papermill 3.0.0 pyhd8ed1ab_0 conda-forge
apache-airflow-providers-postgres 2.2.0 pyhd8ed1ab_0 conda-forge
apache-airflow-providers-sftp 2.1.1 pyhd8ed1ab_0 conda-forge
apache-airflow-providers-sqlite 2.0.1 pyhd8ed1ab_0 conda-forge
apache-airflow-providers-ssh 2.1.1 pyhd8ed1ab_0 conda-forge
Deployment
Other Docker-based deployment
Deployment details
Google Kubernetes Engine for Airflow, regular Docker for celery worker
Anything else
It happened when I upgraded from airflow 2.3.4 to 2.4.1, no other libraries were changed. Tried both apache-airflow-providers-papermill 3.0.0 and apache-airflow-providers-papermill 2.2.3
Are you willing to submit PR?
Code of Conduct
Beta Was this translation helpful? Give feedback.
All reactions