Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Databricks Extra Links See Job Run does not work with custom S3 backend #45240

Open
1 of 2 tasks
mamdouhtawfik opened this issue Dec 27, 2024 · 4 comments · May be fixed by #45334
Open
1 of 2 tasks

Databricks Extra Links See Job Run does not work with custom S3 backend #45240

mamdouhtawfik opened this issue Dec 27, 2024 · 4 comments · May be fixed by #45334

Comments

@mamdouhtawfik
Copy link

Apache Airflow Provider(s)

databricks

Versions of Apache Airflow Providers

6.0.0

Apache Airflow version

2.8.1

Operating System

Amazon

Deployment

Amazon (AWS) MWAA

Deployment details

No response

What happened

When I click on the link to see job run details, it directs to https://(s3 path of my xcom json file), instead of directing to databricks workspace

What you think should happen instead

I expect the link to work and directs to job run details in our team's databricks workspace

How to reproduce

  1. Create and use a custom XCOM backend that writes to S3
  2. Run a databricks job
  3. Click the link to go to the job run details
  4. Airflow 404

Anything else

Currently our custom XCOM backend, stores everything in S3, it does not store large values only as other XCOM, yet the behavior of the extra link should not depend on that:
I had a look at https://airflow.apache.org/docs/apache-airflow-providers-databricks/6.0.0/_modules/airflow/providers/databricks/operators/databricks.html#DatabricksJobRunLink.get_link and https://github.com/apache/airflow/blob/2.8.1/airflow/models/xcom.py#L873-L876 yet I doubt that TYPE_CHECKING is true during runtime, because if it is, then the provider won't use our custom class.
I also tried to override the orm_deserialize_value method hoping this will fix the issue, yet I ran into the issue I mentioned in this comment #44232 (comment) (contributions to this discussion are very welcome as well)

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@mamdouhtawfik mamdouhtawfik added area:providers kind:bug This is a clearly a bug needs-triage label for new issues that we didn't triage yet labels Dec 27, 2024
Copy link

boring-cyborg bot commented Dec 27, 2024

Thanks for opening your first issue here! Be sure to follow the issue template! If you are willing to raise PR to address this issue please do so, no need to wait for approval.

@potiuk potiuk added good first issue and removed needs-triage label for new issues that we didn't triage yet labels Dec 27, 2024
@Prab-27
Copy link
Contributor

Prab-27 commented Dec 27, 2024

@potiuk ,I'd like to work on this -

@mohamedmeqlad99
Copy link

I have submitted a PR to address this issue: PR ##45328.
The PR resolves the problem where the "See Job Run" link does not work with a custom S3 XCom backend. Feedback is welcome!

@mohamedmeqlad99 mohamedmeqlad99 linked a pull request Jan 2, 2025 that will close this issue
4 tasks
@mamdouhtawfik
Copy link
Author

I had a deeper look into the issue and I think the issue is related to a "bug" in the xcom.py file, it was resolved in this PR https://github.com/apache/airflow/pull/37058/files#diff-3f95f161bd9ef4bb455611e0c58583899769360afc53f755cd1577cf194553c5R423, calling the XCom variable to deserialize the value instead of BaseXCom class. I am not sure if there is an easy way to address this within airflow 2.8.1

@mohamedmeqlad99 thanks for spending time to try to support with this, yet I am not sure if your PR is addressing the issue here and generally I don't think we should go in an S3 specific path but rather use the deserialization of the customer xcom backend (which is what was "fixed" in the PR I shared)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants