Replies: 2 comments
-
I think this is likely a question to the snowflake team - @mik-laj - maybe it can pick an interest of somoene in Snowfllake? |
Beta Was this translation helpful? Give feedback.
0 replies
-
can you enable developer logs for
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello.
We are experiencing a very strange issue with SnowflakeOperator. We have a DAG using SnowflakeOperator to run a pretty simple query of a following form:
INSERT INTO table_a ( col1, col2 ) SELECT col1, col2 FROM table_b;
From the logs we can see, that besides the provided query, two extra queries to information_schema of the database are being executed asynchronously:
SELECT table_schema, table_name, column_name, ordinal_position, data_type FROM <DATABASE_NAME>.information_schema.columns WHERE ( table_schema = '<DATABASE_SCHEMA>' AND table_name IN ('table_a') );
And another, very similar query
SELECT table_schema, table_name, column_name, ordinal_position, data_type FROM <DATABASE_NAME>.information_schema.columns WHERE ( table_schema = '<DATABASE_SCHEMA>' AND table_name IN ('table_b') );
Where
<DATABASE_NAME>
and<DATABASE_SCHEMA>
are values of parameters provided for the SnowflakeOperator() constructor.Once in few executions (between 2 and 5) the second query to information_schema seems to be hanging, which can be told by lack of feedback information in logs.
For a successfull run, the appropriate part of the log looks like following:
[2022-10-13T11:11:39.799+0000] {cursor.py:696} INFO - query: [SELECT table_schema, table_name, column_name, ordinal_position, data_type FROM D...]
[2022-10-13T11:11:40.498+0000] {cursor.py:720} INFO - query execution done
[2022-10-13T11:11:40.499+0000] {connection.py:509} INFO - closed
For an unsuccessful run, two last lines are missing, which suggests, that the query has never finished and the connection was never closed.
Analysis of Snowflake query history does not show any hanging or failed queries to information schema, so either the queries actually finished correctly, but there was no feedback provided to SnowflakeOperator or the query was in fact never launched.
The task does not finish and it does not respond to SIGTERM. As a result it's killed by a SIGKILL sent due to DAG execution timeout.
Execution environment is Astronomer 6.0.2 with Airflow 2.4.1 on Azure Kubernetes Service in Central US region.
We do not know how to tackle the issue or what can be done to prevent it.
Beta Was this translation helpful? Give feedback.
All reactions