-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactoring-external-materialization #332
Changes from 1 commit
7f9111b
1014c82
d94042e
7e4b533
46b1f06
d4d8887
0dd9053
405b929
e2685a1
f93a126
489b020
9e50e7f
0115f62
8aba866
0b304c0
a064f9c
9b32e17
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -7,11 +7,14 @@ | |
{%- set read_location = adapter.external_read_location(location, rendered_options) -%} | ||
|
||
-- set language - python or sql | ||
-- i have to learn general about python models | ||
{%- set language = model['language'] -%} | ||
|
||
{%- set target_relation = this.incorporate(type='view') %} | ||
|
||
-- Continue as normal materialization | ||
-- Why do we have temp and intermediate relation? | ||
-- What does load_cached_relation do? | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Built-in dbt method; dbt will do the equivalent of caching the DB catalog metadata it loaded during the run so as to minimize the number of round-trips it does to the database (less relevant for DuckDB, more relevant for e.g. MotherDuck.) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I will have to take a look into MotherDuck here because I still don't understand how this fits here and what the places are where I have to be careful not to overlook the functionality. So please, if you see something that is against MotherDuck logic, make me aware of that. |
||
{%- set existing_relation = load_cached_relation(this) -%} | ||
{%- set temp_relation = make_intermediate_relation(this.incorporate(type='table'), suffix='__dbt_tmp') -%} | ||
{%- set intermediate_relation = make_intermediate_relation(target_relation, suffix='__dbt_int') -%} | ||
|
@@ -27,6 +30,8 @@ | |
{%- set backup_relation = make_backup_relation(target_relation, backup_relation_type) -%} | ||
-- as above, the backup_relation should not already exist | ||
{%- set preexisting_backup_relation = load_cached_relation(backup_relation) -%} | ||
|
||
--What is grants here? | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. GRANTS in the context of e.g. Snowflake or another cloud DWH. DuckDB doesn't have an equivalent concept yet, though it wouldn't surprise me to see it in the context of MotherDuck or Iceberg/Delta tables. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I will skip it for now to simplify, but the access grants here are an exciting topic. We want to export from the current context/duckdb into external parquet, delta, iceberg, and register a view over this location so that the table can be referenced in the next steps. |
||
-- grab current tables grants config for comparision later on | ||
{% set grant_config = config.get('grants') %} | ||
|
||
|
@@ -62,10 +67,11 @@ | |
{{ adapter.rename_relation(intermediate_relation, target_relation) }} | ||
|
||
{{ run_hooks(post_hooks, inside_transaction=True) }} | ||
|
||
-- What is should_revoke? | ||
milicevica23 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
{% set should_revoke = should_revoke(existing_relation, full_refresh_mode=True) %} | ||
{% do apply_grants(target_relation, grant_config, should_revoke=should_revoke) %} | ||
|
||
--What does that do? | ||
milicevica23 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
{% do persist_docs(target_relation, model) %} | ||
|
||
-- `COMMIT` happens here | ||
|
@@ -76,9 +82,10 @@ | |
{{ drop_relation_if_exists(temp_relation) }} | ||
|
||
-- register table into glue | ||
--I dont know glue so can you explain me a bit about that? | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There's a lot of history here-- the ability to register an external table with the AWS Glue catalog (and thus make it accessible to e.g. AWS Athena) actually predates the concept of dbt-duckdb plugins (in the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Nice, I see; I will need you to support me from the requirements point of view because I am unsure if I can imagine all possible use cases implemented until now. I don't have so much experience in the AWS world so more azure, google cloud but happy to learn about glue and athena |
||
{%- set plugin_name = config.get('plugin') -%} | ||
{%- set glue_register = config.get('glue_register', default=false) -%} | ||
{%- set partition_columns = config.get('partition_columns', []) -%} | ||
{%- set partition_columns = config.get('partition_columns', []) -%} -- this one is never used? | ||
milicevica23 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
{% if plugin_name is not none or glue_register is true %} | ||
{% if glue_register %} | ||
{# legacy hack to set the glue database name, deprecate this #} | ||
|
@@ -89,6 +96,6 @@ | |
|
||
{{ run_hooks(post_hooks, inside_transaction=False) }} | ||
|
||
{{ return({'relations': [target_relation]}) }} | ||
{{ return({'relations': [target_relation]}) }} -- to what i return? | ||
milicevica23 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
{% endmaterialization %} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea is to not destroy an existing relation that has the same name as this one (if it exists in the catalog) in case something fails during the construction of the relation defined by this model.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i understand; we prepare everything and do all the transformation, and at the very end, we exchange the names
I understand how it fits into the context but am not sure what it means for me right now in the implementation, so make sure that i don't overlook that one in the implementation.