You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have searched the existing issues, and I could not find an existing issue for this feature
I am requesting a straightforward extension of existing dbt functionality, rather than a Big Idea better suited to a discussion
Describe the feature
It would be great to have the option to customize the format for event_time_start and event_time_end when filtering a source table for microbatch models.
As I'm working with BigQuery, the senarios shown below specifically apply to BigQuery.
Case 1: Source Table Partitioned by a Non-TIMESTAMP Column
With this update, the source table’s event_time column is now cast to TIMESTAMP as shown below:
However, if the event_time column in the source table is not originally of TIMESTAMP type and the table is partitioned by event_time, this casting in the WHERE clause does not reduce the amount of data scanned.
This issue can be resolved by formatting event_time_start and event_time_end:
where
event_time >='2025-02-23'and event_time <'2025-02-24'
Case 2: Sharded Tables
For example, GA4's BigQuery Export tables are sharded by date (formatted as YYYYMMDD).
In these cases, filtering should be performed using the _table_suffix pseudo column:
Is this your first time submitting a feature request?
Describe the feature
It would be great to have the option to customize the format for
event_time_start
andevent_time_end
when filtering a source table for microbatch models.As I'm working with BigQuery, the senarios shown below specifically apply to BigQuery.
Case 1: Source Table Partitioned by a Non-TIMESTAMP Column
With this update, the source table’s event_time column is now cast to TIMESTAMP as shown below:
However, if the event_time column in the source table is not originally of TIMESTAMP type and the table is partitioned by event_time, this casting in the WHERE clause does not reduce the amount of data scanned.
This issue can be resolved by formatting
event_time_start
andevent_time_end
:Case 2: Sharded Tables
For example, GA4's BigQuery Export tables are sharded by date (formatted as YYYYMMDD).
In these cases, filtering should be performed using the
_table_suffix
pseudo column:Describe alternatives you've considered
event_time_start
andevent_time_end
of batch object to manually filter source tablesWho will this benefit?
Are you interested in contributing this feature?
Yes, but I'm not very familiar with the codebase.
Anything else?
No response
The text was updated successfully, but these errors were encountered: