-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve performance of delete+insert
incremental strategy with null equality check changes
#834
base: main
Are you sure you want to change the base?
Conversation
Code lgtm, but let's address those failing tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall this is much cleaner. Assuming tests are happy then this looks good.
|
||
delete from {{ target }} | ||
where ({{ unique_key_str }}) in ( | ||
select distinct {{ unique_key_str }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the distinct
here is unnecessary and would make it take longer since it needs to reduce dupes. I don't have proof of that, but I generally don't use distinct
when using a where <x> in <y>
clause.
Two of the spark failed tests are due to an unrelated issue. The errors in the third spark run are related to the cluster not coming up in time, which we have a separate PR to fix. |
This PR merges the changes for null equality from here: #744 into existing PR: #151
Problem
Solution
Checklist