You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It would be great to have an identity resolution logic in the package. This would consist of multiple steps:
Have a stitching table that combines the user identifiers
Depending on the use case this could be a table defined in dbt or in a more complex case a python implementation or a graph database.
Example code
--- user_identity_mapping.sql
select distinct
anonymous_customer_id,
last_value(customer_id) over(
partition by anonymous_customer_id
order by event_tstamp
rows between unbounded preceding and unbounded following
) as customer_id,
max(event_tstamp) over (partition by anonymous_customer_id) as end_tstamp
from events
where customer_id is not null
and anonymous_customer_id is not null
Update the activities via a post hook
With that table we could implement a post hook that updates the customer_id as well as the activity_occurence and activity_repeated_at fields.
Example code for the customer_id update
update activity as a
set a.customer_id = ui.customer_id
from user_identity_mapping as ui
where a.anonymous_customer_id = ui.anonymous_customer_id;
Alternatives to consider:
Is it really neccessary to have two id columns in activities?
The text was updated successfully, but these errors were encountered:
It would be great to have an identity resolution logic in the package. This would consist of multiple steps:
Depending on the use case this could be a table defined in dbt or in a more complex case a python implementation or a graph database.
Example code
With that table we could implement a post hook that updates the
customer_id
as well as theactivity_occurence
andactivity_repeated_at
fields.Example code for the customer_id update
Alternatives to consider:
The text was updated successfully, but these errors were encountered: