Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(persons-on-events): Add Kafka table and materialized view for distinct ID overrides #20349

Merged
merged 2 commits into from
Feb 26, 2024

Conversation

tkaemming
Copy link
Contributor

This is the companion to #20326 which actually performs the consuming and writing to the table added there.

@posthog-bot

This comment was marked as outdated.

@tkaemming tkaemming force-pushed the poe-distinct-id-overrides-kafka-writes branch from 8acda46 to 974790e Compare February 23, 2024 17:06
@tkaemming
Copy link
Contributor Author

I think this doesn't need to be blocked by #20187 like I previously thought.

Distinct ID reuse is still an issue that should be solved — but at worst, the current problems won't get any worse with this change, and the data in the overrides should actually be a bit more accurate over time than the data in person_distinct_id2. Since we periodically delete rows from this table after they have been squashed, this will allow some rows that are versioned too low (due to Postgres row deletion and reuse rolling back the version counter) to be written to overrides in cases where they will also be ignored by person_distinct_id2.

In any case, this data is still correctable in both tables in the short term by running the fix script. When #20187 is resolved, we'll want to run a mass repair for the entire dataset, which will also fix any lingering issues here (we'll also want to update that script to create tombstones where it currently just warns about future issues.)

Copy link
Member

@fuziontech fuziontech left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks great!

@tkaemming tkaemming merged commit 88e1272 into master Feb 26, 2024
75 of 76 checks passed
@tkaemming tkaemming deleted the poe-distinct-id-overrides-kafka-writes branch February 26, 2024 22:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants