-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add projection on inserted_at #17466
Conversation
posthog/clickhouse/migrations/0049_add_inserted_at_projection.py
Outdated
Show resolved
Hide resolved
Co-authored-by: James Greenhill <[email protected]>
🙏 this MR would make a big difference for me and my team! |
This PR hasn't seen activity in a week! Should it be merged, closed, or further worked on? If you want to keep it open, post a comment or remove the |
Hi @tomasfarias, would you be able to help get this MR across the finish line? Really looking forward to being able to successfully sync all my posthog data to snowflake without missing stuff. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good to me! thanks for only doing this on 202309
partition
if you wanted to limit this to |
Updated the materialization to |
Problem
Since we added
inserted_at
batch exports are querying events based oninserted_at
.However, we are forced to include
timestamp
in the query to better utilize the sort key. It would be better if we could use a sort key based oninserted_at
to optimize performance.Moreover, due to the discrepancy between
inserted_at
andtimestamp
we are missing events that were historically loaded.Changes
Add a migration that creates a projection with
inserted_at
in the sort key.👉 Stay up-to-date with PostHog coding conventions for a smoother review.
How did you test this code?