-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: Support batch export models as views #23052
Merged
tomasfarias
merged 49 commits into
master
from
refactor/support-for-batch-export-model-views
Jun 24, 2024
Merged
Changes from 45 commits
Commits
Show all changes
49 commits
Select commit
Hold shift + click to select a range
b43e7af
refactor: Update metrics to fetch counts at request time
tomasfarias 590f74f
fix: Move import to method
tomasfarias 53c71b6
fix: Add function
tomasfarias ea1d33f
feat: Custom schemas for batch exports
tomasfarias 81e929c
feat: Frontend support for model field
tomasfarias 054a826
fix: Clean-up
tomasfarias 11b47d3
fix: Add missing migration
tomasfarias 106a725
fix: Make new field nullable
tomasfarias 8b19079
Update UI snapshots for `chromium` (1)
github-actions[bot] fdb729a
Update UI snapshots for `chromium` (1)
github-actions[bot] ae9870c
Update UI snapshots for `chromium` (1)
github-actions[bot] 091e380
Update UI snapshots for `chromium` (1)
github-actions[bot] af52af9
Update UI snapshots for `chromium` (1)
github-actions[bot] 46c93de
Update UI snapshots for `chromium` (1)
github-actions[bot] c8bc6a6
fix: Bump migration number
tomasfarias fae1b59
fix: Bump migration number
tomasfarias 85f5094
refactor: Update metrics to fetch counts at request time
tomasfarias b2382e2
fix: Actually use include and exclude events
tomasfarias bc6dd4e
refactor: Switch to counting runs
tomasfarias ce7d4df
refactor: Support batch export models as views
tomasfarias 55f6a5a
fix: Merge conflict
tomasfarias 8e306a4
fix: Quality check fixes
tomasfarias a33563a
refactor: Update metrics to fetch counts at request time
tomasfarias 1f93d43
fix: Move import to method
tomasfarias a395f18
fix: Add function
tomasfarias eb0e581
fix: Typing fixes
tomasfarias 4001daa
feat: Custom schemas for batch exports
tomasfarias 0f3cbe3
feat: Frontend support for model field
tomasfarias fde00d4
fix: Clean-up
tomasfarias ee354cb
fix: Add missing migration
tomasfarias ccab5f6
fix: Make new field nullable
tomasfarias e7b5cb9
Update UI snapshots for `chromium` (1)
github-actions[bot] 6da056a
Update UI snapshots for `chromium` (1)
github-actions[bot] a8b3594
Update UI snapshots for `chromium` (1)
github-actions[bot] 724c208
Update UI snapshots for `chromium` (1)
github-actions[bot] 6b19e95
Update UI snapshots for `chromium` (1)
github-actions[bot] 618d0b8
Update UI snapshots for `chromium` (1)
github-actions[bot] 536cbdb
fix: Bump migration number
tomasfarias 0a9343e
fix: Clean-up unused code
tomasfarias f921bb5
chore: Clean-up unused function and tests
tomasfarias dc9535f
fix: Clean-up unused function
tomasfarias ee3fcb8
fix: HTTP Batch export default fields
tomasfarias c4deddf
fix: Remove test case on new column not present in base table
tomasfarias f2cad91
chore: Clean-up unused functions and queries
tomasfarias 86c86a2
fix: Only run extra clickhouse queries in batch exports tests
tomasfarias 278a067
refactor: Remove coalesce and use only inserted_at in queries
tomasfarias 740188b
fix: Remove deprecated test
tomasfarias 525c682
fix: Add person_id to person model and enforce ordering
tomasfarias 2eabe0c
refactor: Also add version column
tomasfarias File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,131 @@ | ||
CREATE_PERSONS_BATCH_EXPORT_VIEW = """ | ||
CREATE OR REPLACE VIEW persons_batch_export AS ( | ||
SELECT | ||
pd.team_id, | ||
pd.distinct_id, | ||
p.properties, | ||
pd._timestamp AS _timestamp, | ||
NOW64() AS _inserted_at | ||
FROM ( | ||
SELECT | ||
team_id, | ||
distinct_id, | ||
argMax(person_id, version) AS person_id, | ||
max(_timestamp) AS _timestamp | ||
FROM | ||
person_distinct_id2 | ||
WHERE | ||
team_id = {team_id:Int64} | ||
GROUP BY | ||
team_id, | ||
distinct_id | ||
) AS pd | ||
INNER JOIN | ||
person p ON p.id = pd.person_id AND p.team_id = pd.team_id | ||
WHERE | ||
pd.team_id = {team_id:Int64} | ||
AND p.team_id = {team_id:Int64} | ||
AND pd._timestamp >= {interval_start:DateTime64} | ||
AND pd._timestamp < {interval_end:DateTime64} | ||
) | ||
""" | ||
|
||
CREATE_EVENTS_BATCH_EXPORT_VIEW = """ | ||
CREATE OR REPLACE VIEW events_batch_export AS ( | ||
SELECT | ||
team_id AS team_id, | ||
min(timestamp) AS timestamp, | ||
event AS event, | ||
any(distinct_id) AS distinct_id, | ||
any(toString(uuid)) AS uuid, | ||
min(COALESCE(inserted_at, _timestamp)) AS _inserted_at, | ||
any(created_at) AS created_at, | ||
any(elements_chain) AS elements_chain, | ||
any(toString(person_id)) AS person_id, | ||
any(nullIf(properties, '')) AS properties, | ||
any(nullIf(person_properties, '')) AS person_properties, | ||
nullIf(JSONExtractString(properties, '$set'), '') AS set, | ||
nullIf(JSONExtractString(properties, '$set_once'), '') AS set_once | ||
FROM | ||
events | ||
PREWHERE | ||
COALESCE(events.inserted_at, events._timestamp) >= {interval_start:DateTime64} | ||
AND COALESCE(events.inserted_at, events._timestamp) < {interval_end:DateTime64} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 😎 |
||
WHERE | ||
team_id = {team_id:Int64} | ||
AND events.timestamp >= {interval_start:DateTime64} - INTERVAL {lookback_days:Int32} DAY | ||
AND events.timestamp < {interval_end:DateTime64} + INTERVAL 1 DAY | ||
AND (length({include_events:Array(String)}) = 0 OR event IN {include_events:Array(String)}) | ||
AND (length({exclude_events:Array(String)}) = 0 OR event NOT IN {exclude_events:Array(String)}) | ||
GROUP BY | ||
team_id, toDate(events.timestamp), event, cityHash64(events.distinct_id), cityHash64(events.uuid) | ||
ORDER BY | ||
_inserted_at, event | ||
SETTINGS optimize_aggregation_in_order=1 | ||
) | ||
""" | ||
|
||
CREATE_EVENTS_BATCH_EXPORT_VIEW_UNBOUNDED = """ | ||
CREATE OR REPLACE VIEW events_batch_export_unbounded AS ( | ||
SELECT | ||
team_id AS team_id, | ||
min(timestamp) AS timestamp, | ||
event AS event, | ||
any(distinct_id) AS distinct_id, | ||
any(toString(uuid)) AS uuid, | ||
min(COALESCE(inserted_at, _timestamp)) AS _inserted_at, | ||
any(created_at) AS created_at, | ||
any(elements_chain) AS elements_chain, | ||
any(toString(person_id)) AS person_id, | ||
any(nullIf(properties, '')) AS properties, | ||
any(nullIf(person_properties, '')) AS person_properties, | ||
nullIf(JSONExtractString(properties, '$set'), '') AS set, | ||
nullIf(JSONExtractString(properties, '$set_once'), '') AS set_once | ||
FROM | ||
events | ||
PREWHERE | ||
COALESCE(events.inserted_at, events._timestamp) >= {interval_start:DateTime64} | ||
AND COALESCE(events.inserted_at, events._timestamp) < {interval_end:DateTime64} | ||
WHERE | ||
team_id = {team_id:Int64} | ||
AND (length({include_events:Array(String)}) = 0 OR event IN {include_events:Array(String)}) | ||
AND (length({exclude_events:Array(String)}) = 0 OR event NOT IN {exclude_events:Array(String)}) | ||
GROUP BY | ||
team_id, toDate(events.timestamp), event, cityHash64(events.distinct_id), cityHash64(events.uuid) | ||
ORDER BY | ||
_inserted_at, event | ||
SETTINGS optimize_aggregation_in_order=1 | ||
) | ||
""" | ||
|
||
CREATE_EVENTS_BATCH_EXPORT_VIEW_BACKFILL = """ | ||
CREATE OR REPLACE VIEW events_batch_export_backfill AS ( | ||
SELECT | ||
team_id AS team_id, | ||
min(timestamp) AS timestamp, | ||
event AS event, | ||
any(distinct_id) AS distinct_id, | ||
any(toString(uuid)) AS uuid, | ||
min(COALESCE(inserted_at, _timestamp)) AS _inserted_at, | ||
any(created_at) AS created_at, | ||
any(elements_chain) AS elements_chain, | ||
any(toString(person_id)) AS person_id, | ||
any(nullIf(properties, '')) AS properties, | ||
any(nullIf(person_properties, '')) AS person_properties, | ||
nullIf(JSONExtractString(properties, '$set'), '') AS set, | ||
nullIf(JSONExtractString(properties, '$set_once'), '') AS set_once | ||
FROM | ||
events | ||
WHERE | ||
team_id = {team_id:Int64} | ||
AND events.timestamp >= {interval_start:DateTime64} | ||
AND events.timestamp < {interval_end:DateTime64} | ||
AND (length({include_events:Array(String)}) = 0 OR event IN {include_events:Array(String)}) | ||
AND (length({exclude_events:Array(String)}) = 0 OR event NOT IN {exclude_events:Array(String)}) | ||
GROUP BY | ||
team_id, toDate(events.timestamp), event, cityHash64(events.distinct_id), cityHash64(events.uuid) | ||
ORDER BY | ||
_inserted_at, event | ||
SETTINGS optimize_aggregation_in_order=1 | ||
) | ||
""" |
17 changes: 17 additions & 0 deletions
17
posthog/clickhouse/migrations/0064_create_person_batch_export_view.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
from posthog.batch_exports.sql import ( | ||
CREATE_EVENTS_BATCH_EXPORT_VIEW, | ||
CREATE_EVENTS_BATCH_EXPORT_VIEW_BACKFILL, | ||
CREATE_EVENTS_BATCH_EXPORT_VIEW_UNBOUNDED, | ||
CREATE_PERSONS_BATCH_EXPORT_VIEW, | ||
) | ||
from posthog.clickhouse.client.migration_tools import run_sql_with_exceptions | ||
|
||
operations = map( | ||
run_sql_with_exceptions, | ||
[ | ||
CREATE_PERSONS_BATCH_EXPORT_VIEW, | ||
CREATE_EVENTS_BATCH_EXPORT_VIEW, | ||
CREATE_EVENTS_BATCH_EXPORT_VIEW_UNBOUNDED, | ||
CREATE_EVENTS_BATCH_EXPORT_VIEW_BACKFILL, | ||
], | ||
) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Every change on this file is just cleaning up.