-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf: Speed up filtering persons #25604
Conversation
📸 UI snapshots have been updated2 snapshot changes in total. 0 added, 2 modified, 0 deleted:
Triggered by this commit. |
Co-authored-by: github-actions <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions <41898282+github-actions[bot]@users.noreply.github.com>
📸 UI snapshots have been updated2 snapshot changes in total. 0 added, 2 modified, 0 deleted:
Triggered by this commit. |
📸 UI snapshots have been updated3 snapshot changes in total. 0 added, 3 modified, 0 deleted:
Triggered by this commit. |
📸 UI snapshots have been updated3 snapshot changes in total. 0 added, 3 modified, 0 deleted:
Triggered by this commit. |
Suspect IssuesThis pull request was deployed and Sentry observed the following issues:
Did you find this useful? React with a 👍 or 👎 |
* master: (100 commits) revert: perf: Speed up filtering persons (#25715) chore: count data ingested by source (#25714) fix(bi): Dont try to be too clever with variables (#25712) fix(cdp): Ask for kinesis stream name, not the ARN (#25711) fix(batch-exports): Deal with deeply nested events (#25710) fix(batch-exports): Ingest as JSON in Redshift super type (#25709) perf: Speed up filtering persons (#25604) feat(data-warehouse): Add button to cancel batch export backfill run (#25706) feat(cdp): add discord template (#25649) feat(max): Show query summary (#25696) feat(errors): Use exception list and remove stack trace raw (#25655) fix: smaller cohort resource usage for replay filters (#25699) chore: use meta columns in data visualization table (#25697) feat(cdp): add klaviyo templates (#25693) feat(max): Answer feedback buttons (#25670) feat(cdp): add google ads integration warning (#25698) feat(hog): print ast in console (#25608) feat(data-warehouse): edit SQL based import configs (#25685) fix(insights): handle negation property in cohort property filters (#25691) fix: warming and filter (#25674) ...
Co-authored-by: github-actions <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Tomás Farías Santana <[email protected]> Co-authored-by: Marius Andra <[email protected]> Co-authored-by: Georgiy Tarasov <[email protected]> Co-authored-by: David Newell <[email protected]> Co-authored-by: PostHog Bot <[email protected]> Co-authored-by: Richard Borcsik <[email protected]> Co-authored-by: Michael Matloka <[email protected]> Co-authored-by: Eric Duong <[email protected]> Co-authored-by: Paul D'Ambra <[email protected]> Co-authored-by: Juraj Majerik <[email protected]> Co-authored-by: Phani Raj <[email protected]> Co-authored-by: Tom Owers <[email protected]> Co-authored-by: Dylan Martin <[email protected]> Co-authored-by: Oliver Browne <[email protected]> Co-authored-by: Marcus Hof <[email protected]> Co-authored-by: Michael Matloka <[email protected]> Co-authored-by: Anirudh Pillai <[email protected]> Co-authored-by: Sandy Spicer <[email protected]> Co-authored-by: Neil Kakkar <[email protected]>
Problem
Currently, we first group/argmax every single person, then we filter that result set down. It'd be much faster to filter first, then group by.
Changes
Filter first, then group by, then filter again to make sure the results are still correct.
The resulting query is a bit ugly (basically 3 sub queries) but it takes the filtering of the persons page for researchgate down from 80-90 seconds to 2 seconds!
Also removed the personsArgMaxVersion and optimizeJoinedFilters modifiers as those were making things more complicated and weren't relevant anymore.
👉 Stay up-to-date with PostHog coding conventions for a smoother review.
Does this work well for both Cloud and self-hosted?
How did you test this code?