-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Support multiple models in S3 batch export #23105
Conversation
3d7b18b
to
acb3108
Compare
Err, ignore the last few test failures, they seem to be on github's side:
I'll retry the tests. |
Hmm not sure why the hogql_query tests are failing, don't see any changes of mine around that part of the codebase. I'll investigate tomorrow. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, nice pattern for extending beyond persons
5e86a92
to
c894d48
Compare
docker-compose.dev.yml
Outdated
extra_hosts: | ||
- 'host.docker.internal:host-gateway' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed this as locally was failing by builds. Shouldn't have been pushed though.
Hey @fuziontech! You got tagged for review as I rebased this PR and commits from the previous PR you've reviewed popped up in this PR's git history. Feel free to ignore, and apologies for the spam. Unless you are curious, then obviously comments are welcome! |
This commit introduces a new `batch_export_model` input for all batch export destinations. This input is of type `BatchExportModel` which can be used to indicate which model a batch export is supposed to target. This opens the door to creating "persons" model exports, and eventually extending this further with more changes. The change was done in a backwards compatible way, so `batch_export_schema` was not removed, and existing export configurations should continue to work (as evidenced by unit tests passing without changes). However, moving forward all batch exports will be created with `batch_export_schema` set to `None` and `batch_export_model` defined in its place. After updating existing exports, `batch_export_schema` and any other code associated with backwards compatibility can be deleted.
91e4c08
to
cb6ecca
Compare
cb6ecca
to
e32b4a5
Compare
Suspect IssuesThis pull request was deployed and Sentry observed the following issues:
Did you find this useful? React with a 👍 or 👎 |
* refactor: Update metrics to fetch counts at request time * fix: Move import to method * fix: Add function * feat: Custom schemas for batch exports * feat: Frontend support for model field * fix: Clean-up * fix: Add missing migration * fix: Make new field nullable * Update UI snapshots for `chromium` (1) * Update UI snapshots for `chromium` (1) * Update UI snapshots for `chromium` (1) * Update UI snapshots for `chromium` (1) * Update UI snapshots for `chromium` (1) * Update UI snapshots for `chromium` (1) * fix: Bump migration number * fix: Bump migration number * refactor: Update metrics to fetch counts at request time * refactor: Switch to counting runs * refactor: Support batch export models as views * fix: Merge conflict * fix: Quality check fixes * refactor: Update metrics to fetch counts at request time * fix: Move import to method * fix: Add function * fix: Typing fixes * feat: Custom schemas for batch exports * feat: Frontend support for model field * fix: Clean-up * fix: Add missing migration * fix: Make new field nullable * Update UI snapshots for `chromium` (1) * Update UI snapshots for `chromium` (1) * Update UI snapshots for `chromium` (1) * Update UI snapshots for `chromium` (1) * Update UI snapshots for `chromium` (1) * Update UI snapshots for `chromium` (1) * fix: Bump migration number * fix: Clean-up unused code * fix: Clean-up unused function * fix: Only run extra clickhouse queries in batch exports tests * feat: Support multiple models in all batch export destinations This commit introduces a new `batch_export_model` input for all batch export destinations. This input is of type `BatchExportModel` which can be used to indicate which model a batch export is supposed to target. This opens the door to creating "persons" model exports, and eventually extending this further with more changes. The change was done in a backwards compatible way, so `batch_export_schema` was not removed, and existing export configurations should continue to work (as evidenced by unit tests passing without changes). However, moving forward all batch exports will be created with `batch_export_schema` set to `None` and `batch_export_model` defined in its place. After updating existing exports, `batch_export_schema` and any other code associated with backwards compatibility can be deleted. * fix: Add additional type hints and update tests * chore: Add test utilities for multi-model batch exports * fix: Pass properties included as keyword arguments * fix: Support custom key prefixes for multi-model * feat: Unit testing for S3 batch exports with multi-model support * fix: Add and correct type hints * fix: Typo in parameter * fix: API tests now work with model parameter * fix: Re-add hosts to docker-compose * fix: Use UTC timezone alias * fix: Add missing test column * revert: Python 3.11 alias --------- Co-authored-by: github-actions <41898282+github-actions[bot]@users.noreply.github.com>
Problem
Users wish to export person information to allow identifying unique persons in their event batch exports.
Changes
This commit introduces a new
batch_export_model
input for all batch export destinations. This input is of typeBatchExportModel
which can be used to indicate which model a batch export is supposed to target. This opens the door to creating "persons" model exports, and eventually extending this further with more changes.The change was done in a backwards compatible way, so
batch_export_schema
was not removed, and existing export configurations should continue to work (as evidenced by unit tests passing without changes).However, moving forward all batch exports will be created with
batch_export_schema
set toNone
andbatch_export_model
defined in its place.After updating existing exports,
batch_export_schema
and any other code associated with backwards compatibility can be deleted.👉 Stay up-to-date with PostHog coding conventions for a smoother review.
Does this work well for both Cloud and self-hosted?
How did you test this code?
Updated all S3 unit tests to guarantee person batch export support. Follow-up PRs will work on other destinations.
This required generating person data, and checking it was successfully exported. To do this, new utilities were added under
temporal/tests/utils/persons.py
, they work similar to event utilities intemporal/tests/utils/event.py
Also, a bug was fixed in the
test_s3_export_workflow_with_minio_bucket_and_custom_key_prefix
test as it wasn't properly using custom keys.