-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(data-warehouse): Add created_at
to BigQuery persons batch export
#26486
Conversation
created_at
to persons batch export
created_at
to persons batch exportcreated_at
to persons batch export
ac356bf
to
c9da9c8
Compare
Size Change: 0 B Total Size: 1.16 MB ℹ️ View Unchanged
|
📸 UI snapshots have been updated1 snapshot changes in total. 0 added, 1 modified, 0 deleted:
Triggered by this commit. |
📸 UI snapshots have been updated1 snapshot changes in total. 0 added, 1 modified, 0 deleted:
Triggered by this commit. |
📸 UI snapshots have been updated1 snapshot changes in total. 0 added, 1 modified, 0 deleted:
Triggered by this commit. |
📸 UI snapshots have been updated1 snapshot changes in total. 0 added, 1 modified, 0 deleted:
Triggered by this commit. |
created_at
to persons batch exportcreated_at
to BigQuery persons batch export
e9dcdf3
to
2ed487a
Compare
exists_ok: bool = True, | ||
not_found_ok: bool = True, | ||
delete: bool = True, | ||
create: bool = True, | ||
) -> collections.abc.AsyncGenerator[bigquery.Table, None]: | ||
"""Manage a table in BigQuery by ensure it exists while in context.""" | ||
if create is True: | ||
assert table_schema is not None, "Table schema is required when creating a table" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assertions are removed by optimized Python, I'm wondering if we would prefer to still keep the check by instead raising a ValueError("Table schema is required when creating a table")
or other non-retryable error here.
Do we ever expect schema to be None
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assertions are removed by optimized Python
ah right, I wasn't aware of this, thanks!
Do we ever expect schema to be
None
?
To be honest, we're always called managed_table
with a schema so we can probably remove this check
@@ -450,6 +472,109 @@ async def test_insert_into_bigquery_activity_merges_data_in_follow_up_runs( | |||
) | |||
|
|||
|
|||
async def test_insert_into_bigquery_activity_handles_person_schema_changes( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice test!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The migration will need to be bumped to 0093 (or later) as we'll merge my #26477 first. But otherwise PR looks good!
c6703a3
to
e43bfa5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm!
Problem
We don't currently send the
created_at
field in the persons batch export. Here we add it for the BigQuery destination (others to follow).Changes
created_at
columncreated_at
column will only be used when creating new tables, if the export uses a merge then it will be ignored in order to preserve the current table schema. If the user adds the column to their BigQuery table manually, the data will be included in subsequent batch exports.Questions
created_at
column not existing yet?Post-deployment
Does this work well for both Cloud and self-hosted?
Should do
How did you test this code?
Added unit test to cover both newly created tables and existing ones.