Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bump common-streams to 0.6.0 #374

Merged
merged 4 commits into from
May 9, 2024
Merged

Bump common-streams to 0.6.0 #374

merged 4 commits into from
May 9, 2024

Conversation

istreeter and others added 4 commits May 9, 2024 14:42
We already handle one type of exception for too many columns case with
message like:

```Too many columns (10067) in schema. Only 10000 columns are allowed```

It turns out BQ can fail with different one in this case, like:

```Too many total leaf fields: 10001, max allowed field count: 10000```

so we have to handle it as well.
After the loader alters the table to add new columns, it immediately
opens a new Writer and expects the Writer to be aware of the new
columns. However, we have found the Writer might get opened with no
awareness of the newly added columns. Presumbably because of the async
nature of BigQuery's architecture.

This fix works by retrying opening the writer until eventually it should
get opened with awareness of the new columns
When schema evolves, e.g. when new nested field is added to entity, loader should explicity try to alter underlying BigQuery schema. It works correctly for self-describing events (unstruct columns), but as it turns in case of contexts it doesn't modify schema, alter is skipped, what results in bad data. This commit should fix this problem.
@pondzix pondzix merged commit c1951fe into v2 May 9, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants