Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

upgrade airbyte to destinations v2 #73

Closed
15 of 17 tasks
fatchat opened this issue Feb 27, 2024 · 16 comments
Closed
15 of 17 tasks

upgrade airbyte to destinations v2 #73

fatchat opened this issue Feb 27, 2024 · 16 comments
Assignees

Comments

@fatchat
Copy link

fatchat commented Feb 27, 2024

Destinations v2 needs Airbyte 0.50.24 (https://docs.airbyte.com/release_notes/upgrading_to_destinations_v2)

We are deferring the upgrade to 0.50.40 (#70) which is a bigger jump

We will

  1. upgrade from 0.50.21 to 0.50.24
  2. upgrade the BQ destination connector version to 1.9.0 which supports destinations v2 as optional. this will upgrade the connectors across workspaces
  3. we have verified that dalgo works with BQ 1.9.0 and with destinations v2 both disabled and enabled i.e. connections can be created (RAW not NORMALIZED please) and synced

BigQuery clients:

  • Antarang
  • Agency Fund
  • TAP
  • INREM
  • KEF
  • Noora
  • T4D BQ
  1. upgrade the PG destination connector to 0.6.3 which supports destinations v2 as optional
  2. Dalgo works with this upgraded destination connector

Postgres clients:

  • SNEHA
  • STiR
  • SHRI
  • Janaagraha
  • LAHI
  • Arghyam
  • superset_usage
  • Dalgo Demo
  • Tech4Dev
  • ATCEF

Target for completion: March 17

@fatchat fatchat added this to Dalgo Feb 27, 2024
@fatchat fatchat converted this from a draft issue Feb 27, 2024
@fatchat fatchat self-assigned this Feb 27, 2024
@fatchat fatchat moved this to In Progress in Dalgo Feb 27, 2024
@fatchat
Copy link
Author

fatchat commented Feb 29, 2024

On the test machine:

  1. set up an org with BQ 1.9.0
  2. set up another org with BQ 1.9.0 with the destinations v2 flag enabled
  3. sync org 1
  4. sync org 2
  5. sync org 1 again

Results:

  • all syncs worked as expected
  • Note that the airbyte destination connector was not upgraded - 1.9.0 supports destinations_v2

@fatchat
Copy link
Author

fatchat commented Feb 29, 2024

NOTE: upgrading the BQ connector version in one workspace upgrades it in other workspaces as well!

@fatchat
Copy link
Author

fatchat commented Mar 1, 2024

Airbyte has been upgraded to 0.50.24 in production
The BQ connector has been upgraded to 1.9.0
Destinations v2 has been enabled for Noora Health
Verified that this did not enable dest-v2 for Antarang

@fatchat
Copy link
Author

fatchat commented Mar 1, 2024

Antarang's pipeline ran successfully (sync + dbt)

@fatchat
Copy link
Author

fatchat commented Mar 1, 2024

Org Normalization Who Emailed? Migrated?
Antarang Yes GK x
Agency Fund No Linus x x
TAP Yes Ishan x
INREM N/a Rohit x
KEF Yes GK x
Noora n/a Rohit x x
T4D BQ No Rohit x
SNEHA No Abhishek K x
SHRI Facility Cost Sheet Sid x
STiR Yes Sid x
LAHI Some Abhishek N x
Janaagraha No Thomas x
superset_usage Yes Ishan x

@fatchat
Copy link
Author

fatchat commented Mar 1, 2024

Airbyte destination connector 0.6.0 supports Destinations v2 via a switch

Image

@fatchat
Copy link
Author

fatchat commented Mar 1, 2024

Postgres

Created a Dalgo workspace using a Postgres destination v 0.4.0 connector
Synced and verified
Upgraded the connector to 0.6.0
Created a new org, and enabled destinations v2 on the warehouse
Synced and encountered... a bug?

ERROR: relation "airbyte_internal.destinations_v2_raw__stream_Sheet2" does not exist

The warehouse contained airbyte_internal.destinations_v2_raw__stream_sheet2 instead

@fatchat
Copy link
Author

fatchat commented Mar 1, 2024

Upgraded to 0.6.3 and the sync worked on the new org ... phew

Re-synced the first org (now also on 0.6.3 but with destinations v2 disabled) and this also worked

This means that after upgrading the Airbyte Postgres destination connector to 0.6.3, existing orgs' connections will continue to sync without interruption

@fatchat
Copy link
Author

fatchat commented Mar 1, 2024

Upgraded Postgres destination connector in production from 0.3.27 to 0.6.3

Ran the sync for superset_usage:sneha and Airbyte reported no errors from either the sync or the normalization step

The connector did report a warning though:

The main thread is exiting while children non-daemon threads from a connector are still active. Ideally, this situation should not happen... Please check with maintainers if the connector or library code should safely clean up its threads before quitting instead.

@fatchat
Copy link
Author

fatchat commented Mar 2, 2024

Postgres and BQ clients' pipelines ran successfully last night

@fatchat
Copy link
Author

fatchat commented Mar 4, 2024

For upgrading connections for clients for whom we currently use dbt to normalize (e.g. SHRI, LAHI)

  • Replace occurences of _airbyte_ab_id with _airbyte_raw_id
  • Replace jsonb_object_keys({{ json_column }}->'data')) with jsonb_object_keys('data')
  • Remove calls to flatten_json

@fatchat
Copy link
Author

fatchat commented May 18, 2024

SNEHA is ready to upgrade, running the migration to destinations v2 locally

  1. open airbyte connection
  2. turn on "use destinations v2"
  3. click "test and save"

schemas before

  1. intermediate
  2. prod
  3. public
  4. staging
  5. staging_2021

from airbyte's documentation linked above:

After upgrading the out-of-date destination to a [Destinations V2 compatible version](https://docs.airbyte.com/release_notes/upgrading_to_destinations_v2#destinations-v2-effective-versions), the following will occur at the next sync for each connection sending data to the updated destination:

1. Existing raw tables replicated to this destination will be copied to a new airbyte_internal schema.
2. The new raw tables will be updated to the new Destinations V2 format.
3. The new raw tables will be updated with any new data since the last sync, like normal.
4. The new raw tables will be typed and de-duplicated according to the Destinations V2 format.
5. Once typing and de-duplication has completed successfully, your previous final table will be replaced with the updated data.

@fatchat
Copy link
Author

fatchat commented May 18, 2024

from airbyte's logs:
Assessing whether migration is necessary for stream zzz_case
Checking whether v1 raw table _airbyte_raw_zzz_case in dataset staging_2021 exist
Migration Info: Required for Sync mode: true, No existing v2 raw tables: true, A v1 raw table exists: true
Starting v2 Migration for stream zzz_case
create airbyte_internal schema, drop table, create table

new schema exists, with one table staging_2021_raw__stream_zzz_case i.e. <schema>_raw__stream_<table> having columns
_airbyte_raw_id, _airbyte_extracted_at, _airbyte_loaded_at, _airbyte_data

Important: This command can only be run by one of the repository admins:
Usage: ./tools/bin/get_repo_admins.sh <personal_access_token>

@fatchat
Copy link
Author

fatchat commented May 18, 2024

one hour to migrate

Important: This command can only be run by one of the repository admins:
Usage: ./tools/bin/get_repo_admins.sh <personal_access_token>

@fatchat
Copy link
Author

fatchat commented May 18, 2024

SNEHA migration completed on production
dbt ran successfully
notified AK with a request to check the dashboards
dbt still on branch 193

Important: This command can only be run by one of the repository admins:
Usage: ./tools/bin/get_repo_admins.sh <personal_access_token>

@fatchat
Copy link
Author

fatchat commented May 25, 2024

dashboards fine all week
dbt was moved over to main as well

Important: This command can only be run by one of the repository admins:
Usage: ./tools/bin/get_repo_admins.sh <personal_access_token>

@fatchat fatchat closed this as completed May 25, 2024
@github-project-automation github-project-automation bot moved this from In Progress to Done in Dalgo May 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

No branches or pull requests

1 participant