Skip to content

Commit

Permalink
Fix BQ connection parsing with dataset info (#367)
Browse files Browse the repository at this point in the history
## Description

Cosmos version `0.7.4` currently does not correctly parse BigQuery
connection information with dataset information.
Cosmos expects dataset as a top level parameter in connection json.

However, based on the documentation of apache-airflow-providers-google
[here](https://airflow.apache.org/docs/apache-airflow-providers-google/stable/connections/gcp.html#configuring-the-connection),
dataset can not appear as a top level parameter in connection json.

This PR fixes this by updating the mapping in cosmos, so that it expects
dataset parameter in `extra` field of connection.

## Related Issue(s)
closes #365 

## Breaking Change?

None that I am aware of

## Checklist

- [ ] I have made corresponding changes to the documentation (if
required)
- [X] I have added tests that prove my fix is effective or that my
feature works

Co-authored-by: Monideep De <[email protected]>
  • Loading branch information
MonideepDe and Monideep De authored Jul 17, 2023
1 parent c4e8dcd commit 387615f
Show file tree
Hide file tree
Showing 2 changed files with 11 additions and 1 deletion.
2 changes: 1 addition & 1 deletion cosmos/profiles/bigquery/service_account_file.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ class GoogleCloudServiceAccountFileProfileMapping(BaseProfileMapping):

airflow_param_mapping = {
"project": "extra.project",
"dataset": "dataset",
"dataset": "extra.dataset",
"keyfile": "extra.key_path",
}

Expand Down
10 changes: 10 additions & 0 deletions tests/profiles/bigquery/test_bq_service_account_file.py
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,16 @@ def test_connection_claiming() -> None:
profile_mapping = GoogleCloudServiceAccountFileProfileMapping(conn, {})
assert not profile_mapping.can_claim_connection()

# if we have dataset specified in extra, it should claim
dataset_dict = {"dataset": "my_dataset"}
conn = Connection(
conn_id="my_bigquery_connection",
conn_type="google_cloud_platform",
extra=json.dumps({**extra, **dataset_dict}),
)
profile_mapping = GoogleCloudServiceAccountFileProfileMapping(conn, {})
assert profile_mapping.can_claim_connection()

# if we have them all, it should claim
conn = Connection(
conn_id="my_bigquery_connection",
Expand Down

0 comments on commit 387615f

Please sign in to comment.