Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BigQuery metadata ingestion failure for ADC with multiple project_ids #12327

Open
jackson-burke opened this issue Jan 13, 2025 · 0 comments
Open
Labels
bug Bug report

Comments

@jackson-burke
Copy link

Describe the bug
I'm unable to ingest metadata from BigQuery using my Application Default Credentials, possibly due to the fact that the ADC are associated with multiple BigQuery projects.

With the following recipe:

source:
  type: "bigquery"
  config:
    column_limit: 10000
    extract_column_lineage: true
sink:
  type: "datahub-rest"
  config:
    server: ${DATAHUB_GMS_URL}
    token: ${DATAHUB_GMS_TOKEN}

I receive this error: Failed to configure the source (bigquery): Project was not passed and could not be determined from the environment.

I previously could ingest successfully by manually setting my credentials like the below, but would prefer to use the ADC approach.

source:
  type: "bigquery"
  config:
    credential:
      type: "service_account"
      project_id: ${DATAHUB_BIGQUERY_SA_PROJECT_ID}
      private_key_id: ${DATAHUB_BIGQUERY_SA_PRIVATE_KEY_ID}
      private_key: ${DATAHUB_BIGQUERY_SA_PRIVATE_KEY}
      client_email: ${DATAHUB_BIGQUERY_SA_CLIENT_EMAIL}
      client_id: ${DATAHUB_BIGQUERY_SA_CLIENT_ID}
      auth_uri: ${DATAHUB_BIGQUERY_SA_AUTH_URI}
      token_uri: ${DATAHUB_BIGQUERY_SA_TOKEN_URI}
      auth_provider_x509_cert_url: ${DATAHUB_BIGQUERY_SA_AUTH_PROVIDER_X509_CERT_URL}
      client_x509_cert_url: ${DATAHUB_BIGQUERY_SA_CLIENT_X509_CERT_URL}
    column_limit: 10000
    extract_column_lineage: true
sink:
  type: "datahub-rest"
  config:
    server: ${DATAHUB_GMS_URL}
    token: ${DATAHUB_GMS_TOKEN}

To Reproduce
Steps to reproduce the behavior:

  1. Configure a similar recipe for a GCP profile with multiple projects associated to it.
  2. Attempt to ingest metadata from them.

Expected behavior
I would expect to ingest metadata from all associated project_ids or filter only the relevant ones when specifying in the project_ids config.

Desktop (please complete the following information):

  • OS: iOS
  • Browser chrome
  • Version 14.1

Additional context

  • ADC path is properly set (echo $GOOGLE_APPLICATION_CREDENTIALS outputs the correct path) and when I run gcloud projects list I observe my relevant projects.
  • I tried adding project_ids = ["target_id"] in the config, but I receive the same error.
@jackson-burke jackson-burke added the bug Bug report label Jan 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Bug report
Projects
None yet
Development

No branches or pull requests

1 participant