Allow a `use_query_cache` override directly in `create_engine` or `execution_options` #1112

gallodustin · 2024-08-30T22:54:47Z

Is your feature request related to a problem? Please describe.
If I want to execute a particular query that bypasses the BQ query cache, I have to instantiate a new client object, pass it the QueryJobConfig with use_query_cache=False, and then pass that client into create_engine, like below:

import sqlalchemy
from google.cloud import bigquery

job_config_kwargs = {}
job_config_kwargs["use_query_cache"] = False
job_config = bigquery.QueryJobConfig(**job_config_kwargs)
...
bq_client = bigquery.Client(
    default_query_job_config=job_config,
    ...
)

create_engine_kwargs = {}
create_engine_kwargs["connect_args"]["client"] = bq_client
...
bq_engine = sqlalchemy.create_engine(url=..., **create_engine_kwargs)
with bq_engine.connect() as conn:
    conn.execute("SELECT ... FROM ...")

The main issue with this is that create_engine is slow for BigQuery compared to most other dialects, perhaps because of oauth2 checking the credentials passed into the BQ client. It also means that calling code has to keep track of the job_config, bq_client and bq_engine objects.

For these reasons I would rather pass a parameter, perhaps a QueryJobConfig object, and let the dialect and the DBAPI handle the client object.

Describe the solution you'd like
I see that BigQueryDialect inherits from DefaultDialect and that Cursor.execute in the BigQuery DBAPI repo does include a parameter for job_config. I wonder if it as easy as overriding do_execute in the dialect within this repo to pass along that job_config?

The caller could then use it like this:

import sqlalchemy
from google.cloud import bigquery

bq_engine = sqlalchemy.create_engine(url=...). # Just once! I can reuse it whether I want to use the cache or not!
job_config = bigquery.QueryJobConfig(use_query_cache=False)
with bq_engine.connect().execution_options(job_config=job_config) as conn:
    conn.execute("query bypassing the cache")
with bq_engine.connect() as conn:
    conn.execute("query potentially hitting the cache")

Describe alternatives you've considered
See the pseudocode above. It works for us, but it's cumbersome and causes a performance hit.

The text was updated successfully, but these errors were encountered:

product-auto-label bot added the api: bigquery Issues related to the googleapis/python-bigquery-sqlalchemy API. label Aug 30, 2024

blunderbuss-gcf bot assigned tswast Aug 30, 2024

tswast added the type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design. label Sep 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow a `use_query_cache` override directly in `create_engine` or `execution_options` #1112

Allow a `use_query_cache` override directly in `create_engine` or `execution_options` #1112

gallodustin commented Aug 30, 2024

Allow a use_query_cache override directly in create_engine or execution_options #1112

Allow a use_query_cache override directly in create_engine or execution_options #1112

Comments

gallodustin commented Aug 30, 2024

Allow a `use_query_cache` override directly in `create_engine` or `execution_options` #1112

Allow a `use_query_cache` override directly in `create_engine` or `execution_options` #1112