-
Notifications
You must be signed in to change notification settings - Fork 24
Updating BigQuery Views
It is quick and easy to add or update BigQuery views, but order of operations should be deliberate in the event that you are updating a view consumed in a downstream API, such as a metric package. This page uses updating a view consumed as a metric package for the Dashboard as a concrete example where relevant.
-
First, use the BigQuery interface to build and test the query for the new view
-
Once the query is ready, make a new view in
recidiviz-data/recidiviz/calculator/bq/dashboard/views/
-
Be sure to add your view to the
*_views.py
file in the directory where your new view lives -
If updating a metric meant to be consumed as a metric package, run the
dashboard_export_manager
script locally with the appropriate bucket name, e.g. the staging dashboard metric package bucket- Be sure that your local environment has the environment variable
GOOGLE_CLOUD_PROJECT=recidiviz-staging
- If this succeeds, it will create a view in the appropriate dataset in BigQuery, e.g. the staging dashboard views dataset, and export the resulting file to the staging dashboard metric package bucket
- Be sure that your local environment has the environment variable
-
From there, update any downstream consumer that needs to read from the updated view. For the dashboard, add the name of the metric package file to the
server/core/metricsApi.js
to start retrieving via the app's API -
Thus, adding a new view that involves a downstream consumer update should typically require one pull request in
recidiviz-data
for the BigQuery view and one in the downstream app for the consumption-
DO NOT merge the app pull request before the
recidiviz-data
pull request - Once both PRs are merged into main, you must wait until there has been a new deploy of the
recidiviz-data
platform to production before deploying a new version of the app that relies on the new query - Double check that the file that the new logic depends on is in the appropriate metric storage bucket before deploying the dashboard to production
- Dashboard view data is exported every morning around 6am PT. To trigger a manual export of the dashboard view data, publish a message (any message) to the appropriate Pub/Sub topic
-
DO NOT merge the app pull request before the
- First, use the BigQuery interface to edit and test the query of the view for the existing chart (however, don’t ever
Save View
and overwrite an existing view from within the BigQuery interface) - Once the query is ready, make a NEW view in
recidiviz-data/recidiviz/calculator/bq/dashboard/views/
with the updated query, with a different name than the existing view - Be sure to add this view to the
*_views.py
file in the directory where the view lives - This pull request can be merged and deployed to prod without waiting for any downstream consumer changes
- When it's time to transition the existing logic in the downstream app, run the
dashboard_export_manager
script locally with the appropriate bucket name, e.g. the staging dashboard metric package bucket- Be sure that your local environment has the environment variable
GOOGLE_CLOUD_PROJECT=recidiviz-staging
- If this succeeds, it will create a view in the appropriate dataset in BigQuery, e.g. the staging dashboard views dataset, and export the resulting file to the staging dashboard metric package bucket
- Be sure that your local environment has the environment variable
- From there, update any downstream consumer that needs to read from the updated view. For the dashboard, add the name of the metric package file to the
server/core/metricsApi.js
to start retrieving via the app's API- For the dashboard, update the app to start reading from this new file instead of its old view data file
- Once the
recidiviz-data
pull request has been deployed to prod, and the new file is in the appropriate metric package bucket, issue a pull request with the changes to the app -
DO NOT merge the downstream pull request before the
recidiviz-data
pull request has been deployed to prod and the file is in the cloud storage bucket in prod- Dashboard view data is exported every morning around 6am PT. To trigger a manual export of the dashboard view data, publish a message (any message) to the appropriate Pub/Sub topic
- Then, issue a NEW PR in
recidiviz-data
that deletes the old view that's no longer being used by the dashboard. Once this is merged and deployed, delete the old file from any metric storage bucket it lives within
- Home
- Architecture
- Schemas
- Methodology
- Data Extraction
- Data Normalization
- Entity Matching
- Recidivism Measurement
- Development
- Local Development
- Create a Scraper
- Add a New Schema
- Update BigQuery Views
- Continuous Integration
- Operations