- Componentization: compose complex queries from smaller, reusable components
- Test driven development: write tests for each query and run them like unit tests (except for the fact that they make calls to BigQuery)
- Data as code: input and required output for tests can be defined as part of the code (as well as in real database tables)
- No new languages to learn
- Write tests in your own development stack: programming language, IDE and CI/CD pipeline
-
Write an SQL query using Go Text Template notation, for example:
{{ define "fruit" }} WITH base AS ( {{ .Base }} ) SELECT fruit FROM base {{ end }}
The query may contain parameters, like {{ .Base }}
-
Add additional SQL queries to pass as arguments to the main query, for example:
{{ define "base" }} SELECT "orange" AS fruit UNION ALL SELECT "apple" {{ end }}
-
Write your expected result query, for example:
{{ define "fruit_result" }} SELECT "orange" AS fruit UNION ALL SELECT "apple" {{ end }}
-
Create a query definition file describing your query and one or more tests, for example:
Name: fruit Requires: - Base Tests: Test1: Args: - Name: Base Source: base Result: Source: fruit_result
-
Put all files together in a directory
The query definition file specifies how to construct an SQL query from the SQL templates.
A definition file contains Tests, each specifying how to construct an SQL query and an optional expected result.
To create a query from its components, espresso looks for an SQL template with the same name as the definition file itself, and then executes it with the given Args.
Each Arg must have a Name field that corresponds to a required argument in the SQL template and one of the following fields:
- Source - Another SQL template which will be executed and injected into the containing query.
- Table - a table name which will be combined with the Google project and BigQuery dataset (defined in Shot) and injected into the containing query.
- Const - a string that will be injected into the containing query.
Source may contain its own Args. If it doesn't, espresso will look for a corresponding template file with a test of the same name and parse the args from there.
Result is an optional field specifiying the test's expected result. It should have a Source field specifying an SQL template that will be executed to generate the expected result query. Source may contain Args as described above.
The tests require access to BigQuery API. Please set the following environment variables to grant espresso access to BigQuery:
export GCLOUD_PROJECT_ID=<your GCP project ID>
export BIGQUERY_KEY=<a service account with permissions to use BigQuery>
go build
./espresso -dir="./shot/queries/fruit/" -query="fruit" -test="Test1"
docker run --rm -t -e GCLOUD_PROJECT_ID=$GCLOUD_PROJECT_ID -e BIGQUERY_KEY=$BIGQUERY_KEY -v $(pwd)/shot:/shot:ro tufin/espresso -dir="/shot" -query="fruit" -test="Test1"
- Embed your tests directory
- Create an "Espresso Shot" and run it
- Use standard Go assertions to check the expected result against the actual output
func TestEspressoShot_Filesystem(t *testing.T) {
queryValues, resultValues, err := shot.NewShotWithClient(env.GetGCPProjectID(), "", os.DirFS("./queries/fruit")).RunTest("fruit", "Test1", []bigquery.QueryParameter{}, &map[string]bigquery.Value{})
require.NoError(t, err)
require.ElementsMatch(t, queryValues, resultValues)
}
You can also embed the SQL templates directory into the code:
//go:embed queries/fruit
var templates embed.FS
func TestEspressoShot_Embed(t *testing.T) {
queryValues, resultValues, err := shot.NewShotWithClient(env.GetGCPProjectID(), "", templates).RunTest("fruit", "Test1", []bigquery.QueryParameter{}, &map[string]bigquery.Value{})
require.NoError(t, err)
require.ElementsMatch(t, queryValues, resultValues)
}
Currently only Go is supported. If you'd like to contribute additional language support, please start a dicssussion.
- This is an initial proof-of-concept and request-for-comments
- Please submit your feedback as pull requests, issues or discussions.