Skip to content

Latest commit

 

History

History
 
 

bigquery-dataset

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 

Google Cloud Bigquery Module

This module allows managing a single BigQuery dataset, including access configuration, tables and views.

TODO

  • check for dynamic values in tables and views
  • add support for external tables

Examples

Simple dataset with access configuration

Access configuration defaults to using the separate google_bigquery_dataset_access resource, so as to leave the default dataset access rules untouched.

You can choose to manage the google_bigquery_dataset access rules instead via the dataset_access variable, but be sure to always have at least one OWNER access and to avoid duplicating accesses, or terraform apply will fail.

The access variables are split into access and access_identities variables, so that dynamic values can be passed in for identities (eg a service account email generated by a different module or resource).

module "bigquery-dataset" {
  source     = "./fabric/modules/bigquery-dataset"
  project_id = "my-project"
  id          = "my-dataset"
  access = {
    reader-group   = { role = "READER", type = "group" }
    owner          = { role = "OWNER", type = "user" }
    project_owners = { role = "OWNER", type = "special_group" }
    view_1         = { role = "READER", type = "view" }
  }
  access_identities = {
    reader-group   = "[email protected]"
    owner          = "[email protected]"
    project_owners = "projectOwners"
    view_1         = "my-project|my-dataset|my-table"
  }
}
# tftest modules=1 resources=5

IAM roles

Access configuration can also be specified via IAM instead of basic roles via the iam variable. When using IAM, basic roles cannot be used via the access family variables.

module "bigquery-dataset" {
  source     = "./fabric/modules/bigquery-dataset"
  project_id = "my-project"
  id          = "my-dataset"
  iam = {
    "roles/bigquery.dataOwner" = ["user:[email protected]"]
  }
}
# tftest modules=1 resources=2

Dataset options

Dataset options are set via the options variable. all options must be specified, but a null value can be set to options that need to use defaults.

module "bigquery-dataset" {
  source     = "./fabric/modules/bigquery-dataset"
  project_id = "my-project"
  id         = "my-dataset"
  options = {
    default_table_expiration_ms     = 3600000
    default_partition_expiration_ms = null
    delete_contents_on_destroy      = false
  }
}
# tftest modules=1 resources=1

Tables and views

Tables are created via the tables variable, or the view variable for views. Support for external tables will be added in a future release.

locals {
  countries_schema = jsonencode([
    { name = "country", type = "STRING" },
    { name = "population", type = "INT64" },
  ])
}

module "bigquery-dataset" {
  source     = "./fabric/modules/bigquery-dataset"
  project_id = "my-project"
  id         = "my_dataset"
  tables = {
    countries = {
      friendly_name       = "Countries"
      labels              = {}
      options             = null
      partitioning        = null
      schema              = local.countries_schema
      deletion_protection = true
    }
  }
}
# tftest modules=1 resources=2

If partitioning is needed, populate the partitioning variable using either the time or range attribute.

locals {
  countries_schema = jsonencode([
    { name = "country", type = "STRING" },
    { name = "population", type = "INT64" },
  ])
}

module "bigquery-dataset" {
  source     = "./fabric/modules/bigquery-dataset"
  project_id = "my-project"
  id         = "my-dataset"
  tables = {
    table_a = {
      friendly_name = "Table a"
      labels        = {}
      options       = null
      partitioning = {
        field = null
        range = null # use start/end/interval for range
        time  = { type = "DAY", expiration_ms = null }
      }
      schema              = local.countries_schema
      deletion_protection = true
    }
  }
}
# tftest modules=1 resources=2

To create views use the view variable. If you're querying a table created by the same module terraform apply will initially fail and eventually succeed once the underlying table has been created. You can probably also use the module's output in the view's query to create a dependency on the table.

locals {
  countries_schema = jsonencode([
    { name = "country", type = "STRING" },
    { name = "population", type = "INT64" },
  ])
}

module "bigquery-dataset" {
  source     = "./fabric/modules/bigquery-dataset"
  project_id = "my-project"
  id         = "my_dataset"
  tables = {
    countries = {
      friendly_name       = "Countries"
      labels              = {}
      options             = null
      partitioning        = null
      schema              = local.countries_schema
      deletion_protection = true
    }
  }
  views = {
    population = {
      friendly_name       = "Population"
      labels              = {}
      query               = "SELECT SUM(population) FROM my_dataset.countries"
      use_legacy_sql      = false
      deletion_protection = true
    }
  }
}

# tftest modules=1 resources=3

Variables

name description type required default
id Dataset id. string
project_id Id of the project where datasets will be created. string
access Map of access rules with role and identity type. Keys are arbitrary and must match those in the access_identities variable, types are domain, group, special_group, user, view. map(object({…})) {}
access_identities Map of access identities used for basic access roles. View identities have the format 'project_id|dataset_id|table_id'. map(string) {}
dataset_access Set access in the dataset resource instead of using separate resources. bool false
description Optional description. string "Terraform managed."
encryption_key Self link of the KMS key that will be used to protect destination table. string null
friendly_name Dataset friendly name. string null
iam IAM bindings in {ROLE => [MEMBERS]} format. Mutually exclusive with the access_* variables used for basic roles. map(list(string)) {}
labels Dataset labels. map(string) {}
location Dataset location. string "EU"
options Dataset options. object({…}) {…}
tables Table definitions. Options and partitioning default to null. Partitioning can only use range or time, set the unused one to null. map(object({…})) {}
views View definitions. map(object({…})) {}

Outputs

name description sensitive
dataset Dataset resource.
dataset_id Dataset id.
id Fully qualified dataset id.
self_link Dataset self link.
table_ids Map of fully qualified table ids keyed by table ids.
tables Table resources.
view_ids Map of fully qualified view ids keyed by view ids.
views View resources.