Skip to content

Commit

Permalink
fix and fmt
Browse files Browse the repository at this point in the history
  • Loading branch information
jaycarlton committed Nov 30, 2020
1 parent 8c194a9 commit 76ae015
Show file tree
Hide file tree
Showing 5 changed files with 117 additions and 23 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -2,3 +2,4 @@
*.tfstate
*.backup
*.iml
.DS_Store
45 changes: 33 additions & 12 deletions modules/workbench/README.md
Original file line number Diff line number Diff line change
@@ -1,24 +1,45 @@
# Workbench Child Modules
# Workbench Terraform Modules
The module directories here represent individually deployable subsystems,
microservices, or other functional units. It's easy enough to put all buckets, say,
in a `gcs` module, but that wouldn't really let us operate on an individual components's bucket.
in a `gcs` module, but that wouldn't really let us operate on an individual components-owned bucket.

Following is a broad outline fo each child module. If you feel irritated that you can't see, for example,
all dashboards in one place, you can still go to the Console or use `gcloud`.
## Goals
### Automate ourselves out of a job
All the existing and planned Terraform modules have some level of scripted or otherwise automated
support processes.
## Non-goals
### Become the game in town
We don't want to get into a position where we force anyone to use Terraform if it's not the best
choice for them. Terraform is still pretty new, and changing rapidly. The Google provider is also
under rapid development.
### Wag the Dog
We do not have any aspirations to absorb any of the tasks that external teams are responsible for,
including building the GCP projects for each of our environments or conducting all administrative
tasks in either pmi-ops or terra projects. If Terraform really "takes off". then it may make sense to
share learnings, and at that point, there may be opportunities for our Terraform stack to use theirs,
or vice versa. While these boundaries may be fuzzy today, hopefully the addition of clear module
inputs and documentation will drive clarification of responsibilities and visibility into state,
dependencies, etc.
### Bypass security approvals
In some cases, actions that require security approval can be performed in Terraform, particularly
around IAM bindings, access groups, and roles. We don't want a situation where an audit finds that
individuals or service accounts were added or modified without going through the proper channels.

## Reporting
One potential workaround here is to invite sysadmin or security personnel to the private repository
to approve changes to the Terraform module inputs.

## Currently Supported Modules

### Reporting
The state for reporting is currently the BigQuery dataset and its tables and views. In the future,
it makes sense to add j
it makes sense to add these sorts of things:
* Reporting-specific metrics
* Notifications on the system
* Reporting-specific logs, specific logs
* Data blocks for views (maybe)

## Backend Database (future)
This resource is inherently cross-functional, so we can just put
* The application DB
* backup settings
This will take advantage of the `google_sql_database_instance` resource.

Schema migrations work via `Ruby->Gradle->Liquibase->MySql->🚂`
Maybe it needs a `Terraform` caboose. It looks like there's not currently a Liquibase provider.
In other words, the primary focus of the module is the Reporting system, but it may be convenient to
add reporting-specific artifacts that might otherwise be concerned with Monitoring or other auxiliary
services.
72 changes: 72 additions & 0 deletions modules/workbench/WORKBENCH-MODULE-PLAN.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
# Workbench Module Plan
The module directories here represent individually deployable subsystems,
microservices, or other functional units. It's easy enough to put all buckets, say,
in a `gcs` module, but that wouldn't really let us operate on an individual components's bucket.

Following is a broad outline fo each child module. If you feel irritated that you can't see, for example,
all dashboards in one place, you can still go to the Console or use `gcloud`.

# Workbench Module Development Plan
The Workbench is the topmost parent module in the AoU Workbench
Application configuration. It depends on several modules for individual
subsystems.

After creating a valid Terraform configuration we're not finished,
as we need to make sure we don't step on other tools or automatioin.
For example, items that pertain to cloud resources will need to move
out of the workbench JSON config system.

I have automation already for stackdriver setting that fetches all of theiir configurations
and plan to migrate it to Terraform.

## Reporting
The state for reporting is currently the BigQuery dataset and its tables and views.
Highlights
* Reporting-specific metrics with the `google_logging_metric` [resource](https://www.terraform.io/docs/providers/google/r/logging_metric.html)
and others
* Notifications on the system
* Reporting-specific logs, specific logs
* Data blocks for views (maybe)

## Backend Database (notional)
This resource is inherently cross-functional, so we can just put
* The application DB
* backup settings
This will take advantage of the `google_sql_database_instance` resource.

Schema migrations work via `Ruby->Gradle->Liquibase->MySql->�`
Maybe it needs a `Terraform` caboose. It looks like there's not currently a Liquibase provider.

It may not make sense organizationally to do this in Terraform, as there are dependencies on other
team(s) when instantiating or migrating databases.

## Workbench to RDR Pipeline
Instantiate [google_cloud_tasks_queue](https://www.terraform.io/docs/providers/google/r/cloud_tasks_queue.html) resource
resouorces as necessary.

## API Server
* AppEngine versions, instances, logs, etc. Isn't just named
App Engine, since that's the resource that gets crated.

At the moment, there are no plans to rip and replace our existing deployment process or automation,
but we may find areas that the Terraform approach could be helpful (such as managing dependent
deployment artifacts or steps in a declarative way.)

## Action Audit
This module maps to
* Stackdriver logs for each environment. (It will need to
move from the application JSON config likely.)
* Logs-based metrics on the initial log stream
* Sink to BigQuery dataset for each environment (Stackdriver may need to create initially, in which
case, we need to do `terraform state import`.)
* Logs-based metrics on the initial log stream
* Reporting datasets in BigQuery

## Tiers and Egress Detection
There is a [sumo logic provider](https://www.sumologic.com/blog/terraform-provider-hosted/) for Terraform, which is very good
news. It looks really svelte.

We will also want to control the VPC flow logs,
perimeters, etc, but it won't be in this `workbench` module,
because Terra-not-form owns the organization and needs to do
creation manually for now.
2 changes: 1 addition & 1 deletion modules/workbench/main.tf
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Module for creating an instance of the scratch AoU RW Environment
# Workbench Analytics Reporting Subsystem
module "reporting" {
source = "./modules/reporting"

Expand Down
20 changes: 10 additions & 10 deletions modules/workbench/modules/reporting/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -58,23 +58,23 @@ locals {
timeseries_view_template_filenames = fileset("${path.module}/assets/views/timeseries", "*.sql")
# expanded to fully qualified path, e.g. ["/repos/workbench/terraform/modules/reporting/views/latest_users.sql", ...]
timeseries_view_template_paths = [for file_name in local.timeseries_view_template_filenames :
pathexpand("${path.module}/assets/views/timeseries/${file_name}")]
pathexpand("${path.module}/assets/views/timeseries/${file_name}")]

live_view_tables = [for table_input in local.table_inputs : table_input["table_id"] ]
live_view_tables = [for table_input in local.table_inputs : table_input["table_id"]]
live_view_template_path = pathexpand("${path.module}/assets/views/live/live_table.sql")

# All live views (live_user, live_cohort, etc) depend on the tables being created first, so we need to make sure
# Teraform treats each view as depending on all the tables. It's not possible to depend on the exact
# table (I think) but this should solve the dependency problem of trying to create the view before
# its table. https://stackoverflow.com/q/64795896/12345554
live_views = [for table_name in module.main.table_names :
merge({
view_id = "live_${table_name}"
query = templatefile(local.live_view_template_path, {
project = var.project_id
dataset = var.reporting_dataset_id
table_name = table_name
})
merge({
view_id = "live_${table_name}"
query = templatefile(local.live_view_template_path, {
project = var.project_id
dataset = var.reporting_dataset_id
table_name = table_name
})
}, local.VIEW_CONSTANTS)]

# Create views for each .sql file in the views directory. There is no Terraform
Expand Down Expand Up @@ -107,7 +107,7 @@ module "main" {
description = "Daily output of relational tables and time series views for analysis. Views are provided for general ad-hoc analysis."

tables = local.tables
views = local.views
views = local.views

dataset_labels = {
subsystem = "reporting"
Expand Down

0 comments on commit 76ae015

Please sign in to comment.