Skip to content
This repository has been archived by the owner on Dec 17, 2021. It is now read-only.

Commit

Permalink
Merge pull request #7 from ExpediaGroup/orphaned-data-strategy
Browse files Browse the repository at this point in the history
Orphaned data strategy
  • Loading branch information
Max Jacobs authored Jan 16, 2020
2 parents 9acb041 + 69f7ad5 commit 9e4ef3e
Show file tree
Hide file tree
Showing 5 changed files with 25 additions and 8 deletions.
4 changes: 4 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,10 @@ All notable changes to this project will be documented in this file.

The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/) and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.html).

## [1.2.1] - 2020-01-16
### Added
- A new argument `orphaned_data_strategy` to use for handling stale data during replication.

## [1.2.0] - 2019-08-29
### Added
- Support for Docker Auth.
Expand Down
18 changes: 10 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,25 +5,27 @@ Terraform module for setting up infrastructure for [Shunting Yard](https://githu

For more information please refer to the main [Apiary](https://github.com/ExpediaGroup/apiary) project page.

## Variables
## Inputs

| Name | Description | Type | Default | Required |
|------|-------------|:----:|:-----:|:-----:|
| allowed\_s3\_buckets | List of S3 Buckets to which Shunting Yard will have read-write access. eg. `["bucket-1", "bucket-2"]`. | list | `n/a` | yes |
| allowed\_s3\_buckets | List of S3 Buckets to which Shunting Yard will have read-write access. | list | n/a | yes |
| aws\_region | AWS region to use for resources. | string | n/a | yes |
| cpu | The number of CPU units to reserve for the Shunting Yard container. Valid values can be 256, 512, 1024, 2048 and 4096. Reference: https://docs.aws.amazon.com/AmazonECS/latest/developerguide/task-cpu-memory-error.html | string | `"1024"` | no |
| ct\_common\_config\_yaml | Common Circus Train configuration to be passed to internal Circus Train instance. It can be used, for example to configure Graphite for Circus Train. Refer to [Circus Train README](https://github.com/HotelsDotCom/circus-train/blob/master/README.md) for an exhaustive list of options supported by Circus Train. | string | n/a | yes |
| docker\_image | Full path of Shunting Yard Docker image. | string | n/a | yes |
| docker\_registry\_auth\_secret\_name | Docker Registry authentication SecretManager secret name. | string | `""` | no |
| docker\_version | Shunting Yard Docker image version. | string | n/a | yes |
| docker\_registry\_auth\_secret\_name | Docker Registry authentication SecretManager secret name. | string | `` | no |
| instance\_name | Shunting Yard instance name to identify resources in multi-instance deployments. | string | `""` | no |
| memory | The amount of memory (in MiB) allocated to the Shunting Yard container. Valid values: https://docs.aws.amazon.com/AmazonECS/latest/developerguide/task-cpu-memory-error.html | string | `"4096"` | no |
| memory | The amount of memory (in MiB) used to allocate for the Shunting Yard container. Valid values: https://docs.aws.amazon.com/AmazonECS/latest/developerguide/task-cpu-memory-error.html | string | `"4096"` | no |
| metastore\_events\_sns\_topic | SNS Topic for Hive Metastore events. | string | n/a | yes |
| shuntingyard\_sqs\_queue\_wait\_timeout | Shunting Yard SQS queue wait timeout (in seconds) | string | 15 | no |
| shuntingyard\_sqs\_queue\_stale\_messages\_timeout | Shunting Yard SQS queue stale messages alert timeout (in seconds) | string | 300 | no |
| selected\_tables | Tables selected for Shunting Yard Replication. Supported Format: `[ "database_1.table_1", "database_2.table_2" ]` | list | [] | no |
| orphaned\_data\_strategy | Orphaned data strategy to use for stale data during replication. Supported strategies: "NONE", "HOUSEKEEPING" (default). | string | `"HOUSEKEEPING"` | no |
| selected\_tables | Tables selected for Shunting Yard Replication. Supported Format: [ "database_1.table_1", "database_2.table_2" ] Wildcards are not supported, i.e. you need to specify each table explicitly. | list | `<list>` | no |
| shuntingyard\_sqs\_queue\_stale\_messages\_timeout | Shunting Yard SQS Queue Cloudwatch Alert timeout for messages older than this number of seconds. | string | `"300"` | no |
| shuntingyard\_sqs\_queue\_wait\_timeout | Wait timeout for connecting to the Shunting Yard SQS queue (in seconds) | string | `"15"` | no |
| shuntingyard\_tags | A map of tags to apply to resources. | map | n/a | yes |
| source\_metastore\_uri | Source Metastore URI for Shunting Yard. | string | n/a | yes |
| subnets | ECS container subnets. | list | n/a | yes |
| shuntingyard\_tags | A map of tags to apply to resources. | map | `<map>` | no |
| target\_metastore\_uri | Target Metastore URI for Shunting Yard. | string | n/a | yes |
| vpc\_id | VPC ID. | string | n/a | yes |

Expand Down
1 change: 1 addition & 0 deletions templates.tf
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ data "template_file" "shuntingyard_config_yaml" {
shuntingyard_sqs_queue = "${aws_sqs_queue.shuntingyard_sqs_queue.id}"
shuntingyard_sqs_queue_wait_timeout = "${var.shuntingyard_sqs_queue_wait_timeout}"
selected_tables = "${join("\n", formatlist(" - %s", var.selected_tables))}"
orphaned_data_strategy = "${format("orphaned-data-strategy: %s", var.orphaned_data_strategy)}"
}
}

Expand Down
1 change: 1 addition & 0 deletions templates/shuntingyard-config.yml.tmpl
Original file line number Diff line number Diff line change
Expand Up @@ -16,3 +16,4 @@ event-receiver:
source-table-filter:
table-names:
${selected_tables}
${orphaned_data_strategy}
9 changes: 9 additions & 0 deletions variables.tf
Original file line number Diff line number Diff line change
Expand Up @@ -110,6 +110,15 @@ EOF
default = []
}

variable "orphaned_data_strategy" {
description = <<EOF
Orphaned data strategy to use for stale data during replication. Supported strategies: "NONE", "HOUSEKEEPING" (default).
EOF

type = "string"
default = "HOUSEKEEPING"
}

variable "docker_registry_auth_secret_name" {
description = "Docker Registry authentication SecretManager secret name."
type = "string"
Expand Down

0 comments on commit 9e4ef3e

Please sign in to comment.