Skip to content

Commit

Permalink
Waggledance database aliasing (#77)
Browse files Browse the repository at this point in the history
* Support for database name aliases.

* test if variable is passed json-encoded

* switch to comma/colon delimited list

* switch to comma/colon delimited list

* switch to comma/colon delimited list

* switch to comma/colon delimited list

* switch to comma/colon delimited list

Co-authored-by: Scott Barnhart <[email protected]>
  • Loading branch information
barnharts4 and Scott Barnhart authored Jan 19, 2021
1 parent ddef067 commit f3b0f32
Show file tree
Hide file tree
Showing 7 changed files with 154 additions and 43 deletions.
4 changes: 4 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,10 @@ All notable changes to this project will be documented in this file.

The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/) and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.html).

## [3.2.0] - 2021-01-19
### Added
- Added ability to pass `database-name-mapping` key/value pairs for each federated metastore. See [Waggle Dance Database Name Mapping](https://github.com/HotelsDotCom/waggle-dance#database-name-mapping) for more information. Requires docker image version `1.6.0` or greater.

## [3.1.1] - 2020-08-04

### Changed
Expand Down
116 changes: 107 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,14 +21,14 @@ For more information please refer to the main [Apiary](https://github.com/Expedi
| instance_name | Waggle Dance instance name to identify resources in multi-instance deployments. | string | `` | no |
| k8s_namespace | K8s namespace to create waggle-dance deployment.| string | ``| no |
| k8s_docker_registry_secret | Docker Registry authentication K8s secret name.| string | ``| no |
| local_metastores | List of federated Metastore endpoints directly accessible on the local network. | list | `<list>` | no |
| local_metastores | List of federated Metastore endpoints directly accessible on the local network. See section [`local_metastores`](#local_metastores) for more info.| list | `<list>` | no |
| memory | The amount of memory (in MiB) used to allocate for the Waggle Dance container. Valid values: https://docs.aws.amazon.com/AmazonECS/latest/developerguide/task-cpu-memory-error.html | string | `4096` | no |
| primary_metastore_host | Primary Hive Metastore hostname configured in Waggle Dance. | string | `localhost` | no |
| primary_metastore_port | Primary Hive Metastore port | string | `9083` | no |
| primary_metastore_whitelist | List of Hive databases to whitelist on primary Metastore. | list | `<list>` | no |
| remote_metastores | List of VPC endpoint services to federate Metastores in other accounts. | list | `<list>` | no |
| remote_metastores | List of VPC endpoint services to federate Metastores in other accounts. See section [`remote_metastores`](#remote_metastores) for more info.| list | `<list>` | no |
| secondary_vpcs | List of VPCs to associate with Service Discovery namespace. | list | `<list>` | no |
| ssh_metastores | List of federated Metastores to connect to over SSH via bastion. | list | `<list>` | no |
| ssh_metastores | List of federated Metastores to connect to over SSH via bastion. See section [`ssh_metastores`](#ssh_metastores) for more info.| list | `<list>` | no |
| subnets | ECS container subnets. | list | - | yes |
| tags | A map of tags to apply to resources. | map | `<map>` | no |
| vpc_id | VPC ID. | string | - | yes |
Expand Down Expand Up @@ -61,12 +61,15 @@ module "apiary-waggledance" {
primary_metastore_host = "primary-metastore.yourdomain.com"
primary_metastore_whitelist = ["test_.*", "team_.*"]
remote_metastores = [{
endpoint = "com.amazonaws.vpce.us-west-2.vpce-svc-1"
port = "9083"
prefix = "metastore1"
mapped-databases = "default,test"
},
remote_metastores = [
{
endpoint = "com.amazonaws.vpce.us-west-2.vpce-svc-1"
port = "9083"
prefix = "metastore1"
mapped-databases = "default,test"
database-name-mapping = "test:test_alias,default:default_alias"
writable-whitelist = "test"
},
{
endpoint = "com.amazonaws.vpce.us-east-1.vpce-svc-2"
port = "9083"
Expand All @@ -78,6 +81,101 @@ module "apiary-waggledance" {
}
```

### local_metastores

A list of maps. Each map entry describes a federated metastore server directly accessible on the local network.

An example entry looks like:
```
local_metastores = [
{
host = "hms-readonly.metastore.svc.cluster.local"
port = "9083"
prefix = "local1"
mapped-databases = "default,test"
database-name-mapping = "test:test_alias,default:default_alias"
writable-whitelist = "test"
}
]
```
`local_metastores` map entry fields:

Name | Description | Type | Default | Required |
|------|-------------|:----:|:-----:|:-----:|
| host | Host name of the Hive metastore server on the local network. | string | - | yes |
| port | IP port that the Thrift server of the Hive metastore listens on. | string | `"9083"` | no |
| prefix | Prefix added to the database names from this metastore. Must be unique among all local, remote, and SSH federated metastores in this Waggle Dance instance. | string | - | yes |
| mapped-databases | Comma-separated list of databases from this metastore to expose to federation. If not specified, *all* databases are exposed.| string | `""` | no |
| database-name-mapping | Comma-separated list of `<database>:<alias>` key/value pairs to add aliases for the given databases. Default is no aliases. This is used primarily in migration scenarios where a database has been renamed/relocated. See [Waggle Dance Database Name Mapping](https://github.com/HotelsDotCom/waggle-dance#database-name-mapping) for more information. | string | `""` | no |
| writable-whitelist | Comma-separated list of databases from this metastore that can be in read-write mode. If not specified, all databases are read-only. Use `.*` to allow all databases to be written to. | string | `""` | no |

See [Waggle Dance README](https://github.com/HotelsDotCom/waggle-dance/README.md) for more information on all these parameters.

### remote_metastores

A list of maps. Each map entry describes a federated metastore endpoint accessible via an AWS VPC endpoint.

An example entry looks like:
```
remote_metastores = [
{
endpoint = "com.amazonaws.vpce.us-west-2.vpce-svc-1"
port = "9083"
prefix = "remote1"
mapped-databases = "default,test"
database-name-mapping = "test:test_alias,default:default_alias"
writable-whitelist = ".*"
}
]
```
`remote_metastores` map entry fields:

Name | Description | Type | Default | Required |
|------|-------------|:----:|:-----:|:-----:|
| endpoint | AWS VPC endpoint name that is connected to the remote Hive metastore. | string | - | yes |
| port | IP port that the Thrift server of the remote Hive metastore listens on. | string | `"9083"` | no |
| prefix | Prefix added to the database names from this metastore. Must be unique among all local, remote, and SSH federated metastores in this Waggle Dance instance. | string | - | yes |
| mapped-databases | Comma-separated list of databases from this metastore to expose to federation. If not specified, *all* databases are exposed.| string | `""` | no |
| database-name-mapping | Comma-separated list of `<database>:<alias>` key/value pairs to add aliases for the given databases. Default is no aliases. This is used primarily in migration scenarios where a database has been renamed/relocated. See [Waggle Dance Database Name Mapping](https://github.com/HotelsDotCom/waggle-dance#database-name-mapping) for more information. | string | `""` | no |
| writable-whitelist | Comma-separated list of databases from this metastore that can be in read-write mode. If not specified, all databases are read-only. Use `.*` to allow all databases to be written to. | string | `""` | no |

See [Waggle Dance README](https://github.com/HotelsDotCom/waggle-dance/README.md) for more information on all these parameters.

### ssh_metastores

A list of maps. Each map entry describes a federated metastore endpoint connected via an SSH bastion host.

An example entry looks like:
```
ssh_metastores = [
{
metastore-host = "com.amazonaws.vpce.us-west-2.vpce-svc-1"
port = "9083"
bastion-host = "bastion.remote-account.com"
user = "bastion-user"
timeout = "30000"
prefix = "ssh_metastore1"
mapped-databases = "default,test"
database-name-mapping = "test:test_alias,default:default_alias"
}
]
```
`ssh_metastores` map entry fields:

Name | Description | Type | Default | Required |
|------|-------------|:----:|:-----:|:-----:|
| metastore-host | Host name of the Hive metastore that can be resolved/reached from the bastion host. | string | - | yes |
| port | IP port that the Thrift server of the remote Hive metastore listens on. | string | `"9083"` | no |
| bastion-host | Host name of the bastion host. | string | - | yes |
| user | User name what will login to the bastion host. | string | - | yes |
| timeout | The SSH session timeout in milliseconds, 0 means no timeout. Default is 60000 milliseconds, i.e. 1 minute. | string | `"60000"` | no |
| prefix | Prefix added to the database names from this metastore. Must be unique among all local, remote, and SSH federated metastores in this Waggle Dance instance. | string | - | yes |
| mapped-databases | Comma-separated list of databases from this metastore to expose to federation. If not specified, *all* databases are exposed.| string | `""` | no |
| database-name-mapping | Comma-separated list of `<database>:<alias>` key/value pairs to add aliases for the given databases. Default is no aliases. This is used primarily in migration scenarios where a database has been renamed/relocated. See [Waggle Dance Database Name Mapping](https://github.com/HotelsDotCom/waggle-dance#database-name-mapping) for more information. | string | `""` | no |
| writable-whitelist | Comma-separated list of databases from this metastore that can be in read-write mode. If not specified, all databases are read-only. Use `.*` to allow all databases to be written to. | string | `""` | no |

See [Waggle Dance README](https://github.com/HotelsDotCom/waggle-dance/README.md) for more information on all these parameters.

# Contact

## Mailing List
Expand Down
26 changes: 0 additions & 26 deletions endpoints.tf
Original file line number Diff line number Diff line change
Expand Up @@ -41,32 +41,6 @@ resource "aws_vpc_endpoint" "remote_metastores" {
tags = merge(map("Name", "${var.remote_metastores[count.index].prefix}_metastore"), var.tags)
}

data "template_file" "remote_metastores_yaml" {
count = length(var.remote_metastores)
template = file("${path.module}/templates/waggle-dance-federation-remote.yml.tmpl")

vars = {
prefix = var.remote_metastores[count.index].prefix
metastore_host = aws_vpc_endpoint.remote_metastores[count.index].dns_entry[0].dns_name
metastore_port = lookup(var.remote_metastores[count.index], "port", "9083")
mapped_databases = lookup(var.remote_metastores[count.index], "mapped-databases", "")
writable_whitelist = lookup(var.remote_metastores[count.index], "writable-whitelist", "")
}
}

data "template_file" "local_metastores_yaml" {
count = length(var.local_metastores)
template = file("${path.module}/templates/waggle-dance-federation-local.yml.tmpl")

vars = {
prefix = var.local_metastores[count.index].prefix
metastore_host = var.local_metastores[count.index].host
metastore_port = lookup(var.local_metastores[count.index], "port", "9083")
mapped_databases = lookup(var.local_metastores[count.index], "mapped-databases", "")
writable_whitelist = lookup(var.local_metastores[count.index], "writable-whitelist", "")
}
}

resource "aws_route53_zone" "remote_metastore" {
count = var.enable_remote_metastore_dns == "" ? 0 : 1
name = "${local.remote_metastore_zone_prefix}-${var.aws_region}.${var.domain_extension}"
Expand Down
45 changes: 37 additions & 8 deletions templates.tf
Original file line number Diff line number Diff line change
Expand Up @@ -45,19 +45,48 @@ data "template_file" "primary_metastore_whitelist" {
EOF
}

data "template_file" "local_metastores_yaml" {
count = length(var.local_metastores)
template = file("${path.module}/templates/waggle-dance-federation-local.yml.tmpl")

vars = {
prefix = var.local_metastores[count.index].prefix
metastore_host = var.local_metastores[count.index].host
metastore_port = lookup(var.local_metastores[count.index], "port", "9083")
mapped_databases = lookup(var.local_metastores[count.index], "mapped-databases", "")
database_name_mapping = lookup(var.local_metastores[count.index], "database-name-mapping", "")
writable_whitelist = lookup(var.local_metastores[count.index], "writable-whitelist", "")
}
}

data "template_file" "remote_metastores_yaml" {
count = length(var.remote_metastores)
template = file("${path.module}/templates/waggle-dance-federation-remote.yml.tmpl")

vars = {
prefix = var.remote_metastores[count.index].prefix
metastore_host = aws_vpc_endpoint.remote_metastores[count.index].dns_entry[0].dns_name
metastore_port = lookup(var.remote_metastores[count.index], "port", "9083")
mapped_databases = lookup(var.remote_metastores[count.index], "mapped-databases", "")
database_name_mapping = lookup(var.remote_metastores[count.index], "database-name-mapping", "")
writable_whitelist = lookup(var.remote_metastores[count.index], "writable-whitelist", "")
}
}

data "template_file" "ssh_metastores_yaml" {
count = length(var.ssh_metastores)
template = file("${path.module}/templates/waggle-dance-federation-ssh.yml.tmpl")

vars = {
prefix = lookup(var.ssh_metastores[count.index], "prefix")
bastion_host = lookup(var.ssh_metastores[count.index], "bastion-host")
metastore_host = lookup(var.ssh_metastores[count.index], "metastore-host")
metastore_port = lookup(var.ssh_metastores[count.index], "port")
user = lookup(var.ssh_metastores[count.index], "user")
timeout = lookup(var.ssh_metastores[count.index], "timeout", "60000")
mapped_databases = lookup(var.ssh_metastores[count.index], "mapped-databases", "")
writable_whitelist = lookup(var.ssh_metastores[count.index], "writable-whitelist", "")
prefix = lookup(var.ssh_metastores[count.index], "prefix")
bastion_host = lookup(var.ssh_metastores[count.index], "bastion-host")
metastore_host = lookup(var.ssh_metastores[count.index], "metastore-host")
metastore_port = lookup(var.ssh_metastores[count.index], "port")
user = lookup(var.ssh_metastores[count.index], "user")
timeout = lookup(var.ssh_metastores[count.index], "timeout", "60000")
mapped_databases = lookup(var.ssh_metastores[count.index], "mapped-databases", "")
database_name_mapping = lookup(var.ssh_metastores[count.index], "database-name-mapping", "")
writable_whitelist = lookup(var.ssh_metastores[count.index], "writable-whitelist", "")
}
}

Expand Down
2 changes: 2 additions & 0 deletions templates/waggle-dance-federation-local.yml.tmpl
Original file line number Diff line number Diff line change
Expand Up @@ -4,5 +4,7 @@
remote-meta-store-uris: thrift://${metastore_host}:${metastore_port}
${ mapped_databases == "" ? "" : " mapped-databases:" }
${ mapped_databases == "" ? "" : join("\n",formatlist(" - %s",split(",",mapped_databases))) }
${ database_name_mapping == "" ? "" : " database-name-mapping:" }
${ database_name_mapping == "" ? "" : join("\n", formatlist(" %s", split(",", replace(replace(database_name_mapping, " ", ""), ":", ": ")))) }
${ writable_whitelist == "" ? "" : " writable-database-white-list:" }
${ writable_whitelist == "" ? "" : join("\n",formatlist(" - %s",split(",",writable_whitelist))) }
2 changes: 2 additions & 0 deletions templates/waggle-dance-federation-remote.yml.tmpl
Original file line number Diff line number Diff line change
Expand Up @@ -4,5 +4,7 @@
remote-meta-store-uris: thrift://${metastore_host}:${metastore_port}
${ mapped_databases == "" ? "" : " mapped-databases:" }
${ mapped_databases == "" ? "" : join("\n",formatlist(" - %s",split(",",mapped_databases))) }
${ database_name_mapping == "" ? "" : " database-name-mapping:" }
${ database_name_mapping == "" ? "" : join("\n", formatlist(" %s", split(",", replace(replace(database_name_mapping, " ", ""), ":", ": ")))) }
${ writable_whitelist == "" ? "" : " writable-database-white-list:" }
${ writable_whitelist == "" ? "" : join("\n",formatlist(" - %s",split(",",writable_whitelist))) }
2 changes: 2 additions & 0 deletions templates/waggle-dance-federation-ssh.yml.tmpl
Original file line number Diff line number Diff line change
Expand Up @@ -10,5 +10,7 @@
timeout: ${timeout}
${ mapped_databases == "" ? "" : " mapped-databases:" }
${ mapped_databases == "" ? "" : join("\n",formatlist(" - %s",split(",",mapped_databases))) }
${ database_name_mapping == "" ? "" : " database-name-mapping:" }
${ database_name_mapping == "" ? "" : join("\n", formatlist(" %s", split(",", replace(replace(database_name_mapping, " ", ""), ":", ": ")))) }
${ writable_whitelist == "" ? "" : " writable-database-white-list:" }
${ writable_whitelist == "" ? "" : join("\n",formatlist(" - %s",split(",",writable_whitelist))) }

0 comments on commit f3b0f32

Please sign in to comment.