Skip to content

Commit

Permalink
Feature/ro autoscaling (#217)
Browse files Browse the repository at this point in the history
* enable read-only metastore autoscaling

* fix

* update variable names

* update changelog

* Update CHANGELOG.md

Co-authored-by: Javier Sánchez Beltrán <[email protected]>

Co-authored-by: Raj Poluri <[email protected]>
Co-authored-by: Javier Sánchez Beltrán <[email protected]>
  • Loading branch information
3 people authored Jul 7, 2022
1 parent 24d31a6 commit 51ac078
Show file tree
Hide file tree
Showing 5 changed files with 65 additions and 4 deletions.
4 changes: 4 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,10 @@ All notable changes to this project will be documented in this file.

The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/) and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.html).

## [6.13.0] - 2022-07-07
### Added
- Option to enable k8s hive metastore read only instance autoscaling.

## [6.12.4] - 2022-06-03
### Fixed
- Fix k8s read-only metastore to use RDS reader instance.
Expand Down
7 changes: 6 additions & 1 deletion VARIABLES.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@
| ecs\_domain\_extension | Domain name to use for hosted zone created by ECS service discovery. | `string` | `"lcl"` | no |
| elb\_timeout | Idle timeout for Apiary ELB. | `string` | `"1800"` | no |
| enable\_apiary\_s3\_log\_hive | Create hive database to archive s3 logs in parquet format.Only applicable when module manages logs S3 bucket. | `bool` | `true` | no |
| enable\_autoscaling | Enable read only Hive Metastore k8s horizontal pod autoscaling. | `bool` | `true` | no |
| enable\_data\_events | Enable managed buckets S3 event notifications. | `bool` | `false` | no |
| enable\_gluesync | Enable metadata sync from Hive to the Glue catalog. | `bool` | `false` | no |
| enable\_hive\_metastore\_metrics | Enable sending Hive Metastore metrics to CloudWatch. | `bool` | `false` | no |
Expand All @@ -59,10 +60,14 @@
| hms\_ro\_db\_connection\_pool\_size | Read-only Hive metastore setting for size of the MySQL connection pool. Default is 10. | `number` | `10` | no |
| hms\_ro\_ecs\_task\_count | Desired ECS task count of the read only Hive Metastore service. | `string` | `"3"` | no |
| hms\_ro\_heapsize | Heapsize for the read only Hive Metastore.<br>Valid values: https://docs.aws.amazon.com/AmazonECS/latest/developerguide/task-cpu-memory-error.html | `string` | `"2048"` | no |
| hms\_ro\_k8s\_replica\_count | Initial Number of read only Hive Metastore k8s pod replicas to create. | `number` | `"2048"` | no |
| hms\_ro\_k8s\_max\_replica\_count | Max Number of read only Hive Metastore k8s pod replicas to create. | `number` | `"2048"` | no |
| hms\_ro\_target\_cpu\_percentage | Read only Hive Metastore autoscaling threshold for CPU target usage. | `number` | `"2048"` | no |
| hms\_rw\_cpu | CPU for the read/write Hive Metastore ECS task.<br>Valid values can be 256, 512, 1024, 2048 and 4096.<br>Reference: https://docs.aws.amazon.com/AmazonECS/latest/developerguide/task-cpu-memory-error.html | `string` | `"512"` | no |
| hms\_rw\_db\_connection\_pool\_size | Read-write Hive metastore setting for size of the MySQL connection pool. Default is 10. | `number` | `10` | no |
| hms\_rw\_ecs\_task\_count | Desired ECS task count of the read/write Hive Metastore service. | `string` | `"3"` | no |
| hms\_rw\_heapsize | Heapsize for the read/write Hive Metastore.<br>Valid values: https://docs.aws.amazon.com/AmazonECS/latest/developerguide/task-cpu-memory-error.html | `string` | `"2048"` | no |
| hms\_rw\_k8s\_replica\_count | Initial Number of read/write Hive Metastore k8s pod replicas to create. | `number` | `"2048"` | no |
| iam\_name\_root | Name to identify Hive Metastore IAM roles. | `string` | `"hms"` | no |
| ingress\_cidr | Generally allowed ingress CIDR list. | `list(string)` | n/a | yes |
| instance\_name | Apiary instance name to identify resources in multi-instance deployments. | `string` | `""` | no |
Expand Down Expand Up @@ -296,4 +301,4 @@ Each semicolon-demlimited section will create a new statement entry in the bucke
```
#### Interactions with `apiary_consumer_iamroles` and `apiary_consumer_prefix_iamroles`
- Note that any IAM roles in `apiary_consumer_iamroles` would not be subject to the restrictions from `apiary_customer_condition`, and so could read any S3 object, even if they don't have a `data-sensitivity` tag, or if the `data-sensitivity` tag is `true`, or if there is no `data-type` tag of `image*`.
- Note that any IAM roles in `apiary_consumer_prefix_iamroles` would not be subject to the restrictions from `apiary_customer_condition` for the schemas and prefixes specified in the map, and so could read any S3 object under those prefixes, even if they don't have a `data-sensitivity` tag, or if the `data-sensitivity` tag is `true`, or if there is no `data-type` tag of `image*`.
- Note that any IAM roles in `apiary_consumer_prefix_iamroles` would not be subject to the restrictions from `apiary_customer_condition` for the schemas and prefixes specified in the map, and so could read any S3 object under those prefixes, even if they don't have a `data-sensitivity` tag, or if the `data-sensitivity` tag is `true`, or if there is no `data-type` tag of `image*`.
24 changes: 23 additions & 1 deletion k8s-readonly.tf
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ resource "kubernetes_deployment" "apiary_hms_readonly" {
}

spec {
replicas = 3
replicas = var.hms_ro_k8s_replica_count
selector {
match_labels = {
name = "${local.hms_alias}-readonly"
Expand Down Expand Up @@ -219,6 +219,28 @@ resource "kubernetes_deployment" "apiary_hms_readonly" {
}
}

resource "kubernetes_horizontal_pod_autoscaler" "hms_readonly" {
count = var.hms_instance_type == "k8s" && var.enable_autoscaling ? 1 : 0

metadata {
name = "${local.hms_alias}-readonly"
namespace = var.metastore_namespace
}

spec {
min_replicas = var.hms_ro_k8s_replica_count
max_replicas = var.hms_ro_k8s_max_replica_count

target_cpu_utilization_percentage = var.hms_ro_target_cpu_percentage

scale_target_ref {
api_version = "apps/v1"
kind = "Deployment"
name = kubernetes_deployment.apiary_hms_readonly[0].metadata[0].name
}
}
}

resource "kubernetes_service" "hms_readonly" {
count = var.hms_instance_type == "k8s" ? 1 : 0
metadata {
Expand Down
2 changes: 1 addition & 1 deletion k8s-readwrite.tf
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ resource "kubernetes_deployment" "apiary_hms_readwrite" {
}

spec {
replicas = 3
replicas = var.hms_rw_k8s_replica_count
selector {
match_labels = {
name = "${local.hms_alias}-readwrite"
Expand Down
32 changes: 31 additions & 1 deletion variables.tf
Original file line number Diff line number Diff line change
Expand Up @@ -344,6 +344,36 @@ variable "hms_ro_ecs_task_count" {
default = "3"
}

variable "hms_rw_k8s_replica_count" {
description = "Initial Number of read/write Hive Metastore k8s pod replicas to create."
type = number
default = 3
}

variable "hms_ro_k8s_replica_count" {
description = "Initial Number of read only Hive Metastore k8s pod replicas to create."
type = number
default = 3
}

variable "hms_ro_k8s_max_replica_count" {
description = "Max Number of read only Hive Metastore k8s pod replicas to create."
type = number
default = 10
}

variable "enable_autoscaling" {
description = "Enable read only Hive Metastore k8s horizontal pod autoscaling"
type = bool
default = false
}

variable "hms_ro_target_cpu_percentage" {
description = "Read only Hive Metastore autoscaling threshold for CPU target usage."
type = number
default = 60
}

variable "elb_timeout" {
description = "Idle timeout for Apiary ELB."
type = string
Expand Down Expand Up @@ -586,7 +616,7 @@ variable "enable_dashboard" {
description = "make EKS & ECS dashboard optional"
type = bool
default = true
}
}

variable "rds_family" {
description = "RDS family"
Expand Down

0 comments on commit 51ac078

Please sign in to comment.