The Metadata Service is a central store for the Metaflow metadata. Namely, it contains information about past runs, and pointers to data artifacts they produced. Metaflow client talks to the Metadata service over an HTTP API endpoint. Metadata service is not strictly required to use Metaflow (you can use Metaflow in the "local" mode without it), but it enables a lot of useful functionality, especially if there is more than person using Metaflow in your team.
This terraform module provisions infrastructure to run Metadata service on AWS Fargate.
To read more, see the Metaflow docs
If the access_list_cidr_blocks
variable is set, only traffic originating from the specified IP addresses will be accepted. Services internal to AWS can directly access the load balancer used by the API.
Name | Description | Type | Default | Required |
---|---|---|---|---|
access_list_cidr_blocks | List of CIDRs we want to grant access to our Metaflow Metadata Service. Usually this is our VPN's CIDR blocks. | list(string) |
n/a | yes |
database_name | The database name | string |
"metaflow" |
no |
database_password | The database password | string |
n/a | yes |
database_username | The database username | string |
n/a | yes |
datastore_s3_bucket_kms_key_arn | The ARN of the KMS key used to encrypt the Metaflow datastore S3 bucket | string |
n/a | yes |
db_migrate_lambda_zip_file | Output path for the zip file containing the DB migrate lambda | string |
null |
no |
enable_api_basic_auth | Enable basic auth for API Gateway? (requires key export) | bool |
true |
no |
enable_api_gateway | Enable API Gateway for public metadata service endpoint | bool |
true |
no |
fargate_execution_role_arn | The IAM role that grants access to ECS and Batch services which we'll use as our Metadata Service API's execution_role for our Fargate instance | string |
n/a | yes |
iam_partition | IAM Partition (Select aws-us-gov for AWS GovCloud, otherwise leave as is) | string |
"aws" |
no |
is_gov | Set to true if IAM partition is 'aws-us-gov' | bool |
false |
no |
metadata_service_container_image | Container image for metadata service | string |
"" |
no |
metadata_service_cpu | ECS task CPU unit for metadata service | number |
512 |
no |
metadata_service_memory | ECS task memory in MiB for metadata service | number |
1024 |
no |
metaflow_vpc_id | ID of the Metaflow VPC this SageMaker notebook instance is to be deployed in | string |
n/a | yes |
rds_master_instance_endpoint | The database connection endpoint in address:port format | string |
n/a | yes |
resource_prefix | Prefix given to all AWS resources to differentiate between applications | string |
n/a | yes |
resource_suffix | Suffix given to all AWS resources to differentiate between environment and workspace | string |
n/a | yes |
s3_bucket_arn | The ARN of the bucket we'll be using as blob storage | string |
n/a | yes |
standard_tags | The standard tags to apply to every AWS resource. | map(string) |
n/a | yes |
subnet1_id | First private subnet used for availability zone redundancy | string |
n/a | yes |
subnet2_id | Second private subnet used for availability zone redundancy | string |
n/a | yes |
vpc_cidr_blocks | The VPC CIDR blocks that we'll access list on our Metadata Service API to allow all internal communications | list(string) |
n/a | yes |
with_public_ip | Enable public IP assignment for the Metadata Service. Typically you want this to be set to true if using public subnets as subnet1_id and subnet2_id, and false otherwise | bool |
n/a | yes |
Name | Description |
---|---|
METAFLOW_SERVICE_INTERNAL_URL | URL for Metadata Service (Accessible in VPC) |
METAFLOW_SERVICE_URL | URL for Metadata Service (Open to Public Access) |
api_gateway_rest_api_id | The ID of the API Gateway REST API we'll use to accept MetaData service requests to forward to the Fargate API instance |
api_gateway_rest_api_id_key_id | API Gateway Key ID for Metadata Service. Fetch Key from AWS Console [METAFLOW_SERVICE_AUTH_KEY] |
metadata_service_security_group_id | The security group ID used by the MetaData service. We'll grant this access to our DB. |
metadata_svc_ecs_task_role_arn | This role is passed to AWS ECS' task definition as the task_role . This allows the running of the Metaflow Metadata Service to have the proper permissions to speak to other AWS resources. |
migration_function_arn | ARN of DB Migration Function |
network_load_balancer_dns_name | The DNS addressable name for the Network Load Balancer that accepts requests and forwards them to our Fargate MetaData service instance(s) |