Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: remove cloudfront distribution and custom alternate domain from backend #422

Merged
merged 11 commits into from
Aug 23, 2024
38 changes: 29 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,9 @@
# veda-backend

This project deploys a complete backend for a [SpatioTemporal Asset Catalog](https://stacspec.org/) including a postgres database, a metadata API, and raster tiling API. Veda-backend is a non-forked version of the [eoAPI](https://github.com/developmentseed/eoAPI) demo project. Veda-backend is decoupled from the demo project to selectively incorporate new stable functionality from the fast moving development in eoAPI while providing a continuous baseline for veda-backend users and to support project specific business and deployment logic.

The primary tools employed in the [eoAPI demo](https://github.com/developmentseed/eoAPI) and this project are:

- [stac-spec](https://github.com/radiantearth/stac-spec)
- [stac-api-spec](https://github.com/radiantearth/stac-api-spec)
- [stac-fastapi](https://github.com/stac-utils/stac-fastapi)
Expand All @@ -11,6 +13,7 @@ The primary tools employed in the [eoAPI demo](https://github.com/developmentsee
- [eoapi-cdk](https://github.com/developmentseed/eoapi-cdk/tree/main#eoapi-cdk-constructs) + [radiantearth/stac-browser](https://github.com/radiantearth/stac-browser)

## VEDA backend context

![architecture diagram](.readme/veda-overview-bw.svg)

_Edit this diagram in VS Code using the [Draw.io Integration Extension](https://marketplace.visualstudio.com/items?itemName=hediet.vscode-drawio) and export a new SVG_
Expand All @@ -33,12 +36,14 @@ An [.example.env](.example.env) template is supplied for local deployments. If u

### Fetch environment variables using AWS CLI

To retrieve the variables for a stage that has been previously deployed, the secrets manager can be used to quickly populate an .env file with [scripts/sync-env-local.sh](scripts/sync-env-local.sh).
To retrieve the variables for a stage that has been previously deployed, the secrets manager can be used to quickly populate an .env file with [scripts/sync-env-local.sh](scripts/sync-env-local.sh).

```
```bash
./scripts/sync-env-local.sh <app-secret-name>
```

### Basic environment variables

| Name | Explanation |
| --- | --- |
| `APP_NAME` | Optional app name used to name stack and resources, defaults to `veda-backend` |
Expand All @@ -48,6 +53,7 @@ To retrieve the variables for a stage that has been previously deployed, the sec
| `VEDA_DB_SNAPSHOT_ID` | **Once used always REQUIRED** Optional RDS snapshot identifier to initialize RDS from a snapshot |

### Advanced configuration

The constructs and applications in this project are configured using pydantic. The settings are defined in config.py files stored alongside the associated construct or application--for example the settings for the RDS PostgreSQL construct are defined in database/infrastructure/config.py. For custom configuration, use environment variables to override the pydantic defaults.

| Construct | Env Prefix | Configuration |
Expand All @@ -64,24 +70,28 @@ The constructs and applications in this project are configured using pydantic. T
### Deploying to the cloud

#### Install deployment pre-requisites

- [Node](https://nodejs.org/)
- [NVM](https://github.com/nvm-sh/nvm#node-version-manager---)
- [jq](https://jqlang.github.io/jq/) (used for exporting environment variable secrets to `.env` in [scripts/sync-env-local.sh](/scripts/sync-env-local.sh))

These can be installed with [homebrew](https://brew.sh/) on MacOS
```

```bash
brew install node
brew install nvm
brew install jq
```

#### Virtual environment example
```

```bash
python3 -m venv .venv
source .venv/bin/activate
```

#### Install requirements

```bash
nvm use --lts
npm install --location=global aws-cdk
Expand All @@ -99,7 +109,7 @@ cdk diff
# Execute deployment and standby--security changes will require approval for deployment
cdk deploy
```

## Deleting the CloudFormation stack

If this is a development stack that is safe to delete, you can delete the stack in CloudFormation console or via `cdk destroy`, however, the additional manual steps were required to completely delete the stack resources:
Expand All @@ -112,44 +122,51 @@ If this is a development stack that is safe to delete, you can delete the stack
## Custom deployments

The default settings for this project generate a complete AWS environment including a VPC and gateways for the stack. See this guidance for adjusting the veda-backend stack for existing managed and/or shared AWS environments.

- [Deploy to an existing managed AWS environment](docs/deploying_to_existing_environments.md)
- [Creating a shared base VPC and AWS environment](docs/deploying_to_existing_environments.md#optional-deploy-standalone-base-infrastructure)

## Local Docker deployment

Start up a local stack
```

```bash
docker compose up
```

Clean up after running locally
```

```bash
docker compose down
```

## Running tests locally

To run tests implicated in CI, a script is included that requires as little setup as possible
```

```bash
./scripts/run-local-tests.sh
```

In case of failure, all container logs will be written out to `container_logs.log`.

# Operations

## Adding new data to veda-backend
## Adding new data to veda-backend

> **Warning** PgSTAC records should be loaded in the database using [pypgstac](https://github.com/stac-utils/pgstac#pypgstac) for proper indexing and partitioning.

The VEDA ecosystem includes tools specifially created for loading PgSTAC records and optimizing data assets. The [veda-data-airflow](https://github.com/NASA-IMPACT/veda-data-airflow) project provides examples of cloud pipelines that transform data to cloud optimized formats, generate STAC metadata, and submit records for publication to the veda-backend database via veda-backend's ingest API. Veda-backend's integrated ingest system includes an API lambda for enqueuing collection and item records in a DynamoDB table and an ingestor lambda that batch loads DDB enqueued records into the PgSTAC database. Currently, the client id and domain of an existing Cognito user pool programmatic client must be supplied in [configuration](ingest_api/infrastructure/config.py) as `VEDA_CLIENT_ID` and `VEDA_COGNITO_DOMAIN` (the [veda-auth project](https://github.com/NASA-IMPACT/veda-auth) can be used to deploy a Cognito user pool and client). To dispense auth tokens via the ingest API swagger docs and `/token` endpoints, an administrator must add the ingest API lambda URL to the allowed callbacks of the Cognito client.

## Support scripts
Support scripts are provided for manual system operations.

- [Rotate pgstac password](support_scripts/README.md#rotate-pgstac-password)

# VEDA ecosystem

## Projects

| Name | Explanation |
| --- | --- |
| **veda-backend** | Central index (database) and APIs for recording, discovering, viewing, and using VEDA assets |
Expand All @@ -159,6 +176,7 @@ Support scripts are provided for manual system operations.
| [**veda-data**](https://github.com/NASA-IMPACT/veda-data) | Collection and asset discovery configuration |
| [**veda-data-airflow**](https://github.com/NASA-IMPACT/veda-data-airflow) | Cloud optimize data assets and submit records for publication to veda-stac-ingestor |
| [**veda-docs**](https://github.com/NASA-IMPACT/veda-docs) | Documentation repository for end users of VEDA ecosystem data and tools |
| [**veda-routes**](https://github.com/NASA-IMPACT/veda-routes)| Configuration for VEDA's Content Delivery Network |

## VEDA usage examples

Expand All @@ -169,7 +187,9 @@ Support scripts are provided for manual system operations.
# STAC community resources

## STAC browser

Radiant Earth's [stac-browser](https://github.com/radiantearth/stac-browser) is a browser for STAC catalogs. The demo version of this browser [radiantearth.github.io/stac-browser](https://radiantearth.github.io/stac-browser/#/) can be used to browse the contents of the veda-backend STAC catalog, paste the veda-backend stac-api URL deployed by this project in the demo and click load. Read more about the recent developments and usage of stac-browser [here](https://medium.com/radiant-earth-insights/the-exciting-future-of-the-stac-browser-2351143aa24b).

# License

This project is licensed under **Apache 2**, see the [LICENSE](LICENSE) file for more details.
73 changes: 8 additions & 65 deletions app.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,14 +8,12 @@

from config import veda_app_settings
from database.infrastructure.construct import RdsConstruct
from domain.infrastructure.construct import DomainConstruct
from ingest_api.infrastructure.config import IngestorConfig as ingest_config
from ingest_api.infrastructure.construct import ApiConstruct as ingest_api_construct
from ingest_api.infrastructure.construct import IngestorConstruct as ingestor_construct
from network.infrastructure.construct import VpcConstruct
from permissions_boundary.infrastructure.construct import PermissionsBoundaryAspect
from raster_api.infrastructure.construct import RasterApiLambdaConstruct
from routes.infrastructure.construct import CloudfrontDistributionConstruct
from s3_website.infrastructure.construct import VedaWebsite
from stac_api.infrastructure.construct import StacApiLambdaConstruct

Expand Down Expand Up @@ -71,15 +69,12 @@ def __init__(self, scope: Construct, construct_id: str, **kwargs) -> None:
stage=veda_app_settings.stage_name(),
)

domain = DomainConstruct(veda_stack, "domain", stage=veda_app_settings.stage_name())

raster_api = RasterApiLambdaConstruct(
veda_stack,
"raster-api",
stage=veda_app_settings.stage_name(),
vpc=vpc.vpc,
database=database,
domain=domain,
)

stac_api = StacApiLambdaConstruct(
Expand All @@ -89,23 +84,12 @@ def __init__(self, scope: Construct, construct_id: str, **kwargs) -> None:
vpc=vpc.vpc,
database=database,
raster_api=raster_api,
domain=domain,
)

website = VedaWebsite(
veda_stack, "stac-browser-bucket", stage=veda_app_settings.stage_name()
)

veda_routes = CloudfrontDistributionConstruct(
veda_stack,
"routes",
stage=veda_app_settings.stage_name(),
raster_api_id=raster_api.raster_api.api_id,
stac_api_id=stac_api.stac_api.api_id,
origin_bucket=website.bucket,
region=veda_app_settings.cdk_default_region,
)

# Only create a stac browser if we can infer the catalog url from configuration before synthesis (API Gateway URL not yet available)
stac_catalog_url = veda_app_settings.get_stac_catalog_url()
if stac_catalog_url:
Expand All @@ -120,23 +104,27 @@ def __init__(self, scope: Construct, construct_id: str, **kwargs) -> None:
db_secret_name = database.pgstac.secret.secret_name
db_security_group = database.db_security_group

base_api_url = f"https://{veda_app_settings.stage_name()}.{veda_app_settings.veda_custom_host}".strip(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't VEDA_CUSTOM_HOST already include the stage name?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah it does!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"/"
)
stac_api_url = f"{base_api_url}{veda_app_settings.veda_stac_root_path}/"
raster_api_url = f"{base_api_url}{veda_app_settings.veda_raster_root_path}/"

# ingestor config requires references to other resources, but can be shared between ingest api and bulk ingestor
ingestor_config = ingest_config(
stage=veda_app_settings.stage_name(),
stac_db_security_group_id=db_security_group.security_group_id,
stac_api_url=stac_api.stac_api.url,
raster_api_url=raster_api.raster_api.url,
stac_api_url=stac_api_url,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could handle inferring the cloudfront urls in config with a property or function

@property
def veda_stac_api_cf_url(self) -> str:
    """inferred cloudfront url of the stac api if app is configured with a custom host and root path"""
    if self.veda_custom_host and self.veda_stac_root_path:
        return f"https://{self.veda_custom_host}{self.veda_stac_root_path}"
    return None 

and then conditionally use them here so that we have a way to use the API gateway url if a cloudfront is not used.

stac_api_url=settings.stac_api_cf_url if settings.stac_api_cf_url else stac_api.stac_api.url

raster_api_url=raster_api_url,
)


ingest_api = ingest_api_construct(
veda_stack,
"ingest-api",
config=ingestor_config,
db_secret=database.pgstac.secret,
db_vpc=vpc.vpc,
db_vpc_subnets=database.vpc_subnets,
domain=domain,
)

ingestor = ingestor_construct(
Expand All @@ -149,51 +137,6 @@ def __init__(self, scope: Construct, construct_id: str, **kwargs) -> None:
db_vpc_subnets=database.vpc_subnets,
)

veda_routes.add_ingest_behavior(
ingest_api=ingest_api.api, stage=veda_app_settings.stage_name()
)

# Must be done after all CF behaviors exist
veda_routes.create_route_records(stage=veda_app_settings.stage_name())


# TODO this conditional supports deploying a second set of APIs to a separate custom domain and should be removed if no longer necessary
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉

if veda_app_settings.alt_domain():
alt_domain = DomainConstruct(
veda_stack,
"alt-domain",
stage=veda_app_settings.stage_name(),
alt_domain=True,
)

alt_raster_api = RasterApiLambdaConstruct(
veda_stack,
"alt-raster-api",
stage=veda_app_settings.stage_name(),
vpc=vpc.vpc,
database=database,
domain_name=alt_domain.raster_domain_name,
)

alt_stac_api = StacApiLambdaConstruct(
veda_stack,
"alt-stac-api",
stage=veda_app_settings.stage_name(),
vpc=vpc.vpc,
database=database,
raster_api=raster_api,
domain_name=alt_domain.stac_domain_name,
)

alt_ingest_api = ingest_api_construct(
veda_stack,
"alt-ingest-api",
config=ingestor_config,
db_secret=database.pgstac.secret,
db_vpc=vpc.vpc,
domain_name=alt_domain.ingest_domain_name,
)

git_sha = subprocess.check_output(["git", "rev-parse", "HEAD"]).decode().strip()
try:
git_tag = subprocess.check_output(["git", "describe", "--tags"]).decode().strip()
Expand Down
16 changes: 6 additions & 10 deletions config.py
Original file line number Diff line number Diff line change
Expand Up @@ -90,7 +90,12 @@ class vedaAppSettings(BaseSettings):

veda_stac_root_path: str = Field(
"",
description="Optional path prefix to add to all api endpoints. Used to infer url of stac-api before app synthesis.",
description="STAC API root path. Used to infer url of stac-api before app synthesis.",
)

veda_raster_root_path: str = Field(
"",
description="Raster API root path",
ciaransweet marked this conversation as resolved.
Show resolved Hide resolved
)

veda_domain_create_custom_subdomains: bool = Field(
Expand All @@ -115,15 +120,6 @@ def cdk_env(self) -> dict:
else:
return {}

def alt_domain(self) -> bool:
"""True if alternative domain and host parameters provided"""
return all(
[
self.veda_domain_alt_hosted_zone_id,
self.veda_domain_alt_hosted_zone_name,
]
)

def stage_name(self) -> str:
"""Force lowercase stage name"""
return self.stage.lower()
Expand Down
49 changes: 0 additions & 49 deletions domain/infrastructure/config.py

This file was deleted.

Loading