Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve flexibility for publishing options #2964

Merged
merged 15 commits into from
Dec 3, 2023
Merged
Show file tree
Hide file tree
Changes from 12 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 9 additions & 1 deletion .github/workflows/build-deploy-pudl.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ env:
GITHUB_REF: ${{ github.ref_name }} # This is changed to dev if running on a schedule
GCE_INSTANCE: pudl-deployment-tag # This is changed to pudl-deployment-dev if running on a schedule
GCE_INSTANCE_ZONE: ${{ secrets.GCE_INSTANCE_ZONE }}
GCS_OUTPUT_BUCKET: gs://nightly-build-outputs.catalyst.coop

jobs:
build_and_deploy_pudl:
Expand All @@ -34,6 +35,7 @@ jobs:
- name: Get HEAD of the branch (main or dev)
run: |
echo "ACTION_SHA=$(git rev-parse HEAD)" >> $GITHUB_ENV
echo "SHORT_SHA=$(git rev-parse --short HEAD)" >> $GITHUB_ENV

- name: Print action vars
run: |
Expand Down Expand Up @@ -83,6 +85,11 @@ jobs:
- name: Set up Cloud SDK
uses: google-github-actions/setup-gcloud@v1

- name: Determine commit information
run: |-
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIL about the block chomping operator. Does GHA complain when there's an extra newline at the end here?

echo "COMMIT_BRANCH=$(gitrev-parse --abbrev-ref HEAD)" >> $GITHUB_ENV
echo "COMMIT_TIME=$(git log -1 --format=%cd --date=format:%Y-%m-%d-%H%M)" >> $GITHUB_ENV

# Deploy PUDL image to GCE
- name: Deploy
env:
Expand Down Expand Up @@ -119,6 +126,7 @@ jobs:
--container-env DAGSTER_PG_DB="dagster-storage" \
--container-env FLY_ACCESS_TOKEN=${{ secrets.FLY_ACCESS_TOKEN }} \
--container-env PUDL_SETTINGS_YML="/home/mambauser/src/pudl/package_data/settings/etl_full.yml" \
--container-env PUDL_GCS_OUTPUT=${{ env.GCS_OUTPUT_BUCKET }}/${{ env.COMMIT_TIME }}-${{ env.SHORT_SHA }}-${{ env.COMMIT_BRANCH }}

# Start the VM
- name: Start the deploy-pudl-vm
Expand All @@ -129,6 +137,6 @@ jobs:
uses: slackapi/[email protected]
with:
channel-id: "C03FHB9N0PQ"
slack-message: "build-deploy-pudl status: ${{ job.status }}\n${{ env.ACTION_SHA }}-${{ env.GITHUB_REF }}"
slack-message: "build-deploy-pudl status: ${{ job.status }}\n${{ env.COMMIT_TIME}}-${{ env.SHORT_SHA }}-${{ env.COMMIT_BRANCH }}"
env:
SLACK_BOT_TOKEN: ${{ secrets.PUDL_DEPLOY_SLACK_TOKEN }}
25 changes: 18 additions & 7 deletions docker/gcp_pudl_etl.sh
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@
# This script runs the entire ETL and validation tests in a docker container on a Google Compute Engine instance.
# This script won't work locally because it needs adequate GCP permissions.

: "${PUDL_GCS_OUTPUT:=gs://nightly-build-outputs.catalyst.coop/$ACTION_SHA-$GITHUB_REF}"
zaneselvans marked this conversation as resolved.
Show resolved Hide resolved

set -x

function send_slack_msg() {
Expand All @@ -27,26 +29,26 @@ function run_pudl_etl() {
--loglevel DEBUG \
--gcs-cache-path gs://internal-zenodo-cache.catalyst.coop \
--workers 8 \
$PUDL_SETTINGS_YML && \
pudl_etl \
$PUDL_SETTINGS_YML \
&& pudl_etl \
--loglevel DEBUG \
--gcs-cache-path gs://internal-zenodo-cache.catalyst.coop \
$PUDL_SETTINGS_YML && \
pytest \
$PUDL_SETTINGS_YML \
&& pytest \
-n auto \
--gcs-cache-path gs://internal-zenodo-cache.catalyst.coop \
--etl-settings $PUDL_SETTINGS_YML \
--live-dbs test/integration test/unit && \
pytest \
--live-dbs test/integration test/unit \
&& pytest \
-n auto \
--gcs-cache-path gs://internal-zenodo-cache.catalyst.coop \
--etl-settings $PUDL_SETTINGS_YML \
--live-dbs test/validate
&& touch ${PUDL_OUTPUT}/success
zaneselvans marked this conversation as resolved.
Show resolved Hide resolved
}

function shutdown_vm() {
# Copy the outputs to the GCS bucket
gsutil -m cp -r $PUDL_OUTPUT "gs://nightly-build-outputs.catalyst.coop/$ACTION_SHA-$GITHUB_REF"

upload_file_to_slack $LOGFILE "pudl_etl logs for $ACTION_SHA-$GITHUB_REF:"

Expand All @@ -59,6 +61,11 @@ function shutdown_vm() {
curl -X POST -H "Content-Length: 0" -H "Authorization: Bearer ${ACCESS_TOKEN}" https://compute.googleapis.com/compute/v1/projects/catalyst-cooperative-pudl/zones/$GCE_INSTANCE_ZONE/instances/$GCE_INSTANCE/stop
}

function copy_outputs_to_gcs() {
echo "Copying outputs to GCP bucket $PUDL_GCS_OUTPUT"
gsutil -m cp -r $PUDL_OUTPUT ${PUDL_GCS_OUTPUT}
}

function copy_outputs_to_distribution_bucket() {
echo "Copying outputs to GCP distribution bucket"
gsutil -m -u $GCP_BILLING_PROJECT cp -r "$PUDL_OUTPUT/*" "gs://pudl.catalyst.coop/$GITHUB_REF"
Expand Down Expand Up @@ -109,6 +116,9 @@ if [[ $ETL_SUCCESS == 0 ]]; then
ETL_SUCCESS=${PIPESTATUS[0]}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably want to remove the success file here prior to distribution.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we replace success file with, say, json file that has some runtime statistics from how the data was generated, would we want to include it in distributed files, or also not?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a little build metadata file would be worth including. We're distributing the pudl-etl.log file. It would kind of be a structured accessory to that output.


# Dump outputs to s3 bucket if branch is dev or build was triggered by a tag
# TODO: this behavior should be controlled by on/off switch here and this logic
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree - having the action just pass in a switch would be nice. I think at some point we should replace this whole nightly build harness with a Python script that's more robust and that would be a nice time to fix this too.

# should be moved to the triggering github action. Having it here feels
# fragmented.
if [ $GITHUB_ACTION_TRIGGER = "push" ] || [ $GITHUB_REF = "dev" ]; then
copy_outputs_to_distribution_bucket
ETL_SUCCESS=${PIPESTATUS[0]}
Expand All @@ -124,4 +134,5 @@ else
notify_slack "failure"
fi

copy_outputs_to_gcs
zaneselvans marked this conversation as resolved.
Show resolved Hide resolved
shutdown_vm
2 changes: 1 addition & 1 deletion environments/conda-linux-64.lock.yml
Original file line number Diff line number Diff line change
Expand Up @@ -441,7 +441,7 @@ dependencies:
- rich=13.7.0=pyhd8ed1ab_0
- sqlalchemy=2.0.23=py311h459d7ec_0
- stack_data=0.6.2=pyhd8ed1ab_0
- starlette=0.32.0.post1=pyhd8ed1ab_0
- starlette=0.33.0=pyhd8ed1ab_0
- tiledb=2.16.3=h8c794c1_3
- ukkonen=1.0.1=py311h9547e67_4
- uvicorn=0.24.0.post1=py311h38be061_0
Expand Down
24 changes: 12 additions & 12 deletions environments/conda-lock.yml
Original file line number Diff line number Diff line change
Expand Up @@ -21167,45 +21167,45 @@ package:
category: main
optional: false
- name: starlette
version: 0.32.0.post1
version: 0.33.0
manager: conda
platform: linux-64
dependencies:
anyio: <5,>=3.4.0
python: ">=3.8"
typing_extensions: ">=3.10.0"
url: https://conda.anaconda.org/conda-forge/noarch/starlette-0.32.0.post1-pyhd8ed1ab_0.conda
url: https://conda.anaconda.org/conda-forge/noarch/starlette-0.33.0-pyhd8ed1ab_0.conda
hash:
md5: 9aa6d56db739eee2ff473becbe178fd1
sha256: 9692b83467670b473dc71137376f735249ef2ee6eeefce9068b0dec94810c24c
md5: 55027cf7f50803f0f5ece8b661eff47b
sha256: 3923f4c3e31d8c3a9c574779585137ff834a6108558a8956ef93022d4fcb37a8
category: dev
optional: true
- name: starlette
version: 0.32.0.post1
version: 0.33.0
manager: conda
platform: osx-64
dependencies:
python: ">=3.8"
typing_extensions: ">=3.10.0"
anyio: <5,>=3.4.0
url: https://conda.anaconda.org/conda-forge/noarch/starlette-0.32.0.post1-pyhd8ed1ab_0.conda
url: https://conda.anaconda.org/conda-forge/noarch/starlette-0.33.0-pyhd8ed1ab_0.conda
hash:
md5: 9aa6d56db739eee2ff473becbe178fd1
sha256: 9692b83467670b473dc71137376f735249ef2ee6eeefce9068b0dec94810c24c
md5: 55027cf7f50803f0f5ece8b661eff47b
sha256: 3923f4c3e31d8c3a9c574779585137ff834a6108558a8956ef93022d4fcb37a8
category: dev
optional: true
- name: starlette
version: 0.32.0.post1
version: 0.33.0
manager: conda
platform: osx-arm64
dependencies:
python: ">=3.8"
typing_extensions: ">=3.10.0"
anyio: <5,>=3.4.0
url: https://conda.anaconda.org/conda-forge/noarch/starlette-0.32.0.post1-pyhd8ed1ab_0.conda
url: https://conda.anaconda.org/conda-forge/noarch/starlette-0.33.0-pyhd8ed1ab_0.conda
hash:
md5: 9aa6d56db739eee2ff473becbe178fd1
sha256: 9692b83467670b473dc71137376f735249ef2ee6eeefce9068b0dec94810c24c
md5: 55027cf7f50803f0f5ece8b661eff47b
sha256: 3923f4c3e31d8c3a9c574779585137ff834a6108558a8956ef93022d4fcb37a8
category: dev
optional: true
- name: stevedore
Expand Down
2 changes: 1 addition & 1 deletion environments/conda-osx-64.lock.yml
Original file line number Diff line number Diff line change
Expand Up @@ -421,7 +421,7 @@ dependencies:
- rich=13.7.0=pyhd8ed1ab_0
- sqlalchemy=2.0.23=py311he705e18_0
- stack_data=0.6.2=pyhd8ed1ab_0
- starlette=0.32.0.post1=pyhd8ed1ab_0
- starlette=0.33.0=pyhd8ed1ab_0
- tiledb=2.16.3=hd3a41d5_3
- ukkonen=1.0.1=py311h5fe6e05_4
- uvicorn=0.24.0.post1=py311h6eed73b_0
Expand Down
2 changes: 1 addition & 1 deletion environments/conda-osx-arm64.lock.yml
Original file line number Diff line number Diff line change
Expand Up @@ -421,7 +421,7 @@ dependencies:
- rich=13.7.0=pyhd8ed1ab_0
- sqlalchemy=2.0.23=py311h05b510d_0
- stack_data=0.6.2=pyhd8ed1ab_0
- starlette=0.32.0.post1=pyhd8ed1ab_0
- starlette=0.33.0=pyhd8ed1ab_0
- tiledb=2.16.3=he15c4da_3
- ukkonen=1.0.1=py311he4fd1f5_4
- uvicorn=0.24.0.post1=py311h267d04e_0
Expand Down
Loading