Skip to content

Commit

Permalink
Retry wait for stable service in deploy release (#761)
Browse files Browse the repository at this point in the history
## Ticket

n/a

## Changes
- If waiting for a stable ECS service fails during deploy, try it
exactly one more time

## Context for reviewers
- For two applications using the template-infra, the Nava Labs Decision
Support Tool project, and an internal Nava tool, the ECS service takes
slightly more than 10 minutes to become stable (typically about 11 or
13).
- The AWS wait command can't be configured to allow more than 10 minutes
 - Other approaches considered:
- Sleeping. This is probably the simplest solution but doesn't seem as
robust as simply trying the command twice.
- Retrying a configurable number of times in a loop. This seems like
premature complexity.

## Testing
Tested on internal tool (posted in Slack)
  • Loading branch information
KevinJBoyer authored Oct 1, 2024
1 parent ba31278 commit b7a4677
Showing 1 changed file with 8 additions and 1 deletion.
9 changes: 8 additions & 1 deletion bin/deploy-release
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,13 @@ echo "::endgroup::"
cluster_name=$(terraform -chdir="infra/${app_name}/service" output -raw service_cluster_name)
service_name=$(terraform -chdir="infra/${app_name}/service" output -raw service_name)
echo "Wait for service ${service_name} to become stable"
aws ecs wait services-stable --cluster "${cluster_name}" --services "${service_name}"
wait_for_service_stability() {
aws ecs wait services-stable --cluster "${cluster_name}" --services "${service_name}"
}

if ! wait_for_service_stability; then
echo "Retrying"
wait_for_service_stability
fi

echo "Completed ${app_name} deploy of ${image_tag} to ${environment}"

0 comments on commit b7a4677

Please sign in to comment.