Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce housekeeping pressure on CloudFormation #17

Merged
merged 6 commits into from
Jan 4, 2024

Conversation

dagguh
Copy link
Contributor

@dagguh dagguh commented Jan 3, 2024

No description provided.

@dagguh dagguh requested a review from a team as a code owner January 3, 2024 16:30
@dagguh dagguh force-pushed the reduce-housekeeping-pressure-on-cloudformation branch 2 times, most recently from a4ae713 to f9fb678 Compare January 3, 2024 16:49
When there's a lot of expired stacks (e.g. due to housekeeping outage),
then listing them all requires a lot of Cloudformation requests.
Then we get throttled, so AWS SDK makes a couple retries underneath,
but we finally time out. Then we don't reach the stack deletion part.
So the list remains long and it's a self-perpetuating problem.

Instead, start deleting stacks as soon as we get a batch listed.
We haven't been throttled by EC2 like for CloudFormation,
but do it for consistency with CloudFormation housekeeping.
Network stacks contain VPCs, and VPCs depend on security groups.
Some security groups are provisioned outside of the stack,
so deleting such a stack will fail due to the dependency.

Delete stacks at the end, so that all external dependencies are already
cleaned up.
Avoid false-positives like (notice lack of stacktrace logging too):
```
16:36:00,592 ERROR {} Ec2Instance(instanceId = i-080edd5933a74418f) failed to release itself
16:36:30,577 ERROR {} Ec2Instance(instanceId = i-061f9c1f94990848e) failed to release itself
```
When you hit the 5k key limit, starting 5k threads is ok for Java,
but not ok for AWS. We get immediately throttled.
It's actually faster to do it sequentially. And AWS is quick to delete.
@dagguh dagguh force-pushed the reduce-housekeeping-pressure-on-cloudformation branch from 2100663 to b0952f3 Compare January 4, 2024 10:06
@dagguh dagguh merged commit d639f45 into master Jan 4, 2024
3 checks passed
@dagguh dagguh deleted the reduce-housekeeping-pressure-on-cloudformation branch January 4, 2024 13:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants