A suite of lambdas for testing AWS architecture resiliency
For an introduction, check out the blog post.
For questions or bugs, open an issue.
These Lambdas can be automated by setting up a Cloudwatch event to trigger the Lambda on a recurring basis. For example, to execute every 15 minutes during business hours: */15 14-20 ? * MON-FRI *
Simulates an availability zone outage by terminating all EC2 instances in an availability zone
region - Target region
probability - Probability to run function
zones - List of AZ zones within region
exclusions - List of instance ids that should not be terminated
snsTopic - SNS topic for notifications
Use AmazonEC2FullAccess
policy
Terminates an EC2 instance at random chosen from a list of instances belonging to a configured set of auto scaling groups
region - Target region
probability - Probability to run function
groups - List of auto scaling groups
snsTopic - SNS topic for notifications
Use AmazonEC2FullAccess
policy
Stops an ECS task at random chosen from a list of tasks belonging to a configured set of ECS services
region - Target region
probability - Probability to run function
services - List of ECS services
snsTopic - SNS topic for notifications
Use AmazonECS_FullAccess
policy
Triggers an RDS AZ failover by rebooting the RDS instance with a forced failover
region - Target region
probability - Probability to run function
instance - List of RDS instance identifiers
snsTopic - SNS topic for notifications
Use AmazonRDSFullAccess
policy