We provide a small demo application to give you a quick and easy start into the world of Chaos Engineering. This demo application is a product catalog consisting of products from three different categories (toys, fashion and hot-deals).
The shopping demo consists of three backend services per each product category (bestseller-fashion
, bestseller-toys
and hot-deals
).
Each microservice provides a list of products.
These products are aggregated by the gateway
-microservice and exposed to the user via shopping-ui
.
In addition, each product microservice uses the inventory-service
to determine stock availability.
All services are based on Spring Boot and use different Spring projects.
As mentioned above the gateway
is the entrypoint for the UI.
The gateway
provides available products via the /products
-endpoint which collects all products from each
microservices (bestseller-fashion
, bestseller-toys
and hot-deals
).
This endpoints is implemented in the ProductsController.
Multiple implementation strategies exists using each different resilience patterns: fallbacks, timeouts and circuit breakers as described subsequently.
There are multiple endpoints available to demonstrate different implementations.
url | timeout configured | fallback-value | retry configured | circuit breaker | fails |
---|---|---|---|---|---|
/products |
❌ | ❌ | ❌ | ❌ | - if a microservice is not reachable or returning an error: 🔴 HTTP 500 - if a microservice is not responding fast: 🔴 the whole response will be delayed infinite |
/products/exception |
❌ | ✅ via catch exception |
❌ | ❌ | - if a microservice is not reachable or returning an error: ✅ products of the category are omitted - if a microservice is not responding fast: 🔴 the whole response will be delayed infinite |
/products/timeout |
✅ | ✅ via catch exception |
❌ | ❌ | - if a microservice is not reachable or returning an error: ✅ products of the category are omitted - if a microservice is not responding fast: ✅ products of the category are omitted |
/products/retry |
✅ | ✅ via fallbackMethod in resilience4j |
✅ | ❌ | like /products/timeout , but with max 3 retries each 500ms if a microservice-request isn't successfull.Pro: - 👍 potential recovery from short-term problems Con: - 👎 increasing load on microservices - 👎 increasing response time There is also an blog post about this implementation. |
/products/circuitbreaker |
✅ | ✅ via fallbackMethod in resilience4j |
✅ | ✅ | like /products/retry but with a circuit breaker which is preventing a failing microservice from overload (also from retries) and allow it to recover |
/products/parallel |
default (30s) | ❌ | ❌ | ❌ | Alternative implementation to show a parallelized way of fetching the products. This saves time, but the implementation has the same problems like the basic implementation. |
Our demo can be run on different Docker based platforms using the deployment scripts provided. Checkout the Steadybit Quickstart for more details.
helm repo add steadybit-shopping-demo https://steadybit.github.io/shopping-demo
helm repo update
helm upgrade steadybit-shopping-demo \
--install \
--wait \
--timeout 5m0s \
--set gateway.service.type=ClusterIP \
steadybit-shopping-demo/steadybit-shopping-demo
You can integrate Steadybit into your CI/CD pipeline to validate your resilience continuously and support you in following a GitOps approach.
This section covers how to run experiments continuously to validate your resilience. If you are following a GitOps approach, we recommend you version your experiments and run them on pull requests. You can continue reading about the GitOps approach in our blog post or jump immediately into the definition or latest run of our GitHub Action CI/CD example.
We also recommend using our badges to integrate your latest run status in various places via HTML, Markdown, or image. Like this one:
In case you want to validate the status of advice for given targets from your CI/CD pipeline, you can do so easily by using our CLI. Checkout the definition or latest run of our GitHub Action CI/CD example.
The example validates that all Kubernetes deployments in a particular service tier (identified via the discovered Kubernetes label of the deployment) are following all defined advice.