You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I am working with Litmus to induce chaos into my target environment. However, due to the default health checks I encounter issues on finishing my experiments or even starting them. This is because, in my environment I have multiple replicas for my pods, which means that I would like to induce chaos and be able to delete a pod even if one other replica pod is not yet in ready state.
I can see in the code that we have a tunable that can disable the default health check for the Application Under Test, which is great -- I already did this. However, there is another default check, i.e., checking if all containers are in running state. See here:
Because of this check, if there is one single replica from my target pool of pods that is not running at the time of chaos injection, then it will fail. See below:
What you expected to happen:
I expect that if I set the DEFAULT_HEALTH_CHECK on false, then I should be able to induce chaos by deleting a pod regardless if one or more pods/containers are not in ready/running state.
How to reproduce it (as minimally and precisely as possible):
Prepare a chaos experiment using the generic/pod-delete template.
Select a label that is shared by multiple pods from your AUT
Each component has multiple replicas, e.g., an application with 3 components might have 15 pods -- 5 replicas per each component
Add the DEFAULT_HEALTH_CHECK tunable on false
Start your experiment.
In this scenario, if there is one single replica not in running state, then the chaos experiment fails.
Anything else we need to know?:
As far as I know, reading the documentation and looking in GitHub, I was not able to find another tunable to disable this default check. Am I missing something or the ability to disable this extra check using a tunable does not exists at the moment?
Thank you.
The text was updated successfully, but these errors were encountered:
Can you please check why the container is still not ready? Ideally, the expectation is that all the replica containers should be in Ready state once the chaos is injected and the duration has passed, otherwise, it indicates that there's an issue with the scaling of your app.
Is this a BUG REPORT or FEATURE REQUEST?
FEATURE REQUEST
What happened:
Hi, I am working with Litmus to induce chaos into my target environment. However, due to the default health checks I encounter issues on finishing my experiments or even starting them. This is because, in my environment I have multiple replicas for my pods, which means that I would like to induce chaos and be able to delete a pod even if one other replica pod is not yet in ready state.
I can see in the code that we have a tunable that can disable the default health check for the Application Under Test, which is great -- I already did this. However, there is another default check, i.e., checking if all containers are in running state. See here:
the starting method:
litmus-go/pkg/status/application.go
Line 25 in 77b30e2
the exact place where I see the last log and we start checking the status:
litmus-go/pkg/status/application.go
Line 341 in 77b30e2
Because of this check, if there is one single replica from my target pool of pods that is not running at the time of chaos injection, then it will fail. See below:
What you expected to happen:
I expect that if I set the DEFAULT_HEALTH_CHECK on false, then I should be able to induce chaos by deleting a pod regardless if one or more pods/containers are not in ready/running state.
How to reproduce it (as minimally and precisely as possible):
In this scenario, if there is one single replica not in running state, then the chaos experiment fails.
Anything else we need to know?:
As far as I know, reading the documentation and looking in GitHub, I was not able to find another tunable to disable this default check. Am I missing something or the ability to disable this extra check using a tunable does not exists at the moment?
Thank you.
The text was updated successfully, but these errors were encountered: