Does chaoskube really kill the pods? #103

ljanatka · 2018-09-05T11:37:40Z

Hi Martin,

I am currently working on a project where we are trying to improve reliability of our software via using chaos engineering (but, unfortunately, have a very little experience with it). Currently, our software runs on Azure/Kubernetes.

We found chaoskube as a promising tool to help us, but we found out, that it's behavior is different than expected. In the description of chaoskube, there is an information that it kills the pods, so I created a hypothesis about what will happen when one of our pods will just be dealing with a request when it is killed (there should be an error response and next requests should be processed by the other pod). When I started the experiment, the pods were killed but no error occured.

Then one of my colleagues looked in the source code of chaoskube and found out, that the pod is not killed (i.e. force killed instantly), but rather terminated (if I got it correctly, then by using this approach, the pod finishes dealing with it's current task and then "dies" peacefuly).

Is this really how chaoskube works?

We are learning more about chaos every day, but there is a lot of knowledge that we need to gain.

Since my hypothesis was probably wrong, I would be really graceful for any advice about what other chaos experiments is chaoskube suitable for.

Thank You,

Ladislav

palmerabollo · 2018-09-26T10:53:20Z

This is a very good question. I also assumed that chaoskube was killing the pods. I think killing a pod instead of terminating it is be the best option, because "graceful shutdowns" rarely happen on production environments :)

Would it be possible to at least include a flag to choose the behaviour you want (kill vs terminate)? I'm thinking about adding a configurable gracePeriod in the call to delete the pod. Sounds good?

linki · 2018-09-27T20:25:20Z

@ljanatka @palmerabollo I agree. There's already a pull request for it by @jakewins: #104. It would help me a lot if you would also have a look and leave some feedback.

ljanatka · 2018-10-22T06:52:04Z

@linki the #104 pull request seems to be marked as failing in CI build ...

linki · 2018-10-22T22:02:19Z

@ljanatka I just fixed it in case you want to give it a try again.

ljanatka · 2018-11-19T08:12:31Z

@linki Hi, we finally got to give it a try.
As far as I know, it works quite well. The pod gets killed from the inside, the cluster detects this and restarts it (restart counter of given pod increases, new instance of the pod is not being created).

linki · 2018-11-19T14:23:40Z

@ljanatka Thanks for checking it out!

ljanatka · 2019-01-14T09:50:42Z

@linki Hi, was my test enough to merge this "hardkill" feature into new version of chaoskube? When do You expect the new version to be released?
Thanks!

linki · 2019-01-14T13:21:11Z

@ljanatka I'm not sure. I want to refactor it a bit before merging and I have a work-in-progress branch for it.

@jakewins has a fork of chaoskube where this is merged. You could try using it in the meantime.

ljanatka · 2019-02-18T12:13:59Z

Hi @linki

from the release notes it seems that chaoskube now can "hardkill" the pods. However I did not find any switch that would activate this feature. Or is the hardkill now implemented as default kill method?

Thanks!

linki · 2019-05-28T10:03:47Z

Hi @ljanatka,

https://github.com/linki/chaoskube/releases/tag/v0.12.1 extracted the current strategy into a separate object behind an interface in order to make it easier to add more ways to terminate a pods.

The actual "termination-by-kill" termination strategy from the original PR hasn't been ported over yet.

dbsanfte · 2021-01-20T21:30:51Z

It's been quite awhile now since this feature was requested and I see some refactoring was done. Is there any chance this could be looked at again soon?

palmerabollo mentioned this issue Sep 26, 2018

feat: add grace period option to terminate pods #110

Merged

jakewins mentioned this issue Sep 27, 2018

Initial sketch of --exec #104

Merged

linki added enhancement question labels Sep 27, 2018

linki mentioned this issue Jan 15, 2019

Refactor out a Terminator interface to support different ways to kill pods #117

Merged

linki closed this as completed in #104 Jan 20, 2019

linki reopened this Jan 22, 2019

xingyuli mentioned this issue Nov 19, 2019

how to turn off the dry-run mode #159

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Does chaoskube really kill the pods? #103

Does chaoskube really kill the pods? #103

ljanatka commented Sep 5, 2018

palmerabollo commented Sep 26, 2018 •

edited

Loading

linki commented Sep 27, 2018

ljanatka commented Oct 22, 2018

linki commented Oct 22, 2018

ljanatka commented Nov 19, 2018

linki commented Nov 19, 2018

ljanatka commented Jan 14, 2019

linki commented Jan 14, 2019

ljanatka commented Feb 18, 2019

linki commented May 28, 2019

dbsanfte commented Jan 20, 2021

Does chaoskube really kill the pods? #103

Does chaoskube really kill the pods? #103

Comments

ljanatka commented Sep 5, 2018

palmerabollo commented Sep 26, 2018 • edited Loading

linki commented Sep 27, 2018

ljanatka commented Oct 22, 2018

linki commented Oct 22, 2018

ljanatka commented Nov 19, 2018

linki commented Nov 19, 2018

ljanatka commented Jan 14, 2019

linki commented Jan 14, 2019

ljanatka commented Feb 18, 2019

linki commented May 28, 2019

dbsanfte commented Jan 20, 2021

palmerabollo commented Sep 26, 2018 •

edited

Loading