Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disturbances are missing in the glossary under "Feedback" #50

Open
choeffer opened this issue Jan 7, 2022 · 4 comments
Open

Disturbances are missing in the glossary under "Feedback" #50

choeffer opened this issue Jan 7, 2022 · 4 comments

Comments

@choeffer
Copy link

choeffer commented Jan 7, 2022

I am working in IT, but have an automation engineering background from university. For me, as an engineer, one key feature of a closed-loop control system is to be able to catch/react to disturbances (in the picture called "load"), which are not directly measured, or are even unkown effects, which affect the system.
An example in a k8s cluster could be a node which dies. If this is not measured (normally it is, but to make this example valid), this would be a disturbance, which affects the system from outside, changes the actual vs desired state, and therefore, affects the cluster operation. But as we have Feedback, new pods are scheduled, routing is adjusted etc. So because you are able to observe the disturbance, you can take actions to mitigate it. And you are only able to see these disturbances, as they get visible in the Feedback of the system.
And from my point of view, this inherent feature, to catch/react to disturbances with a closed-loop, makes e.g. k8s so robust.

grafik

Edit: removed the controller part

@choeffer choeffer changed the title Disturbances are missing in the glossary under "Feedback" + overshoots and oscillations are not covered Disturbances are missing in the glossary under "Feedback" Jan 16, 2022
@christianh814
Copy link
Member

@choeffer Interesting, and I feel important thing. Would you like to submit a PR against the glossary?

@williamcaban
Copy link

I agree with @choeffer statement that another way to describe it is a "disturbance". I've seen this term used more associated with fields like physics, chemistry, and circuit engineering (w/signal processing).

When it comes to the field of technology and the operations teams on it, the term "disturbance" may be a foreign term. Do we have any examples of how this term is used in IT operations?

If we were documenting how the close-loop concept from hardware engineering is the base from which the software closed-loop concept is derived, yes, I would be inclined to include it in the glossary. Without having a text somewhere (blog, whitepaper, etc) comparing or creating the analogy between "disturbance" on hardware engineering and concepts that are more familiar for IT operations teams like "drift", "thresholds", etc, I would consider revisiting this at a later time. Once more context is available for it.

@choeffer
Copy link
Author

@williamcaban The term "disturbance" is taken from control theory, and is not taken from fields like physics, chemistry, and circuit engineering. It is used in control theory in general, see https://www.cds.caltech.edu/~murray/courses/cds101/fa02/caltech/astrom-ch5.pdf page 4. So as we apply parts of control theory now in IT, we should use the same terms I think as well.

https://en.wikipedia.org/wiki/Control_theory#Other_examples This is a very good example (which is also used in the PDF above), what I have tried to explain.
You can also think about other influences as slowly changing wind speeds, changing road conditions, changing road surfaces etc. All this influences are indirectly captured by the close loop control, without measuring them directly. You get the influence/effect of them in your output value, in this case the car speed. And therefore, you are able to compensate them.
Same applies to the died node in K8s. You do not need to know, if all nodes are in good condition, or if the network is working properly on all nodes etc. You just have to know if the desired amount of pods is running, and if not, reschedule them. And then other controllers take care of where to schedule them, how to adjust the networking etc. (Disclaimer: I am NOT familiar about the details in k8s, but I have been tought in my CKA course that different controllers are involved to reach the desired state in the etcd.)

@williamcaban
Copy link

Yes, I understand the references to control theory. In my previous comment I'm contrasting that in some non-IT fields the term is commonly used to describe influences or deviations from a state while in software operations or IT operations, the term "disturbance" does not seem to be of common use.

For example, it is easy to find software and operations documentation referring to "configuration drifts" or "events" as some of the sources for triggering a correction of a state. Are you aware of any software or IT operation documentation or practice or model that use the term "distrubance" to describe something in this context?

Note that my statements are not about the validity of the term, the term is correctj. My statement are about how the term may or may not be used in the fields of software or IT operations. Some of the underlying questions I like to answer are:

  • Is this term something widely used so anyone in an operations team can relate to it?
  • By adopting the term as part of a definition, could we create confusion by accidently open a principle to unexpected interpretations?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants