Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Applications Metrics Polling POC #4

Open
atosatto opened this issue Jul 8, 2016 · 6 comments
Open

Applications Metrics Polling POC #4

atosatto opened this issue Jul 8, 2016 · 6 comments
Labels

Comments

@atosatto
Copy link
Member

atosatto commented Jul 8, 2016

We should definitively put down the initial reference implementation of the Application Metrics Polling system.

The responsibility of this system will be to execute the checks defined by the container label kappa.metric every kappa.rate units of time.
Available checks will be defined in the kappa configuration file.
Each metric should give kappa the desired number of active containers for the given service.
The communication between Kappa's Engine and the Applications Metrics Polling components will be implemented using async communication through channels. Each metric will be lazily spawned into its own goroutine.
The first implementation of the Applications Metrics Polling will rely on external bash scripts to actually perform the HTTP requests of cGroup queries to extract the number of desired instances of each container but the implementation will ease the creation of different Applications Metrics Polling implementations.

@fntlnz
Copy link
Contributor

fntlnz commented Jul 8, 2016

I created a little diagram to show what is going on there.

kev

Download PDF

The tick part is based on the rate while the metric is the actual code to be executed in order to obtain back the number of containers to scale.

DOUBT: what are we supposed to do when a tick fails?

Also note that I called them ticks for a reason. I thought that would be better to threat them as single sequential units instead of allowing them to be overlapped like what happens in cronjobs.

@atosatto
Copy link
Member Author

atosatto commented Jul 9, 2016

Uhm.
Regarding your doubt I think that we have two choices (plus logging):

  • exponentially bake-off on fail
  • keep polling the metric even if it is failing
    I believe that the 2nd strategy would be the best one since I expect to users to test and monitor their metrics.

I thought that would be better to threat them as single sequential units instead of allowing them to be overlapped like what happens in cronjobs.

@fntlnz could you be more precise about this?! What do you mean by overlapping? Are you referring to the fact that if the metric is slow to compute the number of desired containers and the rate is too high we can easily queues too many metric polling routines?

I'll open a PR with some code addressing this issue and count on your (@fntlnz & @jnardiello) precious feedbacks.

@fntlnz
Copy link
Contributor

fntlnz commented Jul 12, 2016

@atosatto Yep exactly that.

Let's consider this:

t0 -> metric1 check is fired
t1 -> metric1 ceck finished
t2 -> metric2 check is started
t3 -> metric3 check is started
t4 -> metric3 check is finished
t5 -> metric2 check is finished

As you can imagine allowing metrics to overlap and start in an un-managed way could lead to more problem than it solves

@fntlnz
Copy link
Contributor

fntlnz commented Jul 12, 2016

For the failing metric thing I would choose the same option of your choice.

keep polling the metric even if it is failing

Mainly because if the check fails containers are untouched. I would expose an API or give a way to check programmatically if something is failing

@atosatto
Copy link
Member Author

Very good. All what we need is the code! 😆
I've already written few code lines to address this issue.
I'll submit a PR as soon as the LOC will be more consistent in order to let you give me some feedbacks while I code this.

Thank you!

@atosatto
Copy link
Member Author

atosatto commented Jul 27, 2016

Just submitted PR #5 to keep track of the changes to address this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants