Autorefresh limits #40

jimfrey · 2016-11-30T20:46:45Z

Kentik is experiencing heavy load on our back end due to Grafana users that are setting up auto-refreshing dashboards. Two approaches to deal with this:

The best answer for Kentik in the long term would be for Grafana to retain retrieved data, and then only pull incremental updates. Ideal incremental updates would be <=2 minutes, because we keep the last 2 mins in RAM and can serve responses very fast. I talked to Torkel about this, and he advised that this is not possible at present, due to no method for saving data and calculating averages, etc. We understand that this requires long term work.
The near term request is this: In the Kentik Plug-in, please disable all auto-refresh options that are less than 1 minute.

jimfrey · 2016-12-16T22:40:53Z

Updates. For the last several days, one of our customers has been absolutely hammering the Kentik back end through the Grafana plugin.

The issue is actually pretty crazy. They are hitting us with what looks like 45 dashboards, most of which are 1 month's worth of data, some may be shorter. But they have refresh intervals set at either 30s or 1 min - hard to tell for sure. We just see this regular hammer hitting every minute in regular repeated batches at :00, :15, :45 seconds. Each time, the month-long queries spawn around 2.5 million subqueries against our back end.

They are the only ones hitting so bleeping hard, but clearly anyone using the plug in could do the same, so what we propose is this: limit auto-refresh on a sliding scale.

For queries that are 1 month or longer, no faster than 1h refresh
For queries that are between 1 day and one month, no faster than 15 min refresh
For queries less than 1 full day, no faster than 5 min refresh

This would need to be implemented on the Grafana side. This would give our platform some breathing room as adoption ramps, until such time as incremental querying can be implemented within Grafana.

alexanderzobnin · 2016-12-17T10:13:28Z

There's no way to change auto-refresh behaviour in grafana now. But I can implement that through a incremental queries feature. I'll add proxy (with cache) layer for the queries in plugin. This layer will store data from previous requests. And also I can add these limits to the part which invokes api queries.

For example:

Grafana invoke panel refresh.
Kentik plugin looks into limits.
If not enough time has passed, get data from cache (show previously returned data).
Else, invoke api request and write data to cache.

Then I can expand this pattern to incremental update - add incremental query to step 3.

jimfrey · 2016-12-17T17:35:29Z

Awesome strategy - please proceed!

alexanderzobnin · 2016-12-20T17:39:02Z

@jimfrey try to test incremental-data-update branch. I've added simple data caching and auto refresh limits.

alexanderzobnin · 2016-12-21T09:42:48Z

@jimfrey I'm working on incremental queries and want to discuss a question.
Let's assume we're querying data for last 24 hours. I use Average aggregation for time series data. When I refresh panel, for example after 10 mins, I get average data for last 10 mins, but previous set is average for the slices for 24 hours (1 hour interval for this period). After few updates I'll get graph with 10 min average slices, this is incorrect.
What do you think about it?

alexanderzobnin · 2016-12-21T09:59:20Z

I think, for the incremental queries we should know aggregation slice size (for the given time range). How can I get info about it?

jimfrey · 2017-01-04T12:34:13Z

Tests completed - looks like the incremental pull is working as requested. Thanks! I'd suggest we close this issue, and pick up your questions above about aggregation slice size on a separate thread. If you agree, feel free to close.

alexanderzobnin · 2017-01-10T18:26:47Z

@jimfrey It still isn't an incremental pull, just requests caching. We need more time for true incremental queries, but hope, these changes help to reduce load to your servers.

nopzor1200 · 2017-03-18T00:36:31Z

This is currently under internal discussion. It's an important feature for Kentik, but will require some non insignificant work to implement incremental/chunked/staggered (these are deliberately hazy words, for now) query capabilities in data sources. Part of our discussion is whether this functionality applies to other data sources (eg. Splunk, InfluxDB), and how to best abstract it out while still meeting Kentiks needs.

jimfrey · 2018-02-07T16:35:14Z

11 months have passed since this request. Can we get an update? Thanks.

alexanderzobnin · 2018-03-13T10:08:18Z

@mattttt do we have any estimates for this?

alexanderzobnin added a commit that referenced this issue Dec 1, 2016

Disable auto-refresh intervals that are less than 1 minute, issue #40.

3288a48

alexanderzobnin added a commit that referenced this issue Dec 20, 2016

Add proxy layer for caching API requests, issue #40

5a021c9

briangann added the enhancement label Oct 27, 2018

briangann self-assigned this Oct 27, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Autorefresh limits #40

Autorefresh limits #40

jimfrey commented Nov 30, 2016 •

edited by briangann

Loading

jimfrey commented Dec 16, 2016 •

edited

Loading

alexanderzobnin commented Dec 17, 2016

jimfrey commented Dec 17, 2016

alexanderzobnin commented Dec 20, 2016

alexanderzobnin commented Dec 21, 2016 •

edited

Loading

alexanderzobnin commented Dec 21, 2016

jimfrey commented Jan 4, 2017

alexanderzobnin commented Jan 10, 2017

nopzor1200 commented Mar 18, 2017

jimfrey commented Feb 7, 2018

alexanderzobnin commented Mar 13, 2018

Autorefresh limits #40

Autorefresh limits #40

Comments

jimfrey commented Nov 30, 2016 • edited by briangann Loading

jimfrey commented Dec 16, 2016 • edited Loading

alexanderzobnin commented Dec 17, 2016

jimfrey commented Dec 17, 2016

alexanderzobnin commented Dec 20, 2016

alexanderzobnin commented Dec 21, 2016 • edited Loading

alexanderzobnin commented Dec 21, 2016

jimfrey commented Jan 4, 2017

alexanderzobnin commented Jan 10, 2017

nopzor1200 commented Mar 18, 2017

jimfrey commented Feb 7, 2018

alexanderzobnin commented Mar 13, 2018

jimfrey commented Nov 30, 2016 •

edited by briangann

Loading

jimfrey commented Dec 16, 2016 •

edited

Loading

alexanderzobnin commented Dec 21, 2016 •

edited

Loading