-
Notifications
You must be signed in to change notification settings - Fork 101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix monitoring metrics for individual collectors #389
Fix monitoring metrics for individual collectors #389
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is awesome and allows aggregate deltas + descriptor cache to be compatible with collect time filters (resolving #315).
Since both of those features prune their storage via collector usage we should probably add a collector cache TTL that at least matches the largest TTL if either feature is in use. This way if a scrape config changes the collect filters the exporter will eventually clean itself up without needing to be restarted.
028c6cb
to
f53ae4e
Compare
Signed-off-by: Ananya Kumar Mallik <[email protected]>
c9ce282
to
715e0fe
Compare
@kgeckhart Added default collector cache TTL to be 2 hours (open to suggestion) and overriding it with largest of all 3 TTLs. |
715e0fe
to
e39be65
Compare
Signed-off-by: Ananya Kumar Mallik <[email protected]>
e39be65
to
e620717
Compare
Signed-off-by: Ananya Kumar Mallik <[email protected]>
Signed-off-by: Ananya Kumar Mallik <[email protected]>
e620717
to
d9499df
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I think I didn't fully explain the purpose of the TTL and how it relates to the descriptor cache/aggregate deltas.
There's three use cases,
- A Collector is consistently being used -> safe to keep forever
- A Collector is no longer being used with the descriptor cache and/or aggregate deltas enabled -> Remove after greatest TTL between the two features
- Both features add memory overhead through caching but prune themselves when the collector is in use
- In the case of aggregate deltas, this could be a very large amount of memory because every metric is kept around to enable the delta aggregations
- Removing the collector after the TTL will drop these caches that would otherwise be left around until the exporter is restarted
- A Collector is no longer being used and descriptor cache/aggregate deltas is disabled -> Remove after the default 2 hour TTL (overhead of these collectors should negligible)
I left some more direct comments for the implementation.
@kgeckhart : Added suggested changes for descriptor cache/aggregate deltas use case. |
Signed-off-by: Ananya Kumar Mallik <[email protected]>
8a786a7
to
172634e
Compare
Fix monitoring metrics
Problem
Currently, each time a metrics endpoint is hit (with
collect
parameter e.g.,/metrics?collect=pubsub.googleapis.com/topic
), a new collector is created. This causes the monitoring metrics (stackdriver_monitoring_api_calls_total
,stackdriver_monitoring_scrapes_total
etc.) to reset on each scrape.Solution
This PR:
Testing