Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add some bstream metrics #25

Open
wants to merge 8 commits into
base: develop
Choose a base branch
from

Conversation

fschoell
Copy link

This adds metrics for:

  • blocks/bytes read by filesource from s3
  • blocks/bytes sent from filesource
  • blocks/bytes sent from livesource
  • blocks_behind_live on joiningsource labeled with trace_id

note this depends on streamingfast/dmetrics#1 and streamingfast/dtracing#1 so I can remove the mod replacement

@sduchesneau
Copy link
Contributor

sduchesneau commented Oct 12, 2022

The per-trace_id label is not good in practice, it will increase the size of the prometheus database ridiculously!

from: https://prometheus.io/docs/practices/naming/#labels
CAUTION: Remember that every unique combination of key-value label pairs represents a new time series, which can dramatically increase the amount of data stored. Do not use labels to store dimensions with high cardinality (many different label values), such as user IDs, email addresses, or other unbounded sets of values.

This should be either:

  1. turned on with a specific debug flag (in which case I don't see how useful it is...)
  2. using an "reused" ordinal number instead of the trace_id (in which case it is not that useful...)
    or
  3. left out of prometheus and exposed to the investigating sysadmin through other means, ex: logs, custom out-of-band http request, ...

@sduchesneau
Copy link
Contributor

Another way to get information about the state of the firehose (without per-traceid-status) is to use a counter for how many concurrent streams are behind the HEAD.

A simple prometheus gauge could be set to +1 when a joining source reads from the files, and -1 when it becomes live (or when it is shut down, wrapped in a sync.Once.Do()

This means removing the ctx that was added to the joining source and only playing with those counters.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants