Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Counter Metrics to detect leader and follower check failures. #12711

Open
gargharsh3134 opened this issue Mar 18, 2024 · 1 comment
Assignees
Labels
Cluster Manager enhancement Enhancement or improvement to existing feature or request

Comments

@gargharsh3134
Copy link
Contributor

gargharsh3134 commented Mar 18, 2024

Is your feature request related to a problem? Please describe

Given the introduction of Request Tracing Framework (RTF) using OpenTelemetry (OTel), metrics (histogram/counter) can now be published and used to track failures.

This issue tracks the instrumentation for introducing following 2 counter metrics to identify node drops/health check failures for both the leader and follower nodes:

  1. Leader Check Failures-> Health check failure for ClusterManager Node (leader) performed by follower nodes.
  2. Follower Check Failures -> Health check failures for follower nodes performed by ClusterManager Node (leader).

Describe the solution you'd like

OTel Counter Metrics: Support for Counter type metrics, which was added as part of #10241, can be utilised to publish the metrics.

Related component

Cluster Manager

Describe alternatives you've considered

No response

Additional context

No response

@gargharsh3134 gargharsh3134 added enhancement Enhancement or improvement to existing feature or request untriaged labels Mar 18, 2024
@gargharsh3134 gargharsh3134 changed the title [Feature Request] Counter Metrics to detect leader and follower checks. [Feature Request] Counter Metrics to detect leader and follower check failures. Mar 18, 2024
@andrross
Copy link
Member

[Triage - attendees 1 2 3 4 5 6]
@gargharsh3134 Thanks for filing, looking forward to seeing progress on the PR.

@rwali-aws rwali-aws moved this from 🆕 New to Now(This Quarter) in Cluster Manager Project Board Apr 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Cluster Manager enhancement Enhancement or improvement to existing feature or request
Projects
Status: Now(This Quarter)
Development

No branches or pull requests

2 participants