Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exportable service metrics #134

Open
1 task
j08lue opened this issue Aug 20, 2024 · 5 comments
Open
1 task

Exportable service metrics #134

j08lue opened this issue Aug 20, 2024 · 5 comments

Comments

@j08lue
Copy link
Member

j08lue commented Aug 20, 2024

Scope out what it would take to gather service health / usage metrics and make them accessible to any harvester of such metrics, e.g. to feed into a Grafana dashboard or OpenSearch.

Similar to what veda-backend has been doing, e.g.

Perhaps related to

Acceptance criteria

  • Describe possible pathways to an interface that exposes service health and usage metrics / traces
@j08lue
Copy link
Member Author

j08lue commented Aug 20, 2024

Why this is coming up now:

I learned that the Resource Health building block in EOEPCA will possibly in the future be able to receive OpenTelemetry traces from other building blocks, to ease debugging.

VEDA Backend has set up a centralized usage metrics overview (dashboard), which would also be highly relevant as a feature for eoapi-k8s.

@ranchodeluxe
Copy link
Contributor

ranchodeluxe commented Aug 27, 2024

I think we have about 75% of this already done in main b/c our metrics are already exportable via Prometheus on an API endpoint (that's how we have custom metrics about each service to autoscale by) and they get fed to Grafana. If I understand correctly the only addition here would be either:

  1. setting up an OTeL deployment that feeds more data into Prometheus
  2. setting up an OTeL exporter deployment that pushes data to an OTeL receiver service that EOEPCA+ sets up

That said we use ETOL on the fire atlas stuff and without writing custom metrics I haven't seen a lot of good data coming out of it that we can't get from other metrics APIs already. So something to think about

@j08lue
Copy link
Member Author

j08lue commented Aug 28, 2024

Great to hear the infrastructure is basically already there and we just need to add some custom metrics.

Let us see once our services are up, which metrics that could be.

What is ETOL? 🙏

@ranchodeluxe
Copy link
Contributor

What is ETOL? 🙏

The Encyclopedia of Trotskyism Online 😉 I mean OTeL

@ranchodeluxe
Copy link
Contributor

@j08lue: It also occurred to me this means we'd have to build our own custom runtimes to add OTeL stuff 😞 I really, truly feel like this is yet another example of something that should be plumbed through and turned on in all the upstream libraries 😬

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants