How to debug OTLP trace exporter transient error #3110
Replies: 3 comments 1 reply
-
The response reason already shows the readable message for the error, and it says it received Understanding the root cause involves debugging the server side that sent the Internal server error . You probably want to spend time debugging the sever what's the reason because you mention it eventually succeeds, so the problem is not SDK. How can SDK help more than its current state of showing messages when it sees transient error to find the root cause?
|
Beta Was this translation helpful? Give feedback.
0 replies
-
Monkey patched Open Telemetry lib to see response message from Tempo in case of 500 status, see https://github.com/grafana/tempo/issues/2237 |
Beta Was this translation helpful? Give feedback.
0 replies
-
@manuel-koch is there something else from the OTel Python side that can be discussed here? If not, let's close this discussion, please. ✌️ |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
We are using OpenTelemetry to export tracing spans to Grafana/Tempo (grafana/agent:v0.29.0) within a k8s cluster.
For some unknown reason we see lots of log messages in our service like
Such logs follow a request log message ( which seem to indicate a server error 500 on Gafana/Tempo side ) like the following
Eventually the trace spans seem to get exported successfully though, but due to the (synchronous) retries during the span export the overall performance of our service is degraded.
How could we find more info regarding the root cause of this transient error ?
I.e. the logged error ( although caused by a request error ) does not print the request status code of the underlying error.
Is there a way to get more info about the error ?
I tried to set envvar
OTEL_LOG_LEVEL=DEBUG
but this does not result in any more log output for this error.I tried to align Grafana/Tempo POD logs with our service POD logs but could not find any logs in Grafana/Tempo that correspond to the service warning logs.
Beta Was this translation helpful? Give feedback.
All reactions