-
Notifications
You must be signed in to change notification settings - Fork 65
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OpenTelemetry OTLP setup for tracing, take 2 #697
base: main
Are you sure you want to change the base?
Conversation
Replace the "movement_timing" tracing target and the logging layer it targeted with an optionally installed OpenTelemetry OTLP exporter. The name of the tracing target matched to send OpenTelemetry events is "movement_telemetry".
Add the telemetry overlay enabling OTLP telemetry export in suzuka-full-node and m1-da-light-node.. In the telemetry overlay for suzuka-full-node, add an OTLP collector start job running a docker container.
Provide telemetry API as separate from tracing rather than a globally installed layer. Installing an OpenTelemetry layer into the global tracing subscriber raises nasty reentrancy issues because the OTLP exporter stack also uses tracing under the hood.
Install an OpenTelemetryLayer configured with the OTLP exporter. The tracing spans and events to export are selected by target "movement_telemetry".
The implementation of shutdown in the opentelemetry_sdk exporter calls futures_executor::block_on, which does not play well with the multithreaded Tokio runtime.
3a27069
to
f713d71
Compare
OpenTelemetry needs spans at the top level of its log event model at least.
Emit telemetry events detailing the success or failure
In the transaction_ingress task of suzuka-full-node and executor's transaction_pipe, add comments detailing which metrics the telemetry events are contributing to.
At the points where transaction is dropped in the submit_transaction method, emit telemetry events. These will help compute the transaction failure rate.
environment: | ||
|
||
processes: | ||
otlp-collector: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't seem to be getting anything served by this and the e2e tests for the basic simulation are failing. Is this how I should be using this?:
just suzuka-full-node native build.setup.eth-local.celestia-local.test.telemetry --keep-project
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a feed
overlay in case for some reason this is not in fact staying open with the --keep-project
flag.
just suzuka-full-node native build.setup.eth-local.celestia-local.feed.telemetry --keep-project
@mzabaluev Last commit adds a test. It would be good to add to CI if you like it. |
Apply the telemetry overlay when running tests in the local setup.
I have added the overlays to the local test job, as we need some tests to drive the exporting. Do you think it would be better to test separately? |
Port the simple jaeger setup from process-compose.
Resolves: #555 #556 #559
Summary
protocol-units
,networks
,util
.Integrate an OTLP exporter for tracing events matching the "movement_tracing" target.
The OTLP gRPC endpoint is configured with the MOVEMENT_OTLP environment variable.
This replaces the earlier "movement_timing" tracing layer, as analysis of OpenTelemetry data should be more versatile.
Changelog
MOVEMENT_OTLP
environment variable.MOVEMENT_TIMING
.Testing
The
telemetry
overlay for process-compose runs the jaeger all-in-one Docker container as a local OpenTelemetry collector.