Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve documentation of the otelcol benchmarks #36230

Open
cforce opened this issue Nov 6, 2024 · 1 comment
Open

improve documentation of the otelcol benchmarks #36230

cforce opened this issue Nov 6, 2024 · 1 comment
Labels
enhancement New feature or request needs triage New item requiring triage testbed

Comments

@cforce
Copy link

cforce commented Nov 6, 2024

Component(s)

testbed

Is your feature request related to a problem? Please describe.

I’ve reviewed the OpenTelemetry Collector benchmark information at https://opentelemetry.io/docs/collector/benchmarks/ but am still unclear about several aspects of the graphs. I’m looking for a detailed explanation of each test run, along with specifics on the hardware and OS setups used for each. Also i looking for explanations and advising to tune into cpu reduction.

Describe the solution you'd like

Some of the questions I have are:

  • What are the CPU specifications, such as the number of cores and their types? How many cores and threads are utilized?
  • What does "SPS" mean in these test names? Is it “Samples per Second” or “Data per Second”?
  • Why does HTTP appear to have a lower CPU usage than gRPC in these benchmarks?
  • How is CPU usage distributed in HTTP versus gRPC tests?
  • Which configuration parameters are most relevant to optimize CPU usage, especially when handling lower trace volumes?
  • How has SPS/DPS performance evolved across different OpenTelemetry Collector versions, showing any historical trends in performance changes?
  • How can I trace and analyze CPU usage for my custom Collector build, especially since the benchmark seems to exclude certain widely-used receivers, processors, exporters, and extensions?
  • The current testbed doesn’t seem to evaluate a complete pipeline, such as an HTTP/gRPC receiver, processor, and gRPC/HTTP exporter. In that case, is HTTP truly the most efficient end-to-end option in terms of CPU usage? How would backpressure on the exporter side impact CPU and RAM usage?
  • How can I tune the OpenTelemetry Collector to prevent CPU spikes, similar to how memory limits can be set?

Any insights into these areas would be immensely helpful!

Describe alternatives you've considered

No response

Additional context

No response

@cforce cforce added enhancement New feature or request needs triage New item requiring triage labels Nov 6, 2024
Copy link
Contributor

github-actions bot commented Nov 6, 2024

Pinging code owners:

  • testbed: @open-telemetry/collector-approvers

See Adding Labels via Comments if you do not have permissions to add labels yourself.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request needs triage New item requiring triage testbed
Projects
None yet
Development

No branches or pull requests

1 participant