Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

profiler: suppress errors if the profiler is stopped #2886

Merged
merged 1 commit into from
Sep 25, 2024

Conversation

nsrip-dd
Copy link
Contributor

@nsrip-dd nsrip-dd commented Sep 24, 2024

What does this PR do?

The fix in #2865 was intended to suppress spurious metrics profile
errors when the profiler is stopped. It did so by relaxing the
one-second duration constraint of the metrics profiler. However,
the Windows system timer resolution is about 15 milliseconds (see
https://learn.microsoft.com/en-us/windows-hardware/drivers/kernel/high-resolution-timers#controlling-timer-accuracy)
This caused the metrics profile tests from #2865 to fail because the
metrics profiler will likely be stopped in less than 15 milliseconds,
meaning we'll see 0 duration between profile collection and log an
error.

This commit actually suppresses the error by checking whether the
profiler was stopped (meaning interruptibleSleep was interrupted). If
so, and if the metrics profiler returned an error, we instead return a
sentinel error indicating that profiling was stopped. If we see that
error, we just drop the profile and don't log an error. We won't upload
the profile anyway. This way, we should only report an error from the
metrics profiler if there is actually a problem with the timer.

Motivation

Fix broken CI. Example failure: https://github.com/DataDog/dd-trace-go/actions/runs/11017364478/job/30595111862

@pr-commenter
Copy link

pr-commenter bot commented Sep 24, 2024

Benchmarks

Benchmark execution time: 2024-09-24 18:42:43

Comparing candidate commit bd99cda in PR branch nick.ripley/fix-windows-metrics-profile-test with baseline commit eef52d3 in branch main.

Found 0 performance improvements and 1 performance regressions! Performance is the same for 58 metrics, 0 unstable metrics.

scenario:BenchmarkInjectW3C-24

  • 🟥 execution_time [+82.369ns; +104.831ns] or [+2.033%; +2.588%]

@nsrip-dd nsrip-dd force-pushed the nick.ripley/fix-windows-metrics-profile-test branch from 8772c17 to e398c9c Compare September 24, 2024 17:31
@nsrip-dd nsrip-dd changed the title profiler: handle low Windows timer resolution for metrics profile tests profiler: suppress errors if the profiler is stopped Sep 24, 2024
@nsrip-dd
Copy link
Contributor Author

nsrip-dd commented Sep 24, 2024

Kicking off a Windows run to test this: https://github.com/DataDog/dd-trace-go/actions/runs/11019312785

@nsrip-dd nsrip-dd force-pushed the nick.ripley/fix-windows-metrics-profile-test branch from e398c9c to 232e3d1 Compare September 24, 2024 17:51
The fix in #2865 was intended to suppress spurious metrics profile
errors when the profiler is stopped. It did so by relaxing the
one-second duration constraint of the metrics profiler. However,
the Windows system timer resolution is about 15 milliseconds (see
https://learn.microsoft.com/en-us/windows-hardware/drivers/kernel/high-resolution-timers#controlling-timer-accuracy)
This caused the metrics profile tests from #2865 to fail because the
metrics profiler will likely be stopped in less than 15 milliseconds,
meaning we'll see 0 duration between profile collection and log an
error.

This commit actually suppresses the error by checking whether the
profiler was stopped (meaning interruptibleSleep was interrupted). If
so, and if the metrics profiler returned an error, we instead return a
sentinel error indicating that profiling was stopped. If we see that
error, we just drop the profile and don't log an error. We won't upload
the profile anyway. This way, we should only report an error from the
metrics profiler if there is _actually_ a problem with the timer.
@nsrip-dd nsrip-dd force-pushed the nick.ripley/fix-windows-metrics-profile-test branch from 232e3d1 to bd99cda Compare September 24, 2024 17:58
@nsrip-dd nsrip-dd marked this pull request as ready for review September 24, 2024 18:11
@nsrip-dd nsrip-dd requested a review from a team as a code owner September 24, 2024 18:11
Copy link
Member

@felixge felixge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks 🙇

@nsrip-dd nsrip-dd merged commit ac73f9b into main Sep 25, 2024
146 checks passed
@nsrip-dd nsrip-dd deleted the nick.ripley/fix-windows-metrics-profile-test branch September 25, 2024 12:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants