Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Try to fix flake in ManualInstrumentation tests #6333

Merged
merged 2 commits into from
Nov 25, 2024

Conversation

andrewlock
Copy link
Member

Summary of changes

Add a mutex to try to avoid flake

Reason for change

We see occasional flake in the manual + automatic instrumentation tests. I don't entirely understand why this happens other than it shows that the httplistener span is getting serialized after we have reconfigured the tracer (because traces aren't tied to a settings instance). This is arguably a bug, but it's "known" to an extent. The problem is that it's causing flake when you change the base service name.

FWIW, we already "fixed" this same symptom once: #4960

Implementation details

Add a mutex to try to make sure the httplistener span is closed before we close the request span

Test coverage

Enabled debug logs so that we can get more details if this does recur.

Other details

Stacked on:

@andrewlock andrewlock added the area:tests unit tests, integration tests label Nov 21, 2024
@andrewlock andrewlock requested a review from a team as a code owner November 21, 2024 15:28
@datadog-ddstaging
Copy link

datadog-ddstaging bot commented Nov 21, 2024

Datadog Report

Branch report: andrew/ci/manual-instrumentation-flake-2
Commit report: d6306fa
Test service: dd-trace-dotnet

✅ 0 Failed, 449567 Passed, 2726 Skipped, 19h 20m 8.69s Total Time

@andrewlock
Copy link
Member Author

andrewlock commented Nov 21, 2024

Execution-Time Benchmarks Report ⏱️

Execution-time results for samples comparing the following branches/commits:

Execution-time benchmarks measure the whole time it takes to execute a program. And are intended to measure the one-off costs. Cases where the execution time results for the PR are worse than latest master results are shown in red. The following thresholds were used for comparing the execution times:

  • Welch test with statistical test for significance of 5%
  • Only results indicating a difference greater than 5% and 5 ms are considered.

Note that these results are based on a single point-in-time result for each branch. For full results, see the dashboard.

Graphs show the p99 interval based on the mean and StdDev of the test run, as well as the mean value of the run (shown as a diamond below the graph).

gantt
    title Execution time (ms) FakeDbCommand (.NET Framework 4.6.2) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (6333) - mean (72ms)  : 64, 79
     .   : milestone, 72,
    master - mean (71ms)  : 64, 78
     .   : milestone, 71,

    section CallTarget+Inlining+NGEN
    This PR (6333) - mean (1,108ms)  : 1088, 1128
     .   : milestone, 1108,
    master - mean (1,109ms)  : 1083, 1135
     .   : milestone, 1109,

Loading
gantt
    title Execution time (ms) FakeDbCommand (.NET Core 3.1) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (6333) - mean (109ms)  : 106, 111
     .   : milestone, 109,
    master - mean (109ms)  : 106, 112
     .   : milestone, 109,

    section CallTarget+Inlining+NGEN
    This PR (6333) - mean (765ms)  : 750, 780
     .   : milestone, 765,
    master - mean (772ms)  : 752, 791
     .   : milestone, 772,

Loading
gantt
    title Execution time (ms) FakeDbCommand (.NET 6) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (6333) - mean (92ms)  : 90, 94
     .   : milestone, 92,
    master - mean (92ms)  : 91, 94
     .   : milestone, 92,

    section CallTarget+Inlining+NGEN
    This PR (6333) - mean (725ms)  : 712, 738
     .   : milestone, 725,
    master - mean (726ms)  : 711, 741
     .   : milestone, 726,

Loading
gantt
    title Execution time (ms) HttpMessageHandler (.NET Framework 4.6.2) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (6333) - mean (191ms)  : 187, 195
     .   : milestone, 191,
    master - mean (192ms)  : 186, 198
     .   : milestone, 192,

    section CallTarget+Inlining+NGEN
    This PR (6333) - mean (1,214ms)  : 1193, 1234
     .   : milestone, 1214,
    master - mean (1,214ms)  : 1192, 1237
     .   : milestone, 1214,

Loading
gantt
    title Execution time (ms) HttpMessageHandler (.NET Core 3.1) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (6333) - mean (277ms)  : 272, 281
     .   : milestone, 277,
    master - mean (278ms)  : 272, 284
     .   : milestone, 278,

    section CallTarget+Inlining+NGEN
    This PR (6333) - mean (942ms)  : 926, 959
     .   : milestone, 942,
    master - mean (949ms)  : 933, 965
     .   : milestone, 949,

Loading
gantt
    title Execution time (ms) HttpMessageHandler (.NET 6) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (6333) - mean (266ms)  : 261, 271
     .   : milestone, 266,
    master - mean (266ms)  : 262, 270
     .   : milestone, 266,

    section CallTarget+Inlining+NGEN
    This PR (6333) - mean (930ms)  : 910, 950
     .   : milestone, 930,
    master - mean (931ms)  : 914, 948
     .   : milestone, 931,

Loading

@andrewlock andrewlock requested review from a team as code owners November 22, 2024 12:20
@andrewlock andrewlock force-pushed the andrew/ci/manual-instrumentation-flake branch from ee4d15f to ad3394e Compare November 22, 2024 13:32
@andrewlock andrewlock force-pushed the andrew/ci/manual-instrumentation-flake-2 branch from a931007 to 0e651ce Compare November 22, 2024 13:32
@andrewlock
Copy link
Member Author

andrewlock commented Nov 22, 2024

Benchmarks Report for tracer 🐌

Benchmarks for #6333 compared to master:

  • 3 benchmarks are slower, with geometric mean 1.184
  • All benchmarks have the same allocations

The following thresholds were used for comparing the benchmark speeds:

  • Mann–Whitney U test with statistical test for significance of 5%
  • Only results indicating a difference greater than 10% and 0.3 ns are considered.

Allocation changes below 0.5% are ignored.

Benchmark details

Benchmarks.Trace.ActivityBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master StartStopWithChild net6.0 7.97μs 45.4ns 327ns 0.0119 0.00398 0 5.61 KB
master StartStopWithChild netcoreapp3.1 9.92μs 54.9ns 334ns 0.0151 0.00503 0 5.8 KB
master StartStopWithChild net472 16.3μs 51.3ns 199ns 1.06 0.332 0.116 6.21 KB
#6333 StartStopWithChild net6.0 8.08μs 41.3ns 198ns 0.0166 0.0083 0 5.62 KB
#6333 StartStopWithChild netcoreapp3.1 10.6μs 53.8ns 241ns 0.02 0.005 0 5.81 KB
#6333 StartStopWithChild net472 16.2μs 34.4ns 129ns 1.04 0.329 0.104 6.21 KB
Benchmarks.Trace.AgentWriterBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master WriteAndFlushEnrichedTraces net6.0 487μs 394ns 1.53μs 0 0 0 2.7 KB
master WriteAndFlushEnrichedTraces netcoreapp3.1 654μs 145ns 543ns 0 0 0 2.7 KB
master WriteAndFlushEnrichedTraces net472 859μs 583ns 2.26μs 0.425 0 0 3.3 KB
#6333 WriteAndFlushEnrichedTraces net6.0 487μs 546ns 2.12μs 0 0 0 2.7 KB
#6333 WriteAndFlushEnrichedTraces netcoreapp3.1 656μs 371ns 1.44μs 0 0 0 2.7 KB
#6333 WriteAndFlushEnrichedTraces net472 849μs 360ns 1.3μs 0.425 0 0 3.3 KB
Benchmarks.Trace.AspNetCoreBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master SendRequest net6.0 152μs 856ns 6.52μs 0.148 0 0 14.47 KB
master SendRequest netcoreapp3.1 167μs 884ns 6μs 0.169 0 0 17.27 KB
master SendRequest net472 0.000513ns 0.000275ns 0.00103ns 0 0 0 0 b
#6333 SendRequest net6.0 152μs 879ns 7.57μs 0.146 0 0 14.47 KB
#6333 SendRequest netcoreapp3.1 158μs 791ns 4.11μs 0.158 0 0 17.27 KB
#6333 SendRequest net472 0.00144ns 0.000753ns 0.00291ns 0 0 0 0 b
Benchmarks.Trace.CIVisibilityProtocolWriterBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master WriteAndFlushEnrichedTraces net6.0 564μs 2.75μs 11μs 0.558 0 0 41.57 KB
master WriteAndFlushEnrichedTraces netcoreapp3.1 680μs 3.75μs 22.8μs 0.406 0 0 41.75 KB
master WriteAndFlushEnrichedTraces net472 863μs 4.32μs 18.3μs 8.56 2.57 0.428 53.28 KB
#6333 WriteAndFlushEnrichedTraces net6.0 568μs 2.79μs 11.5μs 0.548 0 0 41.63 KB
#6333 WriteAndFlushEnrichedTraces netcoreapp3.1 690μs 3.43μs 14.2μs 0.336 0 0 41.82 KB
#6333 WriteAndFlushEnrichedTraces net472 880μs 4.11μs 15.4μs 8.13 2.57 0.428 53.32 KB
Benchmarks.Trace.DbCommandBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master ExecuteNonQuery net6.0 1.31μs 0.782ns 2.93ns 0.0145 0 0 1.02 KB
master ExecuteNonQuery netcoreapp3.1 1.73μs 2.21ns 8.25ns 0.0138 0 0 1.02 KB
master ExecuteNonQuery net472 2.04μs 2.42ns 9.38ns 0.156 0.00102 0 987 B
#6333 ExecuteNonQuery net6.0 1.29μs 1.02ns 3.67ns 0.0145 0 0 1.02 KB
#6333 ExecuteNonQuery netcoreapp3.1 1.8μs 1.29ns 4.83ns 0.0133 0 0 1.02 KB
#6333 ExecuteNonQuery net472 2.1μs 1.83ns 7.08ns 0.157 0.00105 0 987 B
Benchmarks.Trace.ElasticsearchBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master CallElasticsearch net6.0 1.2μs 0.724ns 2.71ns 0.0138 0 0 976 B
master CallElasticsearch netcoreapp3.1 1.51μs 1.23ns 4.6ns 0.0129 0 0 976 B
master CallElasticsearch net472 2.46μs 1.32ns 5.12ns 0.157 0 0 995 B
master CallElasticsearchAsync net6.0 1.35μs 0.535ns 2.07ns 0.0129 0 0 952 B
master CallElasticsearchAsync netcoreapp3.1 1.66μs 0.878ns 3.4ns 0.0134 0 0 1.02 KB
master CallElasticsearchAsync net472 2.75μs 1.27ns 4.76ns 0.166 0 0 1.05 KB
#6333 CallElasticsearch net6.0 1.21μs 0.611ns 2.29ns 0.014 0 0 976 B
#6333 CallElasticsearch netcoreapp3.1 1.54μs 0.58ns 2.25ns 0.0133 0 0 976 B
#6333 CallElasticsearch net472 2.52μs 1.11ns 4.28ns 0.157 0 0 995 B
#6333 CallElasticsearchAsync net6.0 1.31μs 1ns 3.87ns 0.0132 0 0 952 B
#6333 CallElasticsearchAsync netcoreapp3.1 1.59μs 0.728ns 2.72ns 0.0135 0 0 1.02 KB
#6333 CallElasticsearchAsync net472 2.66μs 2.6ns 10.1ns 0.167 0 0 1.05 KB
Benchmarks.Trace.GraphQLBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master ExecuteAsync net6.0 1.31μs 0.484ns 1.81ns 0.0131 0 0 952 B
master ExecuteAsync netcoreapp3.1 1.62μs 2.24ns 8.37ns 0.0128 0 0 952 B
master ExecuteAsync net472 1.77μs 0.485ns 1.81ns 0.145 0 0 915 B
#6333 ExecuteAsync net6.0 1.32μs 0.56ns 2.09ns 0.013 0 0 952 B
#6333 ExecuteAsync netcoreapp3.1 1.7μs 0.629ns 2.27ns 0.0126 0 0 952 B
#6333 ExecuteAsync net472 1.8μs 0.383ns 1.38ns 0.145 0 0 915 B
Benchmarks.Trace.HttpClientBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master SendAsync net6.0 4.32μs 3.27ns 12.7ns 0.0323 0 0 2.31 KB
master SendAsync netcoreapp3.1 5.35μs 3.02ns 11.7ns 0.0375 0 0 2.85 KB
master SendAsync net472 7.37μs 8.02ns 31.1ns 0.493 0 0 3.12 KB
#6333 SendAsync net6.0 4.3μs 1.3ns 5.05ns 0.0323 0 0 2.31 KB
#6333 SendAsync netcoreapp3.1 5.36μs 1.7ns 6.6ns 0.0375 0 0 2.85 KB
#6333 SendAsync net472 7.37μs 3.16ns 12.2ns 0.494 0 0 3.12 KB
Benchmarks.Trace.ILoggerBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master EnrichedLog net6.0 1.6μs 0.785ns 2.94ns 0.0233 0 0 1.64 KB
master EnrichedLog netcoreapp3.1 2.22μs 0.93ns 3.48ns 0.0221 0 0 1.64 KB
master EnrichedLog net472 2.77μs 0.859ns 3.33ns 0.249 0 0 1.57 KB
#6333 EnrichedLog net6.0 1.58μs 0.757ns 2.93ns 0.0229 0 0 1.64 KB
#6333 EnrichedLog netcoreapp3.1 2.28μs 2.8ns 10.5ns 0.0215 0 0 1.64 KB
#6333 EnrichedLog net472 2.71μs 2.33ns 8.7ns 0.249 0 0 1.57 KB
Benchmarks.Trace.Log4netBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master EnrichedLog net6.0 119μs 174ns 673ns 0.0596 0 0 4.28 KB
master EnrichedLog netcoreapp3.1 124μs 226ns 875ns 0 0 0 4.28 KB
master EnrichedLog net472 150μs 107ns 399ns 0.681 0.227 0 4.46 KB
#6333 EnrichedLog net6.0 120μs 177ns 686ns 0.0603 0 0 4.28 KB
#6333 EnrichedLog netcoreapp3.1 123μs 170ns 657ns 0 0 0 4.28 KB
#6333 EnrichedLog net472 150μs 105ns 405ns 0.681 0.227 0 4.46 KB
Benchmarks.Trace.NLogBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master EnrichedLog net6.0 3.12μs 1.24ns 4.82ns 0.0297 0 0 2.2 KB
master EnrichedLog netcoreapp3.1 4.17μs 3.85ns 14.9ns 0.0293 0 0 2.2 KB
master EnrichedLog net472 4.91μs 4.95ns 19.2ns 0.32 0 0 2.02 KB
#6333 EnrichedLog net6.0 3.17μs 1.22ns 4.72ns 0.0302 0 0 2.2 KB
#6333 EnrichedLog netcoreapp3.1 4.11μs 1.68ns 6.3ns 0.0291 0 0 2.2 KB
#6333 EnrichedLog net472 4.98μs 1.1ns 4.27ns 0.319 0 0 2.02 KB
Benchmarks.Trace.RedisBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master SendReceive net6.0 1.29μs 0.403ns 1.51ns 0.016 0 0 1.14 KB
master SendReceive netcoreapp3.1 1.76μs 1ns 3.88ns 0.015 0 0 1.14 KB
master SendReceive net472 2.03μs 1.27ns 4.92ns 0.183 0 0 1.16 KB
#6333 SendReceive net6.0 1.41μs 0.537ns 2.01ns 0.0161 0 0 1.14 KB
#6333 SendReceive netcoreapp3.1 1.76μs 0.568ns 2.2ns 0.0157 0 0 1.14 KB
#6333 SendReceive net472 2.17μs 1.41ns 5.28ns 0.184 0 0 1.16 KB
Benchmarks.Trace.SerilogBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master EnrichedLog net6.0 2.7μs 0.665ns 2.49ns 0.0217 0 0 1.6 KB
master EnrichedLog netcoreapp3.1 3.99μs 1.83ns 6.59ns 0.0223 0 0 1.65 KB
master EnrichedLog net472 4.34μs 1.7ns 6.12ns 0.322 0 0 2.04 KB
#6333 EnrichedLog net6.0 2.62μs 1.17ns 4.39ns 0.0219 0 0 1.6 KB
#6333 EnrichedLog netcoreapp3.1 3.92μs 1.98ns 7.65ns 0.0216 0 0 1.65 KB
#6333 EnrichedLog net472 4.47μs 1.23ns 4.59ns 0.323 0 0 2.04 KB
Benchmarks.Trace.SpanBenchmark - Slower ⚠️ Same allocations ✔️

Slower ⚠️ in #6333

Benchmark diff/base Base Median (ns) Diff Median (ns) Modality
Benchmarks.Trace.SpanBenchmark.StartFinishScope‑netcoreapp3.1 1.232 640.08 788.41
Benchmarks.Trace.SpanBenchmark.StartFinishSpan‑net6.0 1.203 387.35 465.93
Benchmarks.Trace.SpanBenchmark.StartFinishScope‑net6.0 1.120 477.12 534.27

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master StartFinishSpan net6.0 388ns 0.528ns 2.04ns 0.00809 0 0 576 B
master StartFinishSpan netcoreapp3.1 615ns 1.12ns 4.34ns 0.00767 0 0 576 B
master StartFinishSpan net472 643ns 1.25ns 4.82ns 0.0917 0 0 578 B
master StartFinishScope net6.0 477ns 0.672ns 2.6ns 0.0099 0 0 696 B
master StartFinishScope netcoreapp3.1 639ns 1.12ns 4.18ns 0.00921 0 0 696 B
master StartFinishScope net472 847ns 1.27ns 4.91ns 0.105 0 0 658 B
#6333 StartFinishSpan net6.0 466ns 1.02ns 3.94ns 0.00794 0 0 576 B
#6333 StartFinishSpan netcoreapp3.1 557ns 0.818ns 3.17ns 0.00759 0 0 576 B
#6333 StartFinishSpan net472 674ns 1.67ns 6.24ns 0.0915 0 0 578 B
#6333 StartFinishScope net6.0 534ns 0.981ns 3.8ns 0.00968 0 0 696 B
#6333 StartFinishScope netcoreapp3.1 786ns 1.83ns 7.08ns 0.00935 0 0 696 B
#6333 StartFinishScope net472 815ns 1.9ns 7.37ns 0.104 0 0 658 B
Benchmarks.Trace.TraceAnnotationsBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master RunOnMethodBegin net6.0 594ns 0.832ns 3.22ns 0.00963 0 0 696 B
master RunOnMethodBegin netcoreapp3.1 857ns 1.45ns 5.61ns 0.00952 0 0 696 B
master RunOnMethodBegin net472 1.09μs 1.86ns 7.22ns 0.104 0 0 658 B
#6333 RunOnMethodBegin net6.0 650ns 0.874ns 3.38ns 0.00971 0 0 696 B
#6333 RunOnMethodBegin netcoreapp3.1 941ns 2.77ns 10.7ns 0.00936 0 0 696 B
#6333 RunOnMethodBegin net472 1.13μs 2.11ns 8.16ns 0.104 0 0 658 B

Base automatically changed from andrew/ci/manual-instrumentation-flake to master November 22, 2024 17:22
@andrewlock andrewlock force-pushed the andrew/ci/manual-instrumentation-flake-2 branch from 0e651ce to d6306fa Compare November 22, 2024 17:23
@andrewlock andrewlock merged commit 467825f into master Nov 25, 2024
60 of 64 checks passed
@andrewlock andrewlock deleted the andrew/ci/manual-instrumentation-flake-2 branch November 25, 2024 11:31
@github-actions github-actions bot added this to the vNext-v3 milestone Nov 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:tests unit tests, integration tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants