[Profiler] Introduced `DD_PROFILER_SKIPPED_METHODS` configuration to exclude specific methods from sampling #6140

GreenMatan · 2024-10-10T08:24:34Z

Summary of changes

This PR introduces a new configuration setting to address a customer issue where the profiler would sample threads transitioning into unmanaged code via P/Invoke, potentially causing crashes due to unsafe operations.

The new configuration, DD_PROFILER_SKIPPED_METHODS, allows users to specify methods that should be excluded from sampling.

Configuration

Key: DD_PROFILER_SKIPPED_METHODS
Value: string
Value Syntax:

Entries are semi-colon (;) delimited
The full type name is specified outside of braces ([ and ])
Methods are specified inside the braces and are comma delimited
Generic methods only need the method name, but generic type names must include the number of generic type variables, like `1 to specify a generic type with one generic type variable

DD_PROFILER_SKIPPED_METHODS="Namespace1.Class1[Method1,Method2];Namespace1.GenericTypeWithOneTypeVariable`1[ExecuteAsync]"

Reason for change

This change was driven by a customer support case involving profiler instability when managed threads transitioned into unmanaged code via P/Invoke are sampled by the profiler.

Implementation details

A new integration type, ProfilerSkippedMethodIntegration, was added. Based on the existing Trace Methods instrumentation, this integration tracks method entry and exit points to mark threads as unsafe for sampling while within these methods.

The shared structure between the profiler's native component and the managed code, TraceContextInfo, has been extended to include a new field _threadMetaInfo (cherry-picked from 5891c12), which tracks the state of the thread:

On method entry, _threadMetaInfo is set to 1, marking the thread as unsafe to sample.
On method exit, it is reset to 0, allowing sampling again.

This ensures the profiler can dynamically adjust sampling behavior based on the thread's current execution context.

Test coverage

UnsafeToUnwindTest that was added in 5891c12, verifies the correct behavior of the profiler when methods are marked as unsafe for sampling.

andrewlock · 2024-10-10T09:06:12Z

Execution-Time Benchmarks Report ⏱️

Execution-time results for samples comparing the following branches/commits:

Execution-time benchmarks measure the whole time it takes to execute a program. And are intended to measure the one-off costs. Cases where the execution time results for the PR are worse than latest master results are shown in red. The following thresholds were used for comparing the execution times:

Welch test with statistical test for significance of 5%
Only results indicating a difference greater than 5% and 5 ms are considered.

Note that these results are based on a single point-in-time result for each branch. For full results, see the dashboard.

Graphs show the p99 interval based on the mean and StdDev of the test run, as well as the mean value of the run (shown as a diamond below the graph).

gantt
    title Execution time (ms) FakeDbCommand (.NET Framework 4.6.2) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (6140) - mean (70ms)  : 67, 72
     .   : milestone, 70,
    master - mean (70ms)  : 67, 73
     .   : milestone, 70,

    section CallTarget+Inlining+NGEN
    This PR (6140) - mean (1,113ms)  : 1093, 1133
     .   : milestone, 1113,
    master - mean (1,111ms)  : 1087, 1134
     .   : milestone, 1111,

gantt
    title Execution time (ms) FakeDbCommand (.NET Core 3.1) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (6140) - mean (109ms)  : 106, 112
     .   : milestone, 109,
    master - mean (109ms)  : 105, 113
     .   : milestone, 109,

    section CallTarget+Inlining+NGEN
    This PR (6140) - mean (771ms)  : 758, 785
     .   : milestone, 771,
    master - mean (770ms)  : 756, 784
     .   : milestone, 770,

gantt
    title Execution time (ms) FakeDbCommand (.NET 6) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (6140) - mean (92ms)  : 90, 94
     .   : milestone, 92,
    master - mean (92ms)  : 90, 94
     .   : milestone, 92,

    section CallTarget+Inlining+NGEN
    This PR (6140) - mean (727ms)  : 709, 744
     .   : milestone, 727,
    master - mean (727ms)  : 710, 744
     .   : milestone, 727,

gantt
    title Execution time (ms) HttpMessageHandler (.NET Framework 4.6.2) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (6140) - mean (190ms)  : 186, 193
     .   : milestone, 190,
    master - mean (190ms)  : 187, 193
     .   : milestone, 190,

    section CallTarget+Inlining+NGEN
    This PR (6140) - mean (1,197ms)  : 1171, 1223
     .   : milestone, 1197,
    master - mean (1,199ms)  : 1173, 1225
     .   : milestone, 1199,

gantt
    title Execution time (ms) HttpMessageHandler (.NET Core 3.1) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (6140) - mean (275ms)  : 270, 279
     .   : milestone, 275,
    master - mean (275ms)  : 270, 280
     .   : milestone, 275,

    section CallTarget+Inlining+NGEN
    This PR (6140) - mean (942ms)  : 921, 964
     .   : milestone, 942,
    master - mean (945ms)  : 928, 963
     .   : milestone, 945,

gantt
    title Execution time (ms) HttpMessageHandler (.NET 6) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (6140) - mean (263ms)  : 260, 266
     .   : milestone, 263,
    master - mean (264ms)  : 259, 268
     .   : milestone, 264,

    section CallTarget+Inlining+NGEN
    This PR (6140) - mean (923ms)  : 904, 943
     .   : milestone, 923,
    master - mean (922ms)  : 899, 946
     .   : milestone, 922,

datadog-ddstaging · 2024-10-10T09:19:10Z

Datadog Report

Branch report: matang/profiler-avoid-sampling-user-pinvokes
Commit report: 0710771
Test service: dd-trace-dotnet

❌ 60 Failed (0 Known Flaky), 292761 Passed, 2941 Skipped, 26h 32m 47.18s Total Time
⌛ 12 Performance Regressions

❌ Failed Tests (60)

This report shows up to 5 failed tests.

AvoidUnwindingUnsafeExecution - Datadog.Profiler.IntegrationTests.Bugs.UnsafeToUnwindTest - Details

Expand for error

 Expected samples.Where(s => s.StackTrace.EndWith(exceptionToSkipStack)) to be null or empty, but found 
     {
         Item1 = |lm:Samples.Computer01 |ns:Samples.Computer01 |ct:UnsafeToUnwind |cg: |fn:RaiseExceptionUnsafeUnwind |fg: |sg:()
 |lm:Samples.Computer01 |ns:Samples.Computer01 |ct:UnsafeToUnwind |cg: |fn:StartException |fg: |sg:(), 
         Item2 = Datadog.Profiler.IntegrationTests.Helpers.PprofHelper+Label
         {
             {
                 Name = "appdomain name", 
                 Value = "clrhost"
             }, 
 ...

AvoidUnwindingUnsafeExecution - Datadog.Profiler.IntegrationTests.Bugs.UnsafeToUnwindTest - Details

Expand for error

 Expected samples.Where(s => s.StackTrace.EndWith(exceptionToSkipStack)) to be null or empty, but found 
     {
         Item1 = |lm:Samples.Computer01 |ns:Samples.Computer01 |ct:UnsafeToUnwind |cg: |fn:RaiseExceptionUnsafeUnwind |fg: |sg:()
 |lm:Samples.Computer01 |ns:Samples.Computer01 |ct:UnsafeToUnwind |cg: |fn:StartException |fg: |sg:()
 |lm:System.Private.CoreLib |ns:System.Threading |ct:ThreadHelper |cg: |fn:ThreadStart_Context |fg: |sg:(object state)
 |lm:System.Private.CoreLib |ns:System.Threading |ct:ExecutionContext |cg: |fn:RunInternal |fg: |sg:(System.Threading.ExecutionContext executionContext, System.Threading.ContextCallback callback, object state)
 |lm:System.Private.CoreLib |ns:System.Threading |ct:ThreadHelper |cg: |fn:ThreadStart |fg: |sg:(), 
         Item2 = Datadog.Profiler.IntegrationTests.Helpers.PprofHelper+Label
         {
             {
 ...

AvoidUnwindingUnsafeExecution - Datadog.Profiler.IntegrationTests.Bugs.UnsafeToUnwindTest - Details

Expand for error

 Expected samples.Where(s => s.StackTrace.EndWith(exceptionToSkipStack)) to be null or empty, but found 
     {
         Item1 = |lm:Samples.Computer01 |ns:Samples.Computer01 |ct:UnsafeToUnwind |cg: |fn:RaiseExceptionUnsafeUnwind |fg: |sg:()
 |lm:Samples.Computer01 |ns:Samples.Computer01 |ct:UnsafeToUnwind |cg: |fn:StartException |fg: |sg:()
 |lm:System.Private.CoreLib |ns:System.Threading |ct:Thread |cg: |fn:StartCallback |fg: |sg:(), 
         Item2 = Datadog.Profiler.IntegrationTests.Helpers.PprofHelper+Label
         {
             {
                 Name = "appdomain name", 
                 Value = "clrhost"
 ...

AvoidUnwindingUnsafeExecution - Datadog.Profiler.IntegrationTests.Bugs.UnsafeToUnwindTest - Details

Expand for error

 Expected samples.Where(s => s.StackTrace.EndWith(exceptionToSkipStack)) to be null or empty, but found 
     {
         Item1 = |lm:Samples.Computer01 |ns:Samples.Computer01 |ct:UnsafeToUnwind |cg: |fn:RaiseExceptionUnsafeUnwind |fg: |sg:()
 |lm:Samples.Computer01 |ns:Samples.Computer01 |ct:UnsafeToUnwind |cg: |fn:StartException |fg: |sg:(), 
         Item2 = Datadog.Profiler.IntegrationTests.Helpers.PprofHelper+Label
         {
             {
                 Name = "appdomain name", 
                 Value = "clrhost"
             }, 
 ...

AvoidUnwindingUnsafeExecution - Datadog.Profiler.IntegrationTests.Bugs.UnsafeToUnwindTest - Details
Expand for error
```
 Expected samples.Any(s => s.StackTrace.EndWith(safeToUnwind)) to be true, but found False.
```

⌛ Performance Regressions vs Default Branch (12)

This report shows up to 5 performance regressions.

Profiler_exceptions - scenarios 5.59s (+264.81ms, +5%) - Details
Profiler_exceptions_cpu_walltime - scenarios 5.61s (+224.07ms, +4%) - Details
Profiler_allocation_cpu_walltime - scenarios 2.76s (+437.82ms, +19%) - Details
Profiler_allocation - scenarios 2.63s (+330.94ms, +14%) - Details
Profiler_cpu_walltime - scenarios 4.76s (+416.22ms, +10%) - Details

andrewlock · 2024-10-10T09:32:43Z

Throughput/Crank Report ⚡

Throughput results for AspNetCoreSimpleController comparing the following branches/commits:

Cases where throughput results for the PR are worse than latest master (5% drop or greater), results are shown in red.

Note that these results are based on a single point-in-time result for each branch. For full results, see one of the many, many dashboards!

gantt
    title Throughput Linux x64 (Total requests) 
    dateFormat  X
    axisFormat %s
    section Baseline
    This PR (6140) (11.011M)   : 0, 11010636
    master (11.099M)   : 0, 11098662
    benchmarks/2.9.0 (11.081M)   : 0, 11080577

    section Automatic
    This PR (6140) (7.243M)   : 0, 7242520
    master (7.353M)   : 0, 7353407
    benchmarks/2.9.0 (7.732M)   : 0, 7732233

    section Trace stats
    master (7.604M)   : 0, 7604389

    section Manual
    master (10.912M)   : 0, 10911730

    section Manual + Automatic
    This PR (6140) (6.720M)   : 0, 6720190
    master (6.757M)   : 0, 6756825

    section DD_TRACE_ENABLED=0
    master (10.056M)   : 0, 10055906

gantt
    title Throughput Linux arm64 (Total requests) 
    dateFormat  X
    axisFormat %s
    section Baseline
    This PR (6140) (9.649M)   : 0, 9649341
    master (9.588M)   : 0, 9587958
    benchmarks/2.9.0 (9.798M)   : 0, 9798067

    section Automatic
    This PR (6140) (6.705M)   : 0, 6704833
    master (6.554M)   : 0, 6554472

    section Trace stats
    master (6.811M)   : 0, 6811095

    section Manual
    master (9.622M)   : 0, 9622473

    section Manual + Automatic
    This PR (6140) (6.229M)   : 0, 6229209
    master (5.894M)   : 0, 5893873

    section DD_TRACE_ENABLED=0
    master (8.887M)   : 0, 8886504

gantt
    title Throughput Windows x64 (Total requests) 
    dateFormat  X
    axisFormat %s
    section Baseline
    This PR (6140) (10.234M)   : 0, 10234184
    master (10.171M)   : 0, 10171179
    benchmarks/2.9.0 (10.067M)   : 0, 10067315

    section Automatic
    This PR (6140) (6.660M)   : 0, 6660223
    master (6.514M)   : 0, 6513949
    benchmarks/2.9.0 (7.552M)   : 0, 7552193

    section Trace stats
    master (7.311M)   : 0, 7310649

    section Manual
    master (10.008M)   : 0, 10007638

    section Manual + Automatic
    This PR (6140) (6.304M)   : 0, 6304019
    master (6.286M)   : 0, 6285745

    section DD_TRACE_ENABLED=0
    master (9.389M)   : 0, 9388827

andrewlock · 2024-10-10T10:16:55Z

Benchmarks Report for tracer 🐌

Benchmarks for #6140 compared to master:

1 benchmarks are faster, with geometric mean 1.124
1 benchmarks are slower, with geometric mean 1.132
All benchmarks have the same allocations

The following thresholds were used for comparing the benchmark speeds:

Mann–Whitney U test with statistical test for significance of 5%
Only results indicating a difference greater than 10% and 0.3 ns are considered.

Allocation changes below 0.5% are ignored.

Benchmark details

Benchmarks.Trace.ActivityBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch	Method	Toolchain	Mean	StdError	StdDev	Gen 0	Gen 1	Gen 2	Allocated
master	`StartStopWithChild`	net6.0	7.85μs	43.7ns	290ns	0.0124	0.00415	0	5.43 KB
master	`StartStopWithChild`	netcoreapp3.1	10μs	56.2ns	364ns	0.0207	0.0104	0	5.61 KB
master	`StartStopWithChild`	net472	16.6μs	38.7ns	150ns	1.03	0.313	0.104	6.08 KB
#6140	`StartStopWithChild`	net6.0	7.67μs	40.1ns	212ns	0.0157	0.00783	0	5.43 KB
#6140	`StartStopWithChild`	netcoreapp3.1	10μs	57ns	399ns	0.0202	0.0101	0	5.61 KB
#6140	`StartStopWithChild`	net472	16.6μs	55.9ns	217ns	1.02	0.308	0.0972	6.06 KB

Benchmarks.Trace.AgentWriterBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch	Method	Toolchain	Mean	StdError	StdDev	Gen 0	Allocated
master	`WriteAndFlushEnrichedTraces`	net6.0	478μs	146ns	545ns	0	2.7 KB
master	`WriteAndFlushEnrichedTraces`	netcoreapp3.1	634μs	206ns	800ns	0	2.7 KB
master	`WriteAndFlushEnrichedTraces`	net472	844μs	654ns	2.53μs	0.419	3.3 KB
#6140	`WriteAndFlushEnrichedTraces`	net6.0	476μs	430ns	1.61μs	0	2.7 KB
#6140	`WriteAndFlushEnrichedTraces`	netcoreapp3.1	635μs	329ns	1.27μs	0	2.7 KB
#6140	`WriteAndFlushEnrichedTraces`	net472	837μs	596ns	2.31μs	0.417	3.3 KB

Benchmarks.Trace.AspNetCoreBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch	Method	Toolchain	Mean	StdError	StdDev	Gen 0	Allocated
master	`SendRequest`	net6.0	190μs	1.03μs	6.24μs	0.201	18.45 KB
master	`SendRequest`	netcoreapp3.1	224μs	1.28μs	9.23μs	0.212	20.61 KB
master	`SendRequest`	net472	0.0314ns	0.00646ns	0.025ns	0	0 b
#6140	`SendRequest`	net6.0	197μs	1.11μs	7.34μs	0.2	18.45 KB
#6140	`SendRequest`	netcoreapp3.1	225μs	1.25μs	9.98μs	0.218	20.61 KB
#6140	`SendRequest`	net472	0.00496ns	0.00106ns	0.0041ns	0	0 b

Benchmarks.Trace.CIVisibilityProtocolWriterBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch	Method	Toolchain	Mean	StdError	StdDev	Gen 0	Gen 1	Gen 2	Allocated
master	`WriteAndFlushEnrichedTraces`	net6.0	559μs	3μs	15.6μs	0.568	0	0	41.63 KB
master	`WriteAndFlushEnrichedTraces`	netcoreapp3.1	680μs	2.27μs	7.86μs	0.342	0	0	41.72 KB
master	`WriteAndFlushEnrichedTraces`	net472	843μs	2.27μs	8.17μs	8.56	2.57	0.428	53.27 KB
#6140	`WriteAndFlushEnrichedTraces`	net6.0	558μs	2.37μs	8.87μs	0.556	0	0	41.8 KB
#6140	`WriteAndFlushEnrichedTraces`	netcoreapp3.1	664μs	3.33μs	19.1μs	0.345	0	0	41.6 KB
#6140	`WriteAndFlushEnrichedTraces`	net472	859μs	3.98μs	15.4μs	8.04	2.23	0.446	53.26 KB

Benchmarks.Trace.DbCommandBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch	Method	Toolchain	Mean	StdError	StdDev	Gen 0	Allocated
master	`ExecuteNonQuery`	net6.0	1.27μs	1.55ns	5.8ns	0.0145	1.02 KB
master	`ExecuteNonQuery`	netcoreapp3.1	1.76μs	2.38ns	8.9ns	0.0131	1.02 KB
master	`ExecuteNonQuery`	net472	2.1μs	1.85ns	6.93ns	0.157	987 B
#6140	`ExecuteNonQuery`	net6.0	1.34μs	1.17ns	4.39ns	0.0141	1.02 KB
#6140	`ExecuteNonQuery`	netcoreapp3.1	1.8μs	0.924ns	3.58ns	0.0133	1.02 KB
#6140	`ExecuteNonQuery`	net472	2.11μs	0.897ns	3.24ns	0.156	987 B

Benchmarks.Trace.ElasticsearchBenchmark - Slower ⚠️ Same allocations ✔️

Slower ⚠️ in #6140

Benchmark	diff/base	Base Median (ns)	Diff Median (ns)	Modality
Benchmarks.Trace.ElasticsearchBenchmark.CallElasticsearch‑net6.0	1.132	1,090.71	1,234.69

Raw results

Branch	Method	Toolchain	Mean	StdError	StdDev	Gen 0	Allocated
master	`CallElasticsearch`	net6.0	1.09μs	1.22ns	4.72ns	0.0139	976 B
master	`CallElasticsearch`	netcoreapp3.1	1.63μs	0.507ns	1.9ns	0.0129	976 B
master	`CallElasticsearch`	net472	2.55μs	1.26ns	4.89ns	0.157	995 B
master	`CallElasticsearchAsync`	net6.0	1.28μs	0.419ns	1.51ns	0.0135	952 B
master	`CallElasticsearchAsync`	netcoreapp3.1	1.63μs	1.55ns	5.81ns	0.0138	1.02 KB
master	`CallElasticsearchAsync`	net472	2.57μs	2.41ns	9.02ns	0.166	1.05 KB
#6140	`CallElasticsearch`	net6.0	1.24μs	0.88ns	3.41ns	0.0136	976 B
#6140	`CallElasticsearch`	netcoreapp3.1	1.54μs	1.86ns	7.22ns	0.0131	976 B
#6140	`CallElasticsearch`	net472	2.48μs	1.8ns	6.72ns	0.157	995 B
#6140	`CallElasticsearchAsync`	net6.0	1.27μs	0.737ns	2.86ns	0.0133	952 B
#6140	`CallElasticsearchAsync`	netcoreapp3.1	1.64μs	0.52ns	2.02ns	0.014	1.02 KB
#6140	`CallElasticsearchAsync`	net472	2.65μs	1.79ns	6.92ns	0.167	1.05 KB

Benchmarks.Trace.GraphQLBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch	Method	Toolchain	Mean	StdError	StdDev	Gen 0	Allocated
master	`ExecuteAsync`	net6.0	1.29μs	1.12ns	4.32ns	0.0129	952 B
master	`ExecuteAsync`	netcoreapp3.1	1.66μs	3.5ns	13.6ns	0.0125	952 B
master	`ExecuteAsync`	net472	1.72μs	0.571ns	2.21ns	0.145	915 B
#6140	`ExecuteAsync`	net6.0	1.36μs	0.713ns	2.67ns	0.0137	952 B
#6140	`ExecuteAsync`	netcoreapp3.1	1.59μs	0.974ns	3.77ns	0.0127	952 B
#6140	`ExecuteAsync`	net472	1.8μs	0.963ns	3.73ns	0.145	915 B

Benchmarks.Trace.HttpClientBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch	Method	Toolchain	Mean	StdError	StdDev	Gen 0	Allocated
master	`SendAsync`	net6.0	4.17μs	2.47ns	9.58ns	0.0311	2.22 KB
master	`SendAsync`	netcoreapp3.1	5.01μs	2.23ns	8.65ns	0.0375	2.76 KB
master	`SendAsync`	net472	7.78μs	4.13ns	14.3ns	0.5	3.15 KB
#6140	`SendAsync`	net6.0	4.35μs	1.78ns	6.43ns	0.0305	2.22 KB
#6140	`SendAsync`	netcoreapp3.1	5.04μs	2.13ns	8.24ns	0.0378	2.76 KB
#6140	`SendAsync`	net472	7.73μs	4.76ns	18.4ns	0.497	3.15 KB

Benchmarks.Trace.ILoggerBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch	Method	Toolchain	Mean	StdError	StdDev	Gen 0	Allocated
master	`EnrichedLog`	net6.0	1.44μs	0.862ns	3.23ns	0.0231	1.64 KB
master	`EnrichedLog`	netcoreapp3.1	2.31μs	1.19ns	4.45ns	0.0219	1.64 KB
master	`EnrichedLog`	net472	2.69μs	0.793ns	2.86ns	0.25	1.57 KB
#6140	`EnrichedLog`	net6.0	1.55μs	0.928ns	3.47ns	0.0232	1.64 KB
#6140	`EnrichedLog`	netcoreapp3.1	2.08μs	1.11ns	4.16ns	0.0219	1.64 KB
#6140	`EnrichedLog`	net472	2.58μs	1.36ns	5.08ns	0.25	1.57 KB

Benchmarks.Trace.Log4netBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch	Method	Toolchain	Mean	StdError	StdDev	Gen 0	Gen 1	Allocated
master	`EnrichedLog`	net6.0	118μs	134ns	501ns	0.0594	0	4.28 KB
master	`EnrichedLog`	netcoreapp3.1	123μs	225ns	871ns	0	0	4.28 KB
master	`EnrichedLog`	net472	151μs	276ns	1.07μs	0.673	0.224	4.46 KB
#6140	`EnrichedLog`	net6.0	115μs	120ns	450ns	0.0574	0	4.28 KB
#6140	`EnrichedLog`	netcoreapp3.1	121μs	175ns	654ns	0.0608	0	4.28 KB
#6140	`EnrichedLog`	net472	150μs	189ns	733ns	0.68	0.227	4.46 KB

Benchmarks.Trace.NLogBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch	Method	Toolchain	Mean	StdError	StdDev	Gen 0	Allocated
master	`EnrichedLog`	net6.0	2.91μs	1.11ns	4.29ns	0.0308	2.2 KB
master	`EnrichedLog`	netcoreapp3.1	4.26μs	1.56ns	6.06ns	0.0299	2.2 KB
master	`EnrichedLog`	net472	4.95μs	3.49ns	13.5ns	0.318	2.02 KB
#6140	`EnrichedLog`	net6.0	3.14μs	0.826ns	3.09ns	0.031	2.2 KB
#6140	`EnrichedLog`	netcoreapp3.1	4.08μs	1.49ns	5.59ns	0.0285	2.2 KB
#6140	`EnrichedLog`	net472	5.01μs	2.33ns	9.04ns	0.319	2.02 KB

Benchmarks.Trace.RedisBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch	Method	Toolchain	Mean	StdError	StdDev	Gen 0	Gen 1	Allocated
master	`SendReceive`	net6.0	1.44μs	0.536ns	2ns	0.0159	0	1.14 KB
master	`SendReceive`	netcoreapp3.1	1.75μs	1.06ns	4.11ns	0.0158	0	1.14 KB
master	`SendReceive`	net472	2.12μs	1.12ns	4.2ns	0.183	0.00106	1.16 KB
#6140	`SendReceive`	net6.0	1.4μs	0.721ns	2.79ns	0.016	0	1.14 KB
#6140	`SendReceive`	netcoreapp3.1	1.76μs	1.38ns	5.34ns	0.0157	0	1.14 KB
#6140	`SendReceive`	net472	2.28μs	1.72ns	6.65ns	0.183	0.00114	1.16 KB

Benchmarks.Trace.SerilogBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch	Method	Toolchain	Mean	StdError	StdDev	Gen 0	Allocated
master	`EnrichedLog`	net6.0	2.71μs	0.634ns	2.46ns	0.0217	1.6 KB
master	`EnrichedLog`	netcoreapp3.1	3.97μs	1.38ns	5.18ns	0.0217	1.65 KB
master	`EnrichedLog`	net472	4.54μs	1.32ns	4.94ns	0.322	2.04 KB
#6140	`EnrichedLog`	net6.0	2.81μs	0.896ns	3.47ns	0.0226	1.6 KB
#6140	`EnrichedLog`	netcoreapp3.1	3.85μs	1.34ns	5.18ns	0.0211	1.65 KB
#6140	`EnrichedLog`	net472	4.39μs	1.74ns	6.74ns	0.323	2.04 KB

Benchmarks.Trace.SpanBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch	Method	Toolchain	Mean	StdError	StdDev	Gen 0	Allocated
master	`StartFinishSpan`	net6.0	400ns	0.173ns	0.669ns	0.00805	576 B
master	`StartFinishSpan`	netcoreapp3.1	594ns	0.522ns	2.02ns	0.00781	576 B
master	`StartFinishSpan`	net472	720ns	0.394ns	1.53ns	0.0915	578 B
master	`StartFinishScope`	net6.0	483ns	0.242ns	0.939ns	0.0097	696 B
master	`StartFinishScope`	netcoreapp3.1	717ns	0.438ns	1.7ns	0.00935	696 B
master	`StartFinishScope`	net472	932ns	0.652ns	2.52ns	0.105	658 B
#6140	`StartFinishSpan`	net6.0	407ns	0.117ns	0.454ns	0.00817	576 B
#6140	`StartFinishSpan`	netcoreapp3.1	585ns	2.59ns	10ns	0.00788	576 B
#6140	`StartFinishSpan`	net472	749ns	0.864ns	2.99ns	0.0915	578 B
#6140	`StartFinishScope`	net6.0	486ns	0.314ns	1.21ns	0.00978	696 B
#6140	`StartFinishScope`	netcoreapp3.1	762ns	0.653ns	2.53ns	0.00956	696 B
#6140	`StartFinishScope`	net472	935ns	0.986ns	3.82ns	0.104	658 B

Benchmarks.Trace.TraceAnnotationsBenchmark - Faster 🎉 Same allocations ✔️

Faster 🎉 in #6140

Benchmark	base/diff	Base Median (ns)	Diff Median (ns)	Modality
Benchmarks.Trace.TraceAnnotationsBenchmark.RunOnMethodBegin‑net6.0	1.124	734.52	653.46

Raw results

Branch	Method	Toolchain	Mean	StdError	StdDev	Gen 0	Allocated
master	`RunOnMethodBegin`	net6.0	734ns	0.274ns	1.06ns	0.00982	696 B
master	`RunOnMethodBegin`	netcoreapp3.1	916ns	0.21ns	0.758ns	0.00921	696 B
master	`RunOnMethodBegin`	net472	1.12μs	0.475ns	1.71ns	0.104	658 B
#6140	`RunOnMethodBegin`	net6.0	653ns	0.44ns	1.7ns	0.00985	696 B
#6140	`RunOnMethodBegin`	netcoreapp3.1	947ns	0.33ns	1.23ns	0.00897	696 B
#6140	`RunOnMethodBegin`	net472	1.08μs	0.729ns	2.82ns	0.104	658 B

…uration won't be sampled by the profiler, utilizing tracer instrumentation

…tegration

chrisnas

LGTM from the profiler side (just a few comments)

chrisnas · 2024-10-14T15:51:29Z

profiler/src/Demos/Samples.Computer01/UnsafeToUnwind.cs

+ public void Run()
+ {
+ Start();
+ Thread.Sleep(TimeSpan.FromSeconds(10));


If the scenario is used in an integration test, it might be interesting to pass the 10s as a parameter (via --param on the command line). Just in case, it is flacky due to duration issue.

chrisnas · 2024-10-14T15:53:53Z

profiler/src/Demos/Samples.Computer01/UnsafeToUnwind.cs

+ {
+ var t = new Thread(action)
+ {
+ IsBackground = false // set to false to prevent the app from shutting down. The test will fail


the comment may be misleading: "set to false allows the app to shutdown; otherwise, the test would fail"

chrisnas · 2024-10-14T16:03:34Z

profiler/test/Datadog.Profiler.IntegrationTests/Bugs/UnsafeToUnwindTest.cs

+ using var agent = MockDatadogAgent.CreateHttpAgent(_output);
+ runner.Run(agent);
+
+ var toSkipStack = new StackTrace(


These call stacks could declared in the class and shared in the 2 tests and avoid copy/paste

chrisnas · 2024-10-14T16:06:15Z

profiler/test/Datadog.Profiler.IntegrationTests/Bugs/UnsafeToUnwindTest.cs

+ public void AvoidUnwindingUnsafeExecution(string appName, string framework, string appAssembly)
+ {
+ var runner = new TestApplicationRunner(appName, framework, appAssembly, _output, commandLine: "--scenario 28", enableTracer: true);
+ runner.Environment.SetVariable(EnvironmentVariables.SkippedMethods, "Samples.Computer01.UnsafeToUnwind[Wrap_UnSafeToUnwind];Samples.Computer01.UnsafeToUnwind[Wrap_RaiseExceptionUnsafeUnwind];Samples.Computer01.UnsafeToUnwind[Wrap_ContentionUnsafeToUnwind]");


Maybe add another test to validate the different possibilities in term of syntax like type1[foo, bar] in addition to type1[foo];type1[bar]. Is there any log when a given method does not exist (i.e. detect typo)?

chrisnas · 2024-10-14T16:26:46Z

tracer/src/Datadog.Tracer.Native/cor_profiler.cpp

@@ -944,7 +951,8 @@ HRESULT CorProfiler::TryRejitModule(ModuleID module_id, std::vector<ModuleID>& m
 auto profiler_library_path = shared::GetEnvironmentValue(WStr("DD_INTERNAL_PROFILING_NATIVE_ENGINE_PATH"));
 if (!profiler_library_path.empty() && fs::exists(profiler_library_path))
 {
- RewritingPInvokeMaps(module_metadata, WStr("continuous profiler"), profiler_nativemethods_type, profiler_library_path);
+ RewritingPInvokeMaps(module_metadata, WStr("continuous profiler"), profiler_nativemethods_type_in_profiler_native_module, profiler_library_path);
+ RewritingPInvokeMaps(module_metadata, WStr("continuous profiler"), profiler_nativemethods_type_in_tracer_native_module);


Is it expected that "continuous profiler" is used here for tracer native module?

GreenMatan requested review from a team as code owners October 10, 2024 08:24

github-actions bot added the area:profiler Issues related to the continous-profiler label Oct 10, 2024

GreenMatan force-pushed the matang/profiler-avoid-sampling-user-pinvokes branch 4 times, most recently from abd9a8d to 9b6eb3e Compare October 14, 2024 07:47

gleocadie and others added 4 commits October 14, 2024 17:25

Add support for profiler being noticed for unsafe unwinding + tests

8e02691

Added skipped methods functionality where methods defined in a config…

afb3742

…uration won't be sampled by the profiler, utilizing tracer instrumentation

Adjusted skipped methods test accordingly

a72d79c

fine-tuned ThreadMetaInfo naming + the profiler skipped calltarget in…

578cdb2

…tegration

gleocadie force-pushed the matang/profiler-avoid-sampling-user-pinvokes branch from 9b6eb3e to 61c7c9c Compare October 14, 2024 15:26

Adjust test + code change

0710771

gleocadie force-pushed the matang/profiler-avoid-sampling-user-pinvokes branch from 61c7c9c to 0710771 Compare October 14, 2024 15:28

chrisnas approved these changes Oct 14, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Profiler] Introduced `DD_PROFILER_SKIPPED_METHODS` configuration to exclude specific methods from sampling #6140

[Profiler] Introduced `DD_PROFILER_SKIPPED_METHODS` configuration to exclude specific methods from sampling #6140

GreenMatan commented Oct 10, 2024 •

edited

Loading

andrewlock commented Oct 10, 2024 •

edited

Loading

datadog-ddstaging bot commented Oct 10, 2024 •

edited

Loading

andrewlock commented Oct 10, 2024 •

edited

Loading

andrewlock commented Oct 10, 2024 •

edited

Loading

Raw results

Raw results

Raw results

Raw results

Raw results

Slower ⚠️ in #6140

Raw results

Raw results

Raw results

Raw results

Raw results

Raw results

Raw results

Raw results

Raw results

Faster 🎉 in #6140

Raw results

chrisnas left a comment

chrisnas Oct 14, 2024

chrisnas Oct 14, 2024

chrisnas Oct 14, 2024

chrisnas Oct 14, 2024

chrisnas Oct 14, 2024

[Profiler] Introduced DD_PROFILER_SKIPPED_METHODS configuration to exclude specific methods from sampling #6140

Are you sure you want to change the base?

[Profiler] Introduced DD_PROFILER_SKIPPED_METHODS configuration to exclude specific methods from sampling #6140

Conversation

GreenMatan commented Oct 10, 2024 • edited Loading

Summary of changes

Configuration

Reason for change

Implementation details

Test coverage

andrewlock commented Oct 10, 2024 • edited Loading

Execution-Time Benchmarks Report ⏱️

datadog-ddstaging bot commented Oct 10, 2024 • edited Loading

Datadog Report

❌ Failed Tests (60)

⌛ Performance Regressions vs Default Branch (12)

andrewlock commented Oct 10, 2024 • edited Loading

Throughput/Crank Report ⚡

andrewlock commented Oct 10, 2024 • edited Loading

Benchmarks Report for tracer 🐌

Benchmark details

Raw results

Raw results

Raw results

Raw results

Raw results

Slower ⚠️ in #6140

Raw results

Raw results

Raw results

Raw results

Raw results

Raw results

Raw results

Raw results

Raw results

Faster 🎉 in #6140

Raw results

chrisnas left a comment

Choose a reason for hiding this comment

chrisnas Oct 14, 2024

Choose a reason for hiding this comment

chrisnas Oct 14, 2024

Choose a reason for hiding this comment

chrisnas Oct 14, 2024

Choose a reason for hiding this comment

chrisnas Oct 14, 2024

Choose a reason for hiding this comment

chrisnas Oct 14, 2024

Choose a reason for hiding this comment

[Profiler] Introduced `DD_PROFILER_SKIPPED_METHODS` configuration to exclude specific methods from sampling #6140

[Profiler] Introduced `DD_PROFILER_SKIPPED_METHODS` configuration to exclude specific methods from sampling #6140

GreenMatan commented Oct 10, 2024 •

edited

Loading

andrewlock commented Oct 10, 2024 •

edited

Loading

datadog-ddstaging bot commented Oct 10, 2024 •

edited

Loading

andrewlock commented Oct 10, 2024 •

edited

Loading

andrewlock commented Oct 10, 2024 •

edited

Loading