profiler: add enable flag to control profiler activation #2840

korECM · 2024-08-31T08:33:04Z

What does this PR do?

This PR introduces a new environment variable DD_PROFILING_ENABLED to control the profiler's behavior in a way similar to DD_TRACE_ENABLED. By default, DD_PROFILING_ENABLED is set to true, meaning profiling will be enabled if profiler.Start() is called in the application code. If DD_PROFILING_ENABLED is set to false, profiling will be disabled even if profiler.Start() is called. This allows the application code to always call profiler.Start() while dynamically adjusting profiling through the environment variable.

Motivation

Fixes #2834

The motivation for this PR is to simplify the control of profiling behavior across multiple applications. By introducing DD_PROFILING_ENABLED, developers can avoid the cumbersome task of managing environment variables within the application code and instead control profiling through a single environment variable.

Additional Information

This PR includes the following changes:

Addition of enable field in profiler config struct.
Update to defaultConfig function to read DD_PROFILING_ENABLED environment variable.
Conditional check in Start function to skip profiling if enable field in profiler config is false.
Unit tests for DD_PROFILING_ENABLED in options_test.go and profiler_test.go.

Reviewer's Checklist

Changed code has unit tests for its functionality at or near 100% coverage.
System-Tests covering this feature have been added and enabled with the va.b.c-dev version tag.
There is a benchmark for any new code, or changes to existing code.
If this interacts with the agent in a new way, a system test has been added.
Add an appropriate team label so this PR gets put in the right place for the release notes.
Non-trivial go.mod changes, e.g. adding new modules, are reviewed by @DataDog/dd-trace-go-guild.

Add a new 'enable' field to the profiler config, controlled by the DD_PROFILING_ENABLED environment variable. This allows users to disable profiling even when the Start() function is called. The enable flag defaults to true, maintaining backwards compatibility. When set to false, the profiler will not start, providing a simple way to toggle profiling without code changes. Update tests to cover the new functionality and add logging for the new configuration option.

felixge

LGTM. @nsrip-dd could you also take a look please? Kicking of a CI run now.

felixge · 2024-09-02T08:05:12Z

profiler/options.go

@@ -146,6 +147,7 @@ func logStartup(c *config) {
 "execution_trace_size_limit": c.traceConfig.Limit,
 "endpoint_count_enabled": c.endpointCountEnabled,
 "custom_profiler_label_keys": c.customProfilerLabels,
+ "enable": c.enable,


The same change needs to be added to profiler/telemetry.go.

Actually, let's just remove this. If enable is false we won't send anything at all.

Not sure I follow. Having enable in the debug log is useful for debugging, e.g. when a customer reports that profiling isn't working.

And having it in telemetry is useful for us to understand how our users use this flag?

I also think keeping this information in the debug log would make it easier to handle future reports about the profiler.
I'm fine with either option, so please feel free to share your thoughts and I'll implement them accordingly!

Having enable in the debug log is useful for debugging, e.g. when a customer reports that profiling isn't working.

Agreed. I added this comment too hastily and didn't notice that the startup log happens before the check for DD_PROFILING_ENABLED. Let's keep this after all.

And having it in telemetry is useful for us to understand how our users use this flag?

As this PR stands right now, the telemetry client won't start if DD_PROFILING_ENABLED=false. My gut feeling is that we shouldn't start telemetry if we don't start the profiler. IMO this is lower priority than making sure DD_PROFILING_ENABLED=false works, and we can address telemetry in a followup. WDYT @felixge?

I've also added it to profiler/telemetry.go.
3aac285 (#2840)

felixge · 2024-09-02T08:08:46Z

profiler/profiler_test.go

+ // So we should not have an activeProfiler
+ assert.Nil(t, activeProfiler)
+ mu.Unlock()
+ })


Thanks for adding this and the other test case.

NIT: Some of the existing test suite has a lot of tests like this that assert on the internal state of things. However, in general we prefer tests that verify user-visible behavior these days. I.e. having a test that checks that a disabled profiler doesn't send data would be nice. In practice this might be a difficult test to write in a non-flaky manner, so I'm okay with keeping the testing as proposed in this PR.

If I understand correctly, should we keep the test that checks activeProfiler, and additionally create a test that verifies no profiling data is sent when DD_PROFILER_ENABLED=false?
As you mentioned, since I'm not very familiar with the codebase, implementing this might be a bit challenging, but I'll give it a try. Thank you for the suggestion.

Here's a test for DD_PROFILING_ENABLED=false I sketched up while reviewing this:

func TestEnabledFalse(t *testing.T) { t.Setenv("DD_PROFILING_ENABLED", "false") ch := startTestProfiler(t, 1, WithPeriod(10*time.Millisecond), WithProfileTypes()) select { case <-ch: t.Fatal("received profile when profiler should have been disabled") case <-time.After(time.Second): // This test might succeed incorrectly on an overloaded // CI server, but is very likely to fail locally given a // buggy implementation } }

Feel free to add it to the PR if it makes sense to you. We can keep the other tests for now.

I've implemented additional changes in f71e851 (#2840).

However, the tests you've already suggested seem sufficient, so I couldn't find any points to modify or add. If there are any areas that need improvement, please feel free to let me know!

nsrip-dd · 2024-09-03T11:02:30Z

profiler/options.go

@@ -146,6 +147,7 @@ func logStartup(c *config) {
 "execution_trace_size_limit": c.traceConfig.Limit,
 "endpoint_count_enabled": c.endpointCountEnabled,
 "custom_profiler_label_keys": c.customProfilerLabels,
+ "enable": c.enable,


Actually, let's just remove this. If enable is false we won't send anything at all.

nsrip-dd · 2024-09-03T11:06:15Z

profiler/options.go

@@ -208,6 +210,7 @@ func defaultConfig() (*config, error) {
 } else {
 c.agentURL = url.String() + "/profiling/v1/input"
 }
+ c.enable = internal.BoolEnv("DD_PROFILING_ENABLED", true)


We're going to support DD_PROFILING_ENABLED=auto set via the Datadog admission controller. Right now this will work with the value auto, but will log a warning saying it's an invalid boolean. Let's perhaps check for auto explicitly and then check the boolean? Something like:

if os.Getenv("DD_PROFILING_ENABLED") == "auto" { c.enable = true } else { c.enable = internal.BoolEnv("DD_PROFILING_ENABLED", true) }

Applied suggested changes here
ab2acd9 (#2840)

Updated profiler options to automatically enable profiling if the environment variable "DD_PROFILING_ENABLED" is set to "auto". This change delegates the decision to the Datadog admission controller when "auto" is specified.

Implemented a new test to verify that no profiles are received when the profiler is disabled. This helps ensure the profiler respects the DD_PROFILING_ENABLED environment variable.

korECM · 2024-10-14T07:13:30Z

Hi @felixge @nsrip-dd

I hope you're doing well! I noticed that my PR has been pending review for about a month. If you have some time, I would greatly appreciate it if you could take a look.

Thank you!

korECM requested a review from a team as a code owner August 31, 2024 08:33

felixge approved these changes Sep 2, 2024

View reviewed changes

nsrip-dd reviewed Sep 3, 2024

View reviewed changes

nsrip-dd added the profiler label Sep 3, 2024

korECM added 3 commits September 10, 2024 20:42

Set profiling enablement to true if "auto" is detected

ab2acd9

Updated profiler options to automatically enable profiling if the environment variable "DD_PROFILING_ENABLED" is set to "auto". This change delegates the decision to the Datadog admission controller when "auto" is specified.

Add enable flag to telemetry configuration

3aac285

Add test for profiler disabled condition

f71e851

Implemented a new test to verify that no profiles are received when the profiler is disabled. This helps ensure the profiler respects the DD_PROFILING_ENABLED environment variable.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

profiler: add enable flag to control profiler activation #2840

profiler: add enable flag to control profiler activation #2840

korECM commented Aug 31, 2024

felixge left a comment

felixge Sep 2, 2024

nsrip-dd Sep 3, 2024

felixge Sep 4, 2024

korECM Sep 4, 2024

nsrip-dd Sep 4, 2024

felixge Sep 8, 2024

korECM Sep 10, 2024

felixge Sep 2, 2024

korECM Sep 4, 2024

nsrip-dd Sep 4, 2024

korECM Sep 10, 2024

nsrip-dd Sep 3, 2024

nsrip-dd Sep 3, 2024

korECM Sep 10, 2024

korECM commented Oct 14, 2024

profiler: add enable flag to control profiler activation #2840

Are you sure you want to change the base?

profiler: add enable flag to control profiler activation #2840

Conversation

korECM commented Aug 31, 2024

What does this PR do?

Motivation

Additional Information

Reviewer's Checklist

felixge left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

korECM commented Oct 14, 2024