-
Notifications
You must be signed in to change notification settings - Fork 438
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
profiler: add enable flag to control profiler activation #2840
base: main
Are you sure you want to change the base?
Conversation
Add a new 'enable' field to the profiler config, controlled by the DD_PROFILING_ENABLED environment variable. This allows users to disable profiling even when the Start() function is called. The enable flag defaults to true, maintaining backwards compatibility. When set to false, the profiler will not start, providing a simple way to toggle profiling without code changes. Update tests to cover the new functionality and add logging for the new configuration option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. @nsrip-dd could you also take a look please? Kicking of a CI run now.
@@ -146,6 +147,7 @@ func logStartup(c *config) { | |||
"execution_trace_size_limit": c.traceConfig.Limit, | |||
"endpoint_count_enabled": c.endpointCountEnabled, | |||
"custom_profiler_label_keys": c.customProfilerLabels, | |||
"enable": c.enable, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same change needs to be added to profiler/telemetry.go
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, let's just remove this. If enable
is false we won't send anything at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I follow. Having enable
in the debug log is useful for debugging, e.g. when a customer reports that profiling isn't working.
And having it in telemetry is useful for us to understand how our users use this flag?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also think keeping this information in the debug log would make it easier to handle future reports about the profiler.
I'm fine with either option, so please feel free to share your thoughts and I'll implement them accordingly!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having enable in the debug log is useful for debugging, e.g. when a customer reports that profiling isn't working.
Agreed. I added this comment too hastily and didn't notice that the startup log happens before the check for DD_PROFILING_ENABLED
. Let's keep this after all.
And having it in telemetry is useful for us to understand how our users use this flag?
As this PR stands right now, the telemetry client won't start if DD_PROFILING_ENABLED=false
. My gut feeling is that we shouldn't start telemetry if we don't start the profiler. IMO this is lower priority than making sure DD_PROFILING_ENABLED=false
works, and we can address telemetry in a followup. WDYT @felixge?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've also added it to profiler/telemetry.go
.
3aac285
(#2840)
// So we should not have an activeProfiler | ||
assert.Nil(t, activeProfiler) | ||
mu.Unlock() | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding this and the other test case.
NIT: Some of the existing test suite has a lot of tests like this that assert on the internal state of things. However, in general we prefer tests that verify user-visible behavior these days. I.e. having a test that checks that a disabled profiler doesn't send data would be nice. In practice this might be a difficult test to write in a non-flaky manner, so I'm okay with keeping the testing as proposed in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand correctly, should we keep the test that checks activeProfiler, and additionally create a test that verifies no profiling data is sent when DD_PROFILER_ENABLED=false
?
As you mentioned, since I'm not very familiar with the codebase, implementing this might be a bit challenging, but I'll give it a try. Thank you for the suggestion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's a test for DD_PROFILING_ENABLED=false
I sketched up while reviewing this:
func TestEnabledFalse(t *testing.T) {
t.Setenv("DD_PROFILING_ENABLED", "false")
ch := startTestProfiler(t, 1, WithPeriod(10*time.Millisecond), WithProfileTypes())
select {
case <-ch:
t.Fatal("received profile when profiler should have been disabled")
case <-time.After(time.Second):
// This test might succeed incorrectly on an overloaded
// CI server, but is very likely to fail locally given a
// buggy implementation
}
}
Feel free to add it to the PR if it makes sense to you. We can keep the other tests for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've implemented additional changes in f71e851 (#2840).
However, the tests you've already suggested seem sufficient, so I couldn't find any points to modify or add. If there are any areas that need improvement, please feel free to let me know!
@@ -146,6 +147,7 @@ func logStartup(c *config) { | |||
"execution_trace_size_limit": c.traceConfig.Limit, | |||
"endpoint_count_enabled": c.endpointCountEnabled, | |||
"custom_profiler_label_keys": c.customProfilerLabels, | |||
"enable": c.enable, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, let's just remove this. If enable
is false we won't send anything at all.
profiler/options.go
Outdated
@@ -208,6 +210,7 @@ func defaultConfig() (*config, error) { | |||
} else { | |||
c.agentURL = url.String() + "/profiling/v1/input" | |||
} | |||
c.enable = internal.BoolEnv("DD_PROFILING_ENABLED", true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're going to support DD_PROFILING_ENABLED=auto
set via the Datadog admission controller. Right now this will work with the value auto
, but will log a warning saying it's an invalid boolean. Let's perhaps check for auto
explicitly and then check the boolean? Something like:
if os.Getenv("DD_PROFILING_ENABLED") == "auto" {
c.enable = true
} else {
c.enable = internal.BoolEnv("DD_PROFILING_ENABLED", true)
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Applied suggested changes here
ab2acd9
(#2840)
Updated profiler options to automatically enable profiling if the environment variable "DD_PROFILING_ENABLED" is set to "auto". This change delegates the decision to the Datadog admission controller when "auto" is specified.
Implemented a new test to verify that no profiles are received when the profiler is disabled. This helps ensure the profiler respects the DD_PROFILING_ENABLED environment variable.
What does this PR do?
This PR introduces a new environment variable
DD_PROFILING_ENABLED
to control the profiler's behavior in a way similar toDD_TRACE_ENABLED
. By default,DD_PROFILING_ENABLED
is set totrue
, meaning profiling will be enabled ifprofiler.Start()
is called in the application code. IfDD_PROFILING_ENABLED
is set tofalse
, profiling will be disabled even ifprofiler.Start()
is called. This allows the application code to always callprofiler.Start()
while dynamically adjusting profiling through the environment variable.Motivation
Fixes #2834
The motivation for this PR is to simplify the control of profiling behavior across multiple applications. By introducing
DD_PROFILING_ENABLED
, developers can avoid the cumbersome task of managing environment variables within the application code and instead control profiling through a single environment variable.Additional Information
This PR includes the following changes:
DD_PROFILING_ENABLED
environment variable.Start function
to skip profiling ifenable
field in profiler config isfalse
.DD_PROFILING_ENABLED
inoptions_test.go
andprofiler_test.go
.Reviewer's Checklist