Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Narrow detection of activated state. #3533

Closed
wants to merge 1 commit into from

Conversation

mitchell-as
Copy link
Contributor

@mitchell-as mitchell-as commented Oct 8, 2024

BugDX-3086 `state shell` not working in VSCode/WSL, saying "already in a shell" with empty strings.

Do not rely on just config, which may be leftover from a previous activation.

Note: I do not have the means to reproduce this issue. My Windows VM cannot run WSL due to the lack of a hypervisor. However, after reading the code (

func IsActivated(cfg Configurable) bool {
return ActivationPID(cfg) != -1
}
and
func ActivationPID(cfg Configurable) int32 {
pid := int32(os.Getpid())
ppid := int32(os.Getppid())
procInfoErrMsgFmt := "Could not detect process information: %v"
seen := map[int32]bool{}
for pid != 0 && pid != ppid {
pidFileName := ActivationPIDFileName(cfg.ConfigPath(), int(pid))
if fileutils.FileExists(pidFileName) {
return pid
}
if ppid == 0 {
logging.Debug("Parent PID is 0")
return -1
}
if seen[ppid] {
logging.Debug("Parent process PID has already been seen")
return -1
}
pproc, err := process.NewProcess(ppid)
if err != nil {
if err != process.ErrorProcessNotRunning {
multilog.Log(logging.ErrorNoStacktrace, rollbar.Error)(procInfoErrMsgFmt, err)
}
return -1
}
name, err := pproc.Name()
if err != nil {
logging.Error(procInfoErrMsgFmt, err)
} else {
logging.Debug("Parent process name: %s", name)
}
seen[ppid] = true
pid = ppid
ppid, err = pproc.Ppid()
if err != nil {
multilog.Log(logging.ErrorNoStacktrace, rollbar.Error)(procInfoErrMsgFmt, err)
return -1
}
}
return -1
}
), I see that we're relying on a config check, followed by a process ID running check. I think it's possible for a previous activation to be erroneously detected as still running under the right circumstances. The fact that Antoine could not immediately reproduce this issue when filing the initial bug report lends credence to my suspicion.

We may want to consider simplifying this detection to just look for environment variables. I was not present when this detection was initially written, so I don't know exactly why we need it or how accurate it needs to be; it just looks very convoluted to me.

Do not rely on just config, which may be leftover from a previous activation.
@mitchell-as
Copy link
Contributor Author

I'm not saying that my approach is correct. I'm open to feedback, a different approach, this being rejected, etc. Do keep in mind I do not have access to WSL from my Windows VM though.

@mitchell-as
Copy link
Contributor Author

Test failure a timeout and unrelated to this PR.

Copy link
Member

@Naatan Naatan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels like we're working around an issue. And it's adding the env var back to detecting activations, which historically has given us lots of issues. IIRC that's the reason we started identifying processes instead.

I'd strongly prefer we try to simplify this mechanic rather than add more complexity to it. If we have a repro case of the issue; can we figure out why the pidfile still exists, and address that?

Also we could probably enrich the config pidfile approach by still verifying the process name. I think if we own the process then we shouldn't see permission errors.

Please always tag me for review on tickets like this. The solution proposed here has lots of implications and goes beyond a bug fix, and was not prescribed in the original ticket. That doesn't mean I'm against it by default or anything, but I'd like to at the very least be aware of the change.

@MDrakos MDrakos removed their request for review October 11, 2024 17:16
@mitchell-as
Copy link
Contributor Author

Closing, as I do not have the ability to reproduce this or debug it further.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants