Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[system/process] Fix windows protected process data collection #191

Merged
merged 9 commits into from
Nov 20, 2024

Conversation

VihasMakwana
Copy link
Contributor

@VihasMakwana VihasMakwana commented Nov 12, 2024

This PR merges work done on windows-protected-processes into main.

Fixes elastic/beats#40484

- Enhancement

We can ignore the error in two cases:
- While reading the process executable name. 
- For pid 4, this call fails as we can't access the executable name via
the system call. Same for other kernel-level processes.
- While finding the owner for a particular process.
- We try to open the process token via `syscall.OpenProcessToken`and we
can't access the token for protected processes , even as an
administrator.
    
it's okay to ignore these errors and move forward as we can access few
other metrics (memory, cpu).
    
More context
[here](elastic/beats#40484 (comment))

Relates elastic/beats#40484
This PR does the following:
- It integrates `getIdleMemory` and `getProcessMemory` introduced in
#181. This
is used to capture stats for PID 0.
- `FillMetricsRequiringMoreAccess` adds some extra information about a
process. It's failure is expected for protected processes and if a
non-root user accesses elevated processes. We can just log the error
rather than returning an error. This will not pollute the status message
on fleet.
     - @cmacknz Please let me know how you feel about this.

This is final PR in this series and once this PR gets merged, we can
merge `windows-protected-processes` into `main`.
@VihasMakwana VihasMakwana added the Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team label Nov 12, 2024
@VihasMakwana VihasMakwana self-assigned this Nov 12, 2024
@VihasMakwana VihasMakwana changed the title {DRAFT} Windows protected processes [system/process] Fix windows protected processes Nov 14, 2024
@VihasMakwana VihasMakwana marked this pull request as ready for review November 14, 2024 12:14
@VihasMakwana VihasMakwana requested a review from a team as a code owner November 14, 2024 12:14
@VihasMakwana VihasMakwana requested review from AndersonQ, rdner and leehinman and removed request for a team November 14, 2024 12:14
@VihasMakwana VihasMakwana changed the title [system/process] Fix windows protected processes [system/process] Fix windows protected process data collection Nov 14, 2024
Copy link
Contributor

@leehinman leehinman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved with request to document the sleep call.

return state, err
}
// sleep is crucial to calculate the idle percentage
time.Sleep(50 * time.Millisecond)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add to the docstring for FillPidMetrics, that it will take 50msec to complete because of this. I just want to be sure that the developer using FillPidMetrics understands this call has a "long" runtime overhead and isn't just reading counters.

Copy link
Contributor Author

@VihasMakwana VihasMakwana Nov 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@leehinman I'll remove this part.

We don't show percentage for a process (for windows). I don't see any reason to include percentage for idle process.
In future, we can make use of performance counters for it.

What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good to me

Copy link
Member

@AndersonQ AndersonQ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is one blocker comment I added, the others is up to you to address or not

metric/system/process/process_windows.go Outdated Show resolved Hide resolved
metric/system/process/process_windows.go Outdated Show resolved Hide resolved
state.Memory.Rss.Bytes = opt.UintWith(processInfo.WorkingSetSize)
state.Memory.Size = opt.UintWith(processInfo.PrivatePageCount)
state.NumThreads = opt.IntWith(int(processInfo.NumberOfThreads))
break
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Blocker]
break or set offset+continue?
Isn't it possible to have other process in the list?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. But we're only interested in one with pid=0. So, no need to check other processes once we come across the pid 0 process

@VihasMakwana
Copy link
Contributor Author

/test

@VihasMakwana VihasMakwana merged commit 43ae8e1 into main Nov 20, 2024
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[metricbeat/system][windows] - Metricbeat reports DEGRADED while running in privileged mode
3 participants