-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for profiling on Windows #141
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Including: - add stacks that don't start in user space - properly register newly started threads - use ProcessId from the user data instead of the event on thread start events
This allows us to avoid rebuilding it for every event.
They're currently broken
This now gets further when converting an x86_64 etl file which I recorded with the following: ``` xperf -start "NT Kernel Logger" -SetProfInt 2500 -on latency -stackwalk profile+cswitch -start "usersession" -on Microsoft-JScript:0x3+d2d578d9-2936-45b6-a09f-30e32715f42d:0x8000000000010000+c923f508-96e4-5515-e32c-7539d1b10504:0x6 xperf -stop "NT Kernel Logger" -stop "usersession" -d prof.etl ``` In detail: - Move the things that need admin privileges to a place where they don't run. - Remove the tokio runtime and the call to get_library_info_for_binary - Stub out the kernel librray stuff - Put a default for the kernel start address - Set arch to x86_64 when running on x86_64, so that user and kernel stacks are merged correctly - Remove an unwrap which was panicking on the file I was testing with
mstange
reviewed
Apr 29, 2024
mstange
reviewed
Apr 29, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR adds (initial) support for profiling on Windows. It uses the Event Tracing for Windows framework to get very fast, low-overhead captures.
In order to expedite getting something working, this PR calls out to the
xperf.exe
tool to start/stop tracking and save output to a file. It then parses events from that file, adding them to a profile. This is probably good enough for 95% use cases, but in the future it would be nice to use Etw directly to create an in-memory trace, handling events as they come in. Etw is just such an annoying API to work with that I didn't want to tackle that; I use theferrisetw
crate for event parsing, and it can probably be used for kicking off a kernel trace, but the process for enabling kernel providers is convoluted.There's a few things that I need to fix:
xperf
, and to query to list & global address of kernel drivers. Right now, I require thatsamply
be run as administrator. But a better solution would be to spawn an elevated child process that can do all the elevated things, while the main process takes care of launching the process to be profiled (right now it's launched as Administrator, not great!)samply start
and wait for input to stop profiling. (Orsamply start --time 10s
).PROFILE
provider just doesn't work; errors out when you try to start. And I'm on a Apple Silicon Mac using Parallels, with no x86_64 (or non-VM ARM64) for another week.But it works, and works quite well!
Future work: