-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
documentation of asm
contents?
#65
Comments
Hi @iximeow, thanks your detailed and thoughtful questions. I'm still looking at #64, but I'd like to take a swing at addressing your questions here.
I'm not entirely sure of the history here, but
That's all essentially correct, but probably mysterious without more knowledge about how DTrace works. We can (and probably should) document this more clearly in the code itself. The assembler directives you mention are in fact designed to store some metadata about the probes themselves in the final binary. If you're curious, you can use Now, as you point out, the In particular, the I put "code" in quotes, because you're also right, that's just a no-op now. But DTrace also rewrites that
That is a macOS-specific assembler directive. You can look at this manual for details, but basically it just creates an undefined symbol in the resulting symbol table with the specified name. This is macOS-specific, since Apple has built support for DTrace directly into their platform linker and loader. That's also why the
Not quite. The only ARM platform we currently support is macOS.
This is similar to how the implementation on macOS works. Take a look at the implementation of that flavor, especially the block comment at the top. On macOS, the linker creates a probe "function" with goofy names like There are a few places like this, where we probably could use a Rust function call, but don't. The overarching reason is that we want to be completely, exactly sure of what the emitted instructions are. An example is the probe macros themselves. Those could have been functions. However, that may result in a function call, or may not, if the compiler chooses to inline it. But one tenet of DTrace is that there should be absolutely minimal impact on a system when the probes are not enabled. We felt it might interfere with this aim to rely on the compiler to choose to inline a function, while macros are guaranteed to get us here. In a similar vein, DTrace is clearly relying on some shenanigans, let's say, in the actual assembly. It rewrites instructions dynamically to achieve its ends. We needed to be sure that the instructions are exactly what DTrace expects (or the implementation on the supported platforms, anyway), so that it can be sure to operate correctly.
You may be correct that Hope that helps illuminate why we've made the choices we did, but please let me know if other questions remain. Thanks! |
I'm thinking more about the flags question, and I actually think the |
In addition, DTrace uses the location of a probe as part of its name. If the probe were not inlined then it would be named by the probe's name rather than by the name of the containing function.
... in particular, we can't have the compiler decided to omit some code because it appears to be unreachable! This is one of the few times I feel confident saying that we know better!
Agreed. As @bnaecker noted, the comment "no calling conventions i know of ensure that status registers are preserved" misses the important--but subtle and under-documented--point that no functions are called as a result of that call instruction! The meta point about safety-related comments is well taken however. |
this is great! i have a much better understanding of how these pieces fit together now, and i think that some of this would be super informative to keep as documentation alongside the code. thank you both.
i definitely would have never thought to look there :) some spelunking shows that this was added as part of some
oh! this was the missing piece. so then you want to be absolutely certain that the compiled probe would have assembly of the shape xor rax, rax ; critical to be present at the `is_enabled` address, otherwise
; enablement breaks. so, `asm!()` it is!
test rax, rax ; <- both from rustc, but `rax` is opaque so we always get a
jz not_enabled ; <- `test` instruction (and not assumptions `rax` is 0)
is_enabled:
; do the probe stuff
and this was the other piece i hadn't quite understood; taken together this also explains why in rust terms the probe machinery must be implemented as a macro - probe sites must be unique addresses in the binary otherwise. and then in this way some
ah! so the
i'm not, but i'm piecing it together and see how those line up 😀
i agree! i actually think it would be clearer to make the bonus thoughts:it seems that
this means the probe record emission has to be in-line with the probe because of the shared labels but makes the size/address calculation not ambiguous with.. hex constants 😅 (reading |
Hi @iximeow! Thanks for your patience on this. I'm coming back to this repo for some maintenance, so I apologize for the hiatus. I'm glad I was able to answer some of your questions about how this crate works. It's true that a lot of it relies on knowing the implementation details of DTrace itself. We happen to have just such a person in @ahl! I've tried to cover the outstanding issues below, but let me know if you have other questions. I believe you're correct about the As for Like I said, it's unclear to me if this "counts" as accessing memory. We're certainly never doing something like The last thing is the labels. That said, the before/after syntax should still be OK with better names. I believe I tried initially to use more human-friendly names, like Hopefully that addresses your outstanding points. I'd be happy to review a PR fixing the ASM block flags, and I'm also still mulling over your PR removing the |
no worries! i've recently come back to some issues i've been mulling over elsewhere, too.
i think the rub is that rustc (more likely, LLVM) is attentively considering the attestation as well. so while this probably wouldn't happen today, consider a function like... fn probe_example() {
let mut work: u8 = 0;
let mut sum = 0;
while let Some(item) = do_work(&mut sum) {
work += 1;
}
module::report_work!(|| &work); // you _could_ write `work` without a ref, of course. somewhat contrived example..
cleanup();
sum
} if
it's a bit contrived - i think it requires the local be a type with no
neat! that's an interesting way to acknowledge duplicate labels in an assembler..
sure does! i'll wait to PR about the asm flags until @ahl has a chance to weigh in too. and for the |
hi! related to #64, i couldn't tell what the intent of a few asm blocks were to figure out if there might be other ways of getting the same results.. i think that everything i'm seeing might just be assembler directives for some specific target, but here goes:
in
no-linker.rs
there's aclr rax
butclr
is not an x86 instruction. is this expected to be assembled by an assembler that translatesclr
to"clear"
, either byxor rax, rax
ormov rax, 0
?is_enabled_rec
andprobe_rec
are a series of assembler directives and not specific instructions themselves. i think the consequence here is thatis_enabled
will always be0
(due toclr rax
) and theprobe_rec
section would never execute. if it were to execute, i think there would be no instructions there but thenop
, with the directives just setting up bytes describing a probe record elsewhere in the binary?in
linker.rs
, there's a.reference
directive but that's not a directive i can recall seeing anywhere. is there a specific assembler that does recognize this directive, where a reader (me, hi) could go to learn more about what this should cause the assembler to emit?since
call_instruction
inlinker.rs
references aarch64, i think this also implies that theno-linker
paths could reasonably be tried on aarch64 which would then (noclr
, but also norax
😁)most of these questions are the result of trying to understand why an explicit
call
/bl
are necessary here - since we can an expected set of arguments for a probe, i'd expect to be able to transmute to an appropriate-arityextern "C"
function that is then called rust-style. that leaves the calling convention and specific instruction in the hands of the compiler, and would let you delete the isa-specific bits i think!and finally, i see a lot of
preserves_flags
inasm
blocks. i would strongly recommend some// Safety:
-style comments about why those options are correct - ifclr rax
above is interpreted asxor rax, rax
, for example, the option is just incorrect and could result in bad flag-clobbering. i think the same is possible in the calling-a-probe case, since no calling conventions i know of ensure that status registers are preserved.. the likelihood of this being a real problem, though, seems low: i haven't often seen llvm keep a boolean in a flags register especially across an interesting number of operations.The text was updated successfully, but these errors were encountered: