documentation of `asm` contents? #65

iximeow · 2022-07-15T23:44:57Z

hi! related to #64, i couldn't tell what the intent of a few asm blocks were to figure out if there might be other ways of getting the same results.. i think that everything i'm seeing might just be assembler directives for some specific target, but here goes:

in no-linker.rs there's a clr rax but clr is not an x86 instruction. is this expected to be assembled by an assembler that translates clr to "clear", either by xor rax, rax or mov rax, 0?

is_enabled_rec and probe_rec are a series of assembler directives and not specific instructions themselves. i think the consequence here is that is_enabled will always be 0 (due to clr rax) and the probe_rec section would never execute. if it were to execute, i think there would be no instructions there but the nop, with the directives just setting up bytes describing a probe record elsewhere in the binary?

in linker.rs, there's a .reference directive but that's not a directive i can recall seeing anywhere. is there a specific assembler that does recognize this directive, where a reader (me, hi) could go to learn more about what this should cause the assembler to emit?

since call_instruction in linker.rs references aarch64, i think this also implies that the no-linker paths could reasonably be tried on aarch64 which would then (no clr, but also no rax 😁)

most of these questions are the result of trying to understand why an explicit call/bl are necessary here - since we can an expected set of arguments for a probe, i'd expect to be able to transmute to an appropriate-arity extern "C" function that is then called rust-style. that leaves the calling convention and specific instruction in the hands of the compiler, and would let you delete the isa-specific bits i think!

and finally, i see a lot of preserves_flags in asm blocks. i would strongly recommend some // Safety: -style comments about why those options are correct - if clr rax above is interpreted as xor rax, rax, for example, the option is just incorrect and could result in bad flag-clobbering. i think the same is possible in the calling-a-probe case, since no calling conventions i know of ensure that status registers are preserved.. the likelihood of this being a real problem, though, seems low: i haven't often seen llvm keep a boolean in a flags register especially across an interesting number of operations.

The text was updated successfully, but these errors were encountered:

bnaecker · 2022-07-18T20:30:47Z

Hi @iximeow, thanks your detailed and thoughtful questions. I'm still looking at #64, but I'd like to take a swing at addressing your questions here.

in no-linker.rs there's a clr rax but clr is not an x86 instruction. is this expected to be assembled by an assembler that translates clr to "clear", either by xor rax, rax or mov rax, 0?

I'm not entirely sure of the history here, but clr rax is currently implemented in LLVM as an alias to, you guessed it, xor rax, rax. It's an easy way to zero the target register.

is_enabled_rec and probe_rec are a series of assembler directives and not specific instructions themselves. i think the consequence here is that is_enabled will always be 0 (due to clr rax) and the probe_rec section would never execute. if it were to execute, i think there would be no instructions there but the nop, with the directives just setting up bytes describing a probe record elsewhere in the binary?

That's all essentially correct, but probably mysterious without more knowledge about how DTrace works. We can (and probably should) document this more clearly in the code itself.

The assembler directives you mention are in fact designed to store some metadata about the probes themselves in the final binary. If you're curious, you can use readelf or a similar tool, and look for the section called set_dtrace_probes in the object file. The dusty binary that's part of this workspace will also dump those records, in the deserialized format (not raw binary). Those records contain all the metadata that we need to describe the probes. When the program is loaded, we look for those records, deserialize them, and then "register" them with the DTrace kernel module.

Now, as you point out, the clr rax bit would seem to imply that the probe section would always be skipped, since is_enabled is always zero. However, while it is zero when the program text is loaded, it's not always zero while the program is run. When we register the probes with the DTrace kernel module, we're providing lots of information about the probes. An important one is the address of that is_enabled value. (Well, the offset to it, but same thing.) When you then make a DTrace command-line call like: dtrace -n ':::my-probe', DTrace finds that offset in the running binary, and rewrites it.

In particular, the xor %rax, %rax, is rewritten into an INT (0xCC). On that trap, the kernel ensures that rax has 1 in it, which guarantees that the rest of the "code" in the probe itself runs.

I put "code" in quotes, because you're also right, that's just a no-op now. But DTrace also rewrites that nop into another trap. But you'll notice that just before the nop, we call the probe argument closure. That generates the arguments the user provided, and we're then storing them in the right registers. When the trap runs, we're effectively calling back to the DTrace kernel module with the arguments constructed by the closure.

in linker.rs, there's a .reference directive but that's not a directive i can recall seeing anywhere. is there a specific assembler that does recognize this directive, where a reader (me, hi) could go to learn more about what this should cause the assembler to emit?

That is a macOS-specific assembler directive. You can look at this manual for details, but basically it just creates an undefined symbol in the resulting symbol table with the specified name. This is macOS-specific, since Apple has built support for DTrace directly into their platform linker and loader. That's also why the register_probes() function is a no-op on macOS -- the loader already does that for us. We're getting into the weeds here, but if you're familiar with calling dtrace -h and dtrace -G, the .reference directive essentially takes the place of that.

since call_instruction in linker.rs references aarch64, i think this also implies that the no-linker paths could reasonably be tried on aarch64 which would then (no clr, but also no rax 😁)

Not quite. The only ARM platform we currently support is macOS. no-linker.rs will currently only work on x86 systems, which is all that the other OS's sporting DTrace support anyway. The call_instruction function is there to emit the correct assembly opcode for x86 or ARM, on macOS.

i'd expect to be able to transmute to an appropriate-arity extern "C" function that is then called rust-style. that leaves the calling convention and specific instruction in the hands of the compiler, and would let you delete the isa-specific bits i think!

This is similar to how the implementation on macOS works. Take a look at the implementation of that flavor, especially the block comment at the top. On macOS, the linker creates a probe "function" with goofy names like __dtrace_probe$foo$bar$v1. (That's not a valid Rust function identifier, but it's fine in C. What we're doing here is making an extern-C function with the same link name.) Now that function isn't really a function. On macOS, the linker will rewrite the call into a nop. Now the assembly looks pretty much the same as the no-linker.rs implementation: generate some probe arguments followed by a nop. At runtime, just like on the other platforms, the macOS DTrace kernel module rewrites the nop to a trap when the probe is enabled. The effect is the same.

There are a few places like this, where we probably could use a Rust function call, but don't. The overarching reason is that we want to be completely, exactly sure of what the emitted instructions are. An example is the probe macros themselves. Those could have been functions. However, that may result in a function call, or may not, if the compiler chooses to inline it. But one tenet of DTrace is that there should be absolutely minimal impact on a system when the probes are not enabled. We felt it might interfere with this aim to rely on the compiler to choose to inline a function, while macros are guaranteed to get us here.

In a similar vein, DTrace is clearly relying on some shenanigans, let's say, in the actual assembly. It rewrites instructions dynamically to achieve its ends. We needed to be sure that the instructions are exactly what DTrace expects (or the implementation on the supported platforms, anyway), so that it can be sure to operate correctly.

and finally, i see a lot of preserves_flags in asm blocks. i would strongly recommend some // Safety: -style comments about why those options are correct - if clr rax above is interpreted as xor rax, rax, for example, the option is just incorrect and could result in bad flag-clobbering. i think the same is possible in the calling-a-probe case, since no calling conventions i know of ensure that status registers are preserved.. the likelihood of this being a real problem, though, seems low: i haven't often seen llvm keep a boolean in a flags register especially across an interesting number of operations

You may be correct that preserves_flags is wrong, but I don't know that's because of the definition of clr. Knowing that clr rax is actually xor %rax, %rax, we can still be sure that the flags register is not touched. That instruction will not modify any of the flags listed in the Rust documentation. Now, given that users have control of the probe closure, I guess it's possible for their code to modify those flags, for example if they do some floating point operations. (Floating-point types aren't supported in DTrace or this crate, though in theory the user could still be doing FP operations.)

Hope that helps illuminate why we've made the choices we did, but please let me know if other questions remain. Thanks!

bnaecker · 2022-07-18T20:42:20Z

I'm thinking more about the flags question, and I actually think the preserves_flags option is fine. The user's provided argument closure is called outside the asm! macro. Inside the macro, I don't see anything we're doing that would modify the flags registers, on either x86 or ARM.

ahl · 2022-07-20T19:53:09Z

There are a few places like this, where we probably could use a Rust function call, but don't. The overarching reason is that we want to be completely, exactly sure of what the emitted instructions are. An example is the probe macros themselves. Those could have been functions. However, that may result in a function call, or may not, if the compiler chooses to inline it. But one tenet of DTrace is that there should be absolutely minimal impact on a system when the probes are not enabled. We felt it might interfere with this aim to rely on the compiler to choose to inline a function, while macros are guaranteed to get us here.

In addition, DTrace uses the location of a probe as part of its name. If the probe were not inlined then it would be named by the probe's name rather than by the name of the containing function.

In a similar vein, DTrace is clearly relying on some shenanigans, let's say, in the actual assembly. It rewrites instructions dynamically to achieve its ends. We needed to be sure that the instructions are exactly what DTrace expects (or the implementation on the supported platforms, anyway), so that it can be sure to operate correctly.

... in particular, we can't have the compiler decided to omit some code because it appears to be unreachable! This is one of the few times I feel confident saying that we know better!

I'm thinking more about the flags question, and I actually think the preserves_flags option is fine.

Agreed. As @bnaecker noted, the comment "no calling conventions i know of ensure that status registers are preserved" misses the important--but subtle and under-documented--point that no functions are called as a result of that call instruction! The meta point about safety-related comments is well taken however.

iximeow · 2022-07-20T22:30:29Z

this is great! i have a much better understanding of how these pieces fit together now, and i think that some of this would be super informative to keep as documentation alongside the code. thank you both.

clr rax is currently implemented in LLVM as an alias

i definitely would have never thought to look there :) some spelunking shows that this was added as part of some rdar:// issue link, this was later moved over to tablegen where it now lives... but rdar://8416805 appears to not be public? or i'm looking in the wrong place some more. would you be opposed to a subsequent PR switching this to xor? if not, i'll do a PR adding a comment pointing here.

On that trap, the kernel ensures that rax has 1 in it,

oh! this was the missing piece. so then you want to be absolutely certain that the compiled probe would have assembly of the shape

    xor rax, rax      ; critical to be present at the `is_enabled` address, otherwise
                      ; enablement breaks. so, `asm!()` it is!
    test rax, rax     ; <- both from rustc, but `rax` is opaque so we always get a
    jz not_enabled    ; <- `test` instruction (and not assumptions `rax` is 0)
is_enabled:
    ; do the probe stuff

DTrace also rewrites that nop into another trap

and this was the other piece i hadn't quite understood; taken together this also explains why in rust terms the probe machinery must be implemented as a macro - probe sites must be unique addresses in the binary otherwise. and then in this way some asm declaration is also implicitly load-bearing to prevent the compiler from deciding two different call sites are identical and f.ex candidates for outlining...

On macOS, the linker will rewrite the call into a nop

ah! so the call is more to get an instruction referencing that extern fn at the right address, for macOS to rely on as it handles the records and probe enablement?

if you're familiar with calling dtrace -h and dtrace -G, the .reference directive essentially takes the place of that.

i'm not, but i'm piecing it together and see how those line up 😀

we can still be sure that the flags register is not touched.

xor rax, rax does modify all the typical arithmetic flags; it will clear OF and CF always, then because rax == rax it will also clear SF and ZF. i believe PF is set, but i haven't thought about it in a quick minute. i think nomem is also inaccurate because there may be pointer arguments (strings) passed to the probe function? nostack seems to depend on how the probe function is exactly called - if that's all wired up via traps and kernel, then technically with all arguments in registers and no return address on the stack that does make sense :D

This is one of the few times I feel confident saying that we know better!

i agree! i actually think it would be clearer to make the asm! block a little larger and say exactly this, vs (in the no-linker case) weaving between asm! and Rust with the overhead of declaring intermediate state back to rustc (f.ex preserve_flags). exactly the opposite outcome i'd hoped for when i started looking at this in #64 :)

bonus thoughts:

it seems that is_enabled_rec and probe_rec don't (abstractly) have to be in-line, but do because the labels 990, 991, and 992 are used to figure out sizes and values that describe the probe records themselves? e.g. 990: here must then be referenced as an address by the .8byte directive for the record describing it. i didn't see anything in the macOS assembler guide that b and f are suffixes for label regions but it really seems like that must be the interpretation... would it make sense to name these labels more intentionally, something more like...

#asm_macro!(
    "probe: nop",
    #probe_rec,
    #in_regs
    options(...)
);

... and elsewhere ...

format!(
    r#"
                .pushsection {section_ident}
                .balign 8
        rec_start:
                .4byte rec_end - rec_start
                .byte {version}
                .byte {n_args}
                .2byte {flags}
                .8byte probe
                .asciz "{prov}"
                .asciz "{probe}"
                {arguments}
        rec_end:                 ; note no second balign 8 ... i think it was redundant?
                .popsection

this means the probe record emission has to be in-line with the probe because of the shared labels but makes the size/address calculation not ambiguous with.. hex constants 😅 (reading 990b as not a magic number/bitfield took a moment..)

bnaecker · 2022-11-18T21:55:39Z

Hi @iximeow! Thanks for your patience on this. I'm coming back to this repo for some maintenance, so I apologize for the hiatus.

I'm glad I was able to answer some of your questions about how this crate works. It's true that a lot of it relies on knowing the implementation details of DTrace itself. We happen to have just such a person in @ahl! I've tried to cover the outstanding issues below, but let me know if you have other questions.

I believe you're correct about the preserves_flags option, that should not be specified. We're explicitly clearing rax, so that should indeed modify the arithmetic flags.

As for nomem, I think the answer is much less clear. It's true that there may be strings involved. In particular, the argument closure may create or refer to a string, and we may be passing (a pointer to) that in a register when entering the assembly block. However, the block itself never reads or writes any additional memory. Even if pointed-to data is reference in a D program based on this probe, the kernel itself would be copying that data from user memory. That's all inside the kernel, and the user process is effectively stopped.

Like I said, it's unclear to me if this "counts" as accessing memory. We're certainly never doing something like mov or other operations directly against the memory. I guess the way to think about it is that assembly is just a vehicle to express the intent, which is that we pass a few arguments in registers into the DTrace kernel module. That said, it may be safest to just remove all these flags. I don't see any harm, but @ahl let me know how you feel about it.

The last thing is the labels. 990 etc were just easy to remember numbers. The b and f suffix stand for backward and forward, and direct the assembler to refer to the next label with that name in those directions. That's at least part of the need for putting the probes "in-line": without making the labels themselves variables (which I don't think we wanted to do), you can only refer to them by the same name, so you need the direction to disambiguate which one. Otherwise we'd refer to the probe after us when storing the metadata!

That said, the before/after syntax should still be OK with better names. I believe I tried initially to use more human-friendly names, like rec_start. I have a vague recollection that this did not work, but I definitely don't remember the details. Please feel free to try that!

Hopefully that addresses your outstanding points. I'd be happy to review a PR fixing the ASM block flags, and I'm also still mulling over your PR removing the lock xchg. That's probably better than the unsound thing we had before, but I also may want to explore re-organizing things so we can just take an exclusive reference there.

iximeow · 2022-11-26T17:44:25Z

Hi @iximeow! Thanks for your patience on this. I'm coming back to this repo for some maintenance, so I apologize for the hiatus.

no worries! i've recently come back to some issues i've been mulling over elsewhere, too.

Like I said, it's unclear to me if this "counts" as accessing memory. We're certainly never doing something like mov or other operations directly against the memory. I guess the way to think about it is that assembly is just a vehicle to express the intent, which is that we pass a few arguments in registers into the DTrace kernel module

i think the rub is that rustc (more likely, LLVM) is attentively considering the attestation as well. so while this probably wouldn't happen today, consider a function like...

fn probe_example() {
  let mut work: u8 = 0;
  let mut sum = 0;

  while let Some(item) = do_work(&mut sum) {
      work += 1;
  }

  module::report_work!(|| &work); // you _could_ write `work` without a ref, of course. somewhat contrived example..

  cleanup();

  sum
}

if do_work() causes work to spill to a stack slot..

LLVM could conceivably increment directly to the stack slot (add [rsp + 0x1234], 1)
load the reference to work as a parameter to report_work
conclude that work is no longer used (the asm block promises it won't read the pointer, LLVM doesn't "see" the call)
reuses that stack slot to spill another local (sum), preparing for cleanup() later
enters the report_work thunk, which faithfully passes along a pointer to the clobbered stack slot

it's a bit contrived - i think it requires the local be a type with no Drop impl for example - but i've been on the other side of this stuff and it's always miserable to find.

The b and f suffix stand for backward and forward, and direct the assembler to refer to the next label with that name in those directions.

neat! that's an interesting way to acknowledge duplicate labels in an assembler..

Hopefully that addresses your outstanding points.

sure does! i'll wait to PR about the asm flags until @ahl has a chance to weigh in too. and for the lock xchg PR, if you can take an exclusive reference that'd be great! it's still unsound as i'd PR'd there, but once i realized there was an underlying difference with goblin's notion of ownership i figured it was better to post the patch and let you figure out which form is most palatable 😁

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

documentation of `asm` contents? #65

documentation of `asm` contents? #65

iximeow commented Jul 15, 2022

bnaecker commented Jul 18, 2022

bnaecker commented Jul 18, 2022

ahl commented Jul 20, 2022

iximeow commented Jul 20, 2022

bnaecker commented Nov 18, 2022

iximeow commented Nov 26, 2022

documentation of asm contents? #65

documentation of asm contents? #65

Comments

iximeow commented Jul 15, 2022

bnaecker commented Jul 18, 2022

bnaecker commented Jul 18, 2022

ahl commented Jul 20, 2022

iximeow commented Jul 20, 2022

bonus thoughts:

bnaecker commented Nov 18, 2022

iximeow commented Nov 26, 2022

documentation of `asm` contents? #65

documentation of `asm` contents? #65