-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Solitary USDT probe in a tokio::select!
statement seems to cause other code in one select!
arm to execute unexpectedly
#86
Comments
I was trying to repro this yesterday, and hit a few things I thought I'd add. First, I made a tiny program intended to be a minimal reproducer, but was not able to successfully repro the behavior. Here's the #[usdt::provider]
mod probes {
fn read_bytes(n_bytes: u64) {}
}
#[tokio::main]
async fn main() {
usdt::register_probes().unwrap();
let (tx, mut rx) = tokio::sync::mpsc::unbounded_channel();
let counter = std::sync::Arc::new(std::sync::atomic::AtomicU64::new(10));
let counter_ = counter.clone();
let _task = tokio::task::spawn(async move {
while counter_.load(std::sync::atomic::Ordering::SeqCst) != 0 {
tokio::time::sleep(std::time::Duration::from_millis(100)).await;
let n = counter_.fetch_sub(1, std::sync::atomic::Ordering::SeqCst);
tx.send(n).unwrap();
}
});
loop {
tokio::select! {
nread = rx.recv() => {
match nread {
Some(0) | None => {
probes::read_bytes!(|| { 0 } );
break;
}
Some(n) => {
probes::read_bytes!(|| { n });
}
}
}
}
}
assert_eq!(counter.load(std::sync::atomic::Ordering::SeqCst), 0, "Broke early");
println!("Completed normally");
} Here's what I see running this against
It's entirely possible that's not an accurate reproducer. I next tried to go back to the original commit in Propolis on which this failure was discovered, but could not actually build it due to a dependency conflict:
I'm not sure where this error is coming from, but it seems there's some git dependency that's not pinned down to a specific revision. |
Propolis recently stumbled across what seems like a code generation issue in which putting a USDT probe (and only a USDT probe) in exactly the right place caused a function to careen down an unexpected path in a
tokio::select!
statement. We've worked around the issue in Propolis, but we need to investigate further to root-cause it, hence this tracking issue.(I'm not certain whether this is a problem with USDT as such or an unfortunate interaction between USDT and the Rust build tools--we might investigate and find it's the latter. But USDT probes do seem to be involved somehow so I'm filing this in this repo for now.)
Most of the context, and a link to the actual code in question, is in propolis#292. It might be faster just to read that issue. Here, however, is some long-winded background:
Propolis has a tokio task that reads/writes bytes from/to a VM's virtual COM1 and from/to serial console clients connected over a websocket. The body of the task is mostly a large
select!
statement that handles input/output from these various sources. In psuedocode it looks like this:The (Propolis equivalent) of the probe on line 15 has very strange behavior. If you build this code
release
, and this probe (and only this probe) is present in its match arm, then the serial task breaks out of the loop unexpectedly early. Any of the following changes make the loop behave as expected:dev
profileslog::info!
logging statement at line 16 (this need not refer to the value of nread or indeed to any other variables; it can just print a message)This Propolis commit implements this last option as a workaround and shows the actual code in its full glory.
I briefly rummaged through the output of
cargo-show-asm
with and without the offending probe in place, but quickly decided I was in over my head, since (a) I don't (yet) understand how DTrace probes work or how USDT constructs them, and (b) the only way I could find to get all the assembly and matched line numbers I needed was to generate asm for the entire propolis-server lib, which is massive and hard to rummage through at this level of detail.I suspect the next step here is to try to minimize this somewhat: can we reproduce this with a simplified task that fakes out the UART and builds in its own small library? If so that should make it much easier to understand what's happening.
The text was updated successfully, but these errors were encountered: