-
Notifications
You must be signed in to change notification settings - Fork 120
Add RCU bindings #143
Comments
Thinking a bit harder about the long-term API and how to handle writers... the interesting part of RCU is that it separates readers and writers. You don't need to block readers while doing a write; you only need to block writers at some point after they've done a write. Also (as the name and some of the terminology imply) "writer" isn't quite accurate. You can't write to an RCU-protected structure. You can only update it to a new one, and then reclaim the old structure later. Since there's no acquire step on the write side, just an atomic update, if a writer needs to read the old structure to copy and modify it, it needs to ensure that no other writers are conflicting with it. (I suppose this could be done with RCU itself, but in practice it seems to be more useful to hold a spinlock across both the read and the update. There's a helper macro So... I think the way to do this is to treat the read capability and the update capability as two different Rust objects that contain the same pointer. Something roughly like pub struct Rcu<T> {ptr: *const T);
pub struct RcuWrite<T> {ptr: Box<T>);
pub struct RcuReadGuard(());
impl RcuReadGuard {
fn new() -> Self { unsafe { rcu_read_lock() }; RcuReadGuard(()) }
}
impl Drop for RcuReadGuard {
fn drop(&mut self) { unsafe { rcu_read_unlock () }; }
}
impl<T> Rcu<T> {
fn init(data: T) -> (Rcu<T>, RcuWrite<T>) {
let w = RcuWrite { ptr: Box::new(data) };
let r = Rcu { w.ptr.into_raw() };
(r, w)
}
fn read<'a>(&self, _: &'g RcuReadGuard) -> &'g T {
unsafe { rcu_defererence(self.ptr) } as &T
}
}
pub struct RcuWriteGuard(Box<T>);
impl Drop for RcuWriteGuard {
fn drop(&mut self) { unsafe { synchronize_rcu () }; }
}
impl<T> RcuWrite<T> {
fn update(&mut self, data: T) {
let new = Box::new(data);
// protected because we have &mut self
let old = unsafe { Box::from_raw(rcu_dereference_protected(self.ptr, 1)) };
unsafe { rcu_assign_pointer(&mut self.ptr, new.into_raw()) };
RcuWriteGuard(old)
}
} or something. Which you could use, then, by storing both a |
Er, obviously that implementation isn't quite right, you need Also perhaps a better design, instead of spitting out a tuple of the read and write objects, is to construct the In any case, I'm fairly confident about the design of |
[Deleted. This issue is about RCU bindings for Rust, for Rust-coded Linux kernel modules. My earlier comment is not relevant.] |
@nicowilliams-work Unfortunately we need to bind the running kernel's implementation of RCU since we need to access exported RCU-protected data structures. (Although, now that I think about it, for the use case of things that live entirely in Rust-space, I vaguely wonder if crossbeam-epoch would work... it does have no_std support. Might be an interesting thing to benchmark RCU in C vs. RCU Rust bindings vs. crossbeam-epoch.) |
A working version of read support (only) is up at #250. |
Add an RcuReadGuard object for calling functions that expect to be RCU-protected, and add a Rust equivalent of for_each_process and a very light task_struct bindings sufficient to write a simple test/demo.
Add an RcuReadGuard object for calling functions that expect to be RCU-protected, and add a Rust equivalent of for_each_process and a very light task_struct bindings sufficient to write a simple test/demo.
Add an RcuReadGuard object for calling functions that expect to be RCU-protected, and add a Rust equivalent of for_each_process and a very light task_struct bindings sufficient to write a simple test/demo.
There are some APIs like
for_each_process
that want you to hold an RCU read lock when calling them. We should support this, at minimum. We probably want to also have full support for RCU-protected data, but maybe we can put that on hold.Upstream RCU docs: https://www.kernel.org/doc/Documentation/RCU/whatisRCU.txt
The rough idea of RCU is to implement a reader-writer lock-ish pattern where readers are low-overhead, under the assumption that all readers are short-lived - a valid assumption for kernels, where a "reader" is the execution of a single syscall/interrupt; anything that persists state beyond this will count as a "writer". In particular, readers should not be slowed down when a writer is trying to do an update, so ideally reads should happen concurrently. The way this is done is by having writers allocate and fill in a new structure, atomically update the pointer to it, synchronize memory and wait for existing readers to complete, and then deallocating the old structure. (This pointer can either be a single pointer to some single structure, or one of the pointers in a linked list, or something.) That way, existing readers see a consistent view - either the old or new version - and you're guaranteed that once you're finished synchronizing, any readers in progress are only seeing the new version.
(Multiple writers are not protected by RCU in any way. They should use a conventional mutex, or something.)
The basic RCU API in Rusty pseudocode is
(Note the latter two in C are macros, into which
p
is syntactically passed by reference. If we want to bind them directly, we probably want to expose C helpers that takevoid **
and handle the type safety in the Rust wrapper.)rcu_read_lock
andrcu_read_unlock
take no context info, they cause an RCU read-sized critical section for any RCU use in the kernel. Between those you may callrcu_dereference
on anything. (That said, there are two other RCU "flavors" in the kernel,rcu_read_lock_bh
etc. andrcu_read_lock_sched
etc. They appear to have been consolidated last year but I think from the API they should still be treated as separate? In any case, the other flavors are rarely used and we can probably get away with ignoring them for now.)The rules as I understand them are:
rcu_read_lock
andrcu_read_unlock
(aka "within a read-side critical section"), and you can only read it withrcu_dererence
(though it compiles out on all architectures besides Alpha).rcu_dereference
doesn't actually do the dereferencing itself, it simply gives you a pointer you can safely deference untilrcu_read_unlock
.rcu_assign_pointer
, though you can do so at any point without advance notice.rcu_assign_pointer
valid until you've calledsynchronize_rcu
.You may nest / overlap read-side critical sections, though. It's a reader lock, you can have more than one of those and writers can't do anything to invalidate the data you're reading until all reader locks are dropped. The only difference with RCU is that writers only wait on
synchronize_rcu
for existing reader locks, once they start, any new reader locks don't affect it (new reads are now guaranteed to be ordered after any previousrcu_assign_pointer
s).I think the rough way to handle this is to
Rcu<T>
pointer typeRcuRead
object whose constructor callsrcu_read_lock()
, whose destructor callsrcu_read_unlock()
, and which is required to read anRcu<T>
Rcu<T>
that keeps the old value alive. Perhaps it returns anRcuDropGuard<T>
object that holds on to a pointer to the old object, and that object's destructor callssynchronize_rcu()
and then frees it?I think it's memory-safe that we're using RAII / destructors here. The worst that can happen is that you deadlock (if you forget an
RcuRead
) or you keep the old value alive forever (if you forgetRcuDropGuard<T>
). The unsafety instd::thread::scoped::JoinGuard
was that there were operations that were unsafe to do before theJoinGuard
was dropped and safe after (like, deallocate data used by the thread). There are no operations that are enabled by either anRcuRead
or anRcuDropGuard<T>
going out of scope. (TheRcuDropGuard<T>
itself needs to handle deallocating the old object, for this reason, to guarantee thatsynchronize_rcu()
was in fact called.)I'm a little less sure about making sure you don't hold the result of reading an
Rcu<T>
after the originatingRcuRead
goes out of scope / after you callrcu_read_unlock
. Can we use lifetimes here, by giving you a pointer that's constrained to the lifetime ofRcuRead
? Or do we need to insist on making you pass a callback/lambda?For the short term, I'd like to at least introduce
RcuRead
and have it be a required parameter for binding things likefor_each_process
, and we can handle doing RCU pointer reads and writes in Rust later. (Which also lets us defer figuring out how to integrate the mutex needed for avoiding concurrent writes.)More interesting RCU docs:
More description of things that RCU does and does not guarantee: https://www.kernel.org/doc/Documentation/RCU/Design/Requirements/Requirements.html
An RCU home page, more or less: http://www.rdrop.com/~paulmck/RCU/
"RCU's first-ever CVE, and how I lived to tell the tale": https://youtu.be/hZX1aokdNiY http://www.rdrop.com/~paulmck/RCU/cve.2019.01.23e.pdf (tl;dr: use-after-free because they locked the wrong RCU flavor, but the solution is a bit more complicated than that)
The text was updated successfully, but these errors were encountered: