-
Notifications
You must be signed in to change notification settings - Fork 71
READ_ONCE and WRITE_ONCE
There are several reasons to use at least READ_ONCE
and WRITE_ONCE
for all shared memory accesses:
- It makes code easier to understand
- It is required by relevant standards
- It enables automatic data race detection
And there are no reasons to not use them (see Performance considerations if you are worried about performance).
Accesses to shared memory are an important thing and they should be clearly visible in the source code. Whether the accessed memory location can be concurrently read/written significantly affects the way you reason about the code. For example, consider the following code:
tty = port->itty;
if (tty == NULL)
return;
It looks like nothing special and does not suggest that this "simple" piece of code actually hides non-trivial synchronization protocol. If the code would look like:
tty = READ_ONCE(port->itty);
if (tty == NULL)
return;
It would be clear that concurrent mutations of port->itty are possible, and thus greater attention is required (especially if you are tracking a bug somewhere around). For example, what is the ownership story here? If port->itty
can become NULL
concurrently, can't it also be deleted?
Consider another piece of code which waits for a response from a device and then copies out reply:
int rc = -1;
...
wait_event_timeout(ps2dev->wait, !(ps2dev->flags & PS2_FLAG_CMD), timeout);
for (i = 0; i < receive; i++)
resp[i] = ps2dev->cmdbuf[(receive - 1) - i];
if (ps2dev->cmdcnt)
goto out;
rc = 0;
out:
return rc;
This code contains a subtle bug: if the reply arrives right after timeout, then we can copy out uninitialized garbage as response expecting that we will return an error from the function; but at the point we check cmdcnt
response has already arrived, so we return success and garbage as response. If the code would use READ_ONCE
to read ps2dev->flags
and ps2dev->cmdcnt
, then it would be more visible that we are doing a sequence of loads of potentially mutating state, and hopefully draw attention to the possibility of the bad scenario described above.
Linux-Kernel Memory Model says:
Loads from and stores to normal variables should be protected with the ACCESS_ONCE() macro.
Similarly, C standard says:
5.1.2.4/25 The execution of a program contains a data race if it contains two conflicting actions in different threads, at least one of which is not atomic, and neither happens before the other. Any such data race results in undefined behavior.
As the consequence C compilers stopped guarantying that "word accesses are atomic". There is a number of ways how compilers can miscompile non-race-free programs, see for example How to miscompile programs with “benign” data races. Let's consider one example. Below you can see result of compilation of an OpenSSL function by a release version of gcc:
// static void widefelem_diff(widefelem out, const widefelem in)
0000000000fa0390 <widefelem_diff>:
fa0390: mov %rdi,-0x8(%rsp)
fa0395: mov %rsi,-0x10(%rsp)
fa039a: mov -0x8(%rsp),%rsi
fa039f: mov (%rsi),%rdi # Load from memory into the register.
fa03a2: mov 0x8(%rsi),%rax
fa03a6: mov %rdi,(%rsi) # Store the same value back into memory.
...
This code dereferences one of the input arguments and then immediately stores the value back. Such generated code cannot break any race-free program because there must be no concurrent stores to the variable. Now consider what happens if compiler generates such code for BUG_ON(obj->refcount > 0)
. The code will load value of refcount into a register and store it back, this can lead to a missed increment in another thread, which, in turn, will lead to usage of the object after free.
If intentional accesses to shared memory are not marked in some way, then it is also impossible to automatically detect non-intentional accesses to shared memory (that is, forgotten locks, accesses from wrong threads, inconsistent reads of uint64, etc). There are tools that try to find these unpleasant classes of bugs (KTSAN, KernelStrider), but they cannot accomplish this task while kernel source code is sprinkled with intentional racy accesses that are indistinguishable from bugs.
Given a sufficiently expressive atomic API and a good implementation, you pay only for what you really need (if you pay just a bit less, generated code becomes incorrect). So performance is not an argrument here. Note that if these assumptions do not hold then it is an issue in itself; it is orthogonal to this topic and should be fixed separately.