Skip to content

Commit

Permalink
Fix incorrect use of POSIX atomics (github Issue 91)
Browse files Browse the repository at this point in the history
The nvtxInit.h macro definitions for NVTX_ATOMIC_WRITE_32 and NVTX_ATOMIC_CAS_32
were correct on Windows, but not on POSIX.  For the CAS (compare and swap), the
"newval" and "oldval" arguments were reversed.  This led to an unfortunate bug
where the NVTX init was not threadsafe, and the problem was not noticed because
the CAS operation with the arguments reversed happened to be returning the FRESH
value even though it did not succeed in doing the swap.  The reporter of Issue
91 on the NVTX github page correctly described that __sync_val_compare_and_swap
takes the comparand (oldval) and exchange (newval) arguments in the reverse
order we were passing them.  The fix here is simply to swap the order we pass
them from the macro to __sync_val_compare_and_swap.

While reviewing the docs for __sync_val_compare_and_swap, I noticed that the
function we used in the implementation of NVTX_ATOMIC_WRITE_32 may not do what
we want on all platforms.  __sync_lock_test_and_set is only an acquire barrier,
not a full memory barrier, and here we are using it as a release instead.  Also
some platforms apparently only support writing the value 1, and we are writing
the value 2.  So I changed this to simply do a memory barrier, a volatile write,
and another memory barrier.

Also, this change removes an unnecessary __sync_synchronize in the macro before
__sync_val_compare_and_swap.  I reviewed the generated assembly using godbolt,
and confirmed on x86 and ARM the correct memory-ordering behavior is produced
just by __sync_val_compare_and_swap by itself.

This raises the issue we need much better threadsafety testing.  We will work
on ways to validate threadsafe initialization programmatically -- a simple
test using multiple threads with an intentionally long init routine would
have caught this error since the init calls would not have serialized.
  • Loading branch information
jcohen-nvidia committed Nov 13, 2024
1 parent c00c100 commit 9cf43fd
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions c/include/nvtx3/nvtxDetail/nvtxInit.h
Original file line number Diff line number Diff line change
Expand Up @@ -51,8 +51,8 @@
#define NVTX_YIELD() sched_yield()
#define NVTX_MEMBAR() __sync_synchronize()
/* Ensure full memory barrier for atomics, to match Windows functions */
#define NVTX_ATOMIC_WRITE_32(address, value) __sync_synchronize(); __sync_lock_test_and_set(address, value)
#define NVTX_ATOMIC_CAS_32(old, address, exchange, comparand) __sync_synchronize(); old = __sync_val_compare_and_swap(address, exchange, comparand)
#define NVTX_ATOMIC_WRITE_32(address, value) __sync_synchronize(); *address = value; __sync_synchronize()
#define NVTX_ATOMIC_CAS_32(old, address, exchange, comparand) old = __sync_val_compare_and_swap(address, comparand, exchange)
#else
#error The library does not support your configuration!
#endif
Expand Down

0 comments on commit 9cf43fd

Please sign in to comment.