You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The Clang thread sanitizer on Linux/Ubuntu detects a data race (v.i., "Traceback:") on the sim_asynch_check variable, which Mark has previously said is a false positive. I'll explain why this isn't the case more carefully than the last time. N.B.: This isn't a discussion of AIO's design, it's a technical discussion with respect to variable synchronization across threads, i.e., implementation.
Here's what's going on:
The main thread reads and writes sim_asynch_check inside the AIO_CHECK_EVENT macro, which is generally expanded at the bottom of simulator instruction loops to process AIO events when --sim_asynch_check < 0.
The AIO thread also executes sim_asynch_check = 0 (in sim_aio_activate, scp.c:439) so that the next time the main thread executes the AIO_CHECK_EVENT macro's code, AIO events are (supposed to be) immediately processed.
Unlike other variables used across threads by AIO, sim_asynch_check is not protected: no mutex, no synchronized access primitives. Consequently, the thread sanitizer correctly identifies an inconsistent read-after-write hazard across threads. This can be a problem for several reasons, one of which is memory store ordering (Intel and ARM do it differently) and another of which is the intent of the AIO code:
sim_asynch_check = 0; /* try to force check */
The important thing to remember is that memory writes are not guaranteed to be sequential or consistent across cores without thread synchronization primitives. Just because the AIO thread writes 0 to sim_asynch_check does not mean that the main thread will read 0 when AIO_EVENT_CHECK decrements sim_async_check (On Intel, there's a high likelihood that the write and reads are more strongly ordered via total store ordering (TSO), whereas ARM has weaker guarantees because ARM does fairly aggressive load/store reordering.)
Looking at the AIO_CHECK_EVENT macro, with extra annotations:
#define AIO_CHECK_EVENT \
/* case 1 */ \
if (0 > --sim_asynch_check) { \
/* case 2 */ \
AIO_UPDATE_QUEUE; \
sim_asynch_check = sim_asynch_inst_latency; \
/* case 3 */ \
} else (void)0
/* case 4 */ \
Case 1: If AIO's write is successfully read by the main thread, then the intent of scp.c:439 is satisfied. Otherwise, AIO_UPDATE_QUEUE is delayed for at least one simulator instruction cycle, possibly delayed by the remaining count in sim_asynch_check depending on the host processor's load/store ordering -- the AIO sim_asynch_check = 0 precedes AIO_CHECK_EVENT and can effectively be ignored or removed from the host processor's write buffers.
Case 2: Main thread wins, delaying AIO_UPDATE_QUEUE by sim_async_inst_latency simulated instruction cycles.
Cases 3, 4: Similar to case 1, where the intent of forcing the AIO check is likely satisfied because main thread's write to sim_asynch_check precedes AIO's write and the host processor can effectively ignore the main thread's write.
Previously, Mark has said (and I'm paraphrasing) that there's no real problem if AIO_UPDATE_QUEUE gets delayed. If that's the case, then scp.c:439 can be removed because there's never a reason to force AIO_EVENT_CHECK to execute immediately.
There is a performance impact if sim_asynch_check is synchronized across threads because synchronization will occur during each simulated instruction when AIO_EVENT_CHECK executes. This should only be a memory fence/barrier around sim_asynch_check, but one can't make guarantees with respect to how host processors or compilers choose to implement memory fences or barriers.
Inserting Clang-specific annotation or a preprocessor conditional forest to detect the Clang thread sanitizer is a non-starter. It makes the code harder to read and maintain. Fixing the race by coming to a consensus as to what the proper behavior is infinitely more preferable.
Traceback:
WARNING: ThreadSanitizer: data race (pid=8206)
Write of size 4 at 0x55f42a1f26b8 by thread T4 (mutexes: write M0):
#0 sim_aio_activate /home/runner/work/open-simh/open-simh/scp.c:439:18 (vax8600+0x1b66ac) (BuildId: 219a3a64523a85072939433b7104158fba55d2ba)
#1 _sim_activate /home/runner/work/open-simh/open-simh/scp.c:12013:1 (vax8600+0x1e978f) (BuildId: 219a3a64523a85072939433b7104158fba55d2ba)
#2 sim_activate /home/runner/work/open-simh/open-simh/scp.c:12005:8 (vax8600+0x1dd1ff) (BuildId: 219a3a64523a85072939433b7104158fba55d2ba)
#3 _disk_io /home/runner/work/open-simh/open-simh/sim_disk.c:266:5 (vax8600+0x206b49) (BuildId: 219a3a64523a85072939433b7104158fba55d2ba)
Previous write of size 4 at 0x55f42a1f26b8 by main thread:
#0 sim_instr /home/runner/work/open-simh/open-simh/VAX/vax_cpu.c:658:5 (vax8600+0x1033ee) (BuildId: 219a3a64523a85072939433b7104158fba55d2ba)
#1 run_cmd /home/runner/work/open-simh/open-simh/scp.c:9325:13 (vax8600+0x1e38dd) (BuildId: 219a3a64523a85072939433b7104158fba55d2ba)
#2 vax860_boot /home/runner/work/open-simh/open-simh/VAX/vax860_abus.c:700:8 (vax8600+0x144459) (BuildId: 219a3a64523a85072939433b7104158fba55d2ba)
#3 do_cmd_label /home/runner/work/open-simh/open-simh/scp.c (vax8600+0x1c61a5) (BuildId: 219a3a64523a85072939433b7104158fba55d2ba)
#4 call_cmd /home/runner/work/open-simh/open-simh/scp.c:5636:8 (vax8600+0x1d0ea0) (BuildId: 219a3a64523a85072939433b7104158fba55d2ba)
#5 do_cmd_label /home/runner/work/open-simh/open-simh/scp.c (vax8600+0x1c61a5) (BuildId: 219a3a64523a85072939433b7104158fba55d2ba)
#6 do_cmd /home/runner/work/open-simh/open-simh/scp.c:4040:8 (vax8600+0x1c53e0) (BuildId: 219a3a64523a85072939433b7104158fba55d2ba)
#7 main /home/runner/work/open-simh/open-simh/scp.c:2944:16 (vax8600+0x1b7d9e) (BuildId: 219a3a64523a85072939433b7104158fba55d2ba)
Location is global 'sim_asynch_check' of size 4 at 0x55f42a1f26b8 (vax8600+0x177f6b8)
The text was updated successfully, but these errors were encountered:
The Clang thread sanitizer on Linux/Ubuntu detects a data race (v.i., "Traceback:") on the
sim_asynch_check
variable, which Mark has previously said is a false positive. I'll explain why this isn't the case more carefully than the last time. N.B.: This isn't a discussion of AIO's design, it's a technical discussion with respect to variable synchronization across threads, i.e., implementation.Here's what's going on:
sim_asynch_check
inside theAIO_CHECK_EVENT
macro, which is generally expanded at the bottom of simulator instruction loops to process AIO events when--sim_asynch_check < 0
.sim_asynch_check = 0
(insim_aio_activate
, scp.c:439) so that the next time the main thread executes theAIO_CHECK_EVENT
macro's code, AIO events are (supposed to be) immediately processed.Unlike other variables used across threads by AIO,
sim_asynch_check
is not protected: no mutex, no synchronized access primitives. Consequently, the thread sanitizer correctly identifies an inconsistent read-after-write hazard across threads. This can be a problem for several reasons, one of which is memory store ordering (Intel and ARM do it differently) and another of which is the intent of the AIO code:The important thing to remember is that memory writes are not guaranteed to be sequential or consistent across cores without thread synchronization primitives. Just because the AIO thread writes
0
tosim_asynch_check
does not mean that the main thread will read0
whenAIO_EVENT_CHECK
decrementssim_async_check
(On Intel, there's a high likelihood that the write and reads are more strongly ordered via total store ordering (TSO), whereas ARM has weaker guarantees because ARM does fairly aggressive load/store reordering.)Looking at the
AIO_CHECK_EVENT
macro, with extra annotations:scp.c:439
is satisfied. Otherwise,AIO_UPDATE_QUEUE
is delayed for at least one simulator instruction cycle, possibly delayed by the remaining count insim_asynch_check
depending on the host processor's load/store ordering -- the AIOsim_asynch_check = 0
precedesAIO_CHECK_EVENT
and can effectively be ignored or removed from the host processor's write buffers.AIO_UPDATE_QUEUE
bysim_async_inst_latency
simulated instruction cycles.sim_asynch_check
precedes AIO's write and the host processor can effectively ignore the main thread's write.Previously, Mark has said (and I'm paraphrasing) that there's no real problem if
AIO_UPDATE_QUEUE
gets delayed. If that's the case, thenscp.c:439
can be removed because there's never a reason to forceAIO_EVENT_CHECK
to execute immediately.There is a performance impact if
sim_asynch_check
is synchronized across threads because synchronization will occur during each simulated instruction whenAIO_EVENT_CHECK
executes. This should only be a memory fence/barrier aroundsim_asynch_check
, but one can't make guarantees with respect to how host processors or compilers choose to implement memory fences or barriers.Inserting Clang-specific annotation or a preprocessor conditional forest to detect the Clang thread sanitizer is a non-starter. It makes the code harder to read and maintain. Fixing the race by coming to a consensus as to what the proper behavior is infinitely more preferable.
Traceback:
The text was updated successfully, but these errors were encountered: