linux-user: c11-atomic-exec-4 dejagnu test fails #48

VVIsaev · 2021-08-06T13:17:39Z

This test works fine in system qemu.

Smaller version of this test:

/* Test for _Atomic in C11.  Test that compare-and-exchange is
   operating properly when operations on the same variable are carried
   out in two threads.  */
/* { dg-do run } */
/* { dg-options "-std=c11 -pedantic-errors -pthread -U_POSIX_C_SOURCE -D_POSIX_C_SOURCE=200809L" } */
/* { dg-additional-options "-D_XOPEN_SOURCE=600" { target *-*-solaris2* } } */
/* { dg-require-effective-target pthread } */

#include <stdint.h>
#include <pthread.h>
#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>

#define ITER_COUNT 10000

static volatile _Atomic bool thread_ready;

/* Generate test code (with NAME used to name functions and variables)
   for atomic compound assignments to a variable of type LHSTYPE.  The
   variable is initialized to INIT, then PRE var POST is executed
   ITER_COUNT times in each of two threads, and the final result
   should be FINAL.  A function test_main_##NAME is generated that
   returns nonzero on failure, zero on success.  */

#define TEST_FUNCS(NAME, LHSTYPE, PRE, POST, INIT, FINAL)		\
									\
static volatile _Atomic LHSTYPE var_##NAME = (INIT);			\
									\
static void *								\
test_thread_##NAME (void *arg)						\
{									\
  thread_ready = true;							\
  for (int i = 0; i < ITER_COUNT; i++)					\
    PRE var_##NAME POST;						\
  return NULL;								\
}									\
									\
static int								\
test_main_##NAME (void)							\
{									\
  thread_ready = false;							\
  pthread_t thread_id;							\
  int pret = pthread_create (&thread_id, NULL, test_thread_##NAME,	\
			     NULL);					\
  if (pret != 0)							\
    {									\
      printf ("pthread_create failed: %d\n", pret);			\
      return 1;								\
    }									\
  while (!thread_ready)							\
    ;									\
  for (int i = 0; i < ITER_COUNT; i++)					\
    PRE var_##NAME POST;						\
  pthread_join (thread_id, NULL);					\
  if (var_##NAME != (FINAL))						\
    {									\
      printf (#NAME " failed %u %u\n", var_##NAME, FINAL);			\
      return 1;								\
    }									\
  else									\
    {									\
      printf (#NAME " passed\n");					\
      return 0;								\
    }									\
}

TEST_FUNCS (uint32_add, uint32_t, , += 1, 0, (uint32_t) 20000)

int
main (void)
{
  int ret = 0;
  ret |= test_main_uint32_add ();
  if (ret)
    abort ();
  else
    exit (0);
}

How to build:

 $ arc-snps-linux-gnu-gcc -mcpu=archs -latomic -lm -pthread atomic-test.c -o ./test
 $ qemu-arc -cpu archs -L ~/tools/glibc_le_archs_linux/arc-snps-linux-gnu/sysroot ./test 
uint32_add failed 12445 20000

The text was updated successfully, but these errors were encountered:

VVIsaev · 2021-08-06T13:19:43Z

Looks like qemu-user runs second thread with second CPU instance on pthread_create and QEMU's llock/scond does not protect properly.

This is how addition is compiled (both functions have similar output):

   1064a:       43c3 0001 4034          mov_s   r3,0x14034
   10650:       236f 10ff               dmb     0x3
   10654:       222f 00d0               llock   r2,[r3]
   10658:       629a                    add_s   r2,r2,r12
   1065a:       222f 00d1               scond   r2,[r3]
   1065e:       f5fc                    bne_s   -8      ;10654 <test_main_uint32_add+0x94>
   10660:       236f 10ff               dmb     0x3

cupertinomiranda · 2021-10-21T17:51:21Z

Looking into this. Did not notice this issue before.

BrunoASMauricio · 2023-03-03T17:18:40Z

The Problem:

While running on full system or baremetal, qemu runs arc_cpu_do_interrupt from target/arc/helper.c on each interrupt, which clears the LF flag.
However, in user-mode this function isn't used because it is the actual host CPU that takes care of the interrupts, preemption and all (as noted by @VVIsaev, the threads are run via pthread_create, and not emulated somehow).

As such, the LF flag isn't being cleared when a thread is preempted, which breaks atomicity between them.
The offending instruction order is as follows:

Thread 1:
1    llock   r2,[r3]
2    add_s   r2,r2,r12
* PREEMPTED *

Thread 2:
...
3    llock   r2,[r3]
4    add_s   r2,r2,r12
5    scond   r2,[r3]
...
6    llock   r2,[r3]
* PREEMPTED *

Thread 1:
7    scond   r2,[r3]

As the LF flag is not cleared in preemption, the llock (6) will enable scond (7) to succeed, and Thread 1 will overwrite Thread 2s' changes.

Verify the problem:

The multi threaded nature of the problem makes it hard to debug and get sensible information from.
A way to verify that this is indeed the problem is to come up with a quick and dirty solution, like adding the following into helper_scond in target/arc/op_helper.c:

    #ifdef CONFIG_USER_ONLY
    pthread_t current_thread = pthread_self();
    if (last_thread_access != current_thread) {
        entry->lpa_lf &= -2;
    }
    #endif

and the following into helper_llock in the same file

    #ifdef CONFIG_USER_ONLY
    last_thread_access = pthread_self();
    #endif

Not forgetting the required pthread_t last_thread_access; before both functions.

I ran the test with these changes 10.000 times, all with success so I do believe it is a fix for it.

This code simply makes llock "register" the current thread ID, and scond try and match it. Only if llock and scond run in sequence, without an llock from a different thread being preempted in between them, can scond succeed.

This code also needs to be in the helper and not, say, translate.c, because in the common case (i.e. without -singlestep), the relevant preemptions happen in the middle of executing the TB and not during the build

As far as I could find, I didn't see any way to do this via QEMU, and as I will be unavailable for the coming week, I leave my research here in case someone wants to take a look into a better solution.

@claziss and @shahab-vahedi if you guys know of a better way to handle this case, possibly via QEMU support, please feel free to comment.

P.S.
The code proposed must go into the already present protected regions (in llock and scond, guarded by mutex locks/unlocks) to prevent race conditions within QEMU itself when accessing last_thread_access

BrunoASMauricio · 2023-03-27T09:48:01Z

This also seems to be the issue for the 'gcc.dg/di-sync-multithread.c' test.

Fix for issue #48 Since an interrupt requires LF to be cleared and there is no hook for usermode interrupts, we save/validate current thread id to detect preemption and clear LF. This only needs to be done where it is relevant for LF (llock/scond variant helpers) Signed-off-by: BrunoASMauricio <[email protected]>

Since an interrupt requires LF to be cleared and there is no hook for usermode interrupts, we save/validate current thread id to detect preemption and clear LF. This only needs to be done where it is relevant for LF (llock/scond variant helpers). Signed-off-by: BrunoASMauricio <[email protected]>

VVIsaev added the bug Something isn't working label Aug 6, 2021

cupertinomiranda self-assigned this Oct 21, 2021

claziss assigned BrunoASMauricio and unassigned cupertinomiranda Jan 27, 2023

kolerov assigned kolerov and unassigned BrunoASMauricio and kolerov Jul 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

linux-user: c11-atomic-exec-4 dejagnu test fails #48

linux-user: c11-atomic-exec-4 dejagnu test fails #48

VVIsaev commented Aug 6, 2021

VVIsaev commented Aug 6, 2021

cupertinomiranda commented Oct 21, 2021

BrunoASMauricio commented Mar 3, 2023 •

edited

Loading

BrunoASMauricio commented Mar 27, 2023

linux-user: c11-atomic-exec-4 dejagnu test fails #48

linux-user: c11-atomic-exec-4 dejagnu test fails #48

Comments

VVIsaev commented Aug 6, 2021

VVIsaev commented Aug 6, 2021

cupertinomiranda commented Oct 21, 2021

BrunoASMauricio commented Mar 3, 2023 • edited Loading

The Problem:

Verify the problem:

BrunoASMauricio commented Mar 27, 2023

BrunoASMauricio commented Mar 3, 2023 •

edited

Loading