You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Assuming other work-items progress without having a WG barrier in prm/core/memory/memfence/basic/global_u16/st_memfence_screl_wave__ld_memfence_scacq_wave/1_4x1x1_1x1x1
#15
Open
pjaaskel opened this issue
Feb 8, 2016
· 0 comments
prm/core/memory/memfence/basic/global_u16/st_memfence_screl_wave__ld_memfence_scacq_wave/1_4x1x1_1x1x1 assumes there's forward progress between work-items even if there is no barrier in the kernel. AFAIU this is undefined behavior and a deadlock is possible.
module &sample:1:0:$base:$large:$near;prog global_u16 &global_var = 0;prog global_u64 &global_flag;prog kernel &test_kernel( kernarg_u64 %output, kernarg_u64 %input){ ld_kernarg_align(8)_u64$d0,[%input]; workitemflatabsid_u64$d1; mad_u64$d2,$d1,2,$d0; ld_global_align(2)_u16$s1,[$d2]; workitemflatabsid_u64$d3; div_u64$d3,$d3, WAVESIZE; cmp_ne_b1_u64$c0,$d3,3; ; All but "wave id" 3 skip the store cbr_b1$c0, @skip_store; ; Wave id 3 should write the 1 to global_var here to ... st_global_align(2)_u16$s1,[&global_var]; mov_b64$d4,1; memfence_screl_wave; atomicnoret_st_global_rlx_system_b64 [&global_flag],$d4; br @skip_memfence;@skip_store: atomic_ld_global_rlx_system_b64$d5,[&global_flag]; cmp_ne_b1_u64$c0,$d5,1; ; ... release the other work-items spinning in this loop. cbr_b1$c0, @skip_store; memfence_scacq_wave;@skip_memfence: ld_global_align(2)_u16$s0,[&global_var]; ld_kernarg_align(8)_u64$d6,[%output]; mad_u64$d7,$d1,2,$d6; st_global_align(2)_u16$s0,[$d7];};
An fbarrier that is reached in the loop and "wave id 3" path should make the case a well-defined one.
The text was updated successfully, but these errors were encountered:
prm/core/memory/memfence/basic/global_u16/st_memfence_screl_wave__ld_memfence_scacq_wave/1_4x1x1_1x1x1 assumes there's forward progress between work-items even if there is no barrier in the kernel. AFAIU this is undefined behavior and a deadlock is possible.
An fbarrier that is reached in the loop and "wave id 3" path should make the case a well-defined one.
The text was updated successfully, but these errors were encountered: