-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Jit: Threads: Add simple RMW operations #280
Conversation
src/jit/MemoryInl.h
Outdated
sljit_emit_op2(compiler, SLJIT_ADD, SLJIT_R1, 0, kContextReg, 0, SLJIT_IMM, OffsetOfContextField(tmp1)); | ||
} | ||
|
||
sljit_emit_op1(compiler, SLJIT_MOV, SLJIT_R0, 0, GET_SOURCE_REG(addr.memArg.arg, instr->requiredReg(0)), 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addr.memArg.arg must be a register, the second argument is not needed. Furthermore, if addr.memArg.arg
is SLJIT_R1
, the previous code overwrites it.
src/jit/MemoryInl.h
Outdated
case ByteCode::I64AtomicRmw16XchgUOpcode: { | ||
operationSize = SLJIT_MOV_U16; | ||
size = 2; | ||
options |= MemAddress::CheckNaturalAlignment; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would set MemAddress::CheckNaturalAlignment
as the default and set it to 0 in the 8 bit case.
src/jit/MemoryInl.h
Outdated
#endif /* SLJIT_32BIT_ARCHITECTURE */ | ||
|
||
struct sljit_label* store_failure = sljit_emit_label(compiler); | ||
sljit_emit_atomic_load(compiler, operationSize, tmpReg, GET_SOURCE_REG(addr.memArg.arg, instr->requiredReg(0))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: this will not work on RISCV with 8/16 bit case, but we can work on that later.
src/jit/MemoryInl.h
Outdated
} | ||
#endif /* SLJIT_32BIT_ARCHITECTURE */ | ||
|
||
struct sljit_label* store_failure = sljit_emit_label(compiler); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would call this a restartOnFailure
. Please don't use underscore in the names.
src/jit/MemoryInl.h
Outdated
|
||
struct sljit_label* store_failure = sljit_emit_label(compiler); | ||
sljit_emit_atomic_load(compiler, operationSize, tmpReg, GET_SOURCE_REG(addr.memArg.arg, instr->requiredReg(0))); | ||
sljit_emit_op1(compiler, SLJIT_MOV, dst.arg, dst.argw, tmpReg, 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is also something RISCV don't like. Maybe we should add a feature to keep tmpReg on success if this is not always the case (compare and exchange often updates this on fail, but not on success).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean the RISCV doc says:
An LR/SC sequence begins with an LR instruction and ends with an SC instruction. The dynamic
code executed between the LR and SC instructions can only contain instructions from the base ''I''
instruction set, excluding loads, stores, backward jumps, taken backward branches, JALR, FENCE,
and SYSTEM instructions. If the ''C'' extension is supported, then compressed forms of the
aforementioned ''I'' instructions are also permitted.
This operation might be a memory store in this case.
Nice patch! |
d4eb1a6
to
ee7a837
Compare
src/jit/ByteCodeParser.cpp
Outdated
case ByteCode::I64AtomicRmw32XchgUOpcode: { | ||
Instruction* instr = compiler->append(byteCode, Instruction::Atomic, opcode, 2, 1); | ||
#if (defined SLJIT_32BIT_ARCHITECTURE && SLJIT_32BIT_ARCHITECTURE) | ||
if (opcode == ByteCode::I64AtomicRmwAddOpcode |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we reorganize the code to do this without an if statement? Move these cases below the FALLTHROUGH;
above, check, if info
is zero (default value), and then set kIsCallback;
on 32 bit mode.
#if (defined SLJIT_32BIT_ARCHITECTURE && SLJIT_32BIT_ARCHITECTURE) | ||
static void atomicRmwAdd64(uint64_t* shared_p, uint64_t* value, uint64_t* result) | ||
{ | ||
std::atomic<uint64_t>* shared = reinterpret_cast<std::atomic<uint64_t>*>(shared_p); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it valid to cast a pointer to a structure? If the implementation has a different layout, this could be a problem. Could we use some constructor? Something like std::atomic<uint64_t> shared(shared_p)
? This way the compiler would decide how to turn a pointer to an atomic pointer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I know, there is no constructor that takes a pointer and creates an atomic pointer; the interpreter also uses reinterpret_cast
:
Line 174 in 69fd93a
std::atomic<T>* shared = reinterpret_cast<std::atomic<T>*>(m_buffer + (offset + addend)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These atomic* relative functions are defined as Memory
member functions, which handle the same atomic operations with memory align checks.
Is it more efficient to use these special helper functions in JITC instead of reusing the Memory's member functions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since alignment check is already done, it should be a bit faster, but since it is only for 32 bit, and probably rarely used, we could use those as well. Btw, not all 32 bit cpus have 64 bit atomic operations (e.g. RISCV), so I have no idea how they do it.
https://github.com/Samsung/walrus/blob/main/src/runtime/Memory.h#L171
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realized we have no access to the memory structure in the callback, we get the absolute address of the memory location. Shall we add static functions into the memory to do this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see
I think that adding more functions to get the address of the Memory would be complex too, so the current helper functions of JITC seems better :)
src/jit/MemoryInl.h
Outdated
offset = rmwOperation->offset(); | ||
|
||
Operand* operands = instr->operands(); | ||
MemAddress addr(options, instr->requiredReg(0), instr->requiredReg(1), 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Preloading the source value would be a good idea (the store operation also do this to reduce the load overhead).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where does the store operation do this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
src/jit/MemoryInl.h
Outdated
#endif /* SLJIT_32BIT_ARCHITECTURE */ | ||
|
||
struct sljit_label* restartOnFailure = sljit_emit_label(compiler); | ||
sljit_emit_atomic_load(compiler, operationSize, tmpReg, SLJIT_EXTRACT_REG(addr.memArg.arg)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about using SLJIT_TMP_DEST_REG
instead of tmpReg
?
d21975c
to
7095cd7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Patch is getting better.
src/jit/ByteCodeParser.cpp
Outdated
@@ -230,6 +230,14 @@ static bool isFloatGlobal(uint32_t globalIndex, Module* module) | |||
|
|||
#endif /* SLJIT_32BIT_ARCHITECTURE */ | |||
|
|||
#if defined(ENABLE_EXTENDED_FEATURES) | |||
#define OPERAND_TYPE_LIST_ATOMIC \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a theoretical question: since the name "extended features" covers everything, I would call this OPERAND_TYPE_LIST_EXTENDED
. The name OTAtomicRmwI32
refers to atomic operations anyway.
src/jit/MemoryInl.h
Outdated
sljit_emit_icall(compiler, SLJIT_CALL, SLJIT_ARGS3V(P, P, P), SLJIT_IMM, faddr); | ||
|
||
sljit_emit_op1(compiler, SLJIT_MOV, dstArgPair.arg1, dstArgPair.arg1w, SLJIT_MEM1(kContextReg), OffsetOfContextField(tmp2) + WORD_LOW_OFFSET); | ||
sljit_emit_op1(compiler, SLJIT_MOV, dstArgPair.arg2, dstArgPair.arg2w, SLJIT_MEM1(kContextReg), OffsetOfContextField(tmp2) + WORD_HIGH_OFFSET); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another idea: we should do this only if dstArgPair.arg1 != SLJIT_MEM1(kFrameReg)
. Otherwise we can put the result address directly into the SLJIT_R2
register, and this copy is unnecessary.
} | ||
#endif /* SLJIT_32BIT_ARCHITECTURE */ | ||
|
||
addr.check(compiler, operands, offset, size); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something is not right. The sourceReg
is passed as the last argument of MemAddress addr()
, but you always pass 0 there. How this code works:
https://github.com/Samsung/walrus/blob/main/src/jit/MemoryInl.h#L250
src/jit/MemoryInl.h
Outdated
|
||
struct sljit_label* restartOnFailure = sljit_emit_label(compiler); | ||
sljit_emit_atomic_load(compiler, operationSize, SLJIT_TMP_DEST_REG, SLJIT_EXTRACT_REG(addr.memArg.arg)); | ||
sljit_emit_op1(compiler, SLJIT_MOV, dst.arg, dst.argw, SLJIT_TMP_DEST_REG, 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please move this code after sljit_set_label
. I mentioned that memory access between atomic load/store is not necessary supported. Now sljit_emit_atomic_store
keeps the tmp reg value when the operation is successful, so you can safely store it.
7095cd7
to
e906558
Compare
src/jit/MemoryInl.h
Outdated
sljit_emit_op1(compiler, SLJIT_MOV, SLJIT_MEM1(kContextReg), OffsetOfContextField(tmp1) + WORD_LOW_OFFSET, srcArgPair.arg1, srcArgPair.arg1w); | ||
sljit_emit_op1(compiler, SLJIT_MOV, SLJIT_MEM1(kContextReg), OffsetOfContextField(tmp1) + WORD_HIGH_OFFSET, srcArgPair.arg2, srcArgPair.arg2w); | ||
|
||
ASSERT(SLJIT_IS_REG(srcArgPair.arg1) || SLJIT_IS_IMM(srcArgPair.arg1)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This assert is unnecessary. It can be anything.
src/jit/MemoryInl.h
Outdated
#endif /* SLJIT_32BIT_ARCHITECTURE */ | ||
JITArg src(operands + 1); | ||
dst = JITArg(operands + 2); | ||
srcReg = GET_SOURCE_REG(src.arg, instr->requiredReg(1)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand something. SrcReg is set to instr->requiredReg(1)
.
Here sourceReg is the last argument, which is set to instr->requiredReg(2)
.
https://github.com/Samsung/walrus/blob/main/src/jit/MemoryInl.h#L37
Does not include RmwCmpxchg Signed-off-by: Máté Tokodi [email protected]
e906558
to
63dcece
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@clover2123 what do you think of this patch? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Add simple RMW operations: Add, Sub, And, Or, Xor, Xchg
Does not include RmwCmpxchg