From 056c52ec4200c3f7838e6a16855f3641e5f55df6 Mon Sep 17 00:00:00 2001 From: Ved Shanbhogue Date: Mon, 10 Jul 2023 13:55:18 -0500 Subject: [PATCH 01/16] try another layout --- zacas.adoc | 33 +++++++++++++++++---------------- 1 file changed, 17 insertions(+), 16 deletions(-) diff --git a/zacas.adoc b/zacas.adoc index 30fb4cf..20d7b75 100644 --- a/zacas.adoc +++ b/zacas.adoc @@ -30,22 +30,6 @@ operation performed by `AMOCAS.W` for RV32 is as follows: X(rd) = temp ---- -For RV64, `AMOCAS.W` atomically loads a 32-bit data value from address in -`rs1`, compares the loaded value to the lower 32 bits of the value held in `rd`, -and if the comparison is bitwise equal, then stores the lower 32 bits of the -value held in `rs2` to the original address in `rs1`. The 32-bit value loaded -from memory is sign-extended and is placed into register `rd`. The operation -performed by `AMOCAS.W` for RV64 is as follows: - -[listing] ----- - temp[31:0] = mem[X(rs1)] - if ( temp[31:0] == X(rd)[31:0] ) - mem[X(rs1)] = X(rs2)[31:0] - endif - X(rd) = SignExtend(temp[31:0]) ----- - `AMOCAS.D` is similar to `AMOCAS.W` but operates on 64-bit data values. For RV32, `AMOCAS.D` atomically loads 64-bits of a data value from address in @@ -78,6 +62,22 @@ The operation performed by `AMOCAS.D` for RV32 is as follows: endif ---- +For RV64, `AMOCAS.W` atomically loads a 32-bit data value from address in +`rs1`, compares the loaded value to the lower 32 bits of the value held in `rd`, +and if the comparison is bitwise equal, then stores the lower 32 bits of the +value held in `rs2` to the original address in `rs1`. The 32-bit value loaded +from memory is sign-extended and is placed into register `rd`. The operation +performed by `AMOCAS.W` for RV64 is as follows: + +[listing] +---- + temp[31:0] = mem[X(rs1)] + if ( temp[31:0] == X(rd)[31:0] ) + mem[X(rs1)] = X(rs2)[31:0] + endif + X(rd) = SignExtend(temp[31:0]) +---- + For RV64, `AMOCAS.D` atomically loads 64-bits of a data value from address in `rs1`, compares the loaded value to a 64-bit value held in `rd`, and if the comparison is bitwise equal, then stores the 64-bit value held in `rs2` to the @@ -91,6 +91,7 @@ original address in `rs1`. The value loaded from memory is placed into register endif X(rd) = temp ---- + `AMOCAS.Q` (RV64 only) atomically loads 128-bits of a data value from address in `rs1`, compares the loaded value to a 128-bit value held in a register pair consisting of `rd` and `rd+1`, and if the comparison is bitwise equal, then From c8f5a4471a31f0c68221cc25e5e4fb97e2a5b096 Mon Sep 17 00:00:00 2001 From: Ved Shanbhogue Date: Mon, 10 Jul 2023 14:00:15 -0500 Subject: [PATCH 02/16] try another layout --- zacas.adoc | 1 + 1 file changed, 1 insertion(+) diff --git a/zacas.adoc b/zacas.adoc index 20d7b75..b385c4c 100644 --- a/zacas.adoc +++ b/zacas.adoc @@ -56,6 +56,7 @@ The operation performed by `AMOCAS.D` for RV32 is as follows: mem[X(rs1)+0] = swap0 mem[X(rs1)+4] = swap1 endif + if ( rd != x0 ) X(rd) = temp0 X(rd+1) = temp1 From 848bb6049a092c53af00fafea06aad8924da9336 Mon Sep 17 00:00:00 2001 From: Ved Shanbhogue Date: Mon, 10 Jul 2023 14:08:47 -0500 Subject: [PATCH 03/16] try another layout --- zacas.adoc | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/zacas.adoc b/zacas.adoc index b385c4c..967476a 100644 --- a/zacas.adoc +++ b/zacas.adoc @@ -56,7 +56,6 @@ The operation performed by `AMOCAS.D` for RV32 is as follows: mem[X(rs1)+0] = swap0 mem[X(rs1)+4] = swap1 endif - if ( rd != x0 ) X(rd) = temp0 X(rd+1) = temp1 @@ -78,7 +77,7 @@ performed by `AMOCAS.W` for RV64 is as follows: endif X(rd) = SignExtend(temp[31:0]) ---- - + + For RV64, `AMOCAS.D` atomically loads 64-bits of a data value from address in `rs1`, compares the loaded value to a 64-bit value held in `rd`, and if the comparison is bitwise equal, then stores the 64-bit value held in `rs2` to the From e88b6538fcb952f187b92d886072846831781829 Mon Sep 17 00:00:00 2001 From: Ved Shanbhogue Date: Mon, 10 Jul 2023 14:13:42 -0500 Subject: [PATCH 04/16] try another layout --- zacas.adoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/zacas.adoc b/zacas.adoc index 967476a..f5da587 100644 --- a/zacas.adoc +++ b/zacas.adoc @@ -61,6 +61,7 @@ The operation performed by `AMOCAS.D` for RV32 is as follows: X(rd+1) = temp1 endif ---- + + For RV64, `AMOCAS.W` atomically loads a 32-bit data value from address in `rs1`, compares the loaded value to the lower 32 bits of the value held in `rd`, @@ -77,7 +78,6 @@ performed by `AMOCAS.W` for RV64 is as follows: endif X(rd) = SignExtend(temp[31:0]) ---- - + For RV64, `AMOCAS.D` atomically loads 64-bits of a data value from address in `rs1`, compares the loaded value to a 64-bit value held in `rd`, and if the comparison is bitwise equal, then stores the 64-bit value held in `rs2` to the From de84954143eb920de88af0e58d5d3ff371426480 Mon Sep 17 00:00:00 2001 From: Ved Shanbhogue Date: Mon, 10 Jul 2023 14:16:13 -0500 Subject: [PATCH 05/16] try another layout --- zacas.adoc | 1 + 1 file changed, 1 insertion(+) diff --git a/zacas.adoc b/zacas.adoc index f5da587..fb5b513 100644 --- a/zacas.adoc +++ b/zacas.adoc @@ -61,6 +61,7 @@ The operation performed by `AMOCAS.D` for RV32 is as follows: X(rd+1) = temp1 endif ---- + + For RV64, `AMOCAS.W` atomically loads a 32-bit data value from address in From dd440e29bef7e70b02fa71e83d7e6530be1ec6ca Mon Sep 17 00:00:00 2001 From: Ved Shanbhogue Date: Mon, 10 Jul 2023 14:20:24 -0500 Subject: [PATCH 06/16] try another layout --- zacas.adoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/zacas.adoc b/zacas.adoc index fb5b513..4aa6c8a 100644 --- a/zacas.adoc +++ b/zacas.adoc @@ -61,7 +61,7 @@ The operation performed by `AMOCAS.D` for RV32 is as follows: X(rd+1) = temp1 endif ---- - +[%hardbreaks] + For RV64, `AMOCAS.W` atomically loads a 32-bit data value from address in From ae94b229500388d75fedeb9f219194c8efa9eaf0 Mon Sep 17 00:00:00 2001 From: Ved Shanbhogue Date: Mon, 10 Jul 2023 14:25:54 -0500 Subject: [PATCH 07/16] try another layout --- zacas.adoc | 2 -- 1 file changed, 2 deletions(-) diff --git a/zacas.adoc b/zacas.adoc index 4aa6c8a..f752df9 100644 --- a/zacas.adoc +++ b/zacas.adoc @@ -62,8 +62,6 @@ The operation performed by `AMOCAS.D` for RV32 is as follows: endif ---- [%hardbreaks] - + - For RV64, `AMOCAS.W` atomically loads a 32-bit data value from address in `rs1`, compares the loaded value to the lower 32 bits of the value held in `rd`, and if the comparison is bitwise equal, then stores the lower 32 bits of the From ba5a090e9cfaf5e3ae92a1429dbc890c0c9690a1 Mon Sep 17 00:00:00 2001 From: Ved Shanbhogue Date: Mon, 10 Jul 2023 14:33:50 -0500 Subject: [PATCH 08/16] try another layout --- zacas.adoc | 18 +++++------------- 1 file changed, 5 insertions(+), 13 deletions(-) diff --git a/zacas.adoc b/zacas.adoc index f752df9..fa380ab 100644 --- a/zacas.adoc +++ b/zacas.adoc @@ -45,7 +45,6 @@ register of a destination register pair is `x0`, then the entire register result is discarded and neither destination register is written. The operation performed by `AMOCAS.D` for RV32 is as follows: [listing] ----- temp0 = mem[X(rs1)+0] temp1 = mem[X(rs1)+4] comp0 = (rd == x0) ? 0 : X(rd) @@ -60,8 +59,7 @@ The operation performed by `AMOCAS.D` for RV32 is as follows: X(rd) = temp0 X(rd+1) = temp1 endif ----- -[%hardbreaks] + For RV64, `AMOCAS.W` atomically loads a 32-bit data value from address in `rs1`, compares the loaded value to the lower 32 bits of the value held in `rd`, and if the comparison is bitwise equal, then stores the lower 32 bits of the @@ -70,26 +68,23 @@ from memory is sign-extended and is placed into register `rd`. The operation performed by `AMOCAS.W` for RV64 is as follows: [listing] ----- temp[31:0] = mem[X(rs1)] if ( temp[31:0] == X(rd)[31:0] ) mem[X(rs1)] = X(rs2)[31:0] endif X(rd) = SignExtend(temp[31:0]) ----- + For RV64, `AMOCAS.D` atomically loads 64-bits of a data value from address in `rs1`, compares the loaded value to a 64-bit value held in `rd`, and if the comparison is bitwise equal, then stores the 64-bit value held in `rs2` to the original address in `rs1`. The value loaded from memory is placed into register `rd`. The operation performed by `AMOCAS.D` for RV64 is as follows: [listing] ----- temp = mem[X(rs1)] if ( temp == X(rd) ) mem[X(rs1)] = X(rs2) endif X(rd) = temp ----- `AMOCAS.Q` (RV64 only) atomically loads 128-bits of a data value from address in `rs1`, compares the loaded value to a 128-bit value held in a register pair @@ -104,7 +99,6 @@ destination register pair is `x0`, then the entire register result is discarded and neither destination register is written. The operation performed by `AMOCAS.Q` is as follows: [listing] ----- temp0 = mem[X(rs1)+0] temp1 = mem[X(rs1)+8] comp0 = (rd == x0) ? 0 : X(rd) @@ -119,7 +113,7 @@ and neither destination register is written. The operation performed by X(rd) = temp0 X(rd+1) = temp1 endif ----- + [NOTE] ==== For a future RV128 extension, `AMOCAS.Q` would encode a single XLEN=128 register @@ -148,7 +142,6 @@ data value for its comparison. The following example code sequence illustrates the use of `AMOCAS.D` in a RV32 implementation to atomically increment a 64-bit counter. [listing] ----- # a0 - address of the counter. increment: lw a2, (a0) # Load current counter value using @@ -163,14 +156,13 @@ retry: bne a2, a6, retry # If amocas.d failed then retry bne a3, a7, retry # using current values loaded by amocas.d. ret ----- + The following example code sequence illustrates the use of `AMOCAS.Q` to implement the _enqueue_ operation for a non-blocking concurrent queue using the algorithm outlined in cite:[queue]. The algorithm atomically operates on a pointer and its associated modification counter using the `AMOCAS.Q` instruction to avoid the ABA problem. [listing] ----- # Enqueue operation of a non-blocking concurrent queue. # Data structures used by the queue: # structure pointer_t {ptr: node_t *, count: uint64_t} @@ -202,7 +194,7 @@ move_tail: # Tail was not pointing to the last node addi a3, a3, 1 # Try to swing Tail to the next node amocas.q.aqrl a6, a2, (a0) j enqueue # Retry ----- + ==== == Additional AMO PMAs From 407cd14077b0e0ed798374259c94de2e6b95b9a1 Mon Sep 17 00:00:00 2001 From: Ved Shanbhogue Date: Mon, 10 Jul 2023 14:47:42 -0500 Subject: [PATCH 09/16] try another layout --- zacas.adoc | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/zacas.adoc b/zacas.adoc index fa380ab..72c5c36 100644 --- a/zacas.adoc +++ b/zacas.adoc @@ -161,7 +161,13 @@ The following example code sequence illustrates the use of `AMOCAS.Q` to implement the _enqueue_ operation for a non-blocking concurrent queue using the algorithm outlined in cite:[queue]. The algorithm atomically operates on a pointer and its associated modification counter using the `AMOCAS.Q` instruction -to avoid the ABA problem. +to avoid the ABA problem.+ + + + + + + + + + + + [listing] # Enqueue operation of a non-blocking concurrent queue. # Data structures used by the queue: From 65a2fe81694052d61ca6cb63bfa23af05e30b0ac Mon Sep 17 00:00:00 2001 From: Ved Shanbhogue Date: Mon, 10 Jul 2023 14:50:13 -0500 Subject: [PATCH 10/16] try another layout --- zacas.adoc | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/zacas.adoc b/zacas.adoc index 72c5c36..06d740b 100644 --- a/zacas.adoc +++ b/zacas.adoc @@ -157,16 +157,16 @@ retry: bne a3, a7, retry # using current values loaded by amocas.d. ret -The following example code sequence illustrates the use of `AMOCAS.Q` to -implement the _enqueue_ operation for a non-blocking concurrent queue using the -algorithm outlined in cite:[queue]. The algorithm atomically operates on a -pointer and its associated modification counter using the `AMOCAS.Q` instruction -to avoid the ABA problem.+ + + + + + +The following example code sequence illustrates the use of `AMOCAS.Q` to +implement the _enqueue_ operation for a non-blocking concurrent queue using the +algorithm outlined in cite:[queue]. The algorithm atomically operates on a +pointer and its associated modification counter using the `AMOCAS.Q` instruction +to avoid the ABA problem. [listing] # Enqueue operation of a non-blocking concurrent queue. From a8fd96ab7fb9adfb95cb108a7ffb76971b5cf08f Mon Sep 17 00:00:00 2001 From: Ved Shanbhogue Date: Mon, 10 Jul 2023 14:52:58 -0500 Subject: [PATCH 11/16] try another layout --- zacas.adoc | 3 +++ 1 file changed, 3 insertions(+) diff --git a/zacas.adoc b/zacas.adoc index 06d740b..0a8acd8 100644 --- a/zacas.adoc +++ b/zacas.adoc @@ -157,11 +157,14 @@ retry: bne a3, a7, retry # using current values loaded by amocas.d. ret + + + + + + + + The following example code sequence illustrates the use of `AMOCAS.Q` to implement the _enqueue_ operation for a non-blocking concurrent queue using the algorithm outlined in cite:[queue]. The algorithm atomically operates on a From 42d20e18145709794870a0e0b2157c484c37f966 Mon Sep 17 00:00:00 2001 From: Ved Shanbhogue Date: Mon, 10 Jul 2023 14:55:50 -0500 Subject: [PATCH 12/16] try another layout --- zacas.adoc | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/zacas.adoc b/zacas.adoc index 0a8acd8..9cd5587 100644 --- a/zacas.adoc +++ b/zacas.adoc @@ -158,11 +158,14 @@ retry: ret - + - + - + - + - + + + + + + + + + The following example code sequence illustrates the use of `AMOCAS.Q` to From cf2e929dff953237fca6de94b7fd1b51bef061b0 Mon Sep 17 00:00:00 2001 From: Ved Shanbhogue Date: Mon, 10 Jul 2023 15:02:58 -0500 Subject: [PATCH 13/16] try another layout --- zacas.adoc | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/zacas.adoc b/zacas.adoc index 9cd5587..b9af665 100644 --- a/zacas.adoc +++ b/zacas.adoc @@ -156,18 +156,18 @@ retry: bne a2, a6, retry # If amocas.d failed then retry bne a3, a7, retry # using current values loaded by amocas.d. ret +==== + + + + + + + + + + - - - - - - - - +[NOTE] +==== The following example code sequence illustrates the use of `AMOCAS.Q` to implement the _enqueue_ operation for a non-blocking concurrent queue using the algorithm outlined in cite:[queue]. The algorithm atomically operates on a From 7c16421672af56fbec9903ed3d0ad30623efe5b5 Mon Sep 17 00:00:00 2001 From: Ved Shanbhogue Date: Mon, 10 Jul 2023 15:09:19 -0500 Subject: [PATCH 14/16] try another layout --- zacas.adoc | 8 +------- 1 file changed, 1 insertion(+), 7 deletions(-) diff --git a/zacas.adoc b/zacas.adoc index b9af665..16ee047 100644 --- a/zacas.adoc +++ b/zacas.adoc @@ -158,13 +158,7 @@ retry: ret ==== - - + - + - + - + - + - +<<< [NOTE] ==== From 953bdc838d074e5172020122f6f3c1036c173845 Mon Sep 17 00:00:00 2001 From: Ved Shanbhogue Date: Mon, 10 Jul 2023 15:16:04 -0500 Subject: [PATCH 15/16] try another layout --- zacas.adoc | 2 ++ 1 file changed, 2 insertions(+) diff --git a/zacas.adoc b/zacas.adoc index 16ee047..06e06a1 100644 --- a/zacas.adoc +++ b/zacas.adoc @@ -60,6 +60,8 @@ The operation performed by `AMOCAS.D` for RV32 is as follows: X(rd+1) = temp1 endif +<<< + For RV64, `AMOCAS.W` atomically loads a 32-bit data value from address in `rs1`, compares the loaded value to the lower 32 bits of the value held in `rd`, and if the comparison is bitwise equal, then stores the lower 32 bits of the From 3884d70db5e4934967def22540c7a135e3ca3787 Mon Sep 17 00:00:00 2001 From: Ved Shanbhogue Date: Mon, 10 Jul 2023 15:24:43 -0500 Subject: [PATCH 16/16] try another layout --- zacas.adoc | 2 -- 1 file changed, 2 deletions(-) diff --git a/zacas.adoc b/zacas.adoc index 06e06a1..16ee047 100644 --- a/zacas.adoc +++ b/zacas.adoc @@ -60,8 +60,6 @@ The operation performed by `AMOCAS.D` for RV32 is as follows: X(rd+1) = temp1 endif -<<< - For RV64, `AMOCAS.W` atomically loads a 32-bit data value from address in `rs1`, compares the loaded value to the lower 32 bits of the value held in `rd`, and if the comparison is bitwise equal, then stores the lower 32 bits of the