From 1467c48039cbf6b83dcb88c572f18dbb78ad41c1 Mon Sep 17 00:00:00 2001
From: github-actions This page is a snapshot from the LWG issues list, see the Library Active Issues List for more information and the meaning of New status. Section: 28.5.4.5 [rand.eng.philox] Status: New
+ Submitter: Ilya A. Burylov Opened: 2024-08-06 Last modified: 2024-08-10 Priority: Not Prioritized
+ View all issues with New status. Discussion:
+The P2075R6 proposal introduced the Philox engine and described the algorithm closely
+following the original paper
+(further Random123sc11).
+
+4134. Issue with Philox algorithm specification
+
Random123sc11 refers to the permutation step as "the inputs are permuted using the Threefish +N-word P-box", thus the P2075R6 permutation table ([tab:rand.eng.philox.f]) is taken from +Threefish algorithm. But the permutation for N=4 in this table does not match the reference +implementations. It's worth noting that while Random123sc11 described the Philox algorithm for N=8 +and N=16, there are no known reference implementations of it either provided by authors or +implemented by other libraries. We proposed to drop N=8 and N=16 for now and update +the permutation indices for N=4 to match the reference implementation. Note: the proposed resolution +allows extending N > 4 cases in the future.
The original paper describes the S-box update for X values in terms of L'
and
+R'
but does not clarify their ordering as part of the counter. In order to match Random123sc11
+reference implementation the ordering should be swapped.
Philox alias templates should be updated, because the current ordering of constants matches the +specific optimization in the reference Random123sc11 implementation and not the generic algorithm +description.
+All proposed modifications below are confirmed by: +
+Philox algorithm coauthor Mark Moraes who is planning to publish errata for the original Random123sc11 +Philox paper.
Matt Stephanson, who originally found the mismatch in P2075R6
Proposed resolution:
++This wording is relative to N4988. +
+ +Modify 28.5.4.5 [rand.eng.philox], [tab:rand.eng.philox.f] as indicated (This effectively reduces +16 data columns to 4 data columns and 4 data rows to 2 data rows):
+ ++++
+ +Table 101 — Values for the word permutation fn(j) [tab:rand.eng.philox.f] ++ +fn(j) ++ j
+ + +0 +1 +2 +3 ++ 4+ 5+ 6+ 7+ 8+ 9+ 10+ 11+ 12+ 13+ 14+ 15+ +n ++ +2 +0 +1 ++ + + + + + + + + + + + + + + +4 +2 +01 +30 +23 +1+ + + + + + + + + + + + + ++ 8+ 2+ 1+ 4+ 7+ 6+ 5+ 0+ 3+ + + + + + + + + ++ 16+ 0+ 9+ 2+ 13+ 6+ 11+ 4+ 15+ 10+ 7+ 12+ 3+ 14+ 5+ 8+ 1
Modify 28.5.4.5 [rand.eng.philox] as indicated:
+ ++++-4- […] +
++
+- +
(4.1) — […]
- +
(4.2) — The following computations are applied to the elements of the V sequence:
++++X2k+0 = mulhi
+X2k+1 = mullomullo(V2k+1,Mk,w) +xor keyqk xor V2k+1 +mulhi(V2k+1,Mk,w) +xor keyqk xor V2k+ ++-5- […] +
+-6- Mandates: + ++
+ +- +
(6.1) — […]
- +
(6.2) —
n == 2 || n == 4
is|| n == 8 || n == 16true
, and- +
(6.3) — […]
- +
(6.4) — […]
Modify 28.5.6 [rand.predef] as indicated:
+ ++++using philox4x32 = + philox_engine<uint_fast32_t, 32, 4, 10, + 0xCD9E8D57+0xD2511F53, 0x9E3779B9, 0xD2511F530xCD9E8D57, 0xBB67AE85>; ++++-11- Required behavior: The 10000th consecutive invocation a default-constructed +object of type
+philox4x32
produces the value1955073260
. ++using philox4x64 = + philox_engine<uint_fast64_t, 64, 4, 10, + 0xCA5A826395121157+0xD2E7470EE14C6C93, 0x9E3779B97F4A7C15, 0xD2E7470EE14C6C930xCA5A826395121157, 0xBB67AE8584CAA73B>; ++++-12- Required behavior: The 10000th consecutive invocation a default-constructed +object of type
+philox4x64
produces the value 3409172418970261260. +
Revised 2024-08-06 at 17:42:00 UTC +
Revised 2024-08-10 at 13:24:08 UTC
Reference ISO/IEC IS 14882:2020(E)
Also see:
@@ -199,22 +199,22 @@Section: 28.5.4.5 [rand.eng.philox] Status: New + Submitter: Ilya A. Burylov Opened: 2024-08-06 Last modified: 2024-08-10
+Priority: Not Prioritized +
+View all issues with New status.
+Discussion:
++The P2075R6 proposal introduced the Philox engine and described the algorithm closely +following the original paper +(further Random123sc11). +
+Matt Stephanson implemented P2075R6 and the 10000'th number did not match. Further investigation revealed +several places in Random123sc11 algorithm description, which either deviate from the reference implementation +written by Random123sc11 authors or loosely defined, which opens the way for different interpretations. + +All major implementations of the Philox algorithm (NumPy, Intel MKL, Nvidia cuRAND, etc.) match the +reference implementation written by Random123sc11 authors and we propose to align wording with that. + +The rationale of proposed changes: + +Random123sc11 refers to the permutation step as "the inputs are permuted using the Threefish +N-word P-box", thus the P2075R6 permutation table ([tab:rand.eng.philox.f]) is taken from +Threefish algorithm. But the permutation for N=4 in this table does not match the reference +implementations. It's worth noting that while Random123sc11 described the Philox algorithm for N=8 +and N=16, there are no known reference implementations of it either provided by authors or +implemented by other libraries. We proposed to drop N=8 and N=16 for now and update +the permutation indices for N=4 to match the reference implementation. Note: the proposed resolution +allows extending N > 4 cases in the future.
The original paper describes the S-box update for X values in terms of L'
and
+R'
but does not clarify their ordering as part of the counter. In order to match Random123sc11
+reference implementation the ordering should be swapped.
Philox alias templates should be updated, because the current ordering of constants matches the +specific optimization in the reference Random123sc11 implementation and not the generic algorithm +description.
+All proposed modifications below are confirmed by: +
+Philox algorithm coauthor Mark Moraes who is planning to publish errata for the original Random123sc11 +Philox paper.
Matt Stephanson, who originally found the mismatch in P2075R6
Proposed resolution:
++This wording is relative to N4988. +
+ +Modify 28.5.4.5 [rand.eng.philox], [tab:rand.eng.philox.f] as indicated (This effectively reduces +16 data columns to 4 data columns and 4 data rows to 2 data rows):
+ ++++
+ +Table 101 — Values for the word permutation fn(j) [tab:rand.eng.philox.f] ++ +fn(j) ++ j
+ + +0 +1 +2 +3 ++ 4+ 5+ 6+ 7+ 8+ 9+ 10+ 11+ 12+ 13+ 14+ 15+ +n ++ +2 +0 +1 ++ + + + + + + + + + + + + + + +4 +2 +01 +30 +23 +1+ + + + + + + + + + + + + ++ 8+ 2+ 1+ 4+ 7+ 6+ 5+ 0+ 3+ + + + + + + + + ++ 16+ 0+ 9+ 2+ 13+ 6+ 11+ 4+ 15+ 10+ 7+ 12+ 3+ 14+ 5+ 8+ 1
Modify 28.5.4.5 [rand.eng.philox] as indicated:
+ ++++-4- […] +
++
+- +
(4.1) — […]
- +
(4.2) — The following computations are applied to the elements of the V sequence:
++++X2k+0 = mulhi
+X2k+1 = mullomullo(V2k+1,Mk,w) +xor keyqk xor V2k+1 +mulhi(V2k+1,Mk,w) +xor keyqk xor V2k+ ++-5- […] +
+-6- Mandates: + ++
+ +- +
(6.1) — […]
- +
(6.2) —
n == 2 || n == 4
is|| n == 8 || n == 16true
, and- +
(6.3) — […]
- +
(6.4) — […]
Modify 28.5.6 [rand.predef] as indicated:
+ ++++using philox4x32 = + philox_engine<uint_fast32_t, 32, 4, 10, + 0xCD9E8D57+0xD2511F53, 0x9E3779B9, 0xD2511F530xCD9E8D57, 0xBB67AE85>; ++++-11- Required behavior: The 10000th consecutive invocation a default-constructed +object of type
+philox4x32
produces the value1955073260
. ++using philox4x64 = + philox_engine<uint_fast64_t, 64, 4, 10, + 0xCA5A826395121157+0xD2E7470EE14C6C93, 0x9E3779B97F4A7C15, 0xD2E7470EE14C6C930xCA5A826395121157, 0xBB67AE8584CAA73B>; ++++-12- Required behavior: The 10000th consecutive invocation a default-constructed +object of type
+philox4x64
produces the value 3409172418970261260. +