provide Standard for x86 __m128/256i on stable Rust, add 128xN/sizexN SIMD types #1162

TheIronBorn · 2021-08-20T21:52:04Z

No description provided.

vks · 2021-08-23T15:02:57Z

Looks good! I wonder how to document this, it is not very discoverable at the moment. Should we document this in the README?

TheIronBorn · 2021-08-23T22:33:37Z

Hmm. We don't have any documentation on Standard for SIMD stuff. We could probably mention the features there. Perhaps Uniform as well. There's also the experimental rustdoc cfg stuff of course

dhardy

The use of unsafe needs attention; after that I'd like to do another review.

src/distributions/integer.rs

newpavlov · 2021-08-24T12:29:57Z

I don't think we need the SampleNativeEndian trait. Instead it should be enough to directly implement intrinsic_native_le_impl by simply copying simd_impl without the to_le call. Amount of code duplication will be minuscule.

TheIronBorn · 2021-08-25T00:50:10Z

The extra internal NE trait might reduce code duplication for future architectures, but it's still probably minimal

TheIronBorn · 2021-09-05T21:55:08Z

packed_simd failure on latest nightly. Exactly why this is useful

TheIronBorn · 2021-09-07T20:40:19Z

Noticed we mention SIMD Standard and Uniform here https://rust-random.github.io/book/guide-dist.html#uniform-sampling-by-type

dhardy · 2021-09-11T10:08:03Z

src/distributions/integer.rs

+#[cfg(target_arch = "x86")] use core::arch::x86::*;
+#[cfg(target_arch = "x86_64")] use core::arch::x86_64::*;


We only want two items, right? I'm not so keen on using glob imports.

4 items now. Added 2 setzero intrinsics

True, though if you make the change below those will go away.

dhardy · 2021-09-11T10:22:14Z

src/distributions/integer.rs

+    (__m128i, _mm_setzero_si128),
+    (__m256i, _mm256_setzero_si256)


I'm baffled: (1) the types exist without additional target features while the constructors require (sse2 / avx), and (2) the constructors are unsafe. Maybe I should learn a little more about SIMD here...

Stupid questions, but:

This code will fail to compile without sse2 / avx, right?

Is there a reason we shouldn't simply transmute an array with suitable alignment? Especially since we're mostly doing that with the pointer-cast anyway.

AFAIK there are no dedicated instructions for the setzero intrinsics. Usually they get compiled either down to XORing the same register or to writing zero bytes to memory. I am also a bit surprised that they are gated on sse2/avx, while types themselves are not.

I agree that transmuting arrays would be a simpler solution, but instead of creating an array with proper alignment I think it will be easier to write something like this:

let mut buf = [0u8; mem::size_of::<$ty>()]; rng.fill_bytes(&mut buf); unsafe { mem::transmute_copy(&buf) }

transmute_copy will handle the alignment requirements and in practice should be properly optimized out by compiler.

It will compile just fine but without see/avx it will fail to run

@TheIronBorn
Using intrinsics without properly checking required target features (either at compile or at run time) is considered UB.

dhardy · 2021-09-23T10:46:37Z

src/distributions/integer.rs

+                    let mut vec: $ty = <$ty>::default();
+                    unsafe {
+                        let ptr = &mut vec;
+                        let b_ptr = &mut *(ptr as *mut $ty as *mut [u8; mem::size_of::<$ty>()]);
+                        rng.fill_bytes(b_ptr);
+                    }
+                    vec.to_le()


I think this is correct, but we should really use from_bits like the old code to avoid unsafe (but do use fill_bytes instead of gen).

Unfortunately from_bits is not documented on docs.rs; I just dropped a PR for that.

I'm confused by this. Do you mean use fill_bytes on a regular array and then from_slice_unaligned? That would avoid all unsafe.

Hmm, I hadn't figured on Simd<[u8; 2]> etc. being hard to construct from an array. Maybe my suggestion doesn't make sense then.

We could do something like

let mut bytes = [0_u8; mem::size_of::<$ty>()]; rng.fill_bytes(&mut bytes); let vec = $ty::from_bits($u8xN::from_slice_unaligned(&bytes)); vec.to_le()

but usizexN don't have from_bits,

provide Standard for __m128/256i on stable Rust

0f00ba7

dhardy reviewed Aug 24, 2021

View reviewed changes

src/distributions/integer.rs Outdated Show resolved Hide resolved

src/distributions/integer.rs Outdated Show resolved Hide resolved

remove uninit, simplify

329ccf5

dhardy reviewed Sep 11, 2021

View reviewed changes

TheIronBorn added 2 commits September 11, 2021 21:59

add usizexN types, fix x86 types

a703d13

change x86 types documentation

4b77d45

dhardy added the D-changes Do: changes requested label Sep 13, 2021

TheIronBorn changed the title ~~provide Standard for x86 __m128/256i on stable Rust, add 128xN SIMD types~~ provide Standard for x86 __m128/256i on stable Rust, add 128xN/sizexN SIMD types Sep 23, 2021

dhardy reviewed Sep 23, 2021

View reviewed changes

TheIronBorn mentioned this pull request Jul 7, 2022

switch to std::simd, expand SIMD & docs #1239

Merged

TheIronBorn closed this in #1239 Aug 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

provide Standard for x86 __m128/256i on stable Rust, add 128xN/sizexN SIMD types #1162

provide Standard for x86 __m128/256i on stable Rust, add 128xN/sizexN SIMD types #1162

TheIronBorn commented Aug 20, 2021

vks commented Aug 23, 2021

TheIronBorn commented Aug 23, 2021

dhardy left a comment

newpavlov commented Aug 24, 2021 •

edited

Loading

TheIronBorn commented Aug 25, 2021

TheIronBorn commented Sep 5, 2021

TheIronBorn commented Sep 7, 2021

dhardy Sep 11, 2021

TheIronBorn Sep 11, 2021

dhardy Sep 11, 2021

dhardy Sep 11, 2021 •

edited

Loading

newpavlov Sep 11, 2021 •

edited

Loading

TheIronBorn Sep 11, 2021

newpavlov Sep 11, 2021

dhardy Sep 23, 2021

TheIronBorn Sep 23, 2021

dhardy Sep 23, 2021

TheIronBorn Sep 23, 2021

		#[cfg(target_arch = "x86")] use core::arch::x86::*;
		#[cfg(target_arch = "x86_64")] use core::arch::x86_64::*;

		(__m128i, _mm_setzero_si128),
		(__m256i, _mm256_setzero_si256)

provide Standard for x86 __m128/256i on stable Rust, add 128xN/sizexN SIMD types #1162

provide Standard for x86 __m128/256i on stable Rust, add 128xN/sizexN SIMD types #1162

Conversation

TheIronBorn commented Aug 20, 2021

vks commented Aug 23, 2021

TheIronBorn commented Aug 23, 2021

dhardy left a comment

Choose a reason for hiding this comment

newpavlov commented Aug 24, 2021 • edited Loading

TheIronBorn commented Aug 25, 2021

TheIronBorn commented Sep 5, 2021

TheIronBorn commented Sep 7, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dhardy Sep 11, 2021 • edited Loading

Choose a reason for hiding this comment

newpavlov Sep 11, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

newpavlov commented Aug 24, 2021 •

edited

Loading

dhardy Sep 11, 2021 •

edited

Loading

newpavlov Sep 11, 2021 •

edited

Loading