v0.7.2
Summary
Post v0.7.0 fixes; more portable implementations of neon intrinsics
Details
- common: fix SIMDE_FLOAT64_C macro when SIMDE_FLOAT64_TYPE is defined 1d28a5d @rosbif
- complex: split complex math out into separate header 0678336 @nemequ
- diagnostic: silence a few -Weverything diagnostics on clang < 5 6f8d285 @nemequ
Implementation of NEON intrinsics:
- neon/ceq: implement vceq{s_f32,d_f64} f4f42dc @nemequ
- neon/abd: trivial formatting fix 0b8c8ca @nemequ
- neon/abd: add missing scalar functions 517a613 @nemequ
- neon/abs: add vabsd_s64 4091e3e @nemequ
- neon/abs: vabsd_s64 wasn't added to GCC until 9.1.0 52051cb @nemequ
- neon/add: implement vaddd_s64 and vaddd_u64 03d4d1b @nemequ
- neon/cagt: implement vcagt{s_f32,d_f64} 731cf71 @nemequ
- neon/c{ge,gt,le,lt}: some improved 64-bit comparisons 97f4dfb @nemequ
- neon/ext: work around bug in GCC prior to 9.0 0c29a5f @nemequ
- neon/padd: vpadd_f32 was buggy in older clang versions 623cbf7 @nemequ
- neon/rnd: add NaN and ties to test suite fa950a2 @nemequ
- neon/rndm: initial implementation 5bf93ad @nemequ
- neon/rndn: initial implementation 2c624b5 @nemequ
- neon/rndp: initial implementation 7f1f499 @nemequ
- neon/uqadd: clang prior to 9 used incorrect types for the scalar funcs fa0eca0 @nemequ
- neon/uzp1,neon/uzp2: change some dependencies from SSE to SSE2 c00a0e5 @rosbif
x86 intrinsics
SSE*
- sse: fix overflow handling for simde_mm_cvt_ss2si a4658d8 @mr-c
- sse: add SIMDE_MM_{GET,SET}_FLUSH_ZERO_MODE 340bf13 @nemequ
- sse, sse2: add range checks to several conversion functions c3d7abf @nemequ
- sse2: update test for simde_mm_set1_epi32 8854ede @nemequ
- sse2: fix armv7 NEON implementation for simde_mm_shufflehi_epi16 338dac0 @nemequ
- sse2: change some dependencies from SSE to SSE2 c00a0e5 @rosbif
- sse2: fix potentially unused variable in loadu functions f43bfed @nemequ
- sse2: use void* for destinations of loadu functions 98c63ae @nemequ
- sse4.1: check for SHUFFLE_VECTOR before using it in _mm_cvtepu32_epi64 cb73aec @nemequ
- sse4.2: some improved 64-bit comparisons 97f4dfb @nemequ
AVX
- avx: use void* for destinations of loadu functions 98c63ae @nemequ
AVX512
- permutex2var: fix some signed/unsigned mismatch warnings 951caa1 @nemequ
- avx512/s{r,l}li: the imm8 paramters should be unsigned ecc388d @nemequ
XOP
- xop: initial implementation 6cc0cef @nemequ
- xop: add a bunch of NEON implementations b602fbc @nemequ
- xop: fix NEON implementation of simde_mm_maccsd_epi16 8d499b5 @nemequ
Testing with Docker/Podman & CI
- docker: add gdb and valgrind to installed packages 4500040 @nemequ
- ci: move icc build from Travis to GitHub Actions 712f01a @nemequ
- gh-actions: run on pull requests 43e7053 @mr-c
- drone: re-organize drone builds 73fe36a @nemequ
- drone: adjust branch triggers 9eba966 @nemequ
- README: update CI information ca440ae @nemequ
- circleci: add Circle CI 5d5350c @nemequ
- circleci: actually build in 32-bit mode 4267926 @nemequ
- cirrus: add Cirrus CI support 0212a07 @nemequ
- cirrus: run asan/ubsan instead of just another GCC build a1c9f1d @nemequ
- docker: allow for an optional persistent build directory 610fa3d @nemequ
- gh-actions, semaphore: move GCC and clang builds to Semaphore 49d0d82 @nemequ
- ci: disable ci/* builds for various providers 28f8775 @nemequ
- travis: disable all builds 687851b @nemequ