Skip to content

v0.7.2

Compare
Choose a tag to compare
@mr-c mr-c released this 10 May 18:20
· 731 commits to master since this release

Summary

Post v0.7.0 fixes; more portable implementations of neon intrinsics

Details

  • common: fix SIMDE_FLOAT64_C macro when SIMDE_FLOAT64_TYPE is defined 1d28a5d @rosbif
  • complex: split complex math out into separate header 0678336 @nemequ
  • diagnostic: silence a few -Weverything diagnostics on clang < 5 6f8d285 @nemequ

Implementation of NEON intrinsics:

  • neon/ceq: implement vceq{s_f32,d_f64} f4f42dc @nemequ
  • neon/abd: trivial formatting fix 0b8c8ca @nemequ
  • neon/abd: add missing scalar functions 517a613 @nemequ
  • neon/abs: add vabsd_s64 4091e3e @nemequ
  • neon/abs: vabsd_s64 wasn't added to GCC until 9.1.0 52051cb @nemequ
  • neon/add: implement vaddd_s64 and vaddd_u64 03d4d1b @nemequ
  • neon/cagt: implement vcagt{s_f32,d_f64} 731cf71 @nemequ
  • neon/c{ge,gt,le,lt}: some improved 64-bit comparisons 97f4dfb @nemequ
  • neon/ext: work around bug in GCC prior to 9.0 0c29a5f @nemequ
  • neon/padd: vpadd_f32 was buggy in older clang versions 623cbf7 @nemequ
  • neon/rnd: add NaN and ties to test suite fa950a2 @nemequ
  • neon/rndm: initial implementation 5bf93ad @nemequ
  • neon/rndn: initial implementation 2c624b5 @nemequ
  • neon/rndp: initial implementation 7f1f499 @nemequ
  • neon/uqadd: clang prior to 9 used incorrect types for the scalar funcs fa0eca0 @nemequ
  • neon/uzp1,neon/uzp2: change some dependencies from SSE to SSE2 c00a0e5 @rosbif

x86 intrinsics

SSE*

  • sse: fix overflow handling for simde_mm_cvt_ss2si a4658d8 @mr-c
  • sse: add SIMDE_MM_{GET,SET}_FLUSH_ZERO_MODE 340bf13 @nemequ
  • sse, sse2: add range checks to several conversion functions c3d7abf @nemequ
  • sse2: update test for simde_mm_set1_epi32 8854ede @nemequ
  • sse2: fix armv7 NEON implementation for simde_mm_shufflehi_epi16 338dac0 @nemequ
  • sse2: change some dependencies from SSE to SSE2 c00a0e5 @rosbif
  • sse2: fix potentially unused variable in loadu functions f43bfed @nemequ
  • sse2: use void* for destinations of loadu functions 98c63ae @nemequ
  • sse4.1: check for SHUFFLE_VECTOR before using it in _mm_cvtepu32_epi64 cb73aec @nemequ
  • sse4.2: some improved 64-bit comparisons 97f4dfb @nemequ

AVX

  • avx: use void* for destinations of loadu functions 98c63ae @nemequ

AVX512

  • permutex2var: fix some signed/unsigned mismatch warnings 951caa1 @nemequ
  • avx512/s{r,l}li: the imm8 paramters should be unsigned ecc388d @nemequ

XOP

  • xop: initial implementation 6cc0cef @nemequ
  • xop: add a bunch of NEON implementations b602fbc @nemequ
  • xop: fix NEON implementation of simde_mm_maccsd_epi16 8d499b5 @nemequ

Testing with Docker/Podman & CI

  • docker: add gdb and valgrind to installed packages 4500040 @nemequ
  • ci: move icc build from Travis to GitHub Actions 712f01a @nemequ
  • gh-actions: run on pull requests 43e7053 @mr-c
  • drone: re-organize drone builds 73fe36a @nemequ
  • drone: adjust branch triggers 9eba966 @nemequ
  • README: update CI information ca440ae @nemequ
  • circleci: add Circle CI 5d5350c @nemequ
  • circleci: actually build in 32-bit mode 4267926 @nemequ
  • cirrus: add Cirrus CI support 0212a07 @nemequ
  • cirrus: run asan/ubsan instead of just another GCC build a1c9f1d @nemequ
  • docker: allow for an optional persistent build directory 610fa3d @nemequ
  • gh-actions, semaphore: move GCC and clang builds to Semaphore 49d0d82 @nemequ
  • ci: disable ci/* builds for various providers 28f8775 @nemequ
  • travis: disable all builds 687851b @nemequ

Misc

  • cmake: don't explicitly list source files in the x86 directory 88c6f7e @nemequ
  • meson: link to libm if available 251bc0d @nemequ
  • simde-align: allow alignment > 8 on MSVC ≥ 19.16 (VS 2017) 0968271 @jsbache
  • README: fix a couple of outdated links 6001182 @nemequ