Use int16 min/max for _mm_set_epi16() calls #395
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This fixes a bug that causes g0 overflows when compiled on Intel processors and run on AMD Ryzen processors.
According to Intel, the
_mm_set_epi16()
function takes signed shorts (int16): https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html#text=mm_set_epi16&ig_expand=6355,6355But 0-65535 is the range of uint16, not int16; causing an overflow to -1. This is warned of by the compiler:
implict conversion from int to short changes value from 65535 to -1
For whatever reason, compiling and running on AMD Ryzen chip succeeded fine. Compiling and runing on an Intel chip also worked fine. But compiling on Intel and running on AMD caused an exception.
Using the int16 range fixes the problem. However, this is not my skill set and I am not sure if this is the right fix. Please verify. Thank you!