-
Notifications
You must be signed in to change notification settings - Fork 259
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wrong constant range in simde_vshll_n_XXX intrinsics #1064
Comments
I generally refer to the following document when writing NEON code: From this document:
Maybe you could change the minima from 1 to 0 and add test cases for these zero length shifts. Remark: |
About zero shifts on A32 - my documentation didn't mention it, but according to this, zero shift is permitted, but resulting instruction is The page also mentions that shifts of the element size are allowed. Similarly for A64 shll instruction allows shifts of the element size. And from my testing, compilers support shifts of the element size when using |
I think that SIMDe should follow the documentation of the NEON intrinsics (link above), not the documentation of the underlying assembly instructions. Some compilers support out of range constants or even variables where strictly a constant is required. So I maintain my position that we should not increase the maxima to the element size in bits but just change the minima from 1 to 0 and add test cases for these zero length shifts. Remark: the document above states that |
I found this documentation (which to me looks like official arm documentation) which describes intrinsics also for shifts of the element size. |
Oops, sorry: you seem to be right. So it looks as if the range should be zero to the element size in bits, and test code must be added for both extremes. |
After a search it seems that in the document I cited, a description and pseudo-code corresponding to (There is also the assembly code corresponding to This is not the first time that I have found intrinsics in this document with totally incorrect descriptions and pseudo-code. |
Actually, |
Thank you for your explanation. I must admit that I was not familiar with the bfloat16 data format. |
Can someone summarize this for me or open a PR? I'd like to make a new SIMDe release in the next week.. |
Hi Michael, I think that we now both agree that for all the So both Both Both Test cases should be added to I could do this but I haven't contributed to SIMDe for nearly three years. |
Fixed in #1068 |
In
simde_vshll_n_XXX
intrinsics the constant range is defined as 1-7 for 8-bit vectors, 1-15 for 16-bit vectors and 1-31 for 32-bit vectors (For example here forsimde_vshll_n_s8
).These ranges are not correct because:
vshll
has two encodings. One encoding for shifts 1-7 for 8-bit vectors, 1-15 for 16-bit vectors and 1-31 for 32-bit vectors. And another encoding for shift 8 for 8-bit vectors, 16 for 16-bit vectors a 32 for 32-bit vectors.sshll
andushll
which have shifts 0-7 for 8-bit vectors, 0-15 for 16-bit vectors and 0-31 for 32-bit vectors, and there's instructionshll
which has shift 8 for 8-bit vectors, 16 for 16-bit vectors a 32 for 32-bit vectors.Therefore the ranges in
simde_vshll_n_XXX
intrinsics should be extended at least to 1-8 for 8-bit vectors, 1-16 for 16-bit vectors a 1-32 for 32-bit vectors.I'm not sure how zero shifts should be handled.
The text was updated successfully, but these errors were encountered: