-
Notifications
You must be signed in to change notification settings - Fork 259
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NEON : Complex operations from Armv8.3-a #1077
Conversation
About the binary operations and 16-bit floating points: they might work when |
Thanks for your help. I have fixed it and pushed the code again! |
simde/arm/neon/cmla_lane.h
Outdated
result = simde_float16x4_from_private(r_); | ||
return result; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Combine these lines
simde/arm/neon/cmla_lane.h
Outdated
#if defined(SIMDE_ARM_NEON_A32V8_ENABLE_NATIVE_ALIASES) | ||
#undef vcmla_lane_f16 | ||
#define vcmla_lane_f16(r, a, b, lane) simde_vcmla_lane_f16(r, a, b, lane) | ||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#if defined(SIMDE_ARM_NEON_A32V8_ENABLE_NATIVE_ALIASES) | |
#undef vcmla_lane_f16 | |
#define vcmla_lane_f16(r, a, b, lane) simde_vcmla_lane_f16(r, a, b, lane) | |
#endif | |
#if defined(SIMDE_ARM_NEON_A32V8_ENABLE_NATIVE_ALIASES) | |
#undef vcmla_lane_f16 | |
#define vcmla_lane_f16(r, a, b, lane) simde_vcmla_lane_f16(r, a, b, lane) | |
#endif |
simde/arm/neon/cmla_rot90_lane.h
Outdated
simde_float32x4_private r_ = | ||
simde_float32x4_to_private(simde_vcvt_f32_f16(r)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This style is more readable
simde_float32x4_private r_ = | |
simde_float32x4_to_private(simde_vcvt_f32_f16(r)), | |
simde_float32x4_private r_ = simde_float32x4_to_private( | |
simde_vcvt_f32_f16(r)), |
test/arm/neon/cadd_rot270.c
Outdated
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, there are several instances of too many blank lines
test/arm/neon/cmla_rot180_lane.c
Outdated
|
||
// simde_float32x4_t r = simde_vcmlaq_rot180_laneq_f32(r_, a, b, | ||
// test_vec[i].lane); simde_test_arm_neon_write_f32x4(2, r, | ||
// SIMDE_TEST_VEC_POS_LAST); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please tidy up your test code as well; thank you!
// simde_float32x4_t r = simde_vcmlaq_rot180_laneq_f32(r_, a, b, | |
// test_vec[i].lane); simde_test_arm_neon_write_f32x4(2, r, | |
// SIMDE_TEST_VEC_POS_LAST); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello! I've cleaned up the code and reformatted it using Clang-Format with the LLVM style. Apologies for the coding style and redundant comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the clean up. SIMDe does not (yet) have an official clang-format style. I see that the LLVM style still uses ColumnLimit: 80
, which I find to be too narrow.
I will add that guidance to https://github.com/simd-everywhere/simde/wiki/Coding-Style for future contributors
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like all the compiler issues are fixed (I think the rpm-build:fedora-rawhide-i386
failure is not your fault). Please fix the formatting errors and merge/re-base on the latest commits in https://github.com/simd-everywhere/simde/tree/master ; I'll then merge this PR! Thank you @wewe5215 !
@mr-c Hello! I have reformatted the code and rebased it on the latest commit. I really appreciate your patience! |
…ne_f{16/32} and vcmlaq_laneq_f{16/32}
…nd vcmlaq_rot90_lane_f{16/32} and vcmlaq_rot90_laneq_f{16/32}
… and vcmlaq_rot180_lane_f{16/32} and vcmlaq_rot180_laneq_f{16/32}
…_f16 and vcmla{/q}_lane{/q}_f16
… vcmla{/q}_lane{/q}_f16
c51131a
to
19ed113
Compare
This pull request includes initial implementations and corresponding test cases listed below
Sorry for the typo in commit fa9a14d.
It is [Neon] Add vcmla_rot270_lane_f{16/32} and vcmla_rot270_laneq_f{16/32} and vcmlaq_rot270_lane_f{16/32} and vcmlaq_rot270_laneq_f{16/32}