Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NEON: more fp16 using intrinsics supported by architecture v7 (skip version) #1081

Merged
merged 45 commits into from
Oct 18, 2023

Conversation

yyctw
Copy link
Contributor

@yyctw yyctw commented Oct 17, 2023

Hi all, this is Eric from Andes Technology Corporation.
This PR is based on the previous PR with certain functions removed to avoid triggering compiler bugs.

@yyctw yyctw force-pushed the v7_without_compiler_bugs branch from 9341801 to e895c43 Compare October 17, 2023 11:34
simde/arm/neon/fma.h Outdated Show resolved Hide resolved
SIMDE_CONSTIFY_8_(vget_lane_f16, r, (HEDLEY_UNREACHABLE(), SIMDE_FLOAT16_VALUE(0.0)), lane, v);
SIMDE_CONSTIFY_8_(vgetq_lane_f16, r, (HEDLEY_UNREACHABLE(), SIMDE_FLOAT16_VALUE(0.0)), lane, v);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Odd, the tests already covered lane values above 3; why didn't that cause a problem already?

@yyctw Can you take a look at

INT8_C( 4),
SIMDE_FLOAT16_VALUE(281.00)},
{ { SIMDE_FLOAT16_VALUE( 392.00), SIMDE_FLOAT16_VALUE( -758.50), SIMDE_FLOAT16_VALUE( -870.50), SIMDE_FLOAT16_VALUE( -511.25),
SIMDE_FLOAT16_VALUE( 731.50), SIMDE_FLOAT16_VALUE( 345.75), SIMDE_FLOAT16_VALUE( -405.25), SIMDE_FLOAT16_VALUE( -353.75) },
INT8_C( 5),
SIMDE_FLOAT16_VALUE(345.75)},
{ { SIMDE_FLOAT16_VALUE( 345.75), SIMDE_FLOAT16_VALUE( 372.75), SIMDE_FLOAT16_VALUE( 802.50), SIMDE_FLOAT16_VALUE( -373.00),
SIMDE_FLOAT16_VALUE( 133.12), SIMDE_FLOAT16_VALUE( 928.00), SIMDE_FLOAT16_VALUE( -18.17), SIMDE_FLOAT16_VALUE( -974.50) },
INT8_C( 6),
SIMDE_FLOAT16_VALUE(-18.17)},
{ { SIMDE_FLOAT16_VALUE( -634.00), SIMDE_FLOAT16_VALUE( -283.75), SIMDE_FLOAT16_VALUE( -99.50), SIMDE_FLOAT16_VALUE( 134.00),
SIMDE_FLOAT16_VALUE( -781.50), SIMDE_FLOAT16_VALUE( 1188.00), SIMDE_FLOAT16_VALUE( -106.88), SIMDE_FLOAT16_VALUE( -497.25) },
INT8_C( 7),
SIMDE_FLOAT16_VALUE(-497.25)},
and investigate why this wasn't caught earlier? Feels like this code path is not being run by the tests..

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on my previous conclusion [ref], even if the HEDLEY_UNREACHABLE() is executed during the testing process (when lane > 3), it will not cause the test to fail.

Comment on lines +57 to +60
simde_float16x4x2_t s_ = { { simde_float16x4_from_private(a_[0]),
simde_float16x4_from_private(a_[1]) } };
return s_;
#endif
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the x86 side we would have something like a simde_float16x4x2_private typedef union with simde_float16x4 and simde_float16x4_private fields.

@yyctw yyctw force-pushed the v7_without_compiler_bugs branch from e895c43 to d5c2855 Compare October 18, 2023 01:15
@mr-c mr-c enabled auto-merge (squash) October 18, 2023 01:44
@mr-c mr-c merged commit 5e7c4d4 into simd-everywhere:master Oct 18, 2023
75 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants