Add config to disable FP16 optimizations on aarch64 UNIX [Build] #18744

oddfacade · 2023-12-07T17:50:55Z

Describe the issue

I'm unable to build for aarch64 targets that don't support FP16 instructions, due to the configuration here, which includes sources compiled with -march=armv8.2-a+fp16 in all aarch64 builds. (The issue is specifically that the optimized methods fail to inline when combined with the rest of the source, which is compiled without +fp16.)

I have been able to work around this by patching the source to prevent MLAS_F16VEC_INTRINSICS_SUPPORTED from being set here. I would like to make this option directly configurable from CMake.

I made a discussion post a few days ago with some context.

If there's support for this feature, I have an implementation ready to contribute.

Urgency

No response

Target platform

aarch64

Build script

CMake build targeting Linux with -march=armv8.2-a.

Error / output

Misc. methods in fp16_common.h fail to inline.

Visual Studio Version

No response

GCC / Compiler Version

11.2, 13.2

The text was updated successfully, but these errors were encountered:

YUNQIUGUO · 2023-12-07T19:27:23Z

@chenfucn

snnn · 2023-12-08T21:38:50Z

They are compile time flags which only impose requirements on compilers, not hardware. For example, in the same file you will see a lot of references to AVX512, but today most PC hardware do not have AVX512. It doesn't cause any problem.

oddfacade · 2023-12-09T01:10:15Z

This is the kind of build error I get when I build using arm-gnu 11.2 with -march=armv8.2-a.

In file included from /var/local/workspace/onnxruntime/onnxruntime/core/mlas/lib/mlasi.h:50,                                                                                                                                                  
                 from /var/local/workspace/onnxruntime/onnxruntime/core/mlas/lib/fp16_common.h:20,                                                                                                                                            
                 from /var/local/workspace/onnxruntime/onnxruntime/core/mlas/lib/activate_fp16.cpp:17:                                                                                                                                        
/opt/toolchains/arm-gnu-toolchain-11.2-2022.02-x86_64-aarch64-none-linux-gnu/lib/gcc/aarch64-none-linux-gnu/11.2.1/include/arm_neon.h: In function ‘MLAS_FLOAT16X8 MlasAddFloat16x8(MLAS_FLOAT16X8, MLAS_FLOAT16X8)’:                         
/opt/toolchains/arm-gnu-toolchain-11.2-2022.02-x86_64-aarch64-none-linux-gnu/lib/gcc/aarch64-none-linux-gnu/11.2.1/include/arm_neon.h:31367:1: error: inlining failed in call to ‘always_inline’ ‘float16x8_t vaddq_f16(float16x8_t, float16x8
_t)’: target specific option mismatch                                                                                                                                                                                                         
31367 | vaddq_f16 (float16x8_t __a, float16x8_t __b)                                                                                                                                                                                          
      | ^~~~~~~~~                                                                                                                                                                                                                             
In file included from /var/local/workspace/onnxruntime/onnxruntime/core/mlas/lib/activate_fp16.cpp:17:                                                                                                                                        
/var/local/workspace/onnxruntime/onnxruntime/core/mlas/lib/fp16_common.h:113:21: note: called from here
  113 |     return vaddq_f16(Vector1, Vector2);
      |            ~~~~~~~~~^~~~~~~~~~~~~~~~~~
gmake[2]: *** [_deps/onnxruntime-build/CMakeFiles/onnxruntime_mlas.dir/build.make:716: _deps/onnxruntime-build/CMakeFiles/onnxruntime_mlas.dir/var/local/workspace/onnxruntime/onnxruntime/core/mlas/lib/activate_fp16.cpp.o] Error 1

When I build with add_compile_options(-march=armv8.2-a+fp16) set globally, the build succeeds, but the resulting binary fails to run on my Cortex A57 VM (I get an illegal instruction error).

chenfucn · 2023-12-11T17:17:13Z

These are two different issues. It's great to see that with add_compile_options(-march=armv8.2-a+fp16) set globally, the build succeeds. The problem is the illegal instruction error instead of building problem then. We do have code to detect fp16 hardware feature in ARM. We need to fix the bug causing the illegal instruction, instead of using build options to get around it. Can you build in Debug and locate where the illegal instruction error was thrown? Thanks!

snnn · 2023-12-11T17:26:52Z

Have you tried GCC 13? I believe our code can pass the build with GCC 13 without problems.

chenfucn · 2023-12-11T17:27:07Z

https://github.com/microsoft/onnxruntime/blob/main/onnxruntime/core/mlas/lib/halfgemm.cpp#L29
The hardware feature detection code is here. Unfortunately I don't have hardware to test this at this moment. Would you see whether this function is called before the illegal instruction error is thrown?

oddfacade · 2023-12-12T16:53:19Z

@chenfucn Thank you for the clarification and guidance. I should have some time this week to look into this. I will update here with whatever I find.

@snnn I definitely tried building with GCC 13.2, but I don't think I ran that code on the target to see if the hardware feature detection works. I'll check that as well.

snnn · 2023-12-12T23:49:47Z

When I build with add_compile_options(-march=armv8.2-a+fp16) set globally, the build succeeds, but the resulting binary fails to run on my Cortex A57 VM

It is expected.

github-actions · 2024-01-13T15:00:57Z

This issue has been automatically marked as stale due to inactivity and will be closed in 30 days if no further activity occurs. If further support is needed, please provide an update and/or more details.

github-actions · 2024-02-13T15:01:02Z

This issue has been automatically closed due to inactivity. Please reactivate if further support is needed.

oddfacade added the build build issues; typically submitted using template label Dec 7, 2023

YUNQIUGUO removed the build build issues; typically submitted using template label Dec 7, 2023

YUNQIUGUO added the build build issues; typically submitted using template label Dec 7, 2023

github-actions bot added the stale issues that have not been addressed in a while; categorized by a bot label Jan 13, 2024

github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Feb 13, 2024

vilmursss mentioned this issue Dec 19, 2024

[Build] Unable to cross-compile ONNX Runtime 1.17.1 for ARM Cortex A53 #23152

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add config to disable FP16 optimizations on aarch64 UNIX [Build] #18744

Add config to disable FP16 optimizations on aarch64 UNIX [Build] #18744

oddfacade commented Dec 7, 2023

YUNQIUGUO commented Dec 7, 2023

snnn commented Dec 8, 2023

oddfacade commented Dec 9, 2023

chenfucn commented Dec 11, 2023

snnn commented Dec 11, 2023

chenfucn commented Dec 11, 2023

oddfacade commented Dec 12, 2023

snnn commented Dec 12, 2023

github-actions bot commented Jan 13, 2024

github-actions bot commented Feb 13, 2024

Add config to disable FP16 optimizations on aarch64 UNIX [Build] #18744

Add config to disable FP16 optimizations on aarch64 UNIX [Build] #18744

Comments

oddfacade commented Dec 7, 2023

Describe the issue

Urgency

Target platform

Build script

Error / output

Visual Studio Version

GCC / Compiler Version

YUNQIUGUO commented Dec 7, 2023

snnn commented Dec 8, 2023

oddfacade commented Dec 9, 2023

chenfucn commented Dec 11, 2023

snnn commented Dec 11, 2023

chenfucn commented Dec 11, 2023

oddfacade commented Dec 12, 2023

snnn commented Dec 12, 2023

github-actions bot commented Jan 13, 2024

github-actions bot commented Feb 13, 2024