Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash in onnxruntime_mlas_test.exe when running on ARM #18517

Closed
ivberg opened this issue Nov 20, 2023 · 3 comments
Closed

Crash in onnxruntime_mlas_test.exe when running on ARM #18517

ivberg opened this issue Nov 20, 2023 · 3 comments
Labels
platform:windows issues related to the Windows platform

Comments

@ivberg
Copy link
Contributor

ivberg commented Nov 20, 2023

Describe the issue

We found a crash in onnxruntime_mlas_test.exe when running on ARM64 Windows. AMD64 was fine.

We tested on Microsoft SP9 5G / Windows Dev Kit 2023 but this also repro'ed on newer QC chipsets.

It looks like the crash is due to a CPUID query with an "unknown ARM CPU part" (even though this chipset is released) and the default logger is not initialized so this crashes.

This could crash other binaries but we only tested onnxruntime_mlas_test.exe

To reproduce

  1. Compile main (I used 9e8ad39) arm64. E.g. .\build.bat --config RelWithDebInfo --build_shared_lib --parallel --cmake_generator "Visual Studio 17 2022" --arm64
  2. Run onnxruntime_mlas_test.exe E.g. C:\src\onnxruntime\build\Windows\RelWithDebInfo\RelWithDebInfo\onnxruntime_mlas_test.exe --help
  3. Crash!

Debugger Stack:

Code crash:
onnxruntime/core/common/cpuid_uarch.cc b/onnxruntime/core/common/cpuid_uarch.cc:140
LOGS_DEFAULT(WARNING) << "unknown ARM CPU part 0x" << std::hex << midr_get_part(midr) << " ignored";

0:000> kn
 # Child-SP          RetAddr               Call Site
00 000000b5`7211d320 00007ffc`6cef5704     KERNELBASE!RaiseException+0x58
01 000000b5`7211d3d0 00007ff6`e4cd4e40     VCRUNTIME140!_CxxThrowException+0xa4 [D:\a\_work\1\s\src\vctools\crt\vcruntime\src\eh\throw.cpp @ 74] 
02 000000b5`7211d430 00007ff6`e4ce741c     onnxruntime_mlas_test!onnxruntime::logging::LoggingManager::DefaultLogger+0x88 [C:\src\onnxruntime\include\onnxruntime\core\common\logging\logging.h @ 294] 
03 000000b5`7211d5a0 00007ff6`e4ce4e78     onnxruntime_mlas_test!onnxruntime::decodeMIDR+0x344 [C:\src\onnxruntime\onnxruntime\core\common\cpuid_uarch.cc @ 140] 
04 000000b5`7211e970 00007ff6`e4cbe8bc     onnxruntime_mlas_test!onnxruntime::CPUIDInfo::ArmWindowsInit+0xb8 [C:\src\onnxruntime\onnxruntime\core\common\cpuid_info.cc @ 220] 
05 (Inline Function) --------`--------     onnxruntime_mlas_test!onnxruntime::CPUIDInfo::{ctor}+0x24 [C:\src\onnxruntime\onnxruntime\core\common\cpuid_info.h @ 124] 
06 (Inline Function) --------`--------     onnxruntime_mlas_test!onnxruntime::CPUIDInfo::GetCPUIDInfo+0x88 [C:\src\onnxruntime\onnxruntime\core\common\cpuid_info.h @ 14] 
07 000000b5`7211fa00 00007ff6`e4bb8334     onnxruntime_mlas_test!MlasFp16AccelerationSupported+0x9c [C:\src\onnxruntime\onnxruntime\core\mlas\lib\halfgemm.cpp @ 29] 
08 (Inline Function) --------`--------     onnxruntime_mlas_test!<lambda_c3045e2c0dfc4497141e494e6eea09a5>::operator()+0x4 [C:\src\onnxruntime\onnxruntime\test\mlas\unittest\test_halfgemm.cpp @ 197] 
09 (Inline Function) --------`--------     onnxruntime_mlas_test!std::invoke+0x8 [C:\Program Files\Microsoft Visual Studio\2022\Enterprise\VC\Tools\MSVC\14.37.32822\include\type_traits @ 1762] 
0a 000000b5`7211fa30 00007ff6`e4bf35c0     onnxruntime_mlas_test!std::_Func_impl_no_alloc<<lambda_c3045e2c0dfc4497141e494e6eea09a5>,unsigned __int64,bool>::_Do_call+0x14 [C:\Program Files\Microsoft Visual Studio\2022\Enterprise\VC\Tools\MSVC\14.37.32822\include\functional @ 841] 
0b (Inline Function) --------`--------     onnxruntime_mlas_test!std::_Func_class<unsigned __int64,bool>::operator()+0x2c [C:\Program Files\Microsoft Visual Studio\2022\Enterprise\VC\Tools\MSVC\14.37.32822\include\functional @ 883] 
0c (Inline Function) --------`--------     onnxruntime_mlas_test!LongShortExecuteManager::RegisterAll+0x44 [C:\src\onnxruntime\onnxruntime\test\mlas\unittest\test_main.cpp @ 40] 
0d 000000b5`7211fa50 00007ff6`e4d03230     onnxruntime_mlas_test!main+0x120 [C:\src\onnxruntime\onnxruntime\test\mlas\unittest\test_main.cpp @ 65] 
0e (Inline Function) --------`--------     onnxruntime_mlas_test!invoke_main+0x24 [D:\a\_work\1\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl @ 78] 
0f 000000b5`7211fab0 00007ff6`e4d032d4     onnxruntime_mlas_test!__scrt_common_main_seh+0x128 [D:\a\_work\1\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl @ 288] 
10 (Inline Function) --------`--------     onnxruntime_mlas_test!__scrt_common_main+0x8 [D:\a\_work\1\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl @ 330] 
11 000000b5`7211faf0 00007ffc`a8fc7ab0     onnxruntime_mlas_test!mainCRTStartup+0x14 [D:\a\_work\1\s\src\vctools\crt\vcstartup\src\startup\exe_main.cpp @ 16] 

Urgency

Not urgent. Have workaround commenting out logging code

diff --git a/onnxruntime/core/common/cpuid_uarch.cc b/onnxruntime/core/common/cpuid_uarch.cc
index 52baad7394..95db127df8 100644
--- a/onnxruntime/core/common/cpuid_uarch.cc
+++ b/onnxruntime/core/common/cpuid_uarch.cc
@@ -137,7 +137,8 @@ void decodeMIDR(
               break;
               // #endif /* ARM */
             default:
-              LOGS_DEFAULT(WARNING) << "unknown ARM CPU part 0x" << std::hex << midr_get_part(midr) << " ignored";
+              // LOGS_DEFAULT(WARNING) << "unknown ARM CPU part 0x" << std::hex << midr_get_part(midr) << " ignored";
+              break;
           }
       }
       break;

Platform

Windows

OS Version

Windows 11

ONNX Runtime Installation

Built from Source

ONNX Runtime Version or Commit ID

9e8ad39

ONNX Runtime API

C++

Architecture

X64

Execution Provider

Default CPU

Execution Provider Library Version

No response

@github-actions github-actions bot added the platform:windows issues related to the Windows platform label Nov 20, 2023
@ivberg
Copy link
Contributor Author

ivberg commented Nov 20, 2023

Looks like it's probably already fixed by 0a4d76d

@snnn
Copy link
Member

snnn commented Nov 20, 2023

The program needs to initialize the global logger first.

@ivberg
Copy link
Contributor Author

ivberg commented Nov 20, 2023

closing per fixed already

@ivberg ivberg closed this as completed Nov 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
platform:windows issues related to the Windows platform
Projects
None yet
Development

No branches or pull requests

2 participants