[Build] Trying to build on a embedded device that doesn't support BFLOAT16 #19920

ChthonicOne · 2024-03-14T19:07:54Z

Describe the issue

I'm trying to build onnxruntime on a Radxa-Zero, but I've come to find out that it does not support BFLOAT16 instructions. As a result your builds fail because they require those instructions and I cannot find any way to disable support for those instructions.

I do not need support for these instructions on this device. I can make do with fp16 datatype instead if need be despite the rounding error. I already do with NCNN. The problem I have with NCNN is that the device does not properly support Vulcan at this time, and I wished to attempt to try GPU acceleration using ArmNN with ONNX.

Is there a way I can disable these instructions and build onnxruntime without them on this device? I have already built ArmNN on the device.

Urgency

I am currently blocked from attempting to use onnxruntime on the Radxa

Target platform

Radxa-zero aarch64

Build script

./build.sh --config RelWithDebInfo --build_shared_lib --parallel --compile_no_warning_as_error --skip_submodule_sync --disable_types=float8 --use_armnn --armnn_relu --armnn_bn

Error / output

debug log is as follows:

./build.sh --config RelWithDebInfo --build_shared_lib --parallel --compile_no_warning_as_error --skip_submodule_sync --disable_types=float8 --use_armnn --armnn_relu --armnn_bn
2024-03-14 14:49:53,302 build [DEBUG] - Command line arguments:
  --build_dir /home/rock/temp/onnxruntime/build/Linux --config RelWithDebInfo --build_shared_lib --parallel --compile_no_warning_as_error --skip_submodule_sync --disable_types=float8 --use_armnn --armnn_relu --armnn_bn
Namespace(acl_home=None, acl_libs=None, allow_running_as_root=False, android=False, android_abi='arm64-v8a', android_api=27, android_cpp_shared=False, android_ndk_path='', android_run_emulator=False, android_sdk_path='', apple_deploy_target=None, apple_sysroot='', arm=False, arm64=False, arm64ec=False, armnn_bn=True, armnn_home=None, armnn_libs=None, armnn_relu=True, build=False, build_apple_framework=False, build_csharp=False, build_dir='/home/rock/temp/onnxruntime/build/Linux', build_java=False, build_micro_benchmarks=False, build_nodejs=False, build_nuget=False, build_objc=False, build_shared_lib=True, build_wasm=False, build_wasm_static_lib=False, build_wheel=False, buildasx=False, cann_home=None, clean=False, cmake_extra_defines=None, cmake_generator=None, cmake_path='cmake', code_coverage=False, compile_no_warning_as_error=True, config=['RelWithDebInfo'], ctest_path='ctest', cuda_home=None, cuda_version=None, cudnn_home=None, disable_contrib_ops=False, disable_exceptions=False, disable_memleak_checker=False, disable_ml_ops=False, disable_rtti=False, disable_types=['float8'], disable_wasm_exception_catching=False, dml_external_project=False, dml_path='', dnnl_aarch64_runtime='', dnnl_acl_root='', dnnl_gpu_runtime='', dnnl_opencl_root='', eigen_path=None, emscripten_settings=None, emsdk_version='3.1.51', enable_address_sanitizer=False, enable_cuda_line_info=False, enable_cuda_nhwc_ops=False, enable_cuda_profiling=False, enable_external_custom_op_schemas=False, enable_language_interop_ops=False, enable_lazy_tensor=False, enable_lto=False, enable_memory_profile=False, enable_msinternal=False, enable_msvc_static_runtime=False, enable_nccl=False, enable_nvtx_profile=False, enable_onnx_tests=False, enable_pybind=False, enable_reduced_operator_type_support=False, enable_rocm_profiling=False, enable_symbolic_shape_infer_tests=False, enable_training=False, enable_training_apis=False, enable_training_ops=False, enable_transformers_tool_test=False, enable_wasm_api_exception_catching=False, enable_wasm_debug_info=False, enable_wasm_exception_throwing_override=True, enable_wasm_profiling=False, enable_wasm_simd=False, enable_wasm_threads=False, enable_wcos=False, extensions_overridden_path=None, external_graph_transformer_path=None, fuzz_testing=False, gdk_edition='.', gdk_platform='Scarlett', gen_api_doc=False, gen_doc=None, include_ops_by_config=None, ios=False, ios_toolchain_file='', llvm_config='', llvm_path=None, migraphx_home=None, minimal_build=None, mpi_home=None, ms_experimental=False, msbuild_extra_options=None, msvc_toolset=None, nccl_home=None, nnapi_min_api=None, numpy_version=None, nvcc_threads=-1, osx_arch='x86_64', parallel=0, path_to_protoc_exe=None, qnn_home=None, riscv_qemu_path='', riscv_toolchain_root='', rocm_home=None, rocm_version=None, rv64=False, skip_keras_test=False, skip_nodejs_tests=False, skip_onnx_tests=False, skip_submodule_sync=True, skip_tests=False, skip_winml_tests=False, snpe_root=None, target=None, tensorrt_home=None, test=False, test_all_timeout='10800', tvm_cuda_runtime=False, update=False, use_acl=None, use_armnn=True, use_azure=False, use_binskim_compliant_compile_flags=False, use_cache=False, use_cann=False, use_coreml=False, use_cuda=False, use_dml=False, use_dnnl=False, use_extensions=False, use_full_protobuf=False, use_gdk=False, use_jsep=False, use_lock_free_queue=False, use_migraphx=False, use_mimalloc=False, use_mpi=False, use_nnapi=False, use_openvino=None, use_preinstalled_eigen=False, use_qnn=False, use_rknpu=False, use_rocm=False, use_snpe=False, use_telemetry=False, use_tensorrt=False, use_tensorrt_builtin_parser=True, use_tensorrt_oss_parser=False, use_triton_kernel=False, use_tvm=False, use_tvm_hash=False, use_vitisai=False, use_webnn=False, use_winml=False, use_xnnpack=False, wasm_malloc=None, wasm_run_tests_in_browser=False, wheel_name_suffix=None, windows_sdk_version=None, winml_root_namespace_override=None, x86=False, xcode_code_signing_identity='', xcode_code_signing_team_id='')
2024-03-14 14:49:53,332 build [DEBUG] - Defaulting to running update, build [and test for native builds].
2024-03-14 14:49:53,333 build [INFO] - Build started
2024-03-14 14:49:53,334 build [INFO] - Generating CMake build tree
2024-03-14 14:49:53,335 build [INFO] - /home/rock/.local/bin/cmake /home/rock/temp/onnxruntime/cmake --compile-no-warning-as-error -Donnxruntime_RUN_ONNX_TESTS=OFF -Donnxruntime_GENERATE_TEST_REPORTS=ON -DPython_EXECUTABLE=/usr/bin/python3 -DPYTHON_EXECUTABLE=/usr/bin/python3 -Donnxruntime_USE_MIMALLOC=OFF -Donnxruntime_ENABLE_PYTHON=OFF -Donnxruntime_BUILD_CSHARP=OFF -Donnxruntime_BUILD_JAVA=OFF -Donnxruntime_BUILD_NODEJS=OFF -Donnxruntime_BUILD_OBJC=OFF -Donnxruntime_BUILD_SHARED_LIB=ON -Donnxruntime_BUILD_APPLE_FRAMEWORK=OFF -Donnxruntime_USE_DNNL=OFF -Donnxruntime_USE_NNAPI_BUILTIN=OFF -Donnxruntime_USE_RKNPU=OFF -Donnxruntime_USE_LLVM=OFF -Donnxruntime_ENABLE_MICROSOFT_INTERNAL=OFF -Donnxruntime_USE_VITISAI=OFF -Donnxruntime_USE_TENSORRT=OFF -Donnxruntime_USE_TENSORRT_BUILTIN_PARSER=ON -Donnxruntime_USE_TVM=OFF -Donnxruntime_TVM_CUDA_RUNTIME=OFF -Donnxruntime_TVM_USE_HASH=OFF -Donnxruntime_USE_MIGRAPHX=OFF -Donnxruntime_DISABLE_CONTRIB_OPS=OFF -Donnxruntime_DISABLE_ML_OPS=OFF -Donnxruntime_DISABLE_RTTI=OFF -Donnxruntime_DISABLE_EXCEPTIONS=OFF -Donnxruntime_MINIMAL_BUILD=OFF -Donnxruntime_EXTENDED_MINIMAL_BUILD=OFF -Donnxruntime_MINIMAL_BUILD_CUSTOM_OPS=OFF -Donnxruntime_REDUCED_OPS_BUILD=OFF -Donnxruntime_ENABLE_LANGUAGE_INTEROP_OPS=OFF -Donnxruntime_USE_DML=OFF -Donnxruntime_USE_WINML=OFF -Donnxruntime_BUILD_MS_EXPERIMENTAL_OPS=OFF -Donnxruntime_USE_TELEMETRY=OFF -Donnxruntime_ENABLE_LTO=OFF -Donnxruntime_USE_ACL=OFF -Donnxruntime_USE_ACL_1902=OFF -Donnxruntime_USE_ACL_1905=OFF -Donnxruntime_USE_ACL_1908=OFF -Donnxruntime_USE_ACL_2002=OFF -Donnxruntime_USE_ACL_2308=OFF -Donnxruntime_USE_ARMNN=ON -Donnxruntime_ARMNN_RELU_USE_CPU=OFF -Donnxruntime_ARMNN_BN_USE_CPU=OFF -Donnxruntime_USE_JSEP=OFF -Donnxruntime_ENABLE_NVTX_PROFILE=OFF -Donnxruntime_ENABLE_TRAINING=OFF -Donnxruntime_ENABLE_TRAINING_OPS=OFF -Donnxruntime_ENABLE_TRAINING_APIS=OFF -Donnxruntime_ENABLE_CPU_FP16_OPS=OFF -Donnxruntime_USE_NCCL=OFF -Donnxruntime_BUILD_BENCHMARKS=OFF -Donnxruntime_USE_ROCM=OFF -DOnnxruntime_GCOV_COVERAGE=OFF -Donnxruntime_USE_MPI=OFF -Donnxruntime_ENABLE_MEMORY_PROFILE=OFF -Donnxruntime_ENABLE_CUDA_LINE_NUMBER_INFO=OFF -Donnxruntime_USE_CUDA_NHWC_OPS=OFF -Donnxruntime_BUILD_WEBASSEMBLY_STATIC_LIB=OFF -Donnxruntime_ENABLE_WEBASSEMBLY_EXCEPTION_CATCHING=ON -Donnxruntime_ENABLE_WEBASSEMBLY_API_EXCEPTION_CATCHING=OFF -Donnxruntime_ENABLE_WEBASSEMBLY_EXCEPTION_THROWING=ON -Donnxruntime_WEBASSEMBLY_RUN_TESTS_IN_BROWSER=OFF -Donnxruntime_ENABLE_WEBASSEMBLY_THREADS=OFF -Donnxruntime_ENABLE_WEBASSEMBLY_DEBUG_INFO=OFF -Donnxruntime_ENABLE_WEBASSEMBLY_PROFILING=OFF -Donnxruntime_ENABLE_LAZY_TENSOR=OFF -Donnxruntime_ENABLE_EXTERNAL_CUSTOM_OP_SCHEMAS=OFF -Donnxruntime_ENABLE_CUDA_PROFILING=OFF -Donnxruntime_ENABLE_ROCM_PROFILING=OFF -Donnxruntime_USE_XNNPACK=OFF -Donnxruntime_USE_WEBNN=OFF -Donnxruntime_USE_CANN=OFF -Donnxruntime_USE_TRITON_KERNEL=OFF -Donnxruntime_DISABLE_FLOAT8_TYPES=ON -Donnxruntime_DISABLE_SPARSE_TENSORS=OFF -Donnxruntime_DISABLE_OPTIONAL_TYPE=OFF -DCMAKE_TLS_VERIFY=ON -DFETCHCONTENT_QUIET=OFF -DCMAKE_BUILD_TYPE=RelWithDebInfo -DCMAKE_PREFIX_PATH=/home/rock/temp/onnxruntime/build/Linux/RelWithDebInfo/installed
Ignoring COMPILE_WARNING_AS_ERROR target property and CMAKE_COMPILE_WARNING_AS_ERROR variable.
CMake Deprecation Warning at CMakeLists.txt:14 (cmake_policy):
  The OLD behavior for policy CMP0104 will be removed from a future version
  of CMake.

  The cmake-policies(7) manual explains that the OLD behaviors of all
  policies are deprecated and that a policy should be set to OLD only under
  specific short-term circumstances.  Projects should be ported to the NEW
  behavior and not rely on setting a policy to OLD.


CMake Warning (dev) at CMakeLists.txt:55 (include):
  Policy CMP0145 is not set: The Dart and FindDart modules are removed.  Run
  "cmake --help-policy CMP0145" for policy details.  Use the cmake_policy
  command to set the policy and suppress this warning.

This warning is for project developers.  Use -Wno-dev to suppress it.

CMake Warning (dev) at /home/rock/.local/lib/python3.8/site-packages/cmake/data/share/cmake-3.28/Modules/Dart.cmake:47 (message):
  Policy CMP0145 is not set: The Dart and FindDart modules are removed.  Run
  "cmake --help-policy CMP0145" for policy details.  Use the cmake_policy
  command to set the policy and suppress this warning.
Call Stack (most recent call first):
  CMakeLists.txt:55 (include)
This warning is for project developers.  Use -Wno-dev to suppress it.

CMake Warning (dev) at /home/rock/.local/lib/python3.8/site-packages/cmake/data/share/cmake-3.28/Modules/Dart.cmake:57 (find_package):
  Policy CMP0144 is not set: find_package uses upper-case <PACKAGENAME>_ROOT
  variables.  Run "cmake --help-policy CMP0144" for policy details.  Use the
  cmake_policy command to set the policy and suppress this warning.

  CMake variable DART_ROOT is set to:

    DART_ROOT-NOTFOUND

  For compatibility, find_package is ignoring the variable, but code in a
  .cmake module might still use it.
Call Stack (most recent call first):
  CMakeLists.txt:55 (include)
This warning is for project developers.  Use -Wno-dev to suppress it.

F16C instruction set is not supported.
FMA instruction set is not supported.
AVX instruction set is not supported.
One or more AVX/F16C instruction flags are not supported.
Building ONNX Runtime for aarch64 CPU ARCH
Patch found: /usr/bin/patch
Loading Dependencies URLs ...
Loading Dependencies ...
-- Populating abseil_cpp
-- Configuring done (0.1s)
-- Generating done (0.0s)
-- Build files have been written to: /home/rock/temp/onnxruntime/build/Linux/RelWithDebInfo/_deps/abseil_cpp-subbuild
[100%] Built target abseil_cpp-populate
-- Abseil source dir:/home/rock/temp/onnxruntime/build/Linux/RelWithDebInfo/_deps/abseil_cpp-src
-- Populating date
-- Configuring done (0.1s)
-- Generating done (0.0s)
-- Build files have been written to: /home/rock/temp/onnxruntime/build/Linux/RelWithDebInfo/_deps/date-subbuild
[100%] Built target date-populate
-- Populating google_nsync
-- Configuring done (0.1s)
-- Generating done (0.0s)
-- Build files have been written to: /home/rock/temp/onnxruntime/build/Linux/RelWithDebInfo/_deps/google_nsync-subbuild
[100%] Built target google_nsync-populate
CMake Deprecation Warning at /home/rock/temp/onnxruntime/build/Linux/RelWithDebInfo/_deps/google_nsync-src/CMakeLists.txt:1 (cmake_minimum_required):
  Compatibility with CMake < 3.5 will be removed from a future version of
  CMake.

  Update the VERSION argument <min> value or use a ...<max> suffix to tell
  CMake that the project does not need compatibility with older versions.


-- Populating safeint
-- Configuring done (0.1s)
-- Generating done (0.0s)
-- Build files have been written to: /home/rock/temp/onnxruntime/build/Linux/RelWithDebInfo/_deps/safeint-subbuild
[100%] Built target safeint-populate
-- Populating utf8_range
-- Configuring done (0.1s)
-- Generating done (0.0s)
-- Build files have been written to: /home/rock/temp/onnxruntime/build/Linux/RelWithDebInfo/_deps/utf8_range-subbuild
[100%] Built target utf8_range-populate
-- Populating protobuf
-- Configuring done (0.1s)
-- Generating done (0.0s)
-- Build files have been written to: /home/rock/temp/onnxruntime/build/Linux/RelWithDebInfo/_deps/protobuf-subbuild
[100%] Built target protobuf-populate
--
-- 3.21.12.0
-- Populating nlohmann_json
-- Configuring done (0.1s)
-- Generating done (0.0s)
-- Build files have been written to: /home/rock/temp/onnxruntime/build/Linux/RelWithDebInfo/_deps/nlohmann_json-subbuild
[100%] Built target nlohmann_json-populate
CMake Deprecation Warning at /home/rock/temp/onnxruntime/build/Linux/RelWithDebInfo/_deps/nlohmann_json-src/CMakeLists.txt:1 (cmake_minimum_required):
  Compatibility with CMake < 3.5 will be removed from a future version of
  CMake.

  Update the VERSION argument <min> value or use a ...<max> suffix to tell
  CMake that the project does not need compatibility with older versions.


-- Using the single-header code from /home/rock/temp/onnxruntime/build/Linux/RelWithDebInfo/_deps/nlohmann_json-src/single_include/
-- Populating mp11
-- Configuring done (0.1s)
-- Generating done (0.0s)
-- Build files have been written to: /home/rock/temp/onnxruntime/build/Linux/RelWithDebInfo/_deps/mp11-subbuild
[100%] Built target mp11-populate
-- Populating re2
-- Configuring done (0.1s)
-- Generating done (0.0s)
-- Build files have been written to: /home/rock/temp/onnxruntime/build/Linux/RelWithDebInfo/_deps/re2-subbuild
[100%] Built target re2-populate
-- Populating gsl
-- Configuring done (0.1s)
-- Generating done (0.0s)
-- Build files have been written to: /home/rock/temp/onnxruntime/build/Linux/RelWithDebInfo/_deps/gsl-subbuild
[100%] Built target gsl-populate
-- Populating flatbuffers
-- Configuring done (0.1s)
-- Generating done (0.0s)
-- Build files have been written to: /home/rock/temp/onnxruntime/build/Linux/RelWithDebInfo/_deps/flatbuffers-subbuild
[100%] Built target flatbuffers-populate
CMake Deprecation Warning at /home/rock/temp/onnxruntime/build/Linux/RelWithDebInfo/_deps/flatbuffers-src/CMakeLists.txt:1 (cmake_minimum_required):
  Compatibility with CMake < 3.5 will be removed from a future version of
  CMake.

  Update the VERSION argument <min> value or use a ...<max> suffix to tell
  CMake that the project does not need compatibility with older versions.


-- Populating pytorch_cpuinfo
-- Configuring done (0.1s)
-- Generating done (0.0s)
-- Build files have been written to: /home/rock/temp/onnxruntime/build/Linux/RelWithDebInfo/_deps/pytorch_cpuinfo-subbuild
[100%] Built target pytorch_cpuinfo-populate
-- Populating pytorch_clog
-- Configuring done (0.1s)
-- Generating done (0.0s)
-- Build files have been written to: /home/rock/temp/onnxruntime/build/Linux/RelWithDebInfo/_deps/pytorch_clog-subbuild
[100%] Built target pytorch_clog-populate
CMake Deprecation Warning at /home/rock/temp/onnxruntime/build/Linux/RelWithDebInfo/_deps/pytorch_clog-src/deps/clog/CMakeLists.txt:1 (CMAKE_MINIMUM_REQUIRED):
  Compatibility with CMake < 3.5 will be removed from a future version of
  CMake.

  Update the VERSION argument <min> value or use a ...<max> suffix to tell
  CMake that the project does not need compatibility with older versions.


-- Populating googletest
-- Configuring done (0.1s)
-- Generating done (0.0s)
-- Build files have been written to: /home/rock/temp/onnxruntime/build/Linux/RelWithDebInfo/_deps/googletest-subbuild
[100%] Built target googletest-populate
-- Populating eigen
-- Configuring done (0.1s)
-- Generating done (0.0s)
-- Build files have been written to: /home/rock/temp/onnxruntime/build/Linux/RelWithDebInfo/_deps/eigen-subbuild
[100%] Built target eigen-populate
-- Populating onnx
-- Configuring done (0.1s)
-- Generating done (0.0s)
-- Build files have been written to: /home/rock/temp/onnxruntime/build/Linux/RelWithDebInfo/_deps/onnx-subbuild
[100%] Built target onnx-populate
CMake Deprecation Warning at /home/rock/temp/onnxruntime/build/Linux/RelWithDebInfo/_deps/onnx-src/CMakeLists.txt:2 (cmake_minimum_required):
  Compatibility with CMake < 3.5 will be removed from a future version of
  CMake.

  Update the VERSION argument <min> value or use a ...<max> suffix to tell
  CMake that the project does not need compatibility with older versions.


CMake Warning (dev) at /home/rock/temp/onnxruntime/build/Linux/RelWithDebInfo/_deps/onnx-src/CMakeLists.txt:112 (find_package):
  Policy CMP0148 is not set: The FindPythonInterp and FindPythonLibs modules
  are removed.  Run "cmake --help-policy CMP0148" for policy details.  Use
  the cmake_policy command to set the policy and suppress this warning.

This warning is for project developers.  Use -Wno-dev to suppress it.

Generated: /home/rock/temp/onnxruntime/build/Linux/RelWithDebInfo/_deps/onnx-build/onnx/onnx-ml.proto
Generated: /home/rock/temp/onnxruntime/build/Linux/RelWithDebInfo/_deps/onnx-build/onnx/onnx-operators-ml.proto
Generated: /home/rock/temp/onnxruntime/build/Linux/RelWithDebInfo/_deps/onnx-build/onnx/onnx-data.proto
--
-- ******** Summary ********
--   CMake version                     : 3.28.3
--   CMake command                     : /home/rock/.local/lib/python3.8/site-packages/cmake/data/bin/cmake
--   System                            : Linux
--   C++ compiler                      : /usr/bin/c++
--   C++ compiler version              : 9.4.0
--   CXX flags                         :  -ffunction-sections -fdata-sections -Wno-restrict  -DCPUINFO_SUPPORTED -Wnon-virtual-dtor
--   Build type                        : RelWithDebInfo
--   Compile definitions               : ORT_ENABLE_STREAM;EIGEN_MPL2_ONLY;_GNU_SOURCE;__STDC_FORMAT_MACROS
--   CMAKE_PREFIX_PATH                 : /home/rock/temp/onnxruntime/build/Linux/RelWithDebInfo/installed
--   CMAKE_INSTALL_PREFIX              : /usr/local
--   CMAKE_MODULE_PATH                 : /home/rock/temp/onnxruntime/cmake/external
--
--   ONNX version                      : 1.15.0
--   ONNX NAMESPACE                    : onnx
--   ONNX_USE_LITE_PROTO               : ON
--   USE_PROTOBUF_SHARED_LIBS          : OFF
--   Protobuf_USE_STATIC_LIBS          : ON
--   ONNX_DISABLE_EXCEPTIONS           : OFF
--   ONNX_DISABLE_STATIC_REGISTRATION  : OFF
--   ONNX_WERROR                       : OFF
--   ONNX_BUILD_TESTS                  : OFF
--   ONNX_BUILD_BENCHMARKS             : OFF
--   ONNX_BUILD_SHARED_LIBS            :
--   BUILD_SHARED_LIBS                 : OFF
--
--   Protobuf compiler                 :
--   Protobuf includes                 :
--   Protobuf libraries                :
--   BUILD_ONNX_PYTHON                 : OFF
Finished fetching external dependencies
NVCC_ERROR =
NVCC_OUT = No such file or directory
CMake Error at CMakeLists.txt:659 (message):
  The compiler doesn't support BFLOAT16!!!


-- Configuring incomplete, errors occurred!
Traceback (most recent call last):
  File "/home/rock/temp/onnxruntime/tools/ci_build/build.py", line 2914, in <module>
    sys.exit(main())
  File "/home/rock/temp/onnxruntime/tools/ci_build/build.py", line 2771, in main
    generate_build_tree(
  File "/home/rock/temp/onnxruntime/tools/ci_build/build.py", line 1635, in generate_build_tree
    run_subprocess(
  File "/home/rock/temp/onnxruntime/tools/ci_build/build.py", line 850, in run_subprocess
    return run(*args, cwd=cwd, capture_stdout=capture_stdout, shell=shell, env=my_env)
  File "/home/rock/temp/onnxruntime/tools/python/util/run.py", line 49, in run
    completed_process = subprocess.run(
  File "/usr/lib/python3.8/subprocess.py", line 516, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['/home/rock/.local/bin/cmake', '/home/rock/temp/onnxruntime/cmake', '--compile-no-warning-as-error', '-Donnxruntime_RUN_ONNX_TESTS=OFF', '-Donnxruntime_GENERATE_TEST_REPORTS=ON', '-DPython_EXECUTABLE=/usr/bin/python3', '-DPYTHON_EXECUTABLE=/usr/bin/python3', '-Donnxruntime_USE_MIMALLOC=OFF', '-Donnxruntime_ENABLE_PYTHON=OFF', '-Donnxruntime_BUILD_CSHARP=OFF', '-Donnxruntime_BUILD_JAVA=OFF', '-Donnxruntime_BUILD_NODEJS=OFF', '-Donnxruntime_BUILD_OBJC=OFF', '-Donnxruntime_BUILD_SHARED_LIB=ON', '-Donnxruntime_BUILD_APPLE_FRAMEWORK=OFF', '-Donnxruntime_USE_DNNL=OFF', '-Donnxruntime_USE_NNAPI_BUILTIN=OFF', '-Donnxruntime_USE_RKNPU=OFF', '-Donnxruntime_USE_LLVM=OFF', '-Donnxruntime_ENABLE_MICROSOFT_INTERNAL=OFF', '-Donnxruntime_USE_VITISAI=OFF', '-Donnxruntime_USE_TENSORRT=OFF', '-Donnxruntime_USE_TENSORRT_BUILTIN_PARSER=ON', '-Donnxruntime_USE_TVM=OFF', '-Donnxruntime_TVM_CUDA_RUNTIME=OFF', '-Donnxruntime_TVM_USE_HASH=OFF', '-Donnxruntime_USE_MIGRAPHX=OFF', '-Donnxruntime_DISABLE_CONTRIB_OPS=OFF', '-Donnxruntime_DISABLE_ML_OPS=OFF', '-Donnxruntime_DISABLE_RTTI=OFF', '-Donnxruntime_DISABLE_EXCEPTIONS=OFF', '-Donnxruntime_MINIMAL_BUILD=OFF', '-Donnxruntime_EXTENDED_MINIMAL_BUILD=OFF', '-Donnxruntime_MINIMAL_BUILD_CUSTOM_OPS=OFF', '-Donnxruntime_REDUCED_OPS_BUILD=OFF', '-Donnxruntime_ENABLE_LANGUAGE_INTEROP_OPS=OFF', '-Donnxruntime_USE_DML=OFF', '-Donnxruntime_USE_WINML=OFF', '-Donnxruntime_BUILD_MS_EXPERIMENTAL_OPS=OFF', '-Donnxruntime_USE_TELEMETRY=OFF', '-Donnxruntime_ENABLE_LTO=OFF', '-Donnxruntime_USE_ACL=OFF', '-Donnxruntime_USE_ACL_1902=OFF', '-Donnxruntime_USE_ACL_1905=OFF', '-Donnxruntime_USE_ACL_1908=OFF', '-Donnxruntime_USE_ACL_2002=OFF', '-Donnxruntime_USE_ACL_2308=OFF', '-Donnxruntime_USE_ARMNN=ON', '-Donnxruntime_ARMNN_RELU_USE_CPU=OFF', '-Donnxruntime_ARMNN_BN_USE_CPU=OFF', '-Donnxruntime_USE_JSEP=OFF', '-Donnxruntime_ENABLE_NVTX_PROFILE=OFF', '-Donnxruntime_ENABLE_TRAINING=OFF', '-Donnxruntime_ENABLE_TRAINING_OPS=OFF', '-Donnxruntime_ENABLE_TRAINING_APIS=OFF', '-Donnxruntime_ENABLE_CPU_FP16_OPS=OFF', '-Donnxruntime_USE_NCCL=OFF', '-Donnxruntime_BUILD_BENCHMARKS=OFF', '-Donnxruntime_USE_ROCM=OFF', '-DOnnxruntime_GCOV_COVERAGE=OFF', '-Donnxruntime_USE_MPI=OFF', '-Donnxruntime_ENABLE_MEMORY_PROFILE=OFF', '-Donnxruntime_ENABLE_CUDA_LINE_NUMBER_INFO=OFF', '-Donnxruntime_USE_CUDA_NHWC_OPS=OFF', '-Donnxruntime_BUILD_WEBASSEMBLY_STATIC_LIB=OFF', '-Donnxruntime_ENABLE_WEBASSEMBLY_EXCEPTION_CATCHING=ON', '-Donnxruntime_ENABLE_WEBASSEMBLY_API_EXCEPTION_CATCHING=OFF', '-Donnxruntime_ENABLE_WEBASSEMBLY_EXCEPTION_THROWING=ON', '-Donnxruntime_WEBASSEMBLY_RUN_TESTS_IN_BROWSER=OFF', '-Donnxruntime_ENABLE_WEBASSEMBLY_THREADS=OFF', '-Donnxruntime_ENABLE_WEBASSEMBLY_DEBUG_INFO=OFF', '-Donnxruntime_ENABLE_WEBASSEMBLY_PROFILING=OFF', '-Donnxruntime_ENABLE_LAZY_TENSOR=OFF', '-Donnxruntime_ENABLE_EXTERNAL_CUSTOM_OP_SCHEMAS=OFF', '-Donnxruntime_ENABLE_CUDA_PROFILING=OFF', '-Donnxruntime_ENABLE_ROCM_PROFILING=OFF', '-Donnxruntime_USE_XNNPACK=OFF', '-Donnxruntime_USE_WEBNN=OFF', '-Donnxruntime_USE_CANN=OFF', '-Donnxruntime_USE_TRITON_KERNEL=OFF', '-Donnxruntime_DISABLE_FLOAT8_TYPES=ON', '-Donnxruntime_DISABLE_SPARSE_TENSORS=OFF', '-Donnxruntime_DISABLE_OPTIONAL_TYPE=OFF', '-DCMAKE_TLS_VERIFY=ON', '-DFETCHCONTENT_QUIET=OFF', '-DCMAKE_BUILD_TYPE=RelWithDebInfo', '-DCMAKE_PREFIX_PATH=/home/rock/temp/onnxruntime/build/Linux/RelWithDebInfo/installed']' returned non-zero exit status 1.

gcc type compatibility is as follows:

g++ example.cpp -march=armv8.2-a+bf16
cc1plus: error: invalid feature modifier ‘bf16’ in ‘-march=armv8.2-a+bf16’
cc1plus: note: valid arguments are: fp simd crypto crc lse fp16 rcpc rdma dotprod aes sha2 sha3 sm4 fp16fml sve profile rng memtag sb ssbs predres;

Visual Studio Version

n/a

GCC / Compiler Version

Ubuntu 9.4.0-1ubuntu1~20.04.2

The text was updated successfully, but these errors were encountered:

hariharans29 · 2024-03-14T19:48:59Z

Are you just able to build default CPU EP after commenting that check in CMakeLists.txt ? My guess is on ARM, we require bfloat16 support to compile some MLAS kernels which is why we have that check. Tagging @yufenglee @snnn to comment on compiler bfloat16 support requirement to compile Default CPU EP on ARM.

ChthonicOne · 2024-03-14T20:01:50Z

Attempting now. Surprisingly there are 2 consecutive checks for the exact same thing in that CMakeLists.txt... No clue why you check for the exact same thing twice in a row. Might want to simplify that while you are at it.

snnn · 2024-03-14T20:09:09Z

They are slightly different. "float16" vs "bfloat16"

   check_cxx_compiler_flag(-march=armv8.2-a+bf16 HAS_ARM64_BFLOAT16)
   if(NOT HAS_ARM64_BFLOAT16)
     message(FATAL_ERROR  "The compiler doesn't support BFLOAT16!!!")
   endif()
   check_cxx_compiler_flag(-march=armv8.2-a+fp16 HAS_ARM64_FLOAT16)
   if(NOT HAS_ARM64_FLOAT16)
     message(FATAL_ERROR  "The compiler doesn't support FLOAT16!!!")
   endif()

snnn · 2024-03-14T20:10:39Z

Would you mind upgrading your Ubuntu version from 20.04 to 22.04?

ChthonicOne · 2024-03-14T20:12:54Z

Right now, my project requires that I remain on 20.04. If it is required that I upgrade to make this work, I can first test this out on a clone of my SD card, and then report this to my project manager and then they can decide if we need to step up.

For now, I'm going to wait to see if the project finishes building. If it doesn't build I'll clone the SD card tomorrow and try stepping up then.

ChthonicOne · 2024-03-14T20:19:29Z

Yep, it failed on MLAS. Going to start the backup process for the SD card now. It'll probably take about an hour or so. I'll probably be back tomorrow with more information if things are working in 22.04 Radxa-Ubuntu

ChthonicOne · 2024-03-15T16:36:18Z

An update:

-- Performing Test HAS_ARM64_BFLOAT16
-- Performing Test HAS_ARM64_BFLOAT16 - Success
-- Performing Test HAS_ARM64_FLOAT16
-- Performing Test HAS_ARM64_FLOAT16 - Success

I almost didn't have enough room to perform the do-release-upgrade on these tiny SD-cards. Too many packages wanted upgrading that one actually ran out of space, but it was non-critical and I was able to update it afterwards once space had been cleared.

Now as I said before, I'm not sure that I even need BFLOAT16s at all in my use case. I think that there should be a means to disable use of them in the build scripts for those that don't need them, and for some reason can't build with a non-IEEE format.

ChthonicOne · 2024-03-15T17:20:39Z

Hrm... I have a problem. It seems that I ran into this error now:

In file included from /home/rock/temp/onnxruntime/onnxruntime/core/providers/armnn/math/gemm.cc:6:
/home/rock/temp/onnxruntime/onnxruntime/core/providers/armnn/math/gemm.h: In member function ‘onnxruntime::common::Status onnxruntime::armnn_ep::Gemm<T>::Compute(onnxruntime::OpKernelContext*) const’:
/home/rock/temp/onnxruntime/onnxruntime/core/providers/armnn/math/gemm.h:63:8: warning: suggest explicit braces to avoid ambiguous ‘else’ [-Wdangling-else]
   63 |     if (X) LOGS_DEFAULT(VERBOSE) << "X " << X->Shape().ToString().c_str();
      |        ^
/home/rock/temp/onnxruntime/onnxruntime/core/providers/armnn/math/gemm.h:64:8: warning: suggest explicit braces to avoid ambiguous ‘else’ [-Wdangling-else]
   64 |     if (W) LOGS_DEFAULT(VERBOSE) << "W " << W->Shape().ToString().c_str();
      |        ^
/home/rock/temp/onnxruntime/onnxruntime/core/providers/armnn/math/gemm.h:65:8: warning: suggest explicit braces to avoid ambiguous ‘else’ [-Wdangling-else]
   65 |     if (B) LOGS_DEFAULT(VERBOSE) << "B " << B->Shape().ToString().c_str();
      |        ^
/home/rock/temp/onnxruntime/onnxruntime/core/providers/armnn/math/gemm.h:109:53: error: no matching function for call to ‘armnn::INetwork::AddFullyConnectedLayer(armnn::FullyConnectedDescriptor&, armnn::ConstTensor&, armnn::Optional<armnn::ConstTensor>, const char [9])’
  109 |         fc_armnn = myNetwork->AddFullyConnectedLayer(fcDescriptor,
      |                    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~
  110 |                                                      weights,
      |                                                      ~~~~~~~~
  111 |                                                      armnn::Optional<armnn::ConstTensor>(bias),
      |                                                      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  112 |                                                      "fc_armnn");
      |                                                      ~~~~~~~~~~~
In file included from /usr/local/include/armnn/ArmNN.hpp:11,
                 from /home/rock/temp/onnxruntime/onnxruntime/core/providers/armnn/armnn_common.h:9,
                 from /home/rock/temp/onnxruntime/onnxruntime/core/providers/armnn/math/gemm.cc:5:
/usr/local/include/armnn/INetwork.hpp:477:24: note: candidate: ‘armnn::IConnectableLayer* armnn::INetwork::AddFullyConnectedLayer(const armnn::FullyConnectedDescriptor&, const char*)’
  477 |     IConnectableLayer* AddFullyConnectedLayer(const FullyConnectedDescriptor& fullyConnectedDescriptor,
      |                        ^~~~~~~~~~~~~~~~~~~~~~
/usr/local/include/armnn/INetwork.hpp:477:24: note:   candidate expects 2 arguments, 4 provided
In file included from /home/rock/temp/onnxruntime/onnxruntime/core/providers/armnn/math/gemm.cc:6:
/home/rock/temp/onnxruntime/onnxruntime/core/providers/armnn/math/gemm.h:114:53: error: no matching function for call to ‘armnn::INetwork::AddFullyConnectedLayer(armnn::FullyConnectedDescriptor&, armnn::ConstTensor&, armnn::EmptyOptional, const char [9])’
  114 |         fc_armnn = myNetwork->AddFullyConnectedLayer(fcDescriptor,
      |                    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~
  115 |                                                      weights,
      |                                                      ~~~~~~~~
  116 |                                                      armnn::EmptyOptional(),
      |                                                      ~~~~~~~~~~~~~~~~~~~~~~~
  117 |                                                      "fc_armnn");
      |                                                      ~~~~~~~~~~~
In file included from /usr/local/include/armnn/ArmNN.hpp:11,
                 from /home/rock/temp/onnxruntime/onnxruntime/core/providers/armnn/armnn_common.h:9,
                 from /home/rock/temp/onnxruntime/onnxruntime/core/providers/armnn/math/gemm.cc:5:
/usr/local/include/armnn/INetwork.hpp:477:24: note: candidate: ‘armnn::IConnectableLayer* armnn::INetwork::AddFullyConnectedLayer(const armnn::FullyConnectedDescriptor&, const char*)’
  477 |     IConnectableLayer* AddFullyConnectedLayer(const FullyConnectedDescriptor& fullyConnectedDescriptor,
      |                        ^~~~~~~~~~~~~~~~~~~~~~
/usr/local/include/armnn/INetwork.hpp:477:24: note:   candidate expects 2 arguments, 4 provided
In file included from /home/rock/temp/onnxruntime/onnxruntime/core/providers/armnn/math/gemm.cc:6:
/home/rock/temp/onnxruntime/onnxruntime/core/providers/armnn/math/gemm.h: In instantiation of ‘onnxruntime::common::Status onnxruntime::armnn_ep::Gemm<T>::Compute(onnxruntime::OpKernelContext*) const [with T = float]’:
/home/rock/temp/onnxruntime/onnxruntime/core/providers/armnn/math/gemm.h:35:10:   required from here
/home/rock/temp/onnxruntime/onnxruntime/core/providers/armnn/math/gemm.h:103:23: warning: narrowing conversion of ‘(element_type)(&((const onnxruntime::Tensor*)B)->onnxruntime::Tensor::Shape())->onnxruntime::TensorShape::GetDims().gsl::span<const long int>::operator[](1)’ from ‘element_type’ {aka ‘long int’} to ‘unsigned int’ [-Wnarrowing]
  103 |             biasShape = {B->Shape().GetDims()[1]};
      |             ~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/rock/temp/onnxruntime/onnxruntime/core/providers/armnn/math/gemm.h:103:23: warning: narrowing conversion of ‘(&((const onnxruntime::Tensor*)B)->onnxruntime::Tensor::Shape())->onnxruntime::TensorShape::GetDims().gsl::span<const long int>::operator[](1)’ from ‘gsl::span<const long int>::element_type’ {aka ‘const long int’} to ‘unsigned int’ [-Wnarrowing]
gmake[2]: *** [CMakeFiles/onnxruntime_providers_armnn.dir/build.make:132: CMakeFiles/onnxruntime_providers_armnn.dir/home/rock/temp/onnxruntime/onnxruntime/core/providers/armnn/math/gemm.cc.o] Error 1
gmake[1]: *** [CMakeFiles/Makefile2:2063: CMakeFiles/onnxruntime_providers_armnn.dir/all] Error 2

ChthonicOne · 2024-03-15T17:39:49Z

From INetwork.hpp:

    /// Adds a fully connected layer to the network.
    /// @param fullyConnectedDescriptor - Description of the fully connected layer.
    /// @return - Interface for configuring the layer.
    ///
    /// @note Weights and biases are passed in as inputs. If they are constant tensors you can simply store
    ///       them in a ConstantLayer as seen below. A full example can be found in samples/SimpleSample.cpp.
    ///
    /// @code
    /// // Make sure the IsConstant flag is set on the weightsInfo before passing it to the ConstTensor.
    /// ConstTensor weights(weightsInfo, weightsData);
    ///
    /// // Constant layer that now holds weights data for FullyConnected
    /// IConnectableLayer* const constantWeightsLayer = myNetwork->AddConstantLayer(weights, "weights");
    ///
    /// FullyConnectedDescriptor fullyConnectedDesc;
    /// IConnectableLayer* const fullyConnectedLayer = myNetwork->AddFullyConnectedLayer(fullyConnectedDesc,
    ///                                                                                  "fully connected");
    /// IConnectableLayer* InputLayer = myNetwork->AddInputLayer(0);
    /// InputLayer->GetOutputSlot(0).Connect(fullyConnectedLayer->GetInputSlot(0));
    /// constantWeightsLayer->GetOutputSlot(0).Connect(fullyConnectedLayer->GetInputSlot(1));
    /// @endcode
    IConnectableLayer* AddFullyConnectedLayer(const FullyConnectedDescriptor& fullyConnectedDescriptor,
                                              const char* name = nullptr);

There is no other method "AddFullyConnectedLayer" in class "INetwork" so I don't know what you are trying to access here. I built the most recent armnn repository 2 days ago. It took 2 days to do.

ChthonicOne · 2024-03-15T18:03:27Z

From the code, it looks like you have to do some work with a constant tensor before you make the fully connected layer in the first one. Then you need to make a second constant layer with the bias, and connect it as an input to the layer as well.
I'm going to try to do this myself in the code and see if it compiles as a work-around.

ChthonicOne · 2024-03-15T19:03:45Z

Code fix for gemm.cc works. I'll post it in a second, but now conv.cc has issues:

In file included from /home/rock/temp/onnxruntime/onnxruntime/core/providers/armnn/nn/conv.cc:9:
/home/rock/temp/onnxruntime/onnxruntime/core/providers/armnn/nn/conv.cc:122:66: error: cannot convert ‘std::vector<long int>’ to ‘onnxruntime::TensorShapeVector&’ {aka ‘absl::lts_20240116::InlinedVector<long int, 6, std::allocator<long int> >&’}
  122 |   ORT_RETURN_IF_ERROR(conv_attrs_.ComputeKernelShape(W->Shape(), kernel_shape));
      |                                                                  ^~~~~~~~~~~~
      |                                                                  |
      |                                                                  std::vector<long int>
/home/rock/temp/onnxruntime/include/onnxruntime/core/common/common.h:225:21: note: in definition of macro ‘ORT_RETURN_IF_ERROR_SESSIONID’
  225 |     auto _status = (expr);                                                                                             \
      |                     ^~~~
/home/rock/temp/onnxruntime/onnxruntime/core/providers/armnn/nn/conv.cc:122:3: note: in expansion of macro ‘ORT_RETURN_IF_ERROR’
  122 |   ORT_RETURN_IF_ERROR(conv_attrs_.ComputeKernelShape(W->Shape(), kernel_shape));
      |   ^~~~~~~~~~~~~~~~~~~
In file included from /home/rock/temp/onnxruntime/onnxruntime/core/providers/cpu/nn/conv.h:7,
                 from /home/rock/temp/onnxruntime/onnxruntime/core/providers/armnn/nn/conv.h:7,
                 from /home/rock/temp/onnxruntime/onnxruntime/core/providers/armnn/nn/conv.cc:14:
/home/rock/temp/onnxruntime/onnxruntime/core/providers/cpu/nn/conv_attributes.h:76:81: note:   initializing argument 2 of ‘onnxruntime::common::Status onnxruntime::ConvAttributes::ComputeKernelShape(const onnxruntime::TensorShape&, onnxruntime::TensorShapeVector&, bool) const’
   76 |   Status ComputeKernelShape(const TensorShape& weight_shape, TensorShapeVector& kernel_shape, bool weight_channels_last = false) const {
      |                                                              ~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~
In file included from /home/rock/temp/onnxruntime/onnxruntime/core/providers/armnn/nn/conv.cc:9:
/home/rock/temp/onnxruntime/onnxruntime/core/providers/armnn/nn/conv.cc:140:35: error: ‘const struct onnxruntime::ConvAttributes’ has no member named ‘InferOutputShape’; did you mean ‘InferPadsAndOutputShape’?
  140 |   ORT_RETURN_IF_ERROR(conv_attrs_.InferOutputShape(input_shape, kernel_shape, strides, dilations, pads, Y_dims));
      |                                   ^~~~~~~~~~~~~~~~
/home/rock/temp/onnxruntime/include/onnxruntime/core/common/common.h:225:21: note: in definition of macro ‘ORT_RETURN_IF_ERROR_SESSIONID’
  225 |     auto _status = (expr);                                                                                             \
      |                     ^~~~
/home/rock/temp/onnxruntime/onnxruntime/core/providers/armnn/nn/conv.cc:140:3: note: in expansion of macro ‘ORT_RETURN_IF_ERROR’
  140 |   ORT_RETURN_IF_ERROR(conv_attrs_.InferOutputShape(input_shape, kernel_shape, strides, dilations, pads, Y_dims));
      |   ^~~~~~~~~~~~~~~~~~~
/home/rock/temp/onnxruntime/onnxruntime/core/providers/armnn/nn/conv.cc:161:81: error: could not convert ‘pads’ from ‘onnxruntime::ConvAttributes::ConvPadVector’ {aka ‘absl::lts_20240116::InlinedVector<long int, 10, std::allocator<long int> >’} to ‘std::vector<long int>’
  161 |     armnn::Convolution2dDescriptor convolutionDescriptor = createConvDescriptor(pads, dilations, strides, biasEnabled);
      |                                                                                 ^~~~
      |                                                                                 |
      |                                                                                 onnxruntime::ConvAttributes::ConvPadVector {aka absl::lts_20240116::InlinedVector<long int, 10, std::allocator<long int> >}
/home/rock/temp/onnxruntime/onnxruntime/core/providers/armnn/nn/conv.cc:185:72: error: no matching function for call to ‘armnn::INetwork::AddDepthwiseConvolution2dLayer(armnn::DepthwiseConvolution2dDescriptor&, armnn::ConstTensor&, armnn::Optional<armnn::ConstTensor>, const char [28])’
  185 |           convolution_armnn = myNetwork->AddDepthwiseConvolution2dLayer(depthwiseDescriptor,
      |                               ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~
  186 |                                                                         weights,
      |                                                                         ~~~~~~~~
  187 |                                                                         armnn::Optional<armnn::ConstTensor>(bias),
      |                                                                         ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  188 |                                                                         "depthwise_convolution_armnn");
      |                                                                         ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from /usr/local/include/armnn/ArmNN.hpp:11,
                 from /home/rock/temp/onnxruntime/onnxruntime/core/providers/armnn/nn/conv.h:10,
                 from /home/rock/temp/onnxruntime/onnxruntime/core/providers/armnn/nn/conv.cc:14:
/usr/local/include/armnn/INetwork.hpp:417:24: note: candidate: ‘armnn::IConnectableLayer* armnn::INetwork::AddDepthwiseConvolution2dLayer(const armnn::DepthwiseConvolution2dDescriptor&, const char*)’
  417 |     IConnectableLayer* AddDepthwiseConvolution2dLayer(const DepthwiseConvolution2dDescriptor& convolution2dDescriptor,
      |                        ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/usr/local/include/armnn/INetwork.hpp:417:24: note:   candidate expects 2 arguments, 4 provided
/home/rock/temp/onnxruntime/onnxruntime/core/providers/armnn/nn/conv.cc:190:72: error: no matching function for call to ‘armnn::INetwork::AddDepthwiseConvolution2dLayer(armnn::DepthwiseConvolution2dDescriptor&, armnn::ConstTensor&, armnn::EmptyOptional, const char [28])’
  190 |           convolution_armnn = myNetwork->AddDepthwiseConvolution2dLayer(depthwiseDescriptor,
      |                               ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~
  191 |                                                                         weights,
      |                                                                         ~~~~~~~~
  192 |                                                                         armnn::EmptyOptional(),
      |                                                                         ~~~~~~~~~~~~~~~~~~~~~~~
  193 |                                                                         "depthwise_convolution_armnn");
      |                                                                         ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from /usr/local/include/armnn/ArmNN.hpp:11,
                 from /home/rock/temp/onnxruntime/onnxruntime/core/providers/armnn/nn/conv.h:10,
                 from /home/rock/temp/onnxruntime/onnxruntime/core/providers/armnn/nn/conv.cc:14:
/usr/local/include/armnn/INetwork.hpp:417:24: note: candidate: ‘armnn::IConnectableLayer* armnn::INetwork::AddDepthwiseConvolution2dLayer(const armnn::DepthwiseConvolution2dDescriptor&, const char*)’
  417 |     IConnectableLayer* AddDepthwiseConvolution2dLayer(const DepthwiseConvolution2dDescriptor& convolution2dDescriptor,
      |                        ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/usr/local/include/armnn/INetwork.hpp:417:24: note:   candidate expects 2 arguments, 4 provided
/home/rock/temp/onnxruntime/onnxruntime/core/providers/armnn/nn/conv.cc:209:61: error: no matching function for call to ‘armnn::INetwork::AddConvolution2dLayer(armnn::Convolution2dDescriptor&, armnn::ConstTensor&, armnn::Optional<armnn::ConstTensor>, const char [18])’
  209 |         convolution_armnn = myNetwork->AddConvolution2dLayer(convolutionDescriptor,
      |                             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~
  210 |                                                              weights,
      |                                                              ~~~~~~~~
  211 |                                                              armnn::Optional<armnn::ConstTensor>(bias),
      |                                                              ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  212 |                                                              "convolution_armnn");
      |                                                              ~~~~~~~~~~~~~~~~~~~~
In file included from /usr/local/include/armnn/ArmNN.hpp:11,
                 from /home/rock/temp/onnxruntime/onnxruntime/core/providers/armnn/nn/conv.h:10,
                 from /home/rock/temp/onnxruntime/onnxruntime/core/providers/armnn/nn/conv.cc:14:
/usr/local/include/armnn/INetwork.hpp:396:24: note: candidate: ‘armnn::IConnectableLayer* armnn::INetwork::AddConvolution2dLayer(const armnn::Convolution2dDescriptor&, const char*)’
  396 |     IConnectableLayer* AddConvolution2dLayer(const Convolution2dDescriptor& convolution2dDescriptor,
      |                        ^~~~~~~~~~~~~~~~~~~~~
/usr/local/include/armnn/INetwork.hpp:396:24: note:   candidate expects 2 arguments, 4 provided
/home/rock/temp/onnxruntime/onnxruntime/core/providers/armnn/nn/conv.cc:214:61: error: no matching function for call to ‘armnn::INetwork::AddConvolution2dLayer(armnn::Convolution2dDescriptor&, armnn::ConstTensor&, armnn::EmptyOptional, const char [18])’
  214 |         convolution_armnn = myNetwork->AddConvolution2dLayer(convolutionDescriptor,
      |                             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~
  215 |                                                              weights,
      |                                                              ~~~~~~~~
  216 |                                                              armnn::EmptyOptional(),
      |                                                              ~~~~~~~~~~~~~~~~~~~~~~~
  217 |                                                              "convolution_armnn");
      |                                                              ~~~~~~~~~~~~~~~~~~~~
In file included from /usr/local/include/armnn/ArmNN.hpp:11,
                 from /home/rock/temp/onnxruntime/onnxruntime/core/providers/armnn/nn/conv.h:10,
                 from /home/rock/temp/onnxruntime/onnxruntime/core/providers/armnn/nn/conv.cc:14:
/usr/local/include/armnn/INetwork.hpp:396:24: note: candidate: ‘armnn::IConnectableLayer* armnn::INetwork::AddConvolution2dLayer(const armnn::Convolution2dDescriptor&, const char*)’
  396 |     IConnectableLayer* AddConvolution2dLayer(const Convolution2dDescriptor& convolution2dDescriptor,
      |                        ^~~~~~~~~~~~~~~~~~~~~
/usr/local/include/armnn/INetwork.hpp:396:24: note:   candidate expects 2 arguments, 4 provided
/home/rock/temp/onnxruntime/onnxruntime/core/providers/armnn/math/gemm.h: In instantiation of ‘onnxruntime::common::Status onnxruntime::armnn_ep::Gemm<T>::Compute(onnxruntime::OpKernelContext*) const [with T = float]’:
/home/rock/temp/onnxruntime/onnxruntime/core/providers/armnn/math/gemm.h:35:10:   required from here
/home/rock/temp/onnxruntime/onnxruntime/core/providers/armnn/math/gemm.h:106:23: warning: narrowing conversion of ‘(element_type)(&((const onnxruntime::Tensor*)B)->onnxruntime::Tensor::Shape())->onnxruntime::TensorShape::GetDims().gsl::span<const long int>::operator[](1)’ from ‘element_type’ {aka ‘long int’} to ‘unsigned int’ [-Wnarrowing]
  106 |             biasShape = {B->Shape().GetDims()[1]};
      |             ~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/rock/temp/onnxruntime/onnxruntime/core/providers/armnn/math/gemm.h:106:23: warning: narrowing conversion of ‘(&((const onnxruntime::Tensor*)B)->onnxruntime::Tensor::Shape())->onnxruntime::TensorShape::GetDims().gsl::span<const long int>::operator[](1)’ from ‘gsl::span<const long int>::element_type’ {aka ‘const long int’} to ‘unsigned int’ [-Wnarrowing]
/home/rock/temp/onnxruntime/onnxruntime/core/providers/armnn/nn/conv.cc: In instantiation of ‘onnxruntime::common::Status onnxruntime::armnn_ep::Conv<T>::Compute(onnxruntime::OpKernelContext*) const [with T = float]’:
/home/rock/temp/onnxruntime/onnxruntime/core/providers/armnn/nn/conv.cc:93:8:   required from here
/home/rock/temp/onnxruntime/onnxruntime/core/providers/armnn/nn/conv.cc:128:24: error: no matching function for call to ‘std::vector<long int>::vector(const TensorShapeVector&)’
  128 |   std::vector<int64_t> dilations(conv_attrs_.dilations);
      |                        ^~~~~~~~~
In file included from /usr/include/c++/11/vector:67,
                 from /usr/include/c++/11/functional:62,
                 from /usr/include/c++/11/pstl/glue_algorithm_defs.h:13,
                 from /usr/include/c++/11/algorithm:74,
                 from /home/rock/temp/onnxruntime/include/onnxruntime/core/common/common.h:22,
                 from /home/rock/temp/onnxruntime/onnxruntime/core/providers/armnn/nn/conv.cc:9:
/usr/include/c++/11/bits/stl_vector.h:653:9: note: candidate: ‘template<class _InputIterator, class> std::vector<_Tp, _Alloc>::vector(_InputIterator, _InputIterator, const allocator_type&) [with _InputIterator = _InputIterator; <template-parameter-2-2> = <template-parameter-1-2>; _Tp = long int; _Alloc = std::allocator<long int>]’
  653 |         vector(_InputIterator __first, _InputIterator __last,
      |         ^~~~~~
/usr/include/c++/11/bits/stl_vector.h:653:9: note:   template argument deduction/substitution failed:
/home/rock/temp/onnxruntime/onnxruntime/core/providers/armnn/nn/conv.cc:128:24: note:   candidate expects 3 arguments, 1 provided
  128 |   std::vector<int64_t> dilations(conv_attrs_.dilations);
      |                        ^~~~~~~~~
In file included from /usr/include/c++/11/vector:67,
                 from /usr/include/c++/11/functional:62,
                 from /usr/include/c++/11/pstl/glue_algorithm_defs.h:13,
                 from /usr/include/c++/11/algorithm:74,
                 from /home/rock/temp/onnxruntime/include/onnxruntime/core/common/common.h:22,
                 from /home/rock/temp/onnxruntime/onnxruntime/core/providers/armnn/nn/conv.cc:9:
/usr/include/c++/11/bits/stl_vector.h:625:7: note: candidate: ‘std::vector<_Tp, _Alloc>::vector(std::initializer_list<_Tp>, const allocator_type&) [with _Tp = long int; _Alloc = std::allocator<long int>; std::vector<_Tp, _Alloc>::allocator_type = std::allocator<long int>]’
  625 |       vector(initializer_list<value_type> __l,
      |       ^~~~~~
/usr/include/c++/11/bits/stl_vector.h:625:43: note:   no known conversion for argument 1 from ‘const TensorShapeVector’ {aka ‘const absl::lts_20240116::InlinedVector<long int, 6, std::allocator<long int> >’} to ‘std::initializer_list<long int>’
  625 |       vector(initializer_list<value_type> __l,
      |              ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~
/usr/include/c++/11/bits/stl_vector.h:607:7: note: candidate: ‘std::vector<_Tp, _Alloc>::vector(std::vector<_Tp, _Alloc>&&, const allocator_type&) [with _Tp = long int; _Alloc = std::allocator<long int>; std::vector<_Tp, _Alloc>::allocator_type = std::allocator<long int>]’
  607 |       vector(vector&& __rv, const allocator_type& __m)
      |       ^~~~~~
/usr/include/c++/11/bits/stl_vector.h:607:7: note:   candidate expects 2 arguments, 1 provided
/usr/include/c++/11/bits/stl_vector.h:589:7: note: candidate: ‘std::vector<_Tp, _Alloc>::vector(std::vector<_Tp, _Alloc>&&, const allocator_type&, std::false_type) [with _Tp = long int; _Alloc = std::allocator<long int>; std::vector<_Tp, _Alloc>::allocator_type = std::allocator<long int>; std::false_type = std::integral_constant<bool, false>]’
  589 |       vector(vector&& __rv, const allocator_type& __m, false_type)
      |       ^~~~~~
/usr/include/c++/11/bits/stl_vector.h:589:7: note:   candidate expects 3 arguments, 1 provided
/usr/include/c++/11/bits/stl_vector.h:585:7: note: candidate: ‘std::vector<_Tp, _Alloc>::vector(std::vector<_Tp, _Alloc>&&, const allocator_type&, std::true_type) [with _Tp = long int; _Alloc = std::allocator<long int>; std::vector<_Tp, _Alloc>::allocator_type = std::allocator<long int>; std::true_type = std::integral_constant<bool, true>]’
  585 |       vector(vector&& __rv, const allocator_type& __m, true_type) noexcept
      |       ^~~~~~
/usr/include/c++/11/bits/stl_vector.h:585:7: note:   candidate expects 3 arguments, 1 provided
/usr/include/c++/11/bits/stl_vector.h:575:7: note: candidate: ‘std::vector<_Tp, _Alloc>::vector(const std::vector<_Tp, _Alloc>&, const allocator_type&) [with _Tp = long int; _Alloc = std::allocator<long int>; std::vector<_Tp, _Alloc>::allocator_type = std::allocator<long int>]’
  575 |       vector(const vector& __x, const allocator_type& __a)
      |       ^~~~~~
/usr/include/c++/11/bits/stl_vector.h:575:7: note:   candidate expects 2 arguments, 1 provided
/usr/include/c++/11/bits/stl_vector.h:572:7: note: candidate: ‘std::vector<_Tp, _Alloc>::vector(std::vector<_Tp, _Alloc>&&) [with _Tp = long int; _Alloc = std::allocator<long int>]’
  572 |       vector(vector&&) noexcept = default;
      |       ^~~~~~
/usr/include/c++/11/bits/stl_vector.h:572:14: note:   no known conversion for argument 1 from ‘const TensorShapeVector’ {aka ‘const absl::lts_20240116::InlinedVector<long int, 6, std::allocator<long int> >’} to ‘std::vector<long int>&&’
  572 |       vector(vector&&) noexcept = default;
      |              ^~~~~~~~
/usr/include/c++/11/bits/stl_vector.h:553:7: note: candidate: ‘std::vector<_Tp, _Alloc>::vector(const std::vector<_Tp, _Alloc>&) [with _Tp = long int; _Alloc = std::allocator<long int>]’
  553 |       vector(const vector& __x)
      |       ^~~~~~
/usr/include/c++/11/bits/stl_vector.h:553:28: note:   no known conversion for argument 1 from ‘const TensorShapeVector’ {aka ‘const absl::lts_20240116::InlinedVector<long int, 6, std::allocator<long int> >’} to ‘const std::vector<long int>&’
  553 |       vector(const vector& __x)
      |              ~~~~~~~~~~~~~~^~~
/usr/include/c++/11/bits/stl_vector.h:522:7: note: candidate: ‘std::vector<_Tp, _Alloc>::vector(std::vector<_Tp, _Alloc>::size_type, const value_type&, const allocator_type&) [with _Tp = long int; _Alloc = std::allocator<long int>; std::vector<_Tp, _Alloc>::size_type = long unsigned int; std::vector<_Tp, _Alloc>::value_type = long int; std::vector<_Tp, _Alloc>::allocator_type = std::allocator<long int>]’
  522 |       vector(size_type __n, const value_type& __value,
      |       ^~~~~~
/usr/include/c++/11/bits/stl_vector.h:522:7: note:   candidate expects 3 arguments, 1 provided
/usr/include/c++/11/bits/stl_vector.h:510:7: note: candidate: ‘std::vector<_Tp, _Alloc>::vector(std::vector<_Tp, _Alloc>::size_type, const allocator_type&) [with _Tp = long int; _Alloc = std::allocator<long int>; std::vector<_Tp, _Alloc>::size_type = long unsigned int; std::vector<_Tp, _Alloc>::allocator_type = std::allocator<long int>]’
  510 |       vector(size_type __n, const allocator_type& __a = allocator_type())
      |       ^~~~~~
/usr/include/c++/11/bits/stl_vector.h:510:24: note:   no known conversion for argument 1 from ‘const TensorShapeVector’ {aka ‘const absl::lts_20240116::InlinedVector<long int, 6, std::allocator<long int> >’} to ‘std::vector<long int>::size_type’ {aka ‘long unsigned int’}
  510 |       vector(size_type __n, const allocator_type& __a = allocator_type())
      |              ~~~~~~~~~~^~~
/usr/include/c++/11/bits/stl_vector.h:497:7: note: candidate: ‘std::vector<_Tp, _Alloc>::vector(const allocator_type&) [with _Tp = long int; _Alloc = std::allocator<long int>; std::vector<_Tp, _Alloc>::allocator_type = std::allocator<long int>]’
  497 |       vector(const allocator_type& __a) _GLIBCXX_NOEXCEPT
      |       ^~~~~~
/usr/include/c++/11/bits/stl_vector.h:497:36: note:   no known conversion for argument 1 from ‘const TensorShapeVector’ {aka ‘const absl::lts_20240116::InlinedVector<long int, 6, std::allocator<long int> >’} to ‘const allocator_type&’ {aka ‘const std::allocator<long int>&’}
  497 |       vector(const allocator_type& __a) _GLIBCXX_NOEXCEPT
      |              ~~~~~~~~~~~~~~~~~~~~~~^~~
/usr/include/c++/11/bits/stl_vector.h:487:7: note: candidate: ‘std::vector<_Tp, _Alloc>::vector() [with _Tp = long int; _Alloc = std::allocator<long int>]’
  487 |       vector() = default;
      |       ^~~~~~
/usr/include/c++/11/bits/stl_vector.h:487:7: note:   candidate expects 0 arguments, 1 provided
/home/rock/temp/onnxruntime/onnxruntime/core/providers/armnn/nn/conv.cc:132:24: error: no matching function for call to ‘std::vector<long int>::vector(const TensorShapeVector&)’
  132 |   std::vector<int64_t> strides(conv_attrs_.strides);
      |                        ^~~~~~~
In file included from /usr/include/c++/11/vector:67,
                 from /usr/include/c++/11/functional:62,
                 from /usr/include/c++/11/pstl/glue_algorithm_defs.h:13,
                 from /usr/include/c++/11/algorithm:74,
                 from /home/rock/temp/onnxruntime/include/onnxruntime/core/common/common.h:22,
                 from /home/rock/temp/onnxruntime/onnxruntime/core/providers/armnn/nn/conv.cc:9:
/usr/include/c++/11/bits/stl_vector.h:653:9: note: candidate: ‘template<class _InputIterator, class> std::vector<_Tp, _Alloc>::vector(_InputIterator, _InputIterator, const allocator_type&) [with _InputIterator = _InputIterator; <template-parameter-2-2> = <template-parameter-1-2>; _Tp = long int; _Alloc = std::allocator<long int>]’
  653 |         vector(_InputIterator __first, _InputIterator __last,
      |         ^~~~~~
/usr/include/c++/11/bits/stl_vector.h:653:9: note:   template argument deduction/substitution failed:
/home/rock/temp/onnxruntime/onnxruntime/core/providers/armnn/nn/conv.cc:132:24: note:   candidate expects 3 arguments, 1 provided
  132 |   std::vector<int64_t> strides(conv_attrs_.strides);
      |                        ^~~~~~~
In file included from /usr/include/c++/11/vector:67,
                 from /usr/include/c++/11/functional:62,
                 from /usr/include/c++/11/pstl/glue_algorithm_defs.h:13,
                 from /usr/include/c++/11/algorithm:74,
                 from /home/rock/temp/onnxruntime/include/onnxruntime/core/common/common.h:22,
                 from /home/rock/temp/onnxruntime/onnxruntime/core/providers/armnn/nn/conv.cc:9:
/usr/include/c++/11/bits/stl_vector.h:625:7: note: candidate: ‘std::vector<_Tp, _Alloc>::vector(std::initializer_list<_Tp>, const allocator_type&) [with _Tp = long int; _Alloc = std::allocator<long int>; std::vector<_Tp, _Alloc>::allocator_type = std::allocator<long int>]’
  625 |       vector(initializer_list<value_type> __l,
      |       ^~~~~~
/usr/include/c++/11/bits/stl_vector.h:625:43: note:   no known conversion for argument 1 from ‘const TensorShapeVector’ {aka ‘const absl::lts_20240116::InlinedVector<long int, 6, std::allocator<long int> >’} to ‘std::initializer_list<long int>’
  625 |       vector(initializer_list<value_type> __l,
      |              ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~
/usr/include/c++/11/bits/stl_vector.h:607:7: note: candidate: ‘std::vector<_Tp, _Alloc>::vector(std::vector<_Tp, _Alloc>&&, const allocator_type&) [with _Tp = long int; _Alloc = std::allocator<long int>; std::vector<_Tp, _Alloc>::allocator_type = std::allocator<long int>]’
  607 |       vector(vector&& __rv, const allocator_type& __m)
      |       ^~~~~~
/usr/include/c++/11/bits/stl_vector.h:607:7: note:   candidate expects 2 arguments, 1 provided
/usr/include/c++/11/bits/stl_vector.h:589:7: note: candidate: ‘std::vector<_Tp, _Alloc>::vector(std::vector<_Tp, _Alloc>&&, const allocator_type&, std::false_type) [with _Tp = long int; _Alloc = std::allocator<long int>; std::vector<_Tp, _Alloc>::allocator_type = std::allocator<long int>; std::false_type = std::integral_constant<bool, false>]’
  589 |       vector(vector&& __rv, const allocator_type& __m, false_type)
      |       ^~~~~~
/usr/include/c++/11/bits/stl_vector.h:589:7: note:   candidate expects 3 arguments, 1 provided
/usr/include/c++/11/bits/stl_vector.h:585:7: note: candidate: ‘std::vector<_Tp, _Alloc>::vector(std::vector<_Tp, _Alloc>&&, const allocator_type&, std::true_type) [with _Tp = long int; _Alloc = std::allocator<long int>; std::vector<_Tp, _Alloc>::allocator_type = std::allocator<long int>; std::true_type = std::integral_constant<bool, true>]’
  585 |       vector(vector&& __rv, const allocator_type& __m, true_type) noexcept
      |       ^~~~~~
/usr/include/c++/11/bits/stl_vector.h:585:7: note:   candidate expects 3 arguments, 1 provided
/usr/include/c++/11/bits/stl_vector.h:575:7: note: candidate: ‘std::vector<_Tp, _Alloc>::vector(const std::vector<_Tp, _Alloc>&, const allocator_type&) [with _Tp = long int; _Alloc = std::allocator<long int>; std::vector<_Tp, _Alloc>::allocator_type = std::allocator<long int>]’
  575 |       vector(const vector& __x, const allocator_type& __a)
      |       ^~~~~~
/usr/include/c++/11/bits/stl_vector.h:575:7: note:   candidate expects 2 arguments, 1 provided
/usr/include/c++/11/bits/stl_vector.h:572:7: note: candidate: ‘std::vector<_Tp, _Alloc>::vector(std::vector<_Tp, _Alloc>&&) [with _Tp = long int; _Alloc = std::allocator<long int>]’
  572 |       vector(vector&&) noexcept = default;
      |       ^~~~~~
/usr/include/c++/11/bits/stl_vector.h:572:14: note:   no known conversion for argument 1 from ‘const TensorShapeVector’ {aka ‘const absl::lts_20240116::InlinedVector<long int, 6, std::allocator<long int> >’} to ‘std::vector<long int>&&’
  572 |       vector(vector&&) noexcept = default;
      |              ^~~~~~~~
/usr/include/c++/11/bits/stl_vector.h:553:7: note: candidate: ‘std::vector<_Tp, _Alloc>::vector(const std::vector<_Tp, _Alloc>&) [with _Tp = long int; _Alloc = std::allocator<long int>]’
  553 |       vector(const vector& __x)
      |       ^~~~~~
/usr/include/c++/11/bits/stl_vector.h:553:28: note:   no known conversion for argument 1 from ‘const TensorShapeVector’ {aka ‘const absl::lts_20240116::InlinedVector<long int, 6, std::allocator<long int> >’} to ‘const std::vector<long int>&’
  553 |       vector(const vector& __x)
      |              ~~~~~~~~~~~~~~^~~
/usr/include/c++/11/bits/stl_vector.h:522:7: note: candidate: ‘std::vector<_Tp, _Alloc>::vector(std::vector<_Tp, _Alloc>::size_type, const value_type&, const allocator_type&) [with _Tp = long int; _Alloc = std::allocator<long int>; std::vector<_Tp, _Alloc>::size_type = long unsigned int; std::vector<_Tp, _Alloc>::value_type = long int; std::vector<_Tp, _Alloc>::allocator_type = std::allocator<long int>]’
  522 |       vector(size_type __n, const value_type& __value,
      |       ^~~~~~
/usr/include/c++/11/bits/stl_vector.h:522:7: note:   candidate expects 3 arguments, 1 provided
/usr/include/c++/11/bits/stl_vector.h:510:7: note: candidate: ‘std::vector<_Tp, _Alloc>::vector(std::vector<_Tp, _Alloc>::size_type, const allocator_type&) [with _Tp = long int; _Alloc = std::allocator<long int>; std::vector<_Tp, _Alloc>::size_type = long unsigned int; std::vector<_Tp, _Alloc>::allocator_type = std::allocator<long int>]’
  510 |       vector(size_type __n, const allocator_type& __a = allocator_type())
      |       ^~~~~~
/usr/include/c++/11/bits/stl_vector.h:510:24: note:   no known conversion for argument 1 from ‘const TensorShapeVector’ {aka ‘const absl::lts_20240116::InlinedVector<long int, 6, std::allocator<long int> >’} to ‘std::vector<long int>::size_type’ {aka ‘long unsigned int’}
  510 |       vector(size_type __n, const allocator_type& __a = allocator_type())
      |              ~~~~~~~~~~^~~
/usr/include/c++/11/bits/stl_vector.h:497:7: note: candidate: ‘std::vector<_Tp, _Alloc>::vector(const allocator_type&) [with _Tp = long int; _Alloc = std::allocator<long int>; std::vector<_Tp, _Alloc>::allocator_type = std::allocator<long int>]’
  497 |       vector(const allocator_type& __a) _GLIBCXX_NOEXCEPT
      |       ^~~~~~
/usr/include/c++/11/bits/stl_vector.h:497:36: note:   no known conversion for argument 1 from ‘const TensorShapeVector’ {aka ‘const absl::lts_20240116::InlinedVector<long int, 6, std::allocator<long int> >’} to ‘const allocator_type&’ {aka ‘const std::allocator<long int>&’}
  497 |       vector(const allocator_type& __a) _GLIBCXX_NOEXCEPT
      |              ~~~~~~~~~~~~~~~~~~~~~~^~~
/usr/include/c++/11/bits/stl_vector.h:487:7: note: candidate: ‘std::vector<_Tp, _Alloc>::vector() [with _Tp = long int; _Alloc = std::allocator<long int>]’
  487 |       vector() = default;
      |       ^~~~~~
/usr/include/c++/11/bits/stl_vector.h:487:7: note:   candidate expects 0 arguments, 1 provided
gmake[2]: *** [CMakeFiles/onnxruntime_providers_armnn.dir/build.make:160: CMakeFiles/onnxruntime_providers_armnn.dir/home/rock/temp/onnxruntime/onnxruntime/core/providers/armnn/nn/conv.cc.o] Error 1
gmake[2]: *** Waiting for unfinished jobs....

Looks to be similar problems. Looks like you have some API problems with new ArmNN.

ChthonicOne · 2024-03-15T20:18:51Z

Ok, the code patch to gemm.h was as follows:

...
      armnn::IConnectableLayer* fc_armnn;

      armnn::TensorInfo weightsInfo(weightShape, armnn::DataType::Float32);
      armnn::ConstTensor weights(weightsInfo, w_data);
      armnn::IConnectableLayer* const constantWeightsLayer = myNetwork->AddConstantLayer(weights, "weights");

      armnn::IConnectableLayer* InputLayer;

      if (fcDescriptor.m_BiasEnabled) {
        armnn::TensorShape biasShape = ArmNNTensorShape(B->Shape());
        if (B->Shape().NumDimensions() == 2) {
          if (B->Shape().GetDims()[0] == 1 && B->Shape().GetDims()[1] > 1) {
            biasShape = {B->Shape().GetDims()[1]};
            LOGS_DEFAULT(VERBOSE) << "Bias reshaped to: {" << B->Shape().GetDims()[1] << "}";
          }
        }
        armnn::TensorInfo biasDesc(biasShape, armnn::DataType::Float32);
        armnn::ConstTensor bias(biasDesc, b_data);
        armnn::IConnectableLayer* const constantBiasLayer = myNetwork->AddConstantLayer(bias, "bias");
        fc_armnn = myNetwork->AddFullyConnectedLayer(fcDescriptor,
                                                     "fc_armnn");
        InputLayer = myNetwork->AddInputLayer(0);
        InputLayer->GetOutputSlot(0).Connect(fc_armnn->GetInputSlot(0));
        constantWeightsLayer->GetOutputSlot(0).Connect(fc_armnn->GetInputSlot(1));
        constantBiasLayer->GetOutputSlot(0).Connect(fc_armnn->GetInputSlot(2));
      } else {
        fc_armnn = myNetwork->AddFullyConnectedLayer(fcDescriptor,
                                                     "fc_armnn");
        InputLayer = myNetwork->AddInputLayer(0);
        InputLayer->GetOutputSlot(0).Connect(fc_armnn->GetInputSlot(0));
        constantWeightsLayer->GetOutputSlot(0).Connect(fc_armnn->GetInputSlot(1));
      }

      armnn::IConnectableLayer* OutputLayer = myNetwork->AddOutputLayer(0);

      InputLayer->GetOutputSlot(0).Connect(fc_armnn->GetInputSlot(0));
      fc_armnn->GetOutputSlot(0).Connect(OutputLayer->GetInputSlot(0));
...

Keep in mind that I am working on a ssh connection with command line input to these files and my only means of error checking is running your build script again, so my progress is slow. I don't have a proper environment set up to work on your stuff.

Also, it probably could have been done better by creating an inline function to emulate the old function you were used to. Your choice how you want to ultimately do it. I'll continue plugging away unless you have a solution for me before I'm done.

snnn · 2024-03-15T21:06:46Z

It is an armnn specific problem. Sorry I cannot provide further help.

ChthonicOne · 2024-03-20T15:09:32Z

So, what you are saying is that they are the ones that write the ArmNN extension for Onnxruntime?

snnn · 2024-03-20T16:17:51Z

I mean I didn't write the ArmNN extension code. I am not familiar with the code. Sorry I cannot provide help on that.

ChthonicOne · 2024-03-21T19:05:28Z

I see. Since it is not related to the original issue, should I open a new issue for this topic to get the right developer's attention?

ChthonicOne · 2024-03-21T20:22:02Z

An update here: It seems that the GPIO ports are not working with the update to 22.04, and the developers of the Radxa-zero have not put out a specially compiled kernel for 22.04 ubuntu yet. This makes the solution to update to 22.04 unsatisfactory for the BFLOAT16 issue, at least until they provide a new kernel.

In the interim, is there a way that you could provide a means to disable BFLOAT16 support so that I can build onnxruntime without it on the Radxa-zero?

huyunlei · 2024-03-23T10:18:31Z

ubantu18.04 jetson xviewer can not build onnxruntime because BFLOAT16,error

snnn · 2024-03-25T18:23:24Z

@huyunlei , please try JetPack 6.0 Developer Preview

snnn · 2024-03-25T18:25:12Z

@ChthonicOne , in your local branch you may revert #17031 .

dusty-nv · 2024-04-07T19:52:08Z

@snnn Xavier doesn't support JetPack 6 so people are still are on JetPack 5, suggest you handle this check for gracefully in the future

lin168 · 2024-04-18T07:20:46Z

Maybe you could try to upgrade your gcc version

Slahuddin-Ch · 2024-06-15T15:52:53Z

Facing the same issue on Voxl 2 chips. Has anyone found a solution for this?

clementperon · 2024-06-19T12:46:39Z

I have the same Issue, I think it should be autodetect by CMake like doing:

--- a/cmake/onnxruntime_mlas.cmake
+++ b/cmake/onnxruntime_mlas.cmake
@@ -354,7 +354,7 @@ else()
         )
         set_source_files_properties(${MLAS_SRC_DIR}/sqnbitgemm_kernel_neon.cpp
                                     PROPERTIES COMPILE_FLAGS " -march=armv8.2-a+dotprod")
-        if (NOT APPLE)
+        if (NOT APPLE AND HAS_ARM64_FLOAT16 AND HAS_ARM64_BFLOAT16)

snnn · 2024-06-19T23:54:14Z

Please upgrade GCC, or apply local patches to remove the code.

ChthonicOne added the build build issues; typically submitted using template label Mar 14, 2024

github-actions bot added ep:ArmNN issues related to Arm NN execution provider platform:mobile issues related to ONNX Runtime mobile; typically submitted using template labels Mar 14, 2024

clementperon mentioned this issue Jun 19, 2024

mlas: fix build on ARM64 #21099

Closed

snnn closed this as not planned Won't fix, can't repro, duplicate, stale Jun 19, 2024

aslafy-z mentioned this issue Jan 13, 2025

Export fails on Jetpack 5 ultralytics/ultralytics#18655

Open

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Build] Trying to build on a embedded device that doesn't support BFLOAT16 #19920

[Build] Trying to build on a embedded device that doesn't support BFLOAT16 #19920

ChthonicOne commented Mar 14, 2024

hariharans29 commented Mar 14, 2024

ChthonicOne commented Mar 14, 2024

snnn commented Mar 14, 2024

snnn commented Mar 14, 2024

ChthonicOne commented Mar 14, 2024

ChthonicOne commented Mar 14, 2024

ChthonicOne commented Mar 15, 2024 •

edited

Loading

ChthonicOne commented Mar 15, 2024

ChthonicOne commented Mar 15, 2024

ChthonicOne commented Mar 15, 2024

ChthonicOne commented Mar 15, 2024

ChthonicOne commented Mar 15, 2024 •

edited

Loading

snnn commented Mar 15, 2024

ChthonicOne commented Mar 20, 2024

snnn commented Mar 20, 2024

ChthonicOne commented Mar 21, 2024

ChthonicOne commented Mar 21, 2024

huyunlei commented Mar 23, 2024

snnn commented Mar 25, 2024

snnn commented Mar 25, 2024

dusty-nv commented Apr 7, 2024

lin168 commented Apr 18, 2024

Slahuddin-Ch commented Jun 15, 2024

clementperon commented Jun 19, 2024

snnn commented Jun 19, 2024

[Build] Trying to build on a embedded device that doesn't support BFLOAT16 #19920

[Build] Trying to build on a embedded device that doesn't support BFLOAT16 #19920

Comments

ChthonicOne commented Mar 14, 2024

Describe the issue

Urgency

Target platform

Build script

Error / output

Visual Studio Version

GCC / Compiler Version

hariharans29 commented Mar 14, 2024

ChthonicOne commented Mar 14, 2024

snnn commented Mar 14, 2024

snnn commented Mar 14, 2024

ChthonicOne commented Mar 14, 2024

ChthonicOne commented Mar 14, 2024

ChthonicOne commented Mar 15, 2024 • edited Loading

ChthonicOne commented Mar 15, 2024

ChthonicOne commented Mar 15, 2024

ChthonicOne commented Mar 15, 2024

ChthonicOne commented Mar 15, 2024

ChthonicOne commented Mar 15, 2024 • edited Loading

snnn commented Mar 15, 2024

ChthonicOne commented Mar 20, 2024

snnn commented Mar 20, 2024

ChthonicOne commented Mar 21, 2024

ChthonicOne commented Mar 21, 2024

huyunlei commented Mar 23, 2024

snnn commented Mar 25, 2024

snnn commented Mar 25, 2024

dusty-nv commented Apr 7, 2024

lin168 commented Apr 18, 2024

Slahuddin-Ch commented Jun 15, 2024

clementperon commented Jun 19, 2024

snnn commented Jun 19, 2024

ChthonicOne commented Mar 15, 2024 •

edited

Loading

ChthonicOne commented Mar 15, 2024 •

edited

Loading