[Web] Memory access out of bounds / alignment fault #21355

cmario · 2024-07-15T15:43:12Z

Describe the issue

Hello,

I am exploring the use of ONNX, with a particular focus on the ORT model format for web applications. I developed a basic WASM module to perform inference using a UNET-like semantic segmentation model. However, the inference process throws an exception, which I have detailed below. Please note that the same code runs without issues outside of the WASM module.

I built the ONNX runtime for web with the following command:

./build.sh --config Release --build_wasm_static_lib --minimal_build --skip_tests --disable_wasm_exception_catching --disable_rtti

I built the WASM module with the following command:

emcc -g myModule.cpp -o myModule.js -I<opencv headers> -I<onnxruntime headers> -L<opencv lib> -L<onnxruntime lib> -lopencv_core -lopencv_imgproc -lonnxruntime_webassembly -s INITIAL_MEMORY=256MB -s EXPORTED_FUNCTIONS="['_processImage', '_malloc', '_free']" -s EXPORTED_RUNTIME_METHODS="['ccall', 'cwrap']" -s SAFE_HEAP=0 --bind

When running the inference I get the following error:

2024-07-15 18:48:24.453600 [I:onnxruntime:, inference_session.cc:514 TraceSessionOptions] Session Options {  execution_mode:0 execution_order:DEFAULT enable_profiling:0 optimized_model_filepath: enable_mem_pattern:1 enable_mem_reuse:1 enable_cpu_mem_arena:1 profile_file_prefix:onnxruntime_profile_ session_logid: session_log_severity_level:-1 session_log_verbosity_level:0 max_num_graph_transformation_steps:10 graph_optimization_level:2 intra_op_param:OrtThreadPoolParams { thread_pool_size: 1 auto_set_affinity: 0 allow_spinning: 1 dynamic_block_base_: 0 stack_size: 0 affinity_str:  set_denormal_as_zero: 0 } inter_op_param:OrtThreadPoolParams { thread_pool_size: 0 auto_set_affinity: 0 allow_spinning: 1 dynamic_block_base_: 0 stack_size: 0 affinity_str:  set_denormal_as_zero: 0 } use_per_session_threads:1 thread_pool_allow_spinning:1 use_deterministic_compute:0 config_options: {  } }
2024-07-15 18:48:24.454500 [I:onnxruntime:, inference_session.cc:414 operator()] Flush-to-zero and denormal-as-zero are off
2024-07-15 18:48:24.454600 [I:onnxruntime:, inference_session.cc:422 ConstructorCommon] Creating and using per session threadpools since use_per_session_threads_ is true
2024-07-15 18:48:24.454800 [I:onnxruntime:, inference_session.cc:440 ConstructorCommon] Dynamic block base set to 0
2024-07-15 18:48:24.462000 [I:onnxruntime:, inference_session.cc:1583 Initialize] Initializing session.
2024-07-15 18:48:24.462100 [I:onnxruntime:, inference_session.cc:1620 Initialize] Adding default CPU execution provider.
2024-07-15 18:48:24.485000 [V:onnxruntime:, session_state.cc:126 CreateGraphInfo] SaveMLValueNameIndexMapping
2024-07-15 18:48:24.485500 [V:onnxruntime:, session_state.cc:172 CreateGraphInfo] Done saving OrtValue mappings.
2024-07-15 18:48:24.488600 [I:onnxruntime:, session_state_utils.cc:201 SaveInitializedTensors] Saving initialized tensors.
2024-07-15 18:48:24.489500 [I:onnxruntime:, session_state_utils.cc:345 SaveInitializedTensors] Done saving initialized tensors
2024-07-15 18:48:24.491300 [I:onnxruntime:, inference_session.cc:1969 Initialize] Session successfully initialized.

With SAFE_HEAP=0:

RuntimeError: memory access out of bounds
    at myModule.wasm.MlasSgemmOperation(CBLAS_TRANSPOSE, CBLAS_TRANSPOSE, unsigned long, unsigned long, unsigned long, float, float const*, unsigned long, float const*, unsigned long, float, float*, unsigned long) (http://localhost:8000/myModule.wasm:wasm-function[9520]:0x24bf4a)
    at myModule.wasm.MlasConvOperation(MLAS_CONV_PARAMETERS const*, float const*, float const*, float const*, float*, float*, unsigned long, unsigned long) (http://localhost:8000/myModule.wasm:wasm-function[10013]:0x2921cc)
    at myModule.wasm.MlasConv(MLAS_CONV_PARAMETERS const*, float const*, float const*, float const*, float*, float*, onnxruntime::concurrency::ThreadPool*) (http://localhost:8000/myModule.wasm:wasm-function[9978]:0x28c58a)
    at myModule.wasm.onnxruntime::Conv<float>::Compute(onnxruntime::OpKernelContext*) const (http://localhost:8000/myModule.wasm:wasm-function[9965]:0x288f4c)
    at myModule.wasm.onnxruntime::LaunchKernelStep::Execute(onnxruntime::StreamExecutionContext&, unsigned long, onnxruntime::SessionScope&, bool const&, bool&) (http://localhost:8000/myModule.wasm:wasm-function[7213]:0x1a3e82)
    at myModule.wasm.onnxruntime::RunSince(unsigned long, onnxruntime::StreamExecutionContext&, onnxruntime::SessionScope&, bool const&, unsigned long) (http://localhost:8000/myModule.wasm:wasm-function[7221]:0x1a6931)
    at myModule.wasm.onnxruntime::ExecuteThePlan(onnxruntime::SessionState const&, gsl::span<int const, 4294967295ul>, gsl::span<OrtValue const, 4294967295ul>, gsl::span<int const, 4294967295ul>, std::__2::vector<OrtValue, std::__2::allocator<OrtValue>>&, std::__2::unordered_map<unsigned long, std::__2::function<onnxruntime::common::Status (onnxruntime::TensorShape const&, OrtDevice const&, OrtValue&, bool&)>, std::__2::hash<unsigned long>, std::__2::equal_to<unsigned long>, std::__2::allocator<std::__2::pair<unsigned long const, std::__2::function<onnxruntime::common::Status (onnxruntime::TensorShape const&, OrtDevice const&, OrtValue&, bool&)>>>> const&, onnxruntime::logging::Logger const&, bool const&, bool, bool) (http://localhost:8000/myModule.wasm:wasm-function[6719]:0x152035)
    at myModule.wasm.onnxruntime::utils::ExecuteGraphImpl(onnxruntime::SessionState const&, onnxruntime::FeedsFetchesManager const&, gsl::span<OrtValue const, 4294967295ul>, std::__2::vector<OrtValue, std::__2::allocator<OrtValue>>&, std::__2::unordered_map<unsigned long, std::__2::function<onnxruntime::common::Status (onnxruntime::TensorShape const&, OrtDevice const&, OrtValue&, bool&)>, std::__2::hash<unsigned long>, std::__2::equal_to<unsigned long>, std::__2::allocator<std::__2::pair<unsigned long const, std::__2::function<onnxruntime::common::Status (onnxruntime::TensorShape const&, OrtDevice const&, OrtValue&, bool&)>>>> const&, ExecutionMode, bool const&, onnxruntime::logging::Logger const&, bool, onnxruntime::Stream*) (http://localhost:8000/myModule.wasm:wasm-function[6718]:0x14f436)
    at myModule.wasm.onnxruntime::InferenceSession::Run(OrtRunOptions const&, gsl::span<std::__2::basic_string<char, std::__2::char_traits<char>, std::__2::allocator<char>> const, 4294967295ul>, gsl::span<OrtValue const, 4294967295ul>, gsl::span<std::__2::basic_string<char, std::__2::char_traits<char>, std::__2::allocator<char>> const, 4294967295ul>, std::__2::vector<OrtValue, std::__2::allocator<OrtValue>>*, std::__2::vector<OrtDevice, std::__2::allocator<OrtDevice>> const*) (http://localhost:8000/myModule.wasm:wasm-function[17805]:0x6f648b)
    at myModule.wasm.onnxruntime::InferenceSession::Run(OrtRunOptions const&, gsl::span<char const* const, 4294967295ul>, gsl::span<OrtValue const* const, 4294967295ul>, gsl::span<char const* const, 4294967295ul>, gsl::span<OrtValue*, 4294967295ul>) (http://localhost:8000/myModule.wasm:wasm-function[5392]:0xf1758)

With SAFE_HEAP=1:

RuntimeError: Aborted(alignment fault)
    at abort (http://localhost:8000/myModule.js:625:41)
    at alignfault (http://localhost:8000/myModule.js:354:3)
    at myModule.wasm (http://localhost:8000/myModule.wasm:wasm-function[17477]:0x862651)
    at myModule.wasm.MlasConvIm2Col(MLAS_CONV_PARAMETERS const*, float const*, float*, unsigned long, unsigned long, unsigned long, unsigned long) (http://localhost:8000/myModule.wasm:wasm-function[9072]:0x2e0492)
    at myModule.wasm.MlasConvOperation(MLAS_CONV_PARAMETERS const*, float const*, float const*, float const*, float*, float*, unsigned long, unsigned long) (http://localhost:8000/myModule.wasm:wasm-function[9074]:0x2e119c)
    at myModule.wasm.MlasConv(MLAS_CONV_PARAMETERS const*, float const*, float const*, float const*, float*, float*, onnxruntime::concurrency::ThreadPool*) (http://localhost:8000/myModule.wasm:wasm-function[9039]:0x2da858)
    at myModule.wasm.onnxruntime::Conv<float>::Compute(onnxruntime::OpKernelContext*) const (http://localhost:8000/myModule.wasm:wasm-function[9026]:0x2d67b0)
    at myModule.wasm.onnxruntime::LaunchKernelStep::Execute(onnxruntime::StreamExecutionContext&, unsigned long, onnxruntime::SessionScope&, bool const&, bool&) (http://localhost:8000/myModule.wasm:wasm-function[6274]:0x1c14f5)
    at myModule.wasm.onnxruntime::RunSince(unsigned long, onnxruntime::StreamExecutionContext&, onnxruntime::SessionScope&, bool const&, unsigned long) (http://localhost:8000/myModule.wasm:wasm-function[6282]:0x1c49bf)
    at myModule.wasm.onnxruntime::ExecuteThePlan(onnxruntime::SessionState const&, gsl::span<int const, 4294967295ul>, gsl::span<OrtValue const, 4294967295ul>, gsl::span<int const, 4294967295ul>, std::__2::vector<OrtValue, std::__2::allocator<OrtValue>>&, std::__2::unordered_map<unsigned long, std::__2::function<onnxruntime::common::Status (onnxruntime::TensorShape const&, OrtDevice const&, OrtValue&, bool&)>, std::__2::hash<unsigned long>, std::__2::equal_to<unsigned long>, std::__2::allocator<std::__2::pair<unsigned long const, std::__2::function<onnxruntime::common::Status (onnxruntime::TensorShape const&, OrtDevice const&, OrtValue&, bool&)>>>> const&, onnxruntime::logging::Logger const&, bool const&, bool, bool) (http://localhost:8000/myModule.wasm:wasm-function[5780]:0x15fa3a)

Best regards,
Mario

To reproduce

Here is the code I used to test the ORT model:

extern "C" {
EMSCRIPTEN_KEEPALIVE
void processImage(const uint8_t* inputImageData, size_t inputImageDataSize, uint8_t* outputImageData, int width, int height) {
    cv::Mat image(height, width, CV_8UC4, const_cast<uint8_t*>(inputImageData));
    cv::Mat rgbImage;
    cv::cvtColor(image, rgbImage, cv::COLOR_BGRA2RGB);
    cv::Mat resizedImage;
    cv::resize(rgbImage, resizedImage, cv::Size(256, 256), 0, 0, cv::INTER_AREA);
    cv::Mat f32Image;
    resizedImage.convertTo(f32Image, CV_32F, 1.0 / 255);
    //
    std::vector<float> inputData;
    inputData.assign((float *) f32Image.datastart, (float *) f32Image.dataend);
    //
    Ort::SessionOptions session_options;
    session_options.SetIntraOpNumThreads(1);
    session_options.SetGraphOptimizationLevel(GraphOptimizationLevel::ORT_ENABLE_EXTENDED);
    // Decode the base64 model
    std::vector<uint8_t> model_data = base64_decode(base64_model);
    // Load the model from memory
    Ort::Env env(ORT_LOGGING_LEVEL_VERBOSE, "test");
    Ort::MemoryInfo memory_info = Ort::MemoryInfo::CreateCpu(OrtAllocatorType::OrtArenaAllocator, OrtMemType::OrtMemTypeDefault);
    Ort::AllocatorWithDefaultOptions allocator;
    Ort::Session session(env, model_data.data(), model_data.size(), session_options);
    // input tensor
    std::vector<int64_t> inputShape = {1, 256, 256, 3};
    Ort::Value inputTensor = Ort::Value::CreateTensor<float>(memory_info, inputData.data(), inputData.size(),
                                                             inputShape.data(), inputShape.size());
    // output tensor
    std::vector<float> outputData(256 * 256 * 4);
    std::vector<int64_t> outputShape = {1, 256, 256, 1};
    Ort::Value outputTensor = Ort::Value::CreateTensor<float>(memory_info,
                                                              outputData.data(), outputData.size(),
                                                              outputShape.data(), outputShape.size());

    auto input_name_alloc = session.GetInputNameAllocated(0, allocator);
    const char *input_name = input_name_alloc.get();
    auto output_name_alloc = session.GetOutputNameAllocated(0, allocator);
    const char *output_name = output_name_alloc.get();

    // Run inference
    session.Run(Ort::RunOptions{nullptr}, &input_name, &inputTensor, 1, &output_name, &outputTensor, 1);

    // Process output tensor
    auto *float_array = outputTensor.GetTensorMutableData<float>();

    // Convert the output tensor to cv::Mat
    cv::Mat outputImg(256, 256, CV_32FC1, float_array);
    outputImg.convertTo(outputImg, CV_8UC1, 255.0);
    cv::resize(outputImg, outputImg, cv::Size(width, height));
    // Copy the output image to the outputImageData buffer
    std::memcpy(outputImageData, outputImg.data, width * height);
}
}

Urgency

No response

ONNX Runtime Installation

Built from Source

ONNX Runtime Version or Commit ID

v1.17.1

Execution Provider

'wasm'/'cpu' (WebAssembly CPU)

The text was updated successfully, but these errors were encountered:

YoniGBinahAi · 2024-08-04T14:09:36Z

I have the same issue with version 1.16.3 and emsdk 3.1.44
my build cmd is :
./build.sh --config Debug --enable_wasm_simd --emsdk_version=3.1.44 --build_wasm_static_lib --enable_wasm_exception_throwing_override --enable_wasm_threads --enable_wasm_api_exception_catching --skip_tests

error :

YoniGBinahAi · 2024-08-06T06:09:36Z

@cmario - we have noticed the function MlasSgemmOperation is consuming lots of stack memory. Since wasm by default allocate only 5MB for the stack, it fails there. You can try and add the following flag and see if it solves your issue. it did help us :
-s TOTAL_STACK=10MB

cmario · 2024-08-09T17:54:08Z

@YoniGBinahAi Thank you very much for your feedback, increasing the total stack to 10MB resolves the issue.

cmario added the platform:web issues related to ONNX Runtime web; typically submitted using template label Jul 15, 2024

cmario closed this as completed Sep 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Web] Memory access out of bounds / alignment fault #21355

[Web] Memory access out of bounds / alignment fault #21355

cmario commented Jul 15, 2024

YoniGBinahAi commented Aug 4, 2024 •

edited

Loading

YoniGBinahAi commented Aug 6, 2024

cmario commented Aug 9, 2024

[Web] Memory access out of bounds / alignment fault #21355

[Web] Memory access out of bounds / alignment fault #21355

Comments

cmario commented Jul 15, 2024

Describe the issue

To reproduce

Urgency

ONNX Runtime Installation

ONNX Runtime Version or Commit ID

Execution Provider

YoniGBinahAi commented Aug 4, 2024 • edited Loading

YoniGBinahAi commented Aug 6, 2024

cmario commented Aug 9, 2024

YoniGBinahAi commented Aug 4, 2024 •

edited

Loading