How to use multiple inputs of different types in C++ session #18932

vymao · 2023-12-26T04:04:40Z

Describe the issue

I am trying to run the Olive-optimized Whisper model in C++. So far, my input features look like this:

Inputs
	input_features : -1x-1x-1
	max_length : 1
	min_length : 1
	num_beams : 1
	num_return_sequences : 1
	length_penalty : 1
	repetition_penalty : 1
Outputs
	sequences : -1x-1x-1

While it is standard to create the input_features as an Ort::Value tensor, I am wondering what the correct way to generate the other inputs is. The other inputs are either ints or floats, and I'm not sure if we are meant to create Tensors for these values as well. Right now, I'm creating such tensors using these methods:

Ort::Value int_to_tensor(int32_t value)
{
    Ort::MemoryInfo mem_info =
        Ort::MemoryInfo::CreateCpu(OrtAllocatorType::OrtArenaAllocator, OrtMemType::OrtMemTypeDefault);

    std::vector<int64_t> shape = {1};

    auto tensor = Ort::Value::CreateTensor<int32_t>(mem_info, &value, 1, shape.data(), 1);
    const int test = static_cast<int>(*tensor.GetTensorData<int32_t>());
    return tensor;
}

Ort::Value float_to_tensor(float value)
{
    Ort::MemoryInfo mem_info =
        Ort::MemoryInfo::CreateCpu(OrtAllocatorType::OrtArenaAllocator, OrtMemType::OrtMemTypeDefault);

    std::vector<int64_t> shape = {1};

    auto tensor = Ort::Value::CreateTensor<float>(mem_info, &value, 1, shape.data(), 1);
    const int* test = tensor.GetTensorData<int>();
    return tensor;
}

And appending these Ort::Values to a vector of Ort::Values to create one std::vector<Ort::Value>. However, when I feed this into RunAsync with the settings {/*max_length=*/200, /*min_length=*/0, /*num_beams=*/2, /*num_return_sequences=*/1, /*length_penalty=*/1.0f, /*repetition_penalty=*/1.0f} as the additional inputs, I get the following error:

2023-12-25 22:06:47.118 main[14133:703243] 2023-12-25 22:06:47.118036 [E:onnxruntime:, sequential_executor.cc:514 ExecuteKernel] Non-zero status code returned while running BeamSearch node. Name:'BeamSearch_node' Status Message: /Users/runner/work/1/s/onnxruntime/contrib_ops/cpu/transformers/beam_search_parameters.cc:64 void onnxruntime::contrib::transformers::BeamSearchParameters::ParseFromInputs(onnxruntime::OpKernelContext *) max_length <= kMaxSequenceLength was false. max_length (32759) shall be no more than 4096
ERROR running model inference: Non-zero status code returned while running BeamSearch node. Name:'BeamSearch_node' Status Message: /Users/runner/work/1/s/onnxruntime/contrib_ops/cpu/transformers/beam_search_parameters.cc:64 void onnxruntime::contrib::transformers::BeamSearchParameters::ParseFromInputs(onnxruntime::OpKernelContext *) max_length <= kMaxSequenceLength was false. max_length (32759) shall be no more than 4096

I don't know how it is getting 32759, as I don't input this value anywhere.

To reproduce

Running the Whisper optimization here for Olive, but omitting any PrePost processing in order to pass raw MEL features into the model
Run the ONNX model in a C++ environment

Urgency

Blocked

Platform

Mac

OS Version

13.5

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

1.16.2

ONNX Runtime API

C++

Architecture

X64

Execution Provider

Default CPU

Execution Provider Library Version

No response

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to use multiple inputs of different types in C++ session #18932

How to use multiple inputs of different types in C++ session #18932

vymao commented Dec 26, 2023

How to use multiple inputs of different types in C++ session #18932

How to use multiple inputs of different types in C++ session #18932

Comments

vymao commented Dec 26, 2023

Describe the issue

To reproduce

Urgency

Platform

OS Version

ONNX Runtime Installation

ONNX Runtime Version or Commit ID

ONNX Runtime API

Architecture

Execution Provider

Execution Provider Library Version