Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to use multiple inputs of different types in C++ session #18932

Open
vymao opened this issue Dec 26, 2023 · 0 comments
Open

How to use multiple inputs of different types in C++ session #18932

vymao opened this issue Dec 26, 2023 · 0 comments

Comments

@vymao
Copy link

vymao commented Dec 26, 2023

Describe the issue

I am trying to run the Olive-optimized Whisper model in C++. So far, my input features look like this:

Inputs
	input_features : -1x-1x-1
	max_length : 1
	min_length : 1
	num_beams : 1
	num_return_sequences : 1
	length_penalty : 1
	repetition_penalty : 1
Outputs
	sequences : -1x-1x-1

While it is standard to create the input_features as an Ort::Value tensor, I am wondering what the correct way to generate the other inputs is. The other inputs are either ints or floats, and I'm not sure if we are meant to create Tensors for these values as well. Right now, I'm creating such tensors using these methods:

Ort::Value int_to_tensor(int32_t value)
{
    Ort::MemoryInfo mem_info =
        Ort::MemoryInfo::CreateCpu(OrtAllocatorType::OrtArenaAllocator, OrtMemType::OrtMemTypeDefault);

    std::vector<int64_t> shape = {1};

    auto tensor = Ort::Value::CreateTensor<int32_t>(mem_info, &value, 1, shape.data(), 1);
    const int test = static_cast<int>(*tensor.GetTensorData<int32_t>());
    return tensor;
}

Ort::Value float_to_tensor(float value)
{
    Ort::MemoryInfo mem_info =
        Ort::MemoryInfo::CreateCpu(OrtAllocatorType::OrtArenaAllocator, OrtMemType::OrtMemTypeDefault);

    std::vector<int64_t> shape = {1};

    auto tensor = Ort::Value::CreateTensor<float>(mem_info, &value, 1, shape.data(), 1);
    const int* test = tensor.GetTensorData<int>();
    return tensor;
}

And appending these Ort::Values to a vector of Ort::Values to create one std::vector<Ort::Value>. However, when I feed this into RunAsync with the settings {/*max_length=*/200, /*min_length=*/0, /*num_beams=*/2, /*num_return_sequences=*/1, /*length_penalty=*/1.0f, /*repetition_penalty=*/1.0f} as the additional inputs, I get the following error:

2023-12-25 22:06:47.118 main[14133:703243] 2023-12-25 22:06:47.118036 [E:onnxruntime:, sequential_executor.cc:514 ExecuteKernel] Non-zero status code returned while running BeamSearch node. Name:'BeamSearch_node' Status Message: /Users/runner/work/1/s/onnxruntime/contrib_ops/cpu/transformers/beam_search_parameters.cc:64 void onnxruntime::contrib::transformers::BeamSearchParameters::ParseFromInputs(onnxruntime::OpKernelContext *) max_length <= kMaxSequenceLength was false. max_length (32759) shall be no more than 4096
ERROR running model inference: Non-zero status code returned while running BeamSearch node. Name:'BeamSearch_node' Status Message: /Users/runner/work/1/s/onnxruntime/contrib_ops/cpu/transformers/beam_search_parameters.cc:64 void onnxruntime::contrib::transformers::BeamSearchParameters::ParseFromInputs(onnxruntime::OpKernelContext *) max_length <= kMaxSequenceLength was false. max_length (32759) shall be no more than 4096

I don't know how it is getting 32759, as I don't input this value anywhere.

To reproduce

  1. Running the Whisper optimization here for Olive, but omitting any PrePost processing in order to pass raw MEL features into the model
  2. Run the ONNX model in a C++ environment

Urgency

Blocked

Platform

Mac

OS Version

13.5

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

1.16.2

ONNX Runtime API

C++

Architecture

X64

Execution Provider

Default CPU

Execution Provider Library Version

No response

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant