Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue running a model in ONNXruntime #21571

Open
nachoogriis opened this issue Jul 31, 2024 · 2 comments
Open

Issue running a model in ONNXruntime #21571

nachoogriis opened this issue Jul 31, 2024 · 2 comments
Labels
stale issues that have not been addressed in a while; categorized by a bot

Comments

@nachoogriis
Copy link

Describe the issue

I have generated an onnx model from the yolo_nas_pose nano model which can be found here. In order to generate the onnx file I have added the following code cell.

yolo_nas_pose = models.get("yolo_nas_pose_n", pretrained_weights="coco_pose")
yolo_nas_pose.export(
    'yolo_nas_pose_n_8_256_128.onnx',
    preprocessing=True,
    postprocessing=True,
    confidence_threshold=0.2,
    num_pre_nms_predictions=1024,
    max_predictions_per_image=120,
    input_image_shape = [256, 128],
    batch_size=8,
    # quantiztion_mode
)

Besides this, I have merged this model with the following (simple) postprocessing model, which just takes the first out of the 120 predictions the model does for each of the images in the batch. I will paste here the tensorflow code used to generate the onnx file.

def create_post_processing_patches_tf():

    @tf.function
    def post_processing_patches_f(num_predictions, boxes, scores, joints):
        filtered_bboxes = tf.reshape(tf.gather(boxes, indices=[0], axis=1), shape=(8, 1, 4))
        filtered_scores = tf.reshape(tf.gather(scores, indices=[0], axis=1), shape=(8, 1))
        filtered_joints = tf.reshape(tf.gather(joints, indices=[0], axis=1), shape=(8, 1, 17, 3))


        return {
            'N': num_predictions,
            'filtered_bboxes': filtered_bboxes,
            'filtered_scores': filtered_scores,
            'filtered_joints': filtered_joints
        }
    
    input_signature = [
        tf.TensorSpec(shape=(8,1), dtype=tf.int64),
        tf.TensorSpec(shape=(8,120,4), dtype=tf.float32),
        tf.TensorSpec(shape=(8,120), dtype=tf.float32),
        tf.TensorSpec(shape=(8,120,17,3), dtype=tf.float32)
    ]

    return post_processing_patches_f, input_signature

def create_post_processing_patches_onnx(
    post_processing_patches_tf_f, input_signature,
    output_file = MODELS_FOLDER.joinpath('post_processing_patches.onnx')):
    
    onnx_model, external_tensor_storage = tf2onnx.convert.from_function(post_processing_patches_tf_f,
                    input_signature=input_signature, 
                    opset=17, custom_ops=None, # 
                    custom_op_handlers=None, custom_rewriter=None, inputs_as_nchw=None,
                    outputs_as_nchw=None, extra_opset=None, shape_override=None,
                    target=None, large_model=False, output_path=None)
    
    onnx.save(onnx_model, output_file)

When I run this model using onnxruntime on its own it works correctly. The problem arises when I merge this model with a previously created one. The inputs to the model are exactly of the same shape and data type. Besides, the problem occurs in one of the yolo_nas_pose's onnx nodes. The error I get is the following.

RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running Concat node. Name:'posegraph2_/Concat' Status Message: /Users/runner/work/1/s/onnxruntime/core/framework/op_kernel.cc:83 virtual OrtValue *onnxruntime::OpKernelContext::OutputMLValue(int, const onnxruntime::TensorShape &) status.IsOK() was false. Shape mismatch attempting to re-use buffer. {4,3} != {6,3}. Validate usage of dim_value (values should be > 0) and dim_param (all values with the same string should equate to the same size) in shapes in the model.

Does anyone know what can be happening? I have tried a shape_inference, but the error still occurs.

To reproduce

To reproduce generate the two models stated above (necessary code is in the describe the issue section) and merge the models by using this code snippet.

full_pose_model = onnx.compose.merge_models(pose_model, post_processing_patches, [
    ('graph2_num_predictions','num_predictions'),
    ('graph2_post_nms_boxes', 'boxes'),
    ('graph2_post_nms_scores', 'scores'),
    ('graph2_post_nms_joints', 'joints')
    ], prefix2='pose_')

onnx.save(full_pose_model, directory_path.joinpath('full_pose_model.onnx'))

Then, concatenate this model with a previous model that can be merged with this one (inputs/outputs correspond).

Urgency

No response

Platform

Mac

OS Version

Sonoma 14.1

ONNX Runtime Installation

Built from Source

ONNX Runtime Version or Commit ID

1.18.1

ONNX Runtime API

Python

Architecture

X64

Execution Provider

Default CPU

Execution Provider Library Version

No response

@wangyems
Copy link
Contributor

hi @nachoogriis, I'm not an expert on onnx.compose.merge_models() but since you mentioned that merging models causes ORT exception in the first model, it's suggested to check if the merging functionality ever change the graph of the model, especially the "concat" part.

A workaround for your case is to merge the models in the original framework and export them as a whole onnx model.

Copy link
Contributor

This issue has been automatically marked as stale due to inactivity and will be closed in 30 days if no further activity occurs. If further support is needed, please provide an update and/or more details.

@github-actions github-actions bot added the stale issues that have not been addressed in a while; categorized by a bot label Aug 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale issues that have not been addressed in a while; categorized by a bot
Projects
None yet
Development

No branches or pull requests

2 participants