Inference Layer by Layer or feature extraction on Onnx Runtime #19954

IzanCatalan · 2024-03-16T12:49:27Z

Describe the issue

Hi everyone, I would like to know if performing a layer-per-layer inference on Onnx Runtime with a pre-trained model (in fp32 or int8 datatypes) is possible.

My idea is to use several fp32 and int8-quantized models from ONNX Model Zoo Repo and then do the inference layer by layer to achieve a feature extraction. After this, I would modify the outputs from each layer, and I would use them as a new input for the following layers.

The approximate code would be something similar to this one:

model_path = "model.onnx"
ort_session = ort.InferenceSession(model_path)

input_data = np.random.randn(1, 3, 32, 32).astype(np.float32)

conv1_output = ort_session.run(None, {'input1': input_data})[0]

conv2_output = ort_session.run(None, {'input2': conv1_output})[0]

# Now, I can work with intermediate outputs, modify them and use them as new inputs

However, I tried to reproduce this code with a resnet50 pre-trained model from ONNX Model Zoo Repo, but it seems this model, like the rest of pre-trained models, only has one input and one output (no way of accessing to intermediate outputs.

So, is there any way I could do this?

Thank you!

To reproduce

I am running onnxruntime build from source for cuda 11.2, GCC 9.5, cmake 3.27 and python 3.8 with ubuntu 20.04.

Urgency

No response

Platform

Linux

OS Version

20.04

ONNX Runtime Installation

Built from Source

ONNX Runtime Version or Commit ID

onnxruntime-gpu1.12.0

ONNX Runtime API

Python

Architecture

X64

Execution Provider

CUDA

Execution Provider Library Version

Cuda 11.2

The text was updated successfully, but these errors were encountered:

hariharans29 · 2024-03-18T22:20:01Z

No, ORT does not support this scenario. Each "session" conceptually maps to an entire model, not a portion of the model.

To achieve what you want, you would have to break-up the layers that you are interested in each model into separate models and chain them together like the sample code you pasted.

Hope this helps.

github-actions · 2024-04-21T15:00:53Z

This issue has been automatically marked as stale due to inactivity and will be closed in 30 days if no further activity occurs. If further support is needed, please provide an update and/or more details.

github-actions bot added the ep:CUDA issues related to the CUDA execution provider label Mar 16, 2024

sophies927 removed the ep:CUDA issues related to the CUDA execution provider label Mar 21, 2024

github-actions bot added the stale issues that have not been addressed in a while; categorized by a bot label Apr 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inference Layer by Layer or feature extraction on Onnx Runtime #19954

Inference Layer by Layer or feature extraction on Onnx Runtime #19954

IzanCatalan commented Mar 16, 2024

hariharans29 commented Mar 18, 2024

github-actions bot commented Apr 21, 2024

Inference Layer by Layer or feature extraction on Onnx Runtime #19954

Inference Layer by Layer or feature extraction on Onnx Runtime #19954

Comments

IzanCatalan commented Mar 16, 2024

Describe the issue

To reproduce

Urgency

Platform

OS Version

ONNX Runtime Installation

ONNX Runtime Version or Commit ID

ONNX Runtime API

Architecture

Execution Provider

Execution Provider Library Version

hariharans29 commented Mar 18, 2024

github-actions bot commented Apr 21, 2024