[Performance] Difference in the ONNX model loading times in C# vs Python #22528

BhSinghal · 2024-10-21T22:36:14Z

Describe the issue

We create models in Azure ML pipelines and convert them into ONNX format.
Recently, we increased the number of estimators of an ensemble model [LGBMClassifier] to 300 which increased the file size to ~200MB.
The older model had 33 estimators, and the file size was around ~2MB.

When we try to create an InferenceSession with this new bigger file using C# we are able to do so in ~10 min while creating an InferenceSession with the same file via Python takes ~15 seconds. [onxx_results.png in the OneDrive link shared on email.]

We want to understand this difference in performance.

We have forwarded the link of OneDrive folder with the code and files as a separate email with subject : [Performance] Difference in the ONNX model loading times in C# vs Python

To reproduce

The OneDrive folder as mentioned in the email sent with subject : [Performance] Difference in the ONNX model loading times in C# vs Python contains the README.md.txt which outlines all the steps to reproduce the issue.

The "old" folder contains ~2MB model with 33 estimators.
The "new" folder contains ~200MB model with 300 estimators.

Please let us know if any other information is needed.

Urgency

No response

Platform

Windows

OS Version

Microsoft Windows 11 Enterprise, 10.0.26100 Build 26100, Surface Laptop 5, 12th Gen Intel(R) Core(TM) i7-1265U, 2700 Mhz, 10 Core(s), 12 Logical Processor(s)

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

ONNX 1.18

ONNX Runtime API

C#

Architecture

X64

Execution Provider

Default CPU

Execution Provider Library Version

No response

Model File

No response

Is this a quantized model?

No

BhSinghal added the performance issues related to performance regressions label Oct 21, 2024

github-actions bot added the api:CSharp issues related to the C# API label Oct 21, 2024

yuslepukhin closed this as completed Oct 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Performance] Difference in the ONNX model loading times in C# vs Python #22528

[Performance] Difference in the ONNX model loading times in C# vs Python #22528

BhSinghal commented Oct 21, 2024 •

edited

Loading

[Performance] Difference in the ONNX model loading times in C# vs Python #22528

[Performance] Difference in the ONNX model loading times in C# vs Python #22528

Comments

BhSinghal commented Oct 21, 2024 • edited Loading

Describe the issue

To reproduce

Urgency

Platform

OS Version

ONNX Runtime Installation

ONNX Runtime Version or Commit ID

ONNX Runtime API

Architecture

Execution Provider

Execution Provider Library Version

Model File

Is this a quantized model?

BhSinghal commented Oct 21, 2024 •

edited

Loading