[Web] Light GBM .ort model multiple times larger than .onnx model #17691

rstokes92 · 2023-09-25T16:37:08Z

Describe the issue

When I create an Onnx Runtime model with python -m onnxruntime.tools.convert_onnx_models_to_ort to convert a Light GBM .onnx model the resulting .ort is 2-3 times larger than the original .onnx file. I think I understand that the ORT format primarily allows us to have a smaller build, but I was surprised that the model size increased.

Is this expected in general? Or could it be specific to LGBM/tree based model?

To reproduce

from onnxmltools import convert_lightgbm
from onnxmltools.convert.lightgbm.operator_converters.LightGbm import (
    convert_lightgbm,
)

from skl2onnx import convert_sklearn, update_registered_converter
from skl2onnx.common.data_types import FloatTensorType
from skl2onnx.common.shape_calculator import (
    calculate_linear_classifier_output_shapes,
)

from lightgbm import LGBMClassifier
from sklearn.datasets import load_iris


X, y = load_iris(return_X_y=True)

model = LGBMClassifier().fit(X, y)

update_registered_converter(
        LGBMClassifier,
        "LightGbmLGBMClassifier",
        calculate_linear_classifier_output_shapes,
        convert_lightgbm,
        options={"nocl": [True, False], "zipmap": [True, False]},
    )

dim = len(model.feature_name_)
initial_type = [("float_input", FloatTensorType([None, dim]))]
onnx_model = convert_sklearn(
    model,
    initial_types=initial_type,
    options={id(model): {"zipmap": False, "nocl": True}},
)

with open("lgbm_iris_test.onnx", "wb") as f:
    f.write(onnx_model.SerializeToString())

Gives lgbm_iris_test.onnx with size of 142.3kb

python -m onnxruntime.tools.convert_onnx_models_to_ort "lgbm_iris_test.onnx" --output_dir "lgbm_iris_ort"

Results in lgnb_iris_ort/lgbm_iris_test.ort with size 337kb

onnx==1.14.1
onnxconverter-common==1.14.0
onnxmltools==1.11.2
onnxruntime==1.16.0
skl2onnx==1.15.0

Urgency

No response

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

1.16

Execution Provider

Other / Unknown

The text was updated successfully, but these errors were encountered:

hariharans29 · 2023-09-25T18:57:46Z

CC: @skottmckay

Removing platform:web tag as this is not really a Web issue. Not sure what the appropriate tag is (probably a new tag tools as this is a question about the ONNX->ORT model conversion tool ?)

skottmckay · 2023-09-25T22:17:50Z

It's expected when the model has traditional ML operators as we don't currently pack integers. e.g. the ids in a TreeEnsembleClassifier node would use 64 bits in the ORT format model flatbuffer but would be packed into just the required bits when saved in an onnx protobuf.

We've considered adding this but there's never been a production use case that required it. Doing so would also mean you couldn't reference the data directly in the ORT format model flatbuffer so it would potentially cost more memory at runtime.

github-actions · 2023-10-28T15:01:01Z

This issue has been automatically marked as stale due to inactivity and will be closed in 7 days if no further activity occurs. If further support is needed, please provide an update and/or more details.

rstokes92 added the platform:web issues related to ONNX Runtime web; typically submitted using template label Sep 25, 2023

hariharans29 removed the platform:web issues related to ONNX Runtime web; typically submitted using template label Sep 25, 2023

github-actions bot added the stale issues that have not been addressed in a while; categorized by a bot label Oct 28, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Web] Light GBM .ort model multiple times larger than .onnx model #17691

[Web] Light GBM .ort model multiple times larger than .onnx model #17691

rstokes92 commented Sep 25, 2023

hariharans29 commented Sep 25, 2023 •

edited

Loading

skottmckay commented Sep 25, 2023

github-actions bot commented Oct 28, 2023

[Web] Light GBM .ort model multiple times larger than .onnx model #17691

[Web] Light GBM .ort model multiple times larger than .onnx model #17691

Comments

rstokes92 commented Sep 25, 2023

Describe the issue

To reproduce

Urgency

ONNX Runtime Installation

ONNX Runtime Version or Commit ID

Execution Provider

hariharans29 commented Sep 25, 2023 • edited Loading

skottmckay commented Sep 25, 2023

github-actions bot commented Oct 28, 2023

hariharans29 commented Sep 25, 2023 •

edited

Loading