[Performance] Very slow load of ONNX model in Windows #22219
Labels
performance
issues related to performance regressions
platform:windows
issues related to the Windows platform
stale
issues that have not been addressed in a while; categorized by a bot
Describe the issue
I am trying to load XGBoost onnx models using onnxruntime on Windows machine.
The model size is 52 mb and the RAM it is consuming on loading is 1378.9 MB. The time to load the model is 15 mins!!
This behavior is observed only on Windows, in Linux the models are loaded in few seconds. but the memory consumption is high in Linux as well.
I tried solution suggested in [https://github.com//issues/3802#issuecomment-624464802] but getting this error
AttributeError: 'onnxruntime.capi.onnxruntime_pybind11_state.SessionOptions' object attribute 'graph_optimization_level' is read-only
This is the simple code I used to load the model,
# sess = rt.InferenceSession(modelSav_path, providers=["CPUExecutionProvider"])
To reproduce
Train and a XGBoost classification model following params:
`
Classifier
update_registered_converter(
XGBClassifier,
"XGBoostXGBClassifier",
calculate_linear_classifier_output_shapes,
convert_xgboost,
options={"nocl": [True, False], "zipmap": [True, False, "columns"]},
)
param = {'n_estimators': 3435, 'max_delta_step': 6, 'learning_rate': 0.030567232354470994, 'base_score': 0.700889637773676, 'scale_pos_weight': 0.29833333651319716, 'booster': 'gbtree', 'reg_lambda': 0.0005531812782988272, 'reg_alpha': 4.8213852607021606e-05, 'subsample': 0.9816268623744107, 'colsample_bytree': 0.3187040821569215, 'max_depth': 17, 'min_child_weight': 2, 'eta': 6.2582977222245746e-06, 'gamma': 2.2248460288603035e-07, 'grow_policy': 'depthwise'}
x_train.columns = range(x_train.shape[1])
x_test.columns = range(x_train.shape[1])
pipe = Pipeline([("xgb", MultiOutputClassifier(XGBClassifier(**param)))])
pipe.fit(x_train.to_numpy(), y_train)
model_onnx = convert_sklearn(
pipe,
"pipeline_xgboost",
[("input", FloatTensorType([None, x_train.shape[1]]))],
verbose=1,
target_opset={"": 12, "ai.onnx.ml": 2},
)
with open("modelname.onnx", "wb") as f:
f.write(model_onnx.SerializeToString())
`
Train and a XGBoost regressor model following params:
`
Regressor
update_registered_converter(
XGBRegressor,
"XGBoostXGBRegressor",
calculate_linear_regressor_output_shapes,
convert_xgboost,
)
param = {'n_estimators': 3435, 'max_delta_step': 6, 'learning_rate': 0.030567232354470994, 'base_score': 0.700889637773676, 'scale_pos_weight': 0.29833333651319716, 'booster': 'gbtree', 'reg_lambda': 0.0005531812782988272, 'reg_alpha': 4.8213852607021606e-05, 'subsample': 0.9816268623744107, 'colsample_bytree': 0.3187040821569215, 'max_depth': 17, 'min_child_weight': 2, 'eta': 6.2582977222245746e-06, 'gamma': 2.2248460288603035e-07, 'grow_policy': 'depthwise'}
x_train.columns = range(x_train.shape[1])
x_test.columns = range(x_train.shape[1])
pipe = Pipeline([("xgb", MultiOutputRegressor(XGBRegressor(**param)))])
pipe.fit(x_train.to_numpy(), y_train)
model_onnx = convert_sklearn(
pipe,
"pipeline_xgboost",
[("input", FloatTensorType([None, x_train.shape[1]]))],
verbose=1,
target_opset={"": 12, "ai.onnx.ml": 2},
options={type(pipe):{'zipmap':False}}
)
with open("modelname.onnx", "wb") as f:
f.write(model_onnx.SerializeToString())`
Load the model with following code,
sess = rt.InferenceSession(modelSav_path, providers=["CPUExecutionProvider"])
And observe the load time and RAM usage.
Urgency
This is release critical issue, since we can't deliver these models with such low performance. Although the models are performing well, we are stuck with the loading time issue. We also thought to use other libraries to package the ML models but we don't have necessary compliance also we trust Microsoft.
Platform
Windows
OS Version
11
ONNX Runtime Installation
Built from Source
ONNX Runtime Version or Commit ID
1.18.1
ONNX Runtime API
Python
Architecture
X64
Execution Provider
Default CPU
Execution Provider Library Version
No response
Model File
No response
Is this a quantized model?
No
The text was updated successfully, but these errors were encountered: