[Build] Handling Multiple ONNX Runtime Sessions Sequentially in Docker #19309
Labels
build
build issues; typically submitted using template
core runtime
issues related to core runtime
stale
issues that have not been addressed in a while; categorized by a bot
Describe the issue
We have a Flask-based API for running computer vision models (YOLO and classifiers) using ONNX Runtime. The models, originally trained in PyTorch, are converted to ONNX format. In the local environment, the system performs well, allowing for the sequential loading and inference of different ONNX models. However, when deployed in Docker, we observe that only the first ONNX model loaded is available for inference, and additional inference sessions cannot be initiated concurrently.
The process flow involves:
We suspect this might be a resource allocation or session management issue within the Docker environment. The primary question is whether implementing multi-threading within the Docker container could resolve this, and if so, how to approach this.
The models are being loaded for inference using :
Expected Behavior:
Each model (YOLO and subsequent classifiers) should be loaded and run independently in their respective ONNX Runtime sessions within the Docker environment, similar to the local setup.
Observed Behavior:
Only the first model (YOLO) loaded in ONNX Runtime is available for inference. Subsequent attempts to load additional models for inference within the same Docker session are unsuccessful.
Build script
Error / output
[2024-01-29 13:03:20,899] ERROR in app: Exception on /predict [POST]
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/flask/app.py", line 1463, in wsgi_app
response = self.full_dispatch_request()
File "/usr/local/lib/python3.10/site-packages/flask/app.py", line 872, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/usr/local/lib/python3.10/site-packages/flask/app.py", line 870, in full_dispatch_request
rv = self.dispatch_request()
File "/usr/local/lib/python3.10/site-packages/flask/app.py", line 855, in dispatch_request
return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args) # type: ignore[no-any-return]
File "/usr/src/app/server.py", line 38, in predict
stonetype_result = stonetype_predict(resized_image)
File "/usr/src/app/services/stonetype_service/app/server.py", line 35, in predict
output = sessionStoneType.run(None, {input_name: image_tensor})
File "/usr/local/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 220, in run
return self._sess.run(output_names, input_feed, run_options)
onnxruntime.capi.onnxruntime_pybind11_state.InvalidArgument: [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Got invalid dimensions for input: images for the following indices
index: 2 Got: 224 Expected: 640
index: 3 Got: 224 Expected: 640
Please fix either the inputs or the model.
The text was updated successfully, but these errors were encountered: