-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Errors in grpc_server.cc while installing onnxruntime backend #6989
Comments
ONNX Runtime backend is not included with 24.02 release due to incompatibility issues. However iGPU and Windows build assets shipped with ONNX Runtime backend. |
I have pulled the container id nvcr.io/nvidia/tritonserver:24.01-py3 to
install 24.01 version. I followed the instruction to build
onnxruntime backend using cmake using the following codes:-
cd server
cd build
cmake -DCMAKE_INSTALL_PREFIX:PATH=`pwd`/install
-DTRITON_BUILD_ONNXRUNTIME_VERSION=1.17.1
-DTRITON_BUILD_CONTAINER_VERSION=24.01 ..
make install
But it gives this error:-
/home/aniket/server/src/grpc_server.cc:43:10: fatal error:
../classification.h: No such file or directory
43 | #include "../classification.h"
| ^~~~~~~~~~~~~~~~~~~~~
compilation terminated.
make[5]: *** [CMakeFiles/grpc-endpoint-library.dir/build.make:76:
CMakeFiles/grpc-endpoint-library.dir/grpc_server.cc.o] Error 1
make[4]: *** [CMakeFiles/Makefile2:440:
CMakeFiles/grpc-endpoint-library.dir/all] Error 2
make[3]: *** [Makefile:136: all] Error 2
make[2]: *** [CMakeFiles/triton-server.dir/build.make:86:
triton-server/src/triton-server-stamp/triton-server-build] Error 2
make[1]: *** [CMakeFiles/Makefile2:193: CMakeFiles/triton-server.dir/all]
Error 2
make: *** [Makefile:136: all] Error 2
…On Fri, 15 Mar 2024 at 23:04, Harshini Komali ***@***.***> wrote:
ONNX Runtime backend is not included with 24.02 release due to incompatibility
issues <microsoft/onnxruntime#19419>. However
iGPU and Windows build assets shipped with ONNX Runtime backend.
—
Reply to this email directly, view it on GitHub
<#6989 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/A3A7RICL32LR3DXCREE56U3YYMWLTAVCNFSM6AAAAABEYFU2SGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMBQGEZTGMRRGY>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
@Aniket-20 Triton 24.01 comes with onnx runtime backend. You can pull the 24.01 image and run a docker container and you should be able to deploy onnx models without building it. Here's a tutorial explaining how to deploy an onnx model - https://github.com/triton-inference-server/tutorials/blob/main/Quick_Deploy/ONNX/README.md |
Thanks, I'll check it out.
…On Sat, 16 Mar, 2024, 2:42 am Harshini Komali, ***@***.***> wrote:
@Aniket-20 <https://github.com/Aniket-20> Triton 24.01 comes with onnx
runtime backend. You can pull the 24.01 image and run a docker container
and you should be able to deploy onnx models without building it. Here's a
tutorial explaining how to deploy an onnx model -
https://github.com/triton-inference-server/tutorials/blob/main/Quick_Deploy/ONNX/README.md
If you are modifying the onnx backend and want to build it with your
changes, please provide more details on what changes you are trying to make
and if possible the changes you made.
—
Reply to this email directly, view it on GitHub
<#6989 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/A3A7RIDMNARRDVI5BYP2EPDYYNP5FAVCNFSM6AAAAABEYFU2SGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMBQGQ2DMMRWG4>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
What are the supported frameworks for backend in NVIDIA Triton server? I am
using this container : nvcr.io/nvidia/tritonserver:24.01-py3-igpu
On Sat, 16 Mar 2024 at 03:38, Aniket Gupta Sharma ***@***.***>
wrote:
… Thanks, I'll check it out.
On Sat, 16 Mar, 2024, 2:42 am Harshini Komali, ***@***.***>
wrote:
> @Aniket-20 <https://github.com/Aniket-20> Triton 24.01 comes with onnx
> runtime backend. You can pull the 24.01 image and run a docker container
> and you should be able to deploy onnx models without building it. Here's a
> tutorial explaining how to deploy an onnx model -
> https://github.com/triton-inference-server/tutorials/blob/main/Quick_Deploy/ONNX/README.md
> If you are modifying the onnx backend and want to build it with your
> changes, please provide more details on what changes you are trying to make
> and if possible the changes you made.
>
> —
> Reply to this email directly, view it on GitHub
> <#6989 (comment)>,
> or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/A3A7RIDMNARRDVI5BYP2EPDYYNP5FAVCNFSM6AAAAABEYFU2SGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMBQGQ2DMMRWG4>
> .
> You are receiving this because you were mentioned.Message ID:
> ***@***.***>
>
|
Please look at this Support Matrix guide - https://docs.nvidia.com/deeplearning/frameworks/support-matrix/index.html |
***@***.***:/opt/tritonserver# tritonserver
--model-repository=/mnt/models
I0319 17:48:18.709411 193 pinned_memory_manager.cc:275] Pinned memory pool
is created at '0x7f0316000000' with size 268435456
I0319 17:48:18.709824 193 cuda_memory_manager.cc:107] CUDA memory pool is
created on device 0 with size 67108864
E0319 17:48:18.714876 193 model_repository_manager.cc:1325] Poll failed for
model directory 'reports': Invalid model name: Could not determine backend
for model 'reports' with no backend in model configuration. Expected model
name of the form 'model.<backend_name>'.
I0319 17:48:18.714938 193 model_lifecycle.cc:461] loading: nllb:1
I0319 17:48:18.714972 193 model_lifecycle.cc:461] loading: densenet_onnx:1
I0319 17:48:18.719683 193 onnxruntime.cc:2610] TRITONBACKEND_Initialize:
onnxruntime
I0319 17:48:18.719735 193 onnxruntime.cc:2620] Triton TRITONBACKEND API
version: 1.17
I0319 17:48:18.719742 193 onnxruntime.cc:2626] 'onnxruntime' TRITONBACKEND
API version: 1.17
I0319 17:48:18.719752 193 onnxruntime.cc:2656] backend configuration:
{"cmdline":{"auto-complete-config":"true","backend-directory":"/opt/tritonserver/backends","min-compute-capability":"6.000000","default-max-batch-size":"4"}}
I0319 17:48:18.738488 193 onnxruntime.cc:2721]
TRITONBACKEND_ModelInitialize: densenet_onnx (version 1)
I0319 17:48:18.738994 193 onnxruntime.cc:694] skipping model configuration
auto-complete for 'densenet_onnx': inputs and outputs already specified
I0319 17:48:18.739550 193 onnxruntime.cc:2786]
TRITONBACKEND_ModelInstanceInitialize: densenet_onnx_0 (GPU device 0)
I0319 17:48:18.998704 193 model_lifecycle.cc:827] successfully loaded
'densenet_onnx'
I0319 17:48:22.212668 193 python_be.cc:2362]
TRITONBACKEND_ModelInstanceInitialize: nllb_0_0 (GPU device 0)
config.json:
100%|###########################################################################################################################################|
846/846 [00:00<00:00, 4.33MB/s]
pytorch_model.bin:
100%|#################################################################################################################################|
2.46G/2.46G [07:52<00:00, 5.21MB/s]
generation_config.json:
100%|#################################################################################################################################|
189/189 [00:00<00:00, 339kB/s]
I0319 17:56:31.427227 193 model_lifecycle.cc:827] successfully loaded 'nllb'
I0319 17:56:31.430189 193 server.cc:606]
+------------------+------+
| Repository Agent | Path |
+------------------+------+
+------------------+------+
I0319 17:56:31.430379 193 server.cc:633]
+-------------+-----------------------------------------------------------------+------------------------------------------------------------------------------------------------------------+
| Backend | Path
| Config
|
+-------------+-----------------------------------------------------------------+------------------------------------------------------------------------------------------------------------+
| python | /opt/tritonserver/backends/python/libtriton_python.so
|
{"cmdline":{"auto-complete-config":"true","backend-directory":"/opt/tritonserver/backends","min-compute-ca
|
| |
| pability":"6.000000","default-max-batch-size":"4"}}
|
| onnxruntime |
/opt/tritonserver/backends/onnxruntime/libtriton_onnxruntime.so |
{"cmdline":{"auto-complete-config":"true","backend-directory":"/opt/tritonserver/backends","min-compute-ca
|
| |
| pability":"6.000000","default-max-batch-size":"4"}}
|
+-------------+-----------------------------------------------------------------+------------------------------------------------------------------------------------------------------------+
I0319 17:56:31.430787 193 server.cc:676]
+---------------+---------+--------+
| Model | Version | Status |
+---------------+---------+--------+
| densenet_onnx | 1 | READY |
| nllb | 1 | READY |
+---------------+---------+--------+
I0319 17:56:31.506130 193 metrics.cc:877] Collecting metrics for GPU 0:
NVIDIA GeForce GTX 1650
I0319 17:56:31.508786 193 metrics.cc:770] Collecting CPU metrics
I0319 17:56:31.510629 193 tritonserver.cc:2498]
+----------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| Option | Value
|
+----------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| server_id | triton
|
| server_version | 2.42.0
|
| server_extensions | classification sequence
model_repository model_repository(unload_dependents) schedule_policy
model_configuration system_shared_memory cuda_shared_memo |
| | ry binary_tensor_data parameters
statistics trace logging
|
| model_repository_path[0] | /mnt/models
|
| model_control_mode | MODE_NONE
|
| strict_model_config | 0
|
| rate_limit | OFF
|
| pinned_memory_pool_byte_size | 268435456
|
| cuda_memory_pool_byte_size{0} | 67108864
|
| min_supported_compute_capability | 6.0
|
| strict_readiness | 1
|
| exit_timeout | 30
|
| cache_enabled | 0
|
+----------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
I0319 17:56:31.510680 193 server.cc:307] Waiting for in-flight requests to
complete.
I0319 17:56:31.510693 193 server.cc:323] Timeout 30: Found 0 model versions
that have in-flight inferences
I0319 17:56:31.510832 193 server.cc:338] All models are stopped, unloading
models
I0319 17:56:31.510845 193 server.cc:345] Timeout 30: Found 2 live models
and 0 in-flight non-inference requests
I0319 17:56:31.511844 193 onnxruntime.cc:2838]
TRITONBACKEND_ModelInstanceFinalize: delete instance state
I0319 17:56:31.549081 193 onnxruntime.cc:2762] TRITONBACKEND_ModelFinalize:
delete model state
I0319 17:56:31.549910 193 model_lifecycle.cc:612] successfully unloaded
'densenet_onnx' version 1
I0319 17:56:32.511349 193 server.cc:345] Timeout 29: Found 1 live models
and 0 in-flight non-inference requests
W0319 17:56:32.513320 193 metrics.cc:631] Unable to get power limit for GPU
0. Status:Success, value:0.000000
I0319 17:56:33.511496 193 server.cc:345] Timeout 28: Found 1 live models
and 0 in-flight non-inference requests
W0319 17:56:33.516014 193 metrics.cc:631] Unable to get power limit for GPU
0. Status:Success, value:0.000000
I0319 17:56:33.581047 193 model_lifecycle.cc:612] successfully unloaded
'nllb' version 1
I0319 17:56:34.511649 193 server.cc:345] Timeout 27: Found 0 live models
and 0 in-flight non-inference requests
W0319 17:56:34.517163 193 metrics.cc:631] Unable to get power limit for GPU
0. Status:Success, value:0.000000
error: creating server: Internal - failed to load all models
There is an error as you can see tritonserver exited even when the models
are ready. Please provide some solutions to this.
…On Mon, 18 Mar 2024 at 22:07, Harshini Komali ***@***.***> wrote:
Please look at this Support Matrix guide -
https://docs.nvidia.com/deeplearning/frameworks/support-matrix/index.html
—
Reply to this email directly, view it on GitHub
<#6989 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/A3A7RICHBJBEIDNL25YAF4LYY4J35AVCNFSM6AAAAABEYFU2SGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMBUGQYDOMJYHE>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
It looks like an error with the model repository you are providing. Can you share the structure of your model repository as well as the config.pbtxt files? |
Actually, we removed the nllb model to a different repository and then load
this model in triton server. Doing this solved our issue.
…On Wed, 20 Mar 2024 at 00:08, Harshini Komali ***@***.***> wrote:
E0319 17:48:18.714876 193 model_repository_manager.cc:1325] Poll failed for
model directory 'reports': Invalid model name: Could not determine backend
for model 'reports' with no backend in model configuration. Expected model
name of the form 'model.<backend_name>'.
It looks like an error with the model repository you are providing. Can
you share the structure of your model repository as well as the
config.pbtxt files?
You can follow this guide on how to structure your model_repository and
creating config files -
https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/user_guide/model_configuration.html
You can have a look at the backends section in this guide to modify
config.pbtxt according to the backend you are using -
https://github.com/triton-inference-server/backend/blob/main/README.md#backends
—
Reply to this email directly, view it on GitHub
<#6989 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/A3A7RID7FUEDFRFAE7DDTQTYZCA3JAVCNFSM6AAAAABEYFU2SGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMBXHA3TSOJTGE>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Is there any guide to concurrent inference in NVIDIA triton server.
On Wed, 20 Mar 2024 at 00:11, Aniket Gupta Sharma ***@***.***>
wrote:
… Actually, we removed the nllb model to a different repository and then
load this model in triton server. Doing this solved our issue.
On Wed, 20 Mar 2024 at 00:08, Harshini Komali ***@***.***>
wrote:
> E0319 17:48:18.714876 193 model_repository_manager.cc:1325] Poll failed for
> model directory 'reports': Invalid model name: Could not determine backend
> for model 'reports' with no backend in model configuration. Expected model
> name of the form 'model.<backend_name>'.
>
> It looks like an error with the model repository you are providing. Can
> you share the structure of your model repository as well as the
> config.pbtxt files?
> You can follow this guide on how to structure your model_repository and
> creating config files -
> https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/user_guide/model_configuration.html
> You can have a look at the backends section in this guide to modify
> config.pbtxt according to the backend you are using -
> https://github.com/triton-inference-server/backend/blob/main/README.md#backends
>
> —
> Reply to this email directly, view it on GitHub
> <#6989 (comment)>,
> or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/A3A7RID7FUEDFRFAE7DDTQTYZCA3JAVCNFSM6AAAAABEYFU2SGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMBXHA3TSOJTGE>
> .
> You are receiving this because you were mentioned.Message ID:
> ***@***.***>
>
|
@Aniket-20 , here's the guide: https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/examples/jetson/concurrency_and_dynamic_batching/README.html I'll close this issue since it seems like it was resolved. Feel free to reach out with any other questions |
/home/aniket/server/src/grpc_server.cc: In lambda function:
/home/aniket/server/src/grpc_server.cc:826:24: error: narrowing conversion of ‘(int)byte_size’ from ‘int’ to ‘google::protobuf::stringpiece_internal::StringPiece::size_type’ {aka ‘long unsigned int’} [-Werror=narrowing]
826 | {buffer, (int)byte_size}, response->mutable_config());
| ^~~~~~~~~~~~~~
/home/aniket/server/src/grpc_server.cc: In instantiation of ‘TRITONSERVER_Error* triton::server::{anonymous}::InferResponseCompleteCommon(TRITONSERVER_Server*, TRITONSERVER_InferenceResponse*, inference::ModelInferResponse&, const triton::server::{anonymous}::AllocPayload&) [with ResponseType = inference::ModelInferResponse]’:
/home/aniket/server/src/grpc_server.cc:3800:69: required from here
/home/aniket/server/src/grpc_server.cc:3353:5: error: enumeration value ‘TRITONSERVER_PARAMETER_DOUBLE’ not handled in switch [-Werror=switch]
3353 | switch (type) {
| ^~~~~~
/home/aniket/server/src/grpc_server.cc: In instantiation of ‘TRITONSERVER_Error* triton::server::{anonymous}::InferResponseCompleteCommon(TRITONSERVER_Server*, TRITONSERVER_InferenceResponse*, inference::ModelInferResponse&, const triton::server::{anonymous}::AllocPayload&) [with ResponseType = inference::ModelStreamInferResponse]’:
/home/aniket/server/src/grpc_server.cc:4400:77: required from here
/home/aniket/server/src/grpc_server.cc:3353:5: error: enumeration value ‘TRITONSERVER_PARAMETER_DOUBLE’ not handled in switch [-Werror=switch]
cc1plus: all warnings being treated as errors
make[5]: *** [CMakeFiles/grpc-endpoint-library.dir/build.make:76: CMakeFiles/grpc-endpoint-library.dir/grpc_server.cc.o] Error 1
make[4]: *** [CMakeFiles/Makefile2:440: CMakeFiles/grpc-endpoint-library.dir/all] Error 2
make[3]: *** [Makefile:136: all] Error 2
make[2]: *** [CMakeFiles/triton-server.dir/build.make:86: triton-server/src/triton-server-stamp/triton-server-build] Error 2
make[1]: *** [CMakeFiles/Makefile2:193: CMakeFiles/triton-server.dir/all] Error 2
make: *** [Makefile:136: all] Error 2
Triton Information
nvcr.io/nvidia/tritonserver:24.02-py3
Version:24.02
Are you using the Triton container or did you build it yourself?
Using docker container
Using cmake to install onnxruntime backend:-
$ mkdir build
$ cd build
$ cmake -DCMAKE_INSTALL_PREFIX:PATH=
pwd
/install -DTRITON_BUILD_ONNXRUNTIME_VERSION=1.17.1 -DTRITON_BUILD_CONTAINER_VERSION=24.02..$ make install
Kindly assist in identifying the root causes of these compilation errors and suggesting appropriate solutions for successful server build and execution.
Thank you for your attention to this matter.
I'm ready to provide further guidance as needed. Please share any additional details or questions you may have.en.
The text was updated successfully, but these errors were encountered: