Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Build] Why does TensorRT EP need the full version of protobuf? #18040

Open
maekawatoshiki opened this issue Oct 20, 2023 · 10 comments
Open

[Build] Why does TensorRT EP need the full version of protobuf? #18040

maekawatoshiki opened this issue Oct 20, 2023 · 10 comments
Labels
build build issues; typically submitted using template ep:CUDA issues related to the CUDA execution provider ep:TensorRT issues related to TensorRT execution provider

Comments

@maekawatoshiki
Copy link

Describe the issue

With TensorRT 8.6.1 and --enable_training, I encountered the same problem mentioned at #15131.

After digging into build scripts, I figured out that this line is the culprit for my case.
By forcing onnxruntime_USE_FULL_PROTOBUF=OFF, I was able to build onnxruntime without errors.

Does different version of TensorRT need the full protobuf? Even if so, I'd like to know the right way to build ORT with tensorrt and training enabled.

Urgency

Not urgent, but at least I'd like to know a better way to resolve this if any.

Target platform

Linux

Build script

./build.sh \
  --config Release \
  --use_cache \
  --cmake_generator Ninja \
  --parallel \
  --build_wheel \
  --enable_pybind \
  --enable_training \
  --use_cuda \
    --cudnn_home /path/to/cuda/ \
    --cuda_home /path/to/cuda \
    --cuda_version=11.6 \
  --build_micro_benchmarks \
  --cmake_extra_defi CMAKE_EXPORT_COMPILE_COMMANDS=ON \
  --build_shared_lib \
  --skip_tests \
  --skip_submodule_sync \
  --enable_cuda_profiling \
  --use_tensorrt \
    --tensorrt_home /path/to/TensorRT-8.6.1.6

Error / output

Exactly the same as #15131

Visual Studio Version

No response

GCC / Compiler Version

gcc 9.4.0

@maekawatoshiki maekawatoshiki added the build build issues; typically submitted using template label Oct 20, 2023
@github-actions github-actions bot added ep:CUDA issues related to the CUDA execution provider ep:TensorRT issues related to TensorRT execution provider labels Oct 20, 2023
@maekawatoshiki maekawatoshiki changed the title [Build] Why does TensorRT need the full version of protobuf? [Build] Why does TensorRT EP need the full version of protobuf? Oct 20, 2023
@jywu-msft
Copy link
Member

it's because TensorRT EP has a dependency on onnx-tensorrt and that doesn't support protobuf-lite

@maekawatoshiki
Copy link
Author

Is building ort without full protobuf expected to fail? In my environment it successfully finished building.

@jywu-msft
Copy link
Member

jywu-msft commented Oct 20, 2023

does it work at runtime?
i think it used to fail due to a dependency on an api that wasn't in protobuf-lite.
it's been awhile. i will take a look at it again and get back to you.

@sl1pkn07
Copy link

Hi

comment that line gets ort tensorrt without problems. now need test runtime (fixed(?) #15131 )

greetings

@maekawatoshiki
Copy link
Author

@jywu-msft At least, ort.InferenceSession(classification_model, providers=["TensorrtExecutionProvider"]).run(...) runs without errors.
I had better run some unit tests for tensorrt though.

Copy link
Contributor

This issue has been automatically marked as stale due to inactivity and will be closed in 7 days if no further activity occurs. If further support is needed, please provide an update and/or more details.

@github-actions github-actions bot added the stale issues that have not been addressed in a while; categorized by a bot label Nov 20, 2023
@sl1pkn07
Copy link

ping?

@jywu-msft
Copy link
Member

FYI: #18413

@sl1pkn07
Copy link

oh. ok. then need wait to release 1.16.3

greetings

@github-actions github-actions bot removed the stale issues that have not been addressed in a while; categorized by a bot label Nov 21, 2023
@sl1pkn07
Copy link

sadly, seems tensorboard now needs full protobuf

FAILED: tensorboard/compat/proto/CMakeFiles/tensorboard.dir/config.pb.cc.o 
/opt/cuda/bin/g++ -DDNNL_OPENMP -DEIGEN_MPL2_ONLY -DENABLE_STRIDED_TENSORS -DENABLE_TRAINING -DENABLE_TRAINING_APIS -DENABLE_TRAINING_CORE -DENABLE_TRAINING_OPS -DORT_ENABLE_STREAM -D_GNU_SOURCE -I/tmp/makepkg/python-onnxruntime/src/onnxruntime/include/onnxruntime -I/tmp/makepkg/python-onnxruntime/src/onnxruntime/include/onnxruntime/core/session -I/tmp/makepkg/python-onnxruntime/src/onnxruntime/orttraining/orttraining/training_api/include -I/tmp/makepkg/python-onnxruntime/src/build/_deps/protobuf-src/src -I/tmp/makepkg/python-onnxruntime/src/build -march=native -O2 -pipe -fno-plt -fexceptions         -Wp,-D_FORTIFY_SOURCE=0 -Wformat -Werror=format-security         -fstack-clash-protection -fcf-protection -Wp,-D_GLIBCXX_ASSERTIONS -g -ffile-prefix-map=/tmp/makepkg/python-onnxruntime/src=/usr/src/debug/python-onnxruntime -Wno-maybe-uninitialized -Wno-error=restrict -ffunction-sections -fdata-sections -DCPUINFO_SUPPORTED -g -std=gnu++17 -fPIC -MD -MT tensorboard/compat/proto/CMakeFiles/tensorboard.dir/config.pb.cc.o -MF tensorboard/compat/proto/CMakeFiles/tensorboard.dir/config.pb.cc.o.d -o tensorboard/compat/proto/CMakeFiles/tensorboard.dir/config.pb.cc.o -c /tmp/makepkg/python-onnxruntime/src/build/tensorboard/compat/proto/config.pb.cc
/tmp/makepkg/python-onnxruntime/src/build/tensorboard/compat/proto/config.pb.cc:698:6: error: '::descriptor_table_tensorboard_2fcompat_2fproto_2fcluster_2eproto' has not been declared
  698 |   &::descriptor_table_tensorboard_2fcompat_2fproto_2fcluster_2eproto,
      |      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/tmp/makepkg/python-onnxruntime/src/build/tensorboard/compat/proto/config.pb.cc:699:6: error: '::descriptor_table_tensorboard_2fcompat_2fproto_2fcost_5fgraph_2eproto' has not been declared
  699 |   &::descriptor_table_tensorboard_2fcompat_2fproto_2fcost_5fgraph_2eproto,
      |      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/tmp/makepkg/python-onnxruntime/src/build/tensorboard/compat/proto/config.pb.cc:700:6: error: '::descriptor_table_tensorboard_2fcompat_2fproto_2fdebug_2eproto' has not been declared
  700 |   &::descriptor_table_tensorboard_2fcompat_2fproto_2fdebug_2eproto,
      |      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/tmp/makepkg/python-onnxruntime/src/build/tensorboard/compat/proto/config.pb.cc:701:6: error: '::descriptor_table_tensorboard_2fcompat_2fproto_2fgraph_2eproto' has not been declared
  701 |   &::descriptor_table_tensorboard_2fcompat_2fproto_2fgraph_2eproto,
      |      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/tmp/makepkg/python-onnxruntime/src/build/tensorboard/compat/proto/config.pb.cc:702:6: error: '::descriptor_table_tensorboard_2fcompat_2fproto_2frewriter_5fconfig_2eproto' has not been declared
  702 |   &::descriptor_table_tensorboard_2fcompat_2fproto_2frewriter_5fconfig_2eproto,
      |      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/tmp/makepkg/python-onnxruntime/src/build/tensorboard/compat/proto/config.pb.cc:703:6: error: '::descriptor_table_tensorboard_2fcompat_2fproto_2fstep_5fstats_2eproto' has not been declared
  703 |   &::descriptor_table_tensorboard_2fcompat_2fproto_2fstep_5fstats_2eproto,
      |      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

:(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build build issues; typically submitted using template ep:CUDA issues related to the CUDA execution provider ep:TensorRT issues related to TensorRT execution provider
Projects
None yet
Development

No branches or pull requests

3 participants