Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

onnxruntime build with --enable_training option fails using MSVC v143 due to ambiguous definition of byte #19269

Closed
mtavenrath opened this issue Jan 25, 2024 · 4 comments
Labels
platform:windows issues related to the Windows platform training issues related to ONNX Runtime training; typically submitted using template

Comments

@mtavenrath
Copy link
Contributor

mtavenrath commented Jan 25, 2024

Describe the issue

training_session_test_utils.cc fails to compile with the following error:

1>C:\Program Files (x86)\Windows Kits\10\Include\10.0.22000.0\um\objidlbase.h(2275,74): error C2872: 'byte': ambiguous symbol
1>(compiling source file '../../../orttraining/orttraining/test/session/training_session_test_utils.cc')
1>C:\Program Files (x86)\Windows Kits\10\Include\10.0.22000.0\shared\rpcndr.h(191,23):
1>could be 'unsigned char byte'
1>C:\Program Files\Microsoft Visual Studio\2022\Professional\VC\Tools\MSVC\14.38.33130\include\cstddef(34,24):
1>or       'std::byte'

This is an STL issue described here: microsoft/STL#204 which references WinSDK issue https://microsoft.visualstudio.com/OS/_workitems/edit/11785508

The suggested workaround is to define _HAS_STD_BYTE 0 before any include. While this solves the original issue it introduces a new error:

1>D:\git\onnxruntime\include\onnxruntime\core\common\span_utils.h(82,43): error C2039: 'byte': is not a member of 'std'
1>(compiling source file '../../../orttraining/orttraining/test/session/training_session_test_utils.cc')
1>C:\Program Files\Microsoft Visual Studio\2022\Professional\VC\Tools\MSVC\14.38.33130\include\cwctype(40,1):
1>see declaration of 'std'

because std::byte isn't defined anymore after adding the define. This makes it impossible to build the full training with tests on windows.

To reproduce

I used to the following cmdline to trigger this problem using Visual Studio 2022 17.8.5.

build.bat --config RelWithDebInfo --enable_nvtx_profile --enable_training --cuda_home=%CUDA_PATH% --cudnn_home=%CUDA_PATH% --use_cuda --enable_nvtx_profile --use_dml --parallel  --skip_tests --cmake_generator "Visual Studio 17 2022" --cmake_extra_defines CMAKE_CUDA_ARCHITECTURES=89 CMAKE_CUDA_FLAGS="-DDISABLE_CUSPARSE_DEPRECATED --disable-warnings" CMAKE_INSTALL_PREFIX=d:\git\onnxruntime\install

Urgency

No response

Platform

Windows

OS Version

22H2

ONNX Runtime Installation

Built from Source

ONNX Runtime Version or Commit ID

5f3113e

ONNX Runtime API

C++

Architecture

X64

Execution Provider

Other / Unknown

Execution Provider Library Version

No response

@github-actions github-actions bot added the platform:windows issues related to the Windows platform label Jan 25, 2024
@yf711 yf711 added the training issues related to ONNX Runtime training; typically submitted using template label Jan 26, 2024
@snnn
Copy link
Member

snnn commented Jan 27, 2024

Can you avoid enabling both DML and CUDA at the same time? I do not expect they would work together. And the training code can only use GPUs through CUDA or ROCM, not DML.

@mtavenrath
Copy link
Contributor Author

I could avoid enabling DML and CUDA at the same time. Yet it won't solve the bug STL/Windows SDK bug I'm currently encountering. The question is if onnxruntime should try to workaround the bug or push to get the origin of the bug fixed.

@snnn
Copy link
Member

snnn commented Jan 29, 2024

We have a Windows Training CUDA pipeline that builds the code with "--enable_training“ and CUDA SDKs. It did not encounter the problem. So I'm curious to know how the warning got triggered on your machine.

@mtavenrath
Copy link
Contributor Author

Building the pipeline only with CUDA, but without DML works on my machine as well. The issue is triggered when enabling the DML backend as well with with "--enable-training"

@snnn snnn closed this as not planned Won't fix, can't repro, duplicate, stale Feb 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
platform:windows issues related to the Windows platform training issues related to ONNX Runtime training; typically submitted using template
Projects
None yet
Development

No branches or pull requests

3 participants