New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Add numeric_limits for MLFloat16 and BFloat16 #22197

Merged

tianleiwu merged 5 commits into main from tlwu/fp16_bf16_limits

Sep 26, 2024

Contributor

tianleiwu commented Sep 24, 2024 •

edited

Loading

Description

Add std::numeric_limits for MLFloat16 and BFloat16.
Update some comments in csharp ORTFloat16.shared.cs.
Add unit tests (including Clip)

Note that the canonical NaN is not consistent in C++ and C#. C# uses negative quiet NaN as canonical NaN, while C++ uses positive quiet NaN. The choice of CSharp Float16.NaN is to be consistent with System.Half.NaN.

FP16 data returns from CUDA might have 7FFF as NaN; FP16 data from CPU provider might have 0x7E00 as NaN. Anyway there is no consistent canonical NaN in ORT right now. Because all these NaNs are aligned with IEEE spec, there shall not an issue in downstream.

Motivation and Context

std::numeric_limits is used in codebase but not defined for MLFloat16 and BFloat16. It causes some bugs like #21957 introduced by #21493.

tianleiwu requested a review from yuslepukhin

September 24, 2024 08:22


          Add numeric_limits for MLFloat16 and BFloat16

718e05b

tianleiwu force-pushed the tlwu/fp16_bf16_limits branch from 13e2aa7 to 718e05b Compare

September 24, 2024 08:26

snnn previously approved these changes

View reviewed changes

yuslepukhin reviewed

View reviewed changes

csharp/src/Microsoft.ML.OnnxRuntime/OrtFloat16.shared.cs Outdated Show resolved Hide resolved


          revert csharp NaN

4664ddc

tianleiwu dismissed snnn’s stale review via

4664ddc

September 24, 2024 22:01

tianleiwu requested review from yuslepukhin and snnn

September 24, 2024 22:16

Member

yuslepukhin commented Sep 25, 2024

Do we want to add a test for Clip or this is separate?

yuslepukhin previously approved these changes

View reviewed changes

Member

yuslepukhin left a comment


          Add tests

87ba638

tianleiwu dismissed yuslepukhin’s stale review via

87ba638

September 25, 2024 19:56

tianleiwu requested a review from yuslepukhin

September 25, 2024 19:56


          update cuda clip

ef4177d

yuslepukhin previously approved these changes

View reviewed changes

Member

yuslepukhin left a comment


          test Clip-12

44ee680

tianleiwu dismissed yuslepukhin’s stale review via

44ee680

September 25, 2024 21:30

yuslepukhin approved these changes

View reviewed changes

Member

yuslepukhin left a comment

github-advanced-security bot found potential problems

View reviewed changes

onnxruntime/core/providers/cuda/math/clip.cc

               template <typename T>
               struct Clip::ComputeImpl {
                 void operator()(cudaStream_t stream, const Tensor* X, const Tensor* min, const Tensor* max, Tensor* Y) const {
-                  auto min_default = clip_internal::LowMax<T>::low();
-                  auto max_default = clip_internal::LowMax<T>::max();
+                  auto min_default = std::numeric_limits<T>::lowest();

Check warning

Code scanning / PREfast

The function 'std::numeric_limits::lowest' is constexpr, mark variable 'min_default' constexpr if compile-time evaluation is desired (con.5). Warning

The function 'std::numeric_limits::lowest' is constexpr, mark variable 'min_default' constexpr if compile-time evaluation is desired (con.5).

onnxruntime/core/providers/cuda/math/clip.cc

               template <typename T>
               struct Clip::ComputeImpl {
                 void operator()(cudaStream_t stream, const Tensor* X, const Tensor* min, const Tensor* max, Tensor* Y) const {
-                  auto min_default = clip_internal::LowMax<T>::low();
-                  auto max_default = clip_internal::LowMax<T>::max();
+                  auto min_default = std::numeric_limits<T>::lowest();

Check warning

Code scanning / PREfast

The function 'std::numeric_limits<__int64>::lowest' is constexpr, mark variable 'min_default' constexpr if compile-time evaluation is desired (con.5). Warning

The function 'std::numeric_limits<__int64>::lowest' is constexpr, mark variable 'min_default' constexpr if compile-time evaluation is desired (con.5).

onnxruntime/core/providers/cuda/math/clip.cc

               template <typename T>
               struct Clip::ComputeImpl {
                 void operator()(cudaStream_t stream, const Tensor* X, const Tensor* min, const Tensor* max, Tensor* Y) const {
-                  auto min_default = clip_internal::LowMax<T>::low();
-                  auto max_default = clip_internal::LowMax<T>::max();
+                  auto min_default = std::numeric_limits<T>::lowest();

Check warning

Code scanning / PREfast

The function 'std::numeric_limits::lowest' is constexpr, mark variable 'min_default' constexpr if compile-time evaluation is desired (con.5). Warning

The function 'std::numeric_limits::lowest' is constexpr, mark variable 'min_default' constexpr if compile-time evaluation is desired (con.5).

onnxruntime/core/providers/cuda/math/clip.cc

               template <typename T>
               struct Clip::ComputeImpl {
                 void operator()(cudaStream_t stream, const Tensor* X, const Tensor* min, const Tensor* max, Tensor* Y) const {
-                  auto min_default = clip_internal::LowMax<T>::low();
-                  auto max_default = clip_internal::LowMax<T>::max();
+                  auto min_default = std::numeric_limits<T>::lowest();

Check warning

Code scanning / PREfast

The function 'std::numeric_limits::lowest' is constexpr, mark variable 'min_default' constexpr if compile-time evaluation is desired (con.5). Warning

The function 'std::numeric_limits::lowest' is constexpr, mark variable 'min_default' constexpr if compile-time evaluation is desired (con.5).

onnxruntime/core/providers/cuda/math/clip.cc

               template <typename T>
               struct Clip::ComputeImpl {
                 void operator()(cudaStream_t stream, const Tensor* X, const Tensor* min, const Tensor* max, Tensor* Y) const {
-                  auto min_default = clip_internal::LowMax<T>::low();
-                  auto max_default = clip_internal::LowMax<T>::max();
+                  auto min_default = std::numeric_limits<T>::lowest();

Check warning

Code scanning / PREfast

The function 'std::numeric_limits::lowest' is constexpr, mark variable 'min_default' constexpr if compile-time evaluation is desired (con.5). Warning

The function 'std::numeric_limits::lowest' is constexpr, mark variable 'min_default' constexpr if compile-time evaluation is desired (con.5).

onnxruntime/core/providers/cuda/math/clip.cc

    
                  auto min_default = clip_internal::LowMax<T>::low();

                  auto max_default = clip_internal::LowMax<T>::max();

                  auto min_default = std::numeric_limits<T>::lowest();

                  auto max_default = std::numeric_limits<T>::max();

Check warning

Code scanning / PREfast

The function 'std::numeric_limits<__int64>::max' is constexpr, mark variable 'max_default' constexpr if compile-time evaluation is desired (con.5). Warning

The function 'std::numeric_limits<__int64>::max' is constexpr, mark variable 'max_default' constexpr if compile-time evaluation is desired (con.5).

onnxruntime/core/providers/cuda/math/clip.cc

    
                  auto min_default = clip_internal::LowMax<T>::low();

                  auto max_default = clip_internal::LowMax<T>::max();

                  auto min_default = std::numeric_limits<T>::lowest();

                  auto max_default = std::numeric_limits<T>::max();

Check warning

Code scanning / PREfast

The function 'std::numeric_limits::max' is constexpr, mark variable 'max_default' constexpr if compile-time evaluation is desired (con.5). Warning

The function 'std::numeric_limits::max' is constexpr, mark variable 'max_default' constexpr if compile-time evaluation is desired (con.5).

onnxruntime/core/providers/cuda/math/clip.cc

    
                  auto min_default = clip_internal::LowMax<T>::low();

                  auto max_default = clip_internal::LowMax<T>::max();

                  auto min_default = std::numeric_limits<T>::lowest();

                  auto max_default = std::numeric_limits<T>::max();

Check warning

Code scanning / PREfast

The function 'std::numeric_limits::max' is constexpr, mark variable 'max_default' constexpr if compile-time evaluation is desired (con.5). Warning

The function 'std::numeric_limits::max' is constexpr, mark variable 'max_default' constexpr if compile-time evaluation is desired (con.5).

onnxruntime/core/providers/cuda/math/clip.cc

    
                  auto min_default = clip_internal::LowMax<T>::low();

                  auto max_default = clip_internal::LowMax<T>::max();

                  auto min_default = std::numeric_limits<T>::lowest();

                  auto max_default = std::numeric_limits<T>::max();

Check warning

Code scanning / PREfast

The function 'std::numeric_limits::max' is constexpr, mark variable 'max_default' constexpr if compile-time evaluation is desired (con.5). Warning

The function 'std::numeric_limits::max' is constexpr, mark variable 'max_default' constexpr if compile-time evaluation is desired (con.5).

onnxruntime/core/providers/cuda/math/clip.cc

    
                  auto min_default = clip_internal::LowMax<T>::low();

                  auto max_default = clip_internal::LowMax<T>::max();

                  auto min_default = std::numeric_limits<T>::lowest();

                  auto max_default = std::numeric_limits<T>::max();

Check warning

Code scanning / PREfast

The function 'std::numeric_limits::max' is constexpr, mark variable 'max_default' constexpr if compile-time evaluation is desired (con.5). Warning

The function 'std::numeric_limits::max' is constexpr, mark variable 'max_default' constexpr if compile-time evaluation is desired (con.5).

tianleiwu merged commit 7880342 into main

85 checks passed

tianleiwu deleted the tlwu/fp16_bf16_limits branch

September 26, 2024 00:10

yihonglyu mentioned this pull request

1.19: Clip operator with type FLOAT16 defaults to min or max value 0.0 if not explicitly given, breaking many models using FLOAT16 #21957

Closed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet