Skip to content

Commit

Permalink
[Fix Bug] Fp8*Fp8 Run Error (#20911)
Browse files Browse the repository at this point in the history
Fix fp8*fp8 when input A is e5m2, input B is e4m3 will run error

### Description
<!-- Describe your changes. -->



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
  • Loading branch information
KnightYao authored Jul 5, 2024
1 parent 3f6b743 commit 9ef28f0
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions onnxruntime/contrib_ops/cuda/math/gemm_float8.cu
Original file line number Diff line number Diff line change
Expand Up @@ -207,7 +207,7 @@ Status GemmFloat8::ComputeGemm(
#endif
case CUDA_R_8F_E4M3:
case CUDA_R_8F_E5M2:
compute_type = CUBLAS_COMPUTE_32F_FAST_TF32;
compute_type = CUBLAS_COMPUTE_32F;
break;
#endif
default:
Expand All @@ -219,7 +219,7 @@ Status GemmFloat8::ComputeGemm(
compute_type = CUBLAS_COMPUTE_32F_FAST_16BF;
break;
case CUDA_R_32F:
compute_type = CUBLAS_COMPUTE_32F_FAST_TF32;
compute_type = CUBLAS_COMPUTE_32F;
break;
default:
ORT_THROW("Unable to determine computeType in operator GemmFloat8.");
Expand Down

0 comments on commit 9ef28f0

Please sign in to comment.