Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fail tests gracefully for the minimal cuda build #21391

Merged
merged 4 commits into from
Aug 3, 2024

Conversation

poweiw
Copy link
Contributor

@poweiw poweiw commented Jul 17, 2024

Description

Several tests result in segfaults during the minimal cuda build. Although test failures are expected due to the limitation of the minimal cuda EP, failing gracefully would be much preferred.

Motivation and Context

To reproduce:

  1. Build ORT with:
./build.sh --build_shared_lib --use_full_protobuf --cuda_home /usr/local/cuda --cudnn_home /usr/lib/x86_64-linux-gnu/ --tensorrt_home /TensorRT-10.0.1.6 --parallel --skip_tests --skip_submodule_sync --allow_running_as_root --use_tensorrt --cmake_extra_defines onnxruntime_CUDA_MINIMAL=1
  1. Run onnxruntime_test_all
...
[----------] 1 test from AllocationPlannerTest
[ RUN      ] AllocationPlannerTest.ReusedInputCrossDifferentStreams
Segmentation fault (core dumped)

@poweiw
Copy link
Contributor Author

poweiw commented Jul 17, 2024

@microsoft-github-policy-service agree company="nvidia"

@poweiw poweiw marked this pull request as draft July 17, 2024 18:35
@poweiw poweiw marked this pull request as ready for review July 17, 2024 18:45
@poweiw poweiw changed the title Fail AllocationPlannerTest gracefully for the minimal cuda build Fail tests gracefully for the minimal cuda build Jul 17, 2024
@tianleiwu
Copy link
Contributor

/azp run Windows ARM64 QNN CI Pipeline,Windows x64 QNN CI Pipeline,Windows CPU CI Pipeline,Windows GPU CI Pipeline,Windows GPU TensorRT CI Pipeline,ONNX Runtime Web CI Pipeline,Linux CPU CI Pipeline,Linux CPU Minimal Build E2E CI Pipeline,Linux GPU CI Pipeline,Linux GPU TensorRT CI Pipeline

@tianleiwu
Copy link
Contributor

/azp run Linux OpenVINO CI Pipeline,Linux QNN CI Pipeline,MacOS CI Pipeline,orttraining-amd-gpu-ci-pipeline,orttraining-linux-ci-pipeline,orttraining-linux-gpu-ci-pipeline,orttraining-ortmodule-distributed,onnxruntime-binary-size-checks-ci-pipeline,Big Models,Linux Android Emulator QNN CI Pipeline

@tianleiwu
Copy link
Contributor

/azp run Android CI Pipeline,iOS CI Pipeline,ONNX Runtime React Native CI Pipeline

Copy link

Azure Pipelines successfully started running 3 pipeline(s).

Copy link

Azure Pipelines successfully started running 10 pipeline(s).

1 similar comment
Copy link

Azure Pipelines successfully started running 10 pipeline(s).

@poweiw
Copy link
Contributor Author

poweiw commented Jul 31, 2024

@chilo-ms @yf711 for review. Thanks!

@tianleiwu
Copy link
Contributor

BTW, I did see many test failures for minimal cuda build:
[ FAILED ] 177 tests, listed below:
[ FAILED ] PlannerTest.LocationPlanningForPassThroughExplicitAndImplicitSubgraphInputs
[ FAILED ] PlannerTest.LocationPlanningForInitializersOnlyUsedInANestedSubgraph
[ FAILED ] PlannerTest.LocationPlanningForImplicitInputsWithoutExplicitConsumersInMainGraph
[ FAILED ] AllocationPlannerTest.ReusedInputCrossDifferentStreams
[ FAILED ] InferenceSessionTests.TestArenaShrinkageAfterRun
[ FAILED ] TransformerTest.MemcpyTransformerTest
[ FAILED ] TransformerTest.MemcpyTransformerTestCudaFirst
[ FAILED ] TransformerTest.TestInitializerDuplicationInSubgraph
[ FAILED ] TransformerTest.MemcpyTransformerTestGraphInputConsumedOnMultipleDevices
[ FAILED ] TransformerTest.MemcpyTransformerTestImplicitInputConsumedOnMultipleDevices
[ FAILED ] ActivationOpTest.Sigmoid_bfloat16
[ FAILED ] ActivationOpTest.Tanh_bfloat16
[ FAILED ] ActivationOpTest.Relu_bfloat16
[ FAILED ] AttentionTest.AttentionEmptyPastState_SharedPastPresent
[ FAILED ] AttentionTest.AttentionPastStateBatch1_SharedPastPresent
[ FAILED ] AttentionTest.AttentionPastStateBatch2_SharedPastPresent
[ FAILED ] AttentionTest.AttentionPastStateBatch2WithPadding_SharedPastPresent
[ FAILED ] AttentionTest.AttentionWithNeoXRotaryEmbedding
[ FAILED ] AttentionTest.Attention4DMask
[ FAILED ] BeamSearchTest.GptBeamSearchFp16
[ FAILED ] BeamSearchTest.GptBeamSearchWithInitDecoderFp16
[ FAILED ] BeamSearchTest.GptBeamSearchFp16_VocabPadded
[ FAILED ] BiasAddTest.BiasAddTest_HiddenSize_320
[ FAILED ] BiasAddTest.BiasAddTest_HiddenSize_640
[ FAILED ] BiasAddTest.BiasAddTest_HiddenSize_1280
[ FAILED ] BiasAddTest.BiasAddTest_HiddenSize_768
[ FAILED ] BiasAddTest.BiasAddTest_HiddenSize_1536
[ FAILED ] BiasSoftmaxTest.InnerBroadcast
[ FAILED ] BiasSoftmaxTest.InnerBroadcastNegativeAxis
[ FAILED ] BiasSoftmaxTest.InnerBroadcastEmptyBiasBatch
[ FAILED ] BiasSoftmaxTest.InnerBroadcastFullBiasBatch
[ FAILED ] BiasSoftmaxTest.OuterBroadcast
[ FAILED ] BiasSoftmaxTest.OuterBroadcastNegativeAxis
[ FAILED ] BiasSoftmaxTest.OuterBroadcastEmptyBiasBatch
[ FAILED ] BiasSoftmaxTest.OuterBroadcastFullBiasBatch
[ FAILED ] BiasSoftmaxTest.InnerBroadcastMediumBatch
[ FAILED ] BiasSoftmaxTest.OuterBroadcastMediumBatch
[ FAILED ] BiasSoftmaxTest.InnerBroadcastLargeBatch
[ FAILED ] BiasSoftmaxTest.OuterBroadcastLargeBatch
[ FAILED ] BiasSplitGeluTest.BiasSplitGeluTest_HiddenSize_2560
[ FAILED ] BiasSplitGeluTest.BiasSplitGeluTest_HiddenSize_5120
[ FAILED ] BiasSplitGeluTest.BiasSplitGeluTest_HiddenSize_10240
[ FAILED ] BiasSplitGeluTest.BiasSplitGeluTest_HiddenSize_6144
[ FAILED ] BiasSplitGeluTest.BiasSplitGeluTest_HiddenSize_12288
[ FAILED ] BitmaskDropoutTest.InferenceFloatType
[ FAILED ] BitmaskDropoutTest.InferenceDoubleType
[ FAILED ] BitmaskDropoutTest.InferenceHalfType
[ FAILED ] BitmaskDropoutTest.TrainingFloatType
[ FAILED ] BitmaskDropoutTest.TrainingDoubleType
[ FAILED ] BitmaskDropoutTest.TrainingHalfType
[ FAILED ] ContribOpTest.Rfft
[ FAILED ] ContribOpTest.Irfft
[ FAILED ] DecoderAttentionTest.SelfAttentionNoStateNoCache
[ FAILED ] DecoderAttentionTest.CrossAttentionNoStateNoCache
[ FAILED ] DecoderAttentionTest.SelfAttentionNoStateOutputCache
[ FAILED ] DecoderAttentionTest.CrossAttentionNoStateOutputCache
[ FAILED ] DecoderAttentionTest.SelfAttentionWithCache
[ FAILED ] DecoderAttentionTest.CrossAttentionWithCache
[ FAILED ] DecoderAttentionTest.SelfAttentionNoStateNoCachePaddingMask
[ FAILED ] DecoderMaskedSelfAttentionTest.Test_fp32
[ FAILED ] DecoderMaskedSelfAttentionTest.Test_fp16
[ FAILED ] DynamicTimeWarp.simple
[ FAILED ] MathOpTest.ComplexMul
[ FAILED ] MathOpTest.ComplexMulConj
[ FAILED ] MathOpTest.ComplexMul_fp16
[ FAILED ] MathOpTest.ComplexMulConj_fp16
[ FAILED ] MathOpTest.Add_float
[ FAILED ] MathOpTest.Add_Broadcast_Axis
[ FAILED ] MathOpTest.Add_Broadcast_MultidirectionalAB
[ FAILED ] MathOpTest.Add_Broadcast_MultidirectionalBA
[ FAILED ] MathOpTest.Sub
[ FAILED ] MathOpTest.Mul
[ FAILED ] MathOpTest.Div
[ FAILED ] MathOpTest.Sqrt_bfloat16
[ FAILED ] MathOpTest.MatMul_bfloat16
[ FAILED ] BiasGeluTest.BFloat16
[ FAILED ] FastGeluTest.FastGeluWithBias_BFloat16
[ FAILED ] FusedMatMulOpTest.BFloat16_NoTranspose
[ FAILED ] GemmaRotaryEmbeddingTest.GemmaRotaryEmbedding_Small
[ FAILED ] GreedySearchTest.GptGreedySearchFp16_VocabPadded
[ FAILED ] GroupNormTest.GroupNorm_128
[ FAILED ] LongformerAttentionTest.LongformerAttention_Format0_NoGlobal
[ FAILED ] LongformerAttentionTest.LongformerAttention_Format0_GlobalStart
[ FAILED ] LongformerAttentionTest.LongformerAttention_Format0_UseCompactMemory
[ FAILED ] LongformerAttentionTest.LongformerAttention_Format0_Float16
[ FAILED ] LongformerAttentionTest.LongformerAttention_Format0_FullWindow
[ FAILED ] LongformerAttentionTest.LongformerAttention_Format1_NoGlobal
[ FAILED ] LongformerAttentionTest.LongformerAttention_Format1_GlobalStart
[ FAILED ] LongformerAttentionTest.LongformerAttention_Format1_NoCompactMemory
[ FAILED ] LongformerAttentionTest.LongformerAttention_Format1_Float16
[ FAILED ] LongformerAttentionTest.LongformerAttention_Format1_FullWindow
[ FAILED ] MatMulNBits.Float16
[ FAILED ] MoETest.MoETest_Gelu
[ FAILED ] MoETest.MoETest_Relu
[ FAILED ] MoETest.MoETest_Mixtral
[ FAILED ] MoETest.QMoETest_Mixtral_Int4
[ FAILED ] MultiHeadAttentionTest.CrossAttention_Batch2_HeadSize32_RightSidePadding_Mask1D
[ FAILED ] MultiHeadAttentionTest.CrossAttention_Batch2_HeadSize32_RightSidePadding_Mask2D
[ FAILED ] MultiHeadAttentionTest.CrossAttention_Batch1_HeadSize32_LeftSidePadding_Mask2D
[ FAILED ] MultiHeadAttentionTest.CrossAttention_Batch2_HeadSize32_NoBias_NoMask_PackedKV
[ FAILED ] MultiHeadAttentionTest.SelfAttention_Batch2_HeadSize32_NoBias_NoMask_PackedQKV
[ FAILED ] NhwcConvTest.Conv2D_2
[ FAILED ] NhwcConvTest.Conv2D_Bias_1
[ FAILED ] NhwcConvTest.Conv2D_AutoPad1
[ FAILED ] NhwcConvTest.Conv2D_AutoPad2
[ FAILED ] NhwcConvTest.Conv2D_asymmetric_padding1
[ FAILED ] NhwcConvTest.Conv2D_asymmetric_padding2
[ FAILED ] PackedAttentionTest.NoPack
[ FAILED ] PackedAttentionTest.NoPackWithRelativePositionBias
[ FAILED ] PackedAttentionTest.PackedWithRelativePositionBias
[ FAILED ] PackedAttentionTest.PackedBatch
[ FAILED ] PackedAttentionTest.PackedBatchWithQKV
[ FAILED ] PackedAttentionTest.TestWithRandomData
[ FAILED ] PackedAttentionTest.TestWithRandomDataWithRBP
[ FAILED ] PackedAttentionTest.TestWithRandomDataLargeSeq
[ FAILED ] PackedMultiHeadAttentionTest.PackedQKV_NoPadding_NoBias_trt
[ FAILED ] PackedMultiHeadAttentionTest.PackedQKV_NoPadding_NoBias_cutlass
[ FAILED ] PackedMultiHeadAttentionTest.PackedQKV_NoPadding_NoBias_unfused
[ FAILED ] PackedMultiHeadAttentionTest.Q_K_V_NoPadding_NoBias_trt
[ FAILED ] PackedMultiHeadAttentionTest.Q_K_V_NoPadding_Bias_RelPosBias_cutlass
[ FAILED ] PackedMultiHeadAttentionTest.Q_K_V_NoPadding_Bias_RelPosBias_unfused
[ FAILED ] PackedMultiHeadAttentionTest.PackedQKV_Padding_NoBias_trt
[ FAILED ] PackedMultiHeadAttentionTest.PackedQKV_Padding_NoBias_cutlass
[ FAILED ] PackedMultiHeadAttentionTest.PackedQKV_Padding_NoBias_FlashAttention
[ FAILED ] PackedMultiHeadAttentionTest.PackedQKV_Padding_NoBias_unfused
[ FAILED ] PackedMultiHeadAttentionTest.PackedQKV_Padding_NoBias_RelPosBias
[ FAILED ] PackedMultiHeadAttentionTest.PackedQKV_Padding_NoBias_BroadcastRelPosBias_cutlass
[ FAILED ] PackedMultiHeadAttentionTest.PackedQKV_Padding_NoBias_BroadcastRelPosBias_unfused
[ FAILED ] QOrderedTest.Gelu_3x11x12
[ FAILED ] QOrderedTest.LayerNormalization_RowMajor
[ FAILED ] QOrderedTest.FP32_Quantize_COL
[ FAILED ] QOrderedTest.FP32_Quantize_ROW
[ FAILED ] QOrderedTest.FP32_Quantize_COL32
[ FAILED ] QOrderedTest.FP32_Quantize_COL4_4R2_8C
[ FAILED ] QOrderedTest.FP16_Quantize_COL
[ FAILED ] QOrderedTest.FP16_Quantize_ROW
[ FAILED ] QOrderedTest.FP16_Quantize_COL32
[ FAILED ] QOrderedTest.FP16_Quantize_COL4_4R2_8C
[ FAILED ] QOrderedTest.FP32_Dequantize_COL32
[ FAILED ] QOrderedTest.FP32_Dequantize_ROW
[ FAILED ] QOrderedTest.FP16_Dequantize_COL32
[ FAILED ] QOrderedTest.FP16_Dequantize_ROW
[ FAILED ] QAttentionTest.QAttentionBatch1
[ FAILED ] QAttentionTest.QAttentionBatch1_Float16
[ FAILED ] QAttentionTest.QAttentionBatch2
[ FAILED ] QAttentionTest.QAttentionMaskPartialSequence
[ FAILED ] QAttentionTest.QAttentionMaskExceedSequence
[ FAILED ] QAttentionTest.QAttentionNoMaskIndex
[ FAILED ] QAttentionTest.QAttentionUnidirectional_CUDA
[ FAILED ] QAttentionTest.QAttentionPrunedModel
[ FAILED ] RelativePositionBiasTest.RelativePositionBiasTest_FP32
[ FAILED ] RelativePositionBiasTest.RelativePositionBiasTest_FP16
[ FAILED ] RelativePositionBiasTest.RelativePositionBiasTest2_FP16
[ FAILED ] RelativePositionBiasTest.RelativePositionBiasTest_FP16_No_Bidirectional
[ FAILED ] GatedRelativePositionBiasTest.FP16_BSNHD_1x3x2x4x8
[ FAILED ] GatedRelativePositionBiasTest.FP32_BSNHD_2x3x2x4x8
[ FAILED ] GatedRelativePositionBiasTest.FP32_LongSeq_BSNHD_2x5x2x4x4
[ FAILED ] GatedRelativePositionBiasTest.FP16_BSNHD_2x8x2x4x8_NoPadding
[ FAILED ] RemovePaddingTest.RemovePaddingBatch1_NoPadding
[ FAILED ] RemovePaddingTest.RemovePaddingBatch3_TwoWithPadding
[ FAILED ] RemovePaddingTest.RemovePaddingBatch3_AllWithPadding
[ FAILED ] RestorePaddingTest.RestorePaddingBatch1_NoPadding
[ FAILED ] RestorePaddingTest.RestorePaddingBatch3_TwoWithPadding
[ FAILED ] RestorePaddingTest.RestorePaddingBatch3_AllWithPadding
[ FAILED ] SamplingTest.Gpt2Sampling_GPU
[ FAILED ] SkipGroupNormTest.SkipGroupNorm_with_bias
[ FAILED ] SkipGroupNormTest.SkipGroupNorm_no_bias_broadcast_skip
[ FAILED ] UnfoldTensorOpTest.LastDim
[ FAILED ] UnfoldTensorOpTest.NormalDim
[ FAILED ] ModOpTest.Fmod_bfloat16_mixed_sign
[ FAILED ] GemmOpTest.GemmNoTrans_bfloat16
[ FAILED ] SoftmaxOperator.Simple_bfloat16
[ FAILED ] ReductionOpTest.ReduceSum_bfloat16
[ FAILED ] ReductionOpTest.ReduceSumBFloat16_2
[ FAILED ] GatherOpTest.Gather_invalid_index_gpu
[ FAILED ] IsInfTest.test_isinf_bfloat16
[ FAILED ] IsInfTest.test_Float8E4M3FN

@tianleiwu tianleiwu merged commit 2653226 into microsoft:main Aug 3, 2024
87 checks passed
@poweiw
Copy link
Contributor Author

poweiw commented Aug 5, 2024

BTW, I did see many test failures for minimal cuda build

Yeah these are expected as even though testing CUDA EP was disabled when USE_TENSORRT, CUDA EP seems to be checked in the TRT EP tests. As discussed here, maintain all of them to be negative tests will require unworthy huge effort.

@poweiw poweiw deleted the dev/poweiw-ort-cuda-minimal branch November 6, 2024 01:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants