Suppress MLIR ubsan errors #3198

BrianHarrisonAMD · 2024-08-13T21:32:37Z

Add suppressing ubsan errors for the file shared_ptr_base.h to workaround errors when upgrading to ROCm 6.2 in the base docker.

See relevant PR #3181, and Issue #3192 for additional details.

The issue can be tracked down to simply creating the MLIR handle after upgrading to the latest ROCm 6.2.
Since we froze MLIR to an older release, suppressing this warning is our only real option.

Note:

Params used when constructing the MLIR handle to replicate, "--x2 1 --operation conv2d --kernel_id 0 --num_cu 104 --arch amdgcn-amd-amdhsa:gfx90a:sramecc+:xnack- --groupsize 1 --fil_layout GNCHW --fil_type fp32 --in_layout NGCHW --out_layout NGCHW --in_type fp32 --out_type fp32 --batchsize 100 --in_channels 25 --out_channels 300 --in_h 32 --in_w 32 --out_h 30 --out_w 30 --fil_h 3 --fil_w 3 --dilation_h 1 --dilation_w 1 --conv_stride_h 1 --conv_stride_w 1 --padding_h 0 --padding_w 0 --kernel_name mlir_gen_igemm_conv2d_v4r4_fwd_xdlops0". Simply creating a handle with those command line arguments will result in the vptr issue when the test application is exiting after upgrading to ROCm 6.2.
ROCm 6.2 also updated the version of clang.

junliume

Let's keep track of this WA and minimize the fix later:

shared_ptr_base.h

atamazov · 2024-08-14T18:50:37Z

@junliume maybe we can simply disable MLIR in the failing stage as a W/A?

BrianHarrisonAMD · 2024-08-14T19:13:02Z

I think that could also be a reasonable workaround, but not sure which way we prefer.
Both aren't ideal hehe.

amberhassaan · 2024-08-14T20:24:32Z

I slightly prefer @atamazov 's suggestion because currently we're saying all best are off with shared_ptr_base which could mean that certain memory problems related to shared_ptr sneak through, while disabling MLIR means that we aren't sure of MLIR's safety so that's more contained in a way to MLIR as opposed to every use of std::shared_ptr.

BrianHarrisonAMD · 2024-08-14T21:50:50Z

I can update this to instead turn off MLIR for the debug steps.

@junliume that good with you?

DrizztDoUrden · 2024-08-15T01:00:49Z

I think disabling MLIR for the sanitized steps as Artem suggested is less harmful because it is at least somewhat tested anyway and we would really not want to miss some shared_ptr misuse in the library if that would happen.

junliume · 2024-08-15T05:51:28Z

I can update this to instead turn off MLIR for the debug steps.

@junliume that good with you?

@BrianHarrisonAMD yes let's do it by only disabling checks when MLIR is in use. Thanks!

BrianHarrisonAMD · 2024-08-15T13:35:23Z

Sounds good!

Ill close and make a new PR with disabling MLIR for the sanitize steps.

Thanks for the feedback everyone!

BrianHarrisonAMD added 4 commits August 13, 2024 16:37

Suppress ubsan errors due to calling mlir

d36fb23

Fix jekins $ escaping

c9fdcbd

Fix pipeline build

8eb475c

Add missing nightly file

bf02f56

BrianHarrisonAMD requested review from JehandadKhan and junliume as code owners August 13, 2024 21:32

BrianHarrisonAMD self-assigned this Aug 13, 2024

BrianHarrisonAMD mentioned this pull request Aug 13, 2024

[CI][Failure] shared_ptr_base.h:199:9: runtime error: member call on address which does not point to an object of type 'std::_Sp_counted_base<>' #3192

Open

BrianHarrisonAMD requested review from amberhassaan, DrizztDoUrden and cderb August 13, 2024 21:41

BrianHarrisonAMD added non-miopen-bug urgency_blocker complexity_low workaround labels Aug 13, 2024

junliume added the TESTING_CI_PASSED label Aug 14, 2024

junliume approved these changes Aug 14, 2024

View reviewed changes

BrianHarrisonAMD closed this Aug 15, 2024

BrianHarrisonAMD deleted the bharriso/suppress-mlir-ubsan-errors branch August 15, 2024 14:50

BrianHarrisonAMD mentioned this pull request Aug 15, 2024

[WA] Disable MLIR when building debug to workaround sanitizer issue #3200

Merged

amberhassaan mentioned this pull request Aug 15, 2024

Update docker ROCm version for CI #3181

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Suppress MLIR ubsan errors #3198

Suppress MLIR ubsan errors #3198

BrianHarrisonAMD commented Aug 13, 2024

junliume left a comment

atamazov commented Aug 14, 2024

BrianHarrisonAMD commented Aug 14, 2024

amberhassaan commented Aug 14, 2024

BrianHarrisonAMD commented Aug 14, 2024 •

edited

Loading

DrizztDoUrden commented Aug 15, 2024

junliume commented Aug 15, 2024

BrianHarrisonAMD commented Aug 15, 2024

Suppress MLIR ubsan errors #3198

Suppress MLIR ubsan errors #3198

Conversation

BrianHarrisonAMD commented Aug 13, 2024

junliume left a comment

Choose a reason for hiding this comment

atamazov commented Aug 14, 2024

BrianHarrisonAMD commented Aug 14, 2024

amberhassaan commented Aug 14, 2024

BrianHarrisonAMD commented Aug 14, 2024 • edited Loading

DrizztDoUrden commented Aug 15, 2024

junliume commented Aug 15, 2024

BrianHarrisonAMD commented Aug 15, 2024

BrianHarrisonAMD commented Aug 14, 2024 •

edited

Loading