Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Constructing a 67108864 TiB array succeeds incorrectly, corrupting memory #54328

Open
LilithHafner opened this issue May 2, 2024 · 17 comments
Open
Labels
arrays [a, r, r, a, y, s] bug Indicates an unexpected problem or unintended behavior

Comments

@LilithHafner
Copy link
Member

julia> x = Array{UInt128}(undef, 2^3, 2^59)
8×576460752303423488 Matrix{UInt128}:
 0x0000ffff1c9ec7b00000000000000000  0x0000ffff27943b900000000000000000    0x0000ffff1fed77c00000000000000003  0x0000ffff1fed58d00000000000000019
 0x0000ffff0e46cc000000ffff2780f340  0x0000ffff0e7426400000ffff0e742650     0x0000ffff2690b1700000ffff27e00038  0x0000ffff273a49900000ffff273a49a0
 0x0000ffff1fed50a00000ffff268f82d0  0x0000ffff2791b3500000000000000000     0x0000ffff1fed60a00000ffff269bb260  0x0000ffff1fed59d00000000000000019
 0x0000ffff268194a80000000000000000  0x0000ffff2690b3100000ffff27f147c8     0x0000ffff2780f3100000ffff2780f320  0x0000ffff277590100000ffff27759020
 0x0000ffff1fed59500000000000000000  0x0000ffff1c47247000003030313a3120     0x0000ffff1fed77c00000000000000003  0x0000ffff1fed59d00000000000000004
 0x0000ffff1c31a0500000ffff1fee6170  0x0000ffff27f147c80000ffff1deb7d90    0x0000ffff2690b1b00000ffff27e00508  0x0000ffff277590500000ffff27759060
 0x0000ffff1c9ec7b00000000000000000  0x0000ffff1c59c3000000ffff2690b310     0x0000ffff1fed60a00000000000000000  0x0000ffff1ccde9f00000000000000004
 0x0000ffff0e46cc000000ffff2780f340  0x0000ffff2690b3500000ffff27dec3d0     0x0000ffff273ae1500000ffff273ae160  0x0000ffff2690b2804000000000000000

julia> x[1] = x[end] = 1
1

julia> rand(x)
0x00000000000000000000000000000001

julia> rand(x, 10)
10-element Vector{UInt128}:
 0x00000000000000000000000000000001
 0x00000000000000000000000000000001
 0x00000000000000000000000000000001
 0x00000000000000000000000000000001
 0x00000000000000000000000000000001
 0x00000000000000000000000000000001
 0x00000000000000000000000000000001
 0x00000000000000000000000000000001
 0x00000000000000000000000000000001
 0x00000000000000000000000000000001

julia> length(x
[13983] signal 11 (1): Segmentation fault
in expression starting at REPL[5]:1
gc_try_claim_and_push at /cache/build/builder-armageddon-1/julialang/julia-master/src/gc.c:2118
gc_mark_outrefs at /cache/build/builder-armageddon-1/julialang/julia-master/src/gc.c:2838 [inlined]
gc_mark_loop_serial_ at /cache/build/builder-armageddon-1/julialang/julia-master/src/gc.c:2963
gc_mark_loop_serial at /cache/build/builder-armageddon-1/julialang/julia-master/src/gc.c:2986
gc_mark_loop at /cache/build/builder-armageddon-1/julialang/julia-master/src/gc.c:3163 [inlined]
_jl_gc_collect at /cache/build/builder-armageddon-1/julialang/julia-master/src/gc.c:3552
ijl_gc_collect at /cache/build/builder-armageddon-1/julialang/julia-master/src/gc.c:3931
maybe_collect at /cache/build/builder-armageddon-1/julialang/julia-master/src/gc.c:922 [inlined]
jl_gc_pool_alloc_inner at /cache/build/builder-armageddon-1/julialang/julia-master/src/gc.c:1319
jl_gc_pool_alloc_noinline at /cache/build/builder-armageddon-1/julialang/julia-master/src/gc.c:1386 [inlined]
jl_gc_alloc_ at /cache/build/builder-armageddon-1/julialang/julia-master/src/julia_internal.h:507 [inlined]
jl_gc_alloc at /cache/build/builder-armageddon-1/julialang/julia-master/src/gc.c:3984
ijl_alloc_svec_uninit at /cache/build/builder-armageddon-1/julialang/julia-master/src/simplevector.c:63
inst_datatype_inner at /cache/build/builder-armageddon-1/julialang/julia-master/src/jltypes.c:2088
inst_datatype_env at /cache/build/builder-armageddon-1/julialang/julia-master/src/jltypes.c:1357 [inlined]
inst_datatype_env at /cache/build/builder-armageddon-1/julialang/julia-master/src/jltypes.c:1361 [inlined]
inst_datatype_env at /cache/build/builder-armageddon-1/julialang/julia-master/src/jltypes.c:1361 [inlined]
inst_datatype_env at /cache/build/builder-armageddon-1/julialang/julia-master/src/jltypes.c:1361 [inlined]
ijl_apply_type at /cache/build/builder-armageddon-1/julialang/julia-master/src/jltypes.c:1377
intersect at /cache/build/builder-armageddon-1/julialang/julia-master/src/subtype.c:3943
intersect_sub_datatype at /cache/build/builder-armageddon-1/julialang/julia-master/src/subtype.c:3558 [inlined]
intersect at /cache/build/builder-armageddon-1/julialang/julia-master/src/subtype.c:3962
intersect_sub_datatype at /cache/build/builder-armageddon-1/julialang/julia-master/src/subtype.c:3558 [inlined]
intersect at /cache/build/builder-armageddon-1/julialang/julia-master/src/subtype.c:3962
intersect_unionall_ at /cache/build/builder-armageddon-1/julialang/julia-master/src/subtype.c:3156
intersect_unionall at /cache/build/builder-armageddon-1/julialang/julia-master/src/subtype.c:3247
intersect at /cache/build/builder-armageddon-1/julialang/julia-master/src/subtype.c:3896
intersect_unionall_ at /cache/build/builder-armageddon-1/julialang/julia-master/src/subtype.c:3160
intersect_unionall at /cache/build/builder-armageddon-1/julialang/julia-master/src/subtype.c:3247
intersect at /cache/build/builder-armageddon-1/julialang/julia-master/src/subtype.c:3870
intersect_tuple at /cache/build/builder-armageddon-1/julialang/julia-master/src/subtype.c:3468
intersect at /cache/build/builder-armageddon-1/julialang/julia-master/src/subtype.c:3910
intersect_unionall_ at /cache/build/builder-armageddon-1/julialang/julia-master/src/subtype.c:3160
intersect_unionall at /cache/build/builder-armageddon-1/julialang/julia-master/src/subtype.c:3270
intersect at /cache/build/builder-armageddon-1/julialang/julia-master/src/subtype.c:3893
intersect_all at /cache/build/builder-armageddon-1/julialang/julia-master/src/subtype.c:4118
intersect_types at /cache/build/builder-armageddon-1/julialang/julia-master/src/subtype.c:4177
ijl_has_empty_intersection at /cache/build/builder-armageddon-1/julialang/julia-master/src/subtype.c:4188
sort_mlmatches at /cache/build/builder-armageddon-1/julialang/julia-master/src/gf.c:3483
sort_mlmatches at /cache/build/builder-armageddon-1/julialang/julia-master/src/gf.c:3503
sort_mlmatches at /cache/build/builder-armageddon-1/julialang/julia-master/src/gf.c:3503
ml_matches at /cache/build/builder-armageddon-1/julialang/julia-master/src/gf.c:3911
ml_matches at /cache/build/builder-armageddon-1/julialang/julia-master/src/gf.c:3687 [inlined]
ijl_matching_methods at /cache/build/builder-armageddon-1/julialang/julia-master/src/gf.c:2309
_methods_by_ftype at ./reflection.jl:1196
complete_methods! at /cache/build/builder-armageddon-1/julialang/julia-master/usr/share/julia/stdlib/v1.12/REPL/src/REPLCompletions.jl:856
complete_keyword_argument at /cache/build/builder-armageddon-1/julialang/julia-master/usr/share/julia/stdlib/v1.12/REPL/src/REPLCompletions.jl:1016
completions at /cache/build/builder-armageddon-1/julialang/julia-master/usr/share/julia/stdlib/v1.12/REPL/src/REPLCompletions.jl:1327
complete_line at /cache/build/builder-armageddon-1/julialang/julia-master/usr/share/julia/stdlib/v1.12/REPL/src/REPL.jl:655
jfptr_complete_line_9919 at /home/x/.julia/juliaup/julia-nightly/share/julia/compiled/v1.12/REPL/u0gqU_1rXQa.so (unknown line)
check_for_hint at /cache/build/builder-armageddon-1/julialang/julia-master/usr/share/julia/stdlib/v1.12/REPL/src/LineEdit.jl:383
#139 at /cache/build/builder-armageddon-1/julialang/julia-master/usr/share/julia/stdlib/v1.12/REPL/src/LineEdit.jl:2519
jl_apply at /cache/build/builder-armageddon-1/julialang/julia-master/src/julia.h:2188 [inlined]
jl_f__call_latest at /cache/build/builder-armageddon-1/julialang/julia-master/src/builtins.c:875
#invokelatest#2 at ./essentials.jl:1032 [inlined]
invokelatest at ./essentials.jl:1029 [inlined]
#27 at /cache/build/builder-armageddon-1/julialang/julia-master/usr/share/julia/stdlib/v1.12/REPL/src/LineEdit.jl:1703
jfptr_YY.27_10520 at /home/x/.julia/juliaup/julia-nightly/share/julia/compiled/v1.12/REPL/u0gqU_1rXQa.so (unknown line)
prompt! at /cache/build/builder-armageddon-1/julialang/julia-master/usr/share/julia/stdlib/v1.12/REPL/src/LineEdit.jl:2840
run_interface at /cache/build/builder-armageddon-1/julialang/julia-master/usr/share/julia/stdlib/v1.12/REPL/src/LineEdit.jl:2742
jfptr_run_interface_11025 at /home/x/.julia/juliaup/julia-nightly/share/julia/compiled/v1.12/REPL/u0gqU_1rXQa.so (unknown line)
run_frontend at /cache/build/builder-armageddon-1/julialang/julia-master/usr/share/julia/stdlib/v1.12/REPL/src/REPL.jl:1446
#75 at /cache/build/builder-armageddon-1/julialang/julia-master/usr/share/julia/stdlib/v1.12/REPL/src/REPL.jl:496
)jfptr_YY.75_12396 at /home/x/.julia/juliaup/julia-nightly/share/julia/compiled/v1.12/REPL/u0gqU_1rXQa.so (unknown line)
jl_apply at /cache/build/builder-armageddon-1/julialang/julia-master/src/julia.h:2188 [inlined]
start_task at /cache/build/builder-armageddon-1/julialang/julia-master/src/task.c:1202
Allocations: 1516744 (Pool: 1516567; Big: 177); GC: 1
Segmentation fault (core dumped)

Related, constructing an array of size 2^61 or 2^62 "works", but not 2^61-1 or 2^62-1:

julia> Vector{Int}(undef, 2^61);

julia> Vector{Int}(undef, 2^62);

julia> Vector{Int}(undef, 2^61-1);
ERROR: ArgumentError: invalid GenericMemory size: too large for system address width
Stacktrace:
 [1] GenericMemory
   @ ./boot.jl:531 [inlined]
 [2] Vector{Int64}(::UndefInitializer, m::Int64)
   @ Core ./boot.jl:596
 [3] top-level scope
   @ REPL[12]:1

julia> Vector{Int}(undef, 2^62-1);
ERROR: ArgumentError: invalid GenericMemory size: too large for system address width
Stacktrace:
 [1] GenericMemory
   @ ./boot.jl:531 [inlined]
 [2] Vector{Int64}(::UndefInitializer, m::Int64)
   @ Core ./boot.jl:596
 [3] top-level scope
   @ REPL[13]:1

I'm guessing all this is due to unchecked overflow when multiplying by sizeof(eltype).

Much if this behavior is also present on 1.10.

CC @oscardssmith who's been working on this recently.

All examples above are from nightly.

@LilithHafner LilithHafner added the bug Indicates an unexpected problem or unintended behavior label May 2, 2024
@oscardssmith
Copy link
Member

That's suboptimal. We probably need one more checked mul on the C side.

@oscardssmith
Copy link
Member

oscardssmith commented May 2, 2024

hmm, we already are doing the multiplication with wide arithmetic:

wideint_t prod = (wideint_t)nel * elsz;
I'm very unsure why this check isn't failing. Is wideint_t somehow being set to uint64?

@KristofferC
Copy link
Member

What does "UB" mean here (the UB bug really bit people...). This is just a bug.

@oscardssmith oscardssmith changed the title Constructing a 67108864 TiB array succeeds but is UB. Constructing a 67108864 TiB array succeeds incorrectly, corrupting memory May 2, 2024
@KlausC
Copy link
Contributor

KlausC commented May 2, 2024

That is how wideint_t is defined (it depends on the architecture and C-compiler in use)

julia/src/array.c

Lines 19 to 23 in e637be1

#if defined(_P64) && defined(UINT128MAX)
typedef __uint128_t wideint_t;
#else
typedef uint64_t wideint_t;
#endif

@vtjnash
Copy link
Member

vtjnash commented May 2, 2024

UINT128MAX seems to never exist. What is that supposed to be?

@KlausC
Copy link
Contributor

KlausC commented May 2, 2024

It is possible to throw if ((size_t)(-1) >> 1 - 1) / max(elsz + (isunion != 0), 1) + 1 > nel or similar in array.c and 2 spots in genericmemory.c.
Then size_t prod; prod = (elsz + (isunion != 0)) * nel, without need for wide_t and MAXINTVAL.

@vtjnash
Copy link
Member

vtjnash commented May 2, 2024

Ideally we don't need division there, as that is much slower than checking for multiplication overflow

@giordano
Copy link
Contributor

giordano commented May 2, 2024

UINT128MAX seems to never exist. What is that supposed to be?

It was introduced by e9be444 (#5022), but it doesn't seem to be defined/used anywhere else in that revision.

@vtjnash
Copy link
Member

vtjnash commented May 2, 2024

Apparently there is no standards-defined way to avoid this UB in C for multiply. But most compilers have extensions (specifically __builtin_mul_overflow) to work around that.

@StefanKarpinski
Copy link
Member

StefanKarpinski commented May 2, 2024

Could we move the checking into Julia? Seems we can do this more reliably than C.

@oscardssmith
Copy link
Member

Moving the checks to Julia would be a bit dangerous in that anyoen that skips doing the checks could run into weird behavior, but if we give it a bad enough name, it's plausible that people won't try to use it...

@LilithHafner
Copy link
Member Author

What does "UB" mean here

I mean that Array{UInt128}(undef, 2^3, 2^59)[10] semantically invokes LLVM's undefined behavior.

@nsajko nsajko added the correctness bug ⚠ Bugs that are likely to lead to incorrect results in user code without throwing label May 3, 2024
@LilithHafner
Copy link
Member Author

I think it is unlikely for this to result in incorrect results in user code. It is much more likely to result in crashes/segfaults, and it is also unlikely for a user to allocate an array of this size in the first place.

@LilithHafner LilithHafner removed the correctness bug ⚠ Bugs that are likely to lead to incorrect results in user code without throwing label May 3, 2024
@oscardssmith
Copy link
Member

oscardssmith commented Oct 13, 2024

pretty sure #55913 (at least mostly) fixes this by the way

@nsajko nsajko added the arrays [a, r, r, a, y, s] label Oct 13, 2024
@vtjnash vtjnash closed this as completed Oct 14, 2024
@oscardssmith
Copy link
Member

This isn't fixed as #55913 isn't merged.

@oscardssmith oscardssmith reopened this Oct 14, 2024
@inkydragon
Copy link
Member

UINT128MAX seems to never exist. What is that supposed to be?

Introduced in e9be444: "Some work to make libjulia compilable on windows using the Intel compiler"

Maybe we want:

// works on gcc 4.6+; clang 3.3+; icc 16+, test with godbolt.org
#if defined(__SIZEOF_INT128__)

Just like what the Linux kernel does.

@oscardssmith
Copy link
Member

The easier approach is to just use the builtin overflow intrinsics which are supported across all the major compilers. This is done as part of the memorynew PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrays [a, r, r, a, y, s] bug Indicates an unexpected problem or unintended behavior
Projects
None yet
Development

No branches or pull requests

9 participants