Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make memorynew intrinsic #56803

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

oscardssmith
Copy link
Member

Attempt to split up #55913 into 2 pieces. This piece now only adds the memorynew intrinsic without any of the optimizations enabled by #55913. As such, this PR should be ready to merge now. (and will make #55913 smaller and simpler)

@oscardssmith oscardssmith added the arrays [a, r, r, a, y, s] label Dec 11, 2024
@KristofferC KristofferC added the needs pkgeval Tests for all registered packages should be run with this change label Dec 11, 2024
src/common_symbols2.inc Outdated Show resolved Hide resolved
src/pipeline.cpp Outdated Show resolved Hide resolved
src/rtutils.c Show resolved Hide resolved
test/core.jl Show resolved Hide resolved
base/genericmemory.jl Outdated Show resolved Hide resolved
base/essentials.jl Outdated Show resolved Hide resolved
doc/src/manual/performance-tips.md Show resolved Hide resolved
doc/src/manual/performance-tips.md Show resolved Hide resolved
@oscardssmith oscardssmith force-pushed the os-memorynew-light branch 2 times, most recently from 69d40b6 to 4ec80ce Compare December 11, 2024 18:00
@oscardssmith oscardssmith changed the title make memorynew intrinsic make memorynew intrinsic (part 1) Dec 11, 2024
@gbaraldi
Copy link
Member

This is much worse than the current implementation btw. For this you at least need a specialized builtin implementation in codegen, even if it just forwards the arguments.

Co-authored-by: Jameson Nash <[email protected]>
Co-authored-by: Jeff Bezanson <[email protected]>
Co-authored-by: Gabriel Baraldi  <[email protected]>
@oscardssmith
Copy link
Member Author

To prevent this PR from being a regression (and to fix the LLVM names test, I think the right way to go is to add the dynamic length version of codegen to this PR. It always goes through C (so LLVM won't be able to delete the whole allocation), but this way this PR on it's own is ~3ns faster to allocate arrays than master without the glory/risk of LLVM deleting the allocation entirely.

@oscardssmith
Copy link
Member Author

oscardssmith commented Dec 12, 2024

@KristofferC I don't think this PR needs a pkgeval. It doesn't have any of the risks of #55913 wrt weird miscompiles. (that said, if you disagree, feel free to run it).

@oscardssmith oscardssmith added performance Must go faster compiler:codegen Generation of LLVM IR and native code labels Dec 12, 2024
@oscardssmith oscardssmith changed the title make memorynew intrinsic (part 1) make memorynew intrinsic Dec 12, 2024
@KristofferC
Copy link
Member

KristofferC commented Dec 12, 2024

I am surprised you would say that and I strongly disagree. The other PR was also just about to be merged before PkgEval showed it was quite buggy so it would be unfortunate to repeat the same mistake again. The whole emit_memorynew looks like it could cause miscompilation issues?

@KristofferC
Copy link
Member

This is much worse than the current implementation btw. For this you at least need a specialized builtin implementation in codegen, even if it just forwards the arguments.

Has this been done now? It is hard to follow the evolution of the PR when the original commit gets amended over and over.

@KristofferC
Copy link
Member

Also, benchmarks should be run here since it touches some core array functionality and to show the performance benefit and that no regressions gets introduced (e.g. from the seemingly new boundscheck).

Compiler/src/tfuncs.jl Outdated Show resolved Hide resolved
src/rtutils.c Show resolved Hide resolved
src/codegen.cpp Outdated Show resolved Hide resolved
@oscardssmith
Copy link
Member Author

The whole emit_memorynew looks like it could cause miscompilation issues?

@KristofferC the reason I say this version is much lower risk is that none of the segfaults that we were seeing come from the emit_memorynew function. (i.e. if you deleted the LLVM optimization passes we do, it would compile correctly). The cause of potential miscompiles in the other PR comes from the fact that if we teach the compiler how to do a Memory allocation completely (i.e. without the ccall that exists in this version), the compiler is able to analyze it and remove the allocation if it believes the allocation doesn't leak. This version of the PR keeps it as an "opaque" object allocation so the class of miscompile we were seeing can't occur. (and hence I think the regular Julia tests are likely to be sufficient for testing the PR).

Has this been done now

yes.

Value *add_with_overflow = ctx.builder.CreateCall(intr, {nel_unboxed, nbytes});
nbytes = ctx.builder.CreateExtractValue(add_with_overflow, 0);
Value *overflow1 = ctx.builder.CreateExtractValue(add_with_overflow, 1);
overflow = ctx.builder.CreateOr(overflow, overflow1);
Copy link
Member

@topolarity topolarity Dec 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know the issues we ran into on the last PR were mostly from complicated interaction with a downstream pass, but I think it is worth writing unit tests that target this specific path in codegen

Can you add those please?

(i.e. test nel == 0 vs. nel != 0, isunion vs. !isunion, etc.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure (although just being able to build Julia is a pretty good test that none of these cases are broken)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've actually realized that this is fairly difficult to test for directly in ways that the compiler won't cheat at. i.e. inference knows that memorynew(Memory{Int}, 5) returns a Memory{Int}, so testing return type will just be optimized out unless I go through a bunch of effort to try to fool the compiler (which also definitely would have segfaulted if it couldn't allocate a Memory).

Comment on lines +4462 to +4463
if (inst == NULL)
return false;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When does this happen?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this means the type isn't Const (i.e. for code like
Memory{rand((Int, Float64))}(undef, 1)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't that the check 3 lines above?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, right. this case might be for Memory{1}(undef, 1) or something like that where it is Const, but not valid.

@oscardssmith
Copy link
Member Author

@nanosoldier runtests()

@oscardssmith
Copy link
Member Author

turns out that we want #56817 before this PR anyway since the segfaults on the OG memorynew PR were just a bug that cleaning up the IR magically is enough to trigger.

@vchuravy
Copy link
Member

and hence I think the regular Julia tests are likely to be sufficient for testing the PR

You are still adding more precise side effects in the modelling of the intrinsic that could have knock on effects down the road. I agree that this PR is lower risk, but PkgEval is not that expensive and it really helps being able to have that strong signal.

@maleadt
Copy link
Member

maleadt commented Dec 13, 2024

turns out that we want #56817 before this PR anyway since the segfaults on the OG memorynew PR were just a bug that cleaning up the IR magically is enough to trigger.

Better wait before triggering PkgEval then? I had to restart the server, so your submission was lost; you'll need to re-trigger (either right away, or after #56817 is merged).

@oscardssmith
Copy link
Member Author

@nanosoldier runtests()

@oscardssmith
Copy link
Member Author

no clue what's up with packageeval. if it doesn't come back within the next day, I'm merging without it. I have at least 1 PR (and hopefully another few after) that depend on this that I'm hoping to land in the 1.12 window.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrays [a, r, r, a, y, s] compiler:codegen Generation of LLVM IR and native code needs pkgeval Tests for all registered packages should be run with this change performance Must go faster
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants