Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

don't capture arrays/tuples in BoundsError #43738

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Conversation

aviatesk
Copy link
Member

@aviatesk aviatesk commented Jan 10, 2022

This commit changes the semantics of BoundsError so that it doesn't
capture arrays/tuples passed to its constructor.

The change wouldn't improve any performance by itself because the
BoundsError constructor is mostly called on error paths and thus
it is opaque to analyzers/optimizers when compiling a caller context.
Rather it's supposed to be a building block for broadening the possibility
of certain memory optimizations in the future such as copy elision for
ImmutableArray construction and stack allocation of Arrays, by
allowing us to assume the invariant that primitive indexing operations
into array/tuple don't escape it so that our escape analyses
can achieve further accuracy.

Specifically, BoundsError constructor will now compute the "summary"
of its Array/Tuple argument, rather than capturing it for the later
inspection. Assuming Base.summary(x::Array) or Base.summary(x::Tuple)
don't escape x, we can assume that BoundsError doesn't escape x.
This change won't appear as breaking in most cases since showerror will
print the exact same error message as before, but obviously this is
technically breaking since we can no longer access to the original
arrays/tuples by catching BoundsError.
I'd say this breaking semantic change would be still acceptable, since
I think it's enough if we can know size/type of arrays/tuples from
BoundsError in most cases.

The last note, I didn't change the semantics for arbitrary user objects,
since we still don't have a good infrastructure to tell the compiler some
primitive assumptions/rules about user type objects anyway.

@aviatesk aviatesk added arrays [a, r, r, a, y, s] breaking This change will break code labels Jan 10, 2022
This commit changes the semantics of `BoundsError` so that it doesn't
capture arrays/tuples passed to its constructor.

The change wouldn't improve any performance by itself because the
`BoundsError` constructor is mostly called on error paths and thus
it is opaque to analyses/optimizers when compiling a caller context.
Rather it's supposed to be a building block for broadening the possibility
of certain memory optimizations in the future such as copy elision for
`ImmutableArray` construction and stack allocation of `Array`s, by
allowing us to assume the invariant that primitive indexing operations
into array/tuple indexing don't escape it so that our escape analyses
can achieve further accuracy.

Specifically, when `BoundsError` constructor will now compute the "summary"
of its `Array`/`Tuple` argument, rather than capturing it for the later
inspection. Assuming `Base.summary(x::Array)` or `Base.summary(x::Tuple)`
don't escape `x`, we can assume that `BoundsError` doesn't escape `x`.
This change won't appear as breaking in most cases since `showerror` will
print the exact same error message as before, but obviously this is
technically breaking since we can no longer access to the original
arrays/tuples by catching `BoundsError`.
I'd say this breaking semantic change would be still acceptable, since
I think it's enough if we can know size/type of arrays/tuples from
`BoundsError` in most cases.

As a last note, I didn't change the semantics for arbitrary user objects,
since we still don't have a good infrastructure to tell the compiler some
primitive assumptions/rules about user type objects anyway.
@@ -139,11 +139,19 @@ JL_DLLEXPORT void JL_NORETURN jl_atomic_error(char *str) // == jl_exceptionf(jl_
jl_throw(jl_new_struct(jl_atomicerror_type, msg));
}

void JL_NORETURN jl_throw_bounds_error(jl_value_t *v, jl_value_t *t)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this code have a special meaning in the context of the other changes, or is this just a refactor?

@ianatol
Copy link
Member

ianatol commented Jan 10, 2022

LGTM, and iirc @JeffBezanson pitched something along these lines when we last discussed this topic. If this looks good to everyone, this should basically be the last requirement to merge #42465 (modulo some more tests / minor changes due to reviews)

# these `invoke_in_world`s here essentially prohibits users from changing
# the semantics of `BoundsError` by overriding `summary(a)` implementations
# (and potentially allowing `a` to be escaped to somewhere)
desc = Base.invoke_in_world(Base._BASE_DEFINED_AGE, Base.summary, a)::String
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a bit worried about that. When I joked offline that we "supported" this, @JeffBezanson was "We do not!".

I think a better approach is to define a Core.throw_bounds_error that is defined in the inference world-age.
and have jl_throw_bounds_error invoke in that world-age.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we really need to (or should) call summary here; saving the size and type should be enough.

To handle calling size, we could define constructors here only for Array and Tuple, and add a constructor in Base for other types so it can call Base.size. (I hope to have a better general solution for this one day, but for now this is how it is.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

size -> axes

@ianatol ianatol mentioned this pull request Jan 10, 2022
@vtjnash vtjnash added the triage This should be discussed on a triage call label Jan 19, 2022
@JeffBezanson
Copy link
Member

From triage: let's change it to capture the type and axes. For getfield we can use the number of fields as the "axes".

@JeffBezanson JeffBezanson removed the triage This should be discussed on a triage call label Jan 20, 2022
@ianatol
Copy link
Member

ianatol commented Jan 20, 2022

This is on top of my list to sort out once I can focus on #42465 again

@vtjnash
Copy link
Member

vtjnash commented Feb 23, 2022

Do we need to (or should we) think about TypeError (typeassert) also?

@ianatol
Copy link
Member

ianatol commented Feb 23, 2022

@vtjnash Are you referring to potential escape through the got field?

The last I talked to Jeff, I think the idea for this PR was trying to mostly handle potential BoundsError escapes at the thrower rather than in the constructor. I wonder if we could do something similar for TypeError --- i.e., require got::Type and call typeof at the thrower if needed?

@vtjnash
Copy link
Member

vtjnash commented Feb 23, 2022

Yeah, but I suppose this is sufficient for now. I think UndefRefError is another candidate where we eventually want to capture something about the array itself too, to provide better context in the error message, but which conflicts with the intent of this PR?

@ianatol
Copy link
Member

ianatol commented Feb 23, 2022

We could take a similar approach with UndefRefError and capture type/axes, but from a cursory look it seems like it's sometimes used in places where we don't really have access to that info? I agree that should probably wait until later anyways.

I don't think bundling BoundsError and TypeError changes in this PR would be too much harder (famous last words) than just doing BoundsError, since there's not too many TypeError throws in Base.

@oscardssmith oscardssmith added triage This should be discussed on a triage call forget me not PRs that one wants to make sure aren't forgotten labels Oct 7, 2024
@oscardssmith
Copy link
Member

We should to get this rebased and merged. It will be very nice once we get #55913

@LilithHafner
Copy link
Member

Bump, but I don't think we need triage to talk about this.

@LilithHafner LilithHafner removed the triage This should be discussed on a triage call label Nov 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrays [a, r, r, a, y, s] breaking This change will break code forget me not PRs that one wants to make sure aren't forgotten
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants