Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add missing store overlap checks #1111

Conversation

manopapad
Copy link
Contributor

Fixes #1110

@manopapad manopapad added the category:bug-fix PR is a bug fix and will be classified as such in release notes label Jan 4, 2024
@manopapad manopapad requested a review from magnatelee January 4, 2024 03:26
@magnatelee
Copy link
Contributor

We should modify _copy_if_overlapping so it would make a copy only for partial overlapping. This code as it's currently written would make a copy even for cases like cunumeric.add(a, b, out=a), which would be functionally correct but bad for performance.

@manopapad
Copy link
Contributor Author

@magnatelee I assume you only want to catch cases when the rhs and lhs Stores are backed by the exact same (sub-)region? If I'm not mistaken, currently two instances of the same slice expression, e.g. a = arr[1:]; b = arr[1:] will create separate partitions off of the root, so even though they are conceptually covering the same extents, Legate will not be able to coallesce them into the same RegionRequirement.

@magnatelee
Copy link
Contributor

Isomorphic slices are all mapped to the same sub-region/partition, so they can be coalesced to the same region requirement. That being said, the Python core has a bug that prevents the coalescing, unrelated to the mapping between slices and partitions. This issue has already been fixed in the internal core, so we should handle all kinds of exact overlapping correctly in this PR.

Comment on lines +1109 to 1114
if (
view.base.has_storage
and rhs.base.has_storage
and view.base.storage.same_handle(rhs.base.storage)
):
return
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively this could be rolled into a Store.maybe_equals(), and/or move this into DeferredArray.copy().

@manopapad
Copy link
Contributor Author

That being said, the Python core has a bug that prevents the coalescing, unrelated to the mapping between slices and partitions.

If this bug is still present, i.e. the coalescing is not guaranteed to happen, then we cannot safely apply this optimization at the moment. We must be certain that the RegionRequirements we feed to Legion will not overlap, because Legion will not check this on release mode, and this can lead to silent data corruption, as seen in #1110.

I suggest we merge this bugfix for now (as well as the accompanying nv-legate/legate#925 which fixes a related bug on the legate.core side) and tackle this optimization in the future, once the core can provide the coalescing guarantee. I opened #1112 to track this.

@manopapad manopapad merged commit 8693a3d into nv-legate:branch-24.01 Jan 26, 2024
14 of 15 checks passed
@manopapad manopapad deleted the mp/2024-01-03/missing-overlap-checks branch January 26, 2024 23:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category:bug-fix PR is a bug fix and will be classified as such in release notes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Cunumeric and numpy behave differently for some inplace operations and slicing
2 participants