Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: Remove C shared memory shim. Refine shared memory utils behavior. Add unit test for shared memory #797

Merged
merged 7 commits into from
Oct 31, 2024

Conversation

GuanLuo
Copy link
Contributor

@GuanLuo GuanLuo commented Oct 22, 2024

What does the PR do?

Add unit test to constraint the behavior of shared memory utilities

Checklist

  • PR title reflects the change and is of format <commit_type>: <Title>
  • Changes are described in the pull request.
  • Related issues are referenced.
  • Populated github labels field
  • Added test plan and verified test passes.
  • Verified that the PR passes existing CI.
  • Verified copyright is correct on all changed files.
  • Added succinct git squash message before merging ref.
  • All template sections are filled out.
  • Optional: Additional screenshots for behavior/output changes with before/after.

Commit Type:

Check the conventional commit type
box here and add the label to the github PR.

  • build
  • ci
  • docs
  • feat
  • fix
  • perf
  • refactor
  • revert
  • style
  • test

Related PRs:

Where should the reviewer start?

test_shared_memory.py has inline comment on change in behavior

Test plan:

L0_python_client_unit_tests

  • CI Pipeline ID:

19586188

Caveats:

This PR is marked as draft to collect feedback on the test cases and API level changes, the PR will be updated with new implementation which uses Python's native shared memory support

Background

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

  • closes GitHub issue: #xxx

@fpetrini15
Copy link
Contributor

Summarizing the changes, it seems like the idea is to more concretely delineate handling when shared memory keys collide? IIUC, the first to register the shared memory controls the size of the region. Subsequent shared memory handles can register with the same key, but must be <= to the original registration size and set create to False.

As a result, we must now account for scenarios where the initial registrant has been deleted. In such a case, other handles registered with the same key will encounter an unlink error/will not exist in the shm region map.

@GuanLuo
Copy link
Contributor Author

GuanLuo commented Oct 22, 2024

Summarizing the changes, it seems like the idea is to more concretely delineate handling when shared memory keys collide?

Yes, the goal is to have unit test to enforce the shared memory utils behavior, whereas previously there is no detailed documentation on it, and I would like to restrict the behavior and document it.

i.e. you may call create_shared_memory_region on the same key with different byte size, and the later call will succeed and in fact resizing the existing shared memory object. I am not OS expert to be sure on what is the implication but intuitively believe that is not good behavior as it makes previous handle carry inconsistent information.

As a result, we must now account for scenarios where the initial registrant has been deleted. In such a case, other handles registered with the same key will encounter an unlink error/will not exist in the shm region map.

Yes, I think I would want to do more implementation change on the shared memory utils to have more internal tracking and only unlink the region when the destroy call on the last handle.

One other change I am going to make is that make create API default to be "allow create or reuse" to avoid breaking existing usage, some of our tests fail due to current change of raising error on reuse by default (ref). Even though it ends up being the test is not cleaning up shared memory properly between test cases, but I am going to avoid this backward-incompatible change, and instead provide option create_only for users that doesn't want the ambiguity from reusing region that may be different from requested size.

@GuanLuo GuanLuo requested a review from fpetrini15 October 23, 2024 01:45
@GuanLuo
Copy link
Contributor Author

GuanLuo commented Oct 23, 2024

test cases and implementation ready for review

@GuanLuo GuanLuo changed the title [draft] refactor: refine shared memory utils behavior. add unit test for shared memory refactor: Remove C shared memory shim. Refine shared memory utils behavior. Add unit test for shared memory Oct 23, 2024
Whether a shared memory region must be created. If False and
a shared memory region of the same name exists, a handle to that
shared memory region will be returned and user must be aware that
the shared memory size can be different from the size requested.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
the shared memory size can be different from the size requested.
the previously allocated shared memory size can be larger than the size requested.

Nit. Current language seems to imply the user can request a size that is larger than previously allocated. This was valid behavior before, but is not with your changes now. I think we want to make that clear here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Applied the change but still refer to that the size can be different instead of larger. This part is actually a bit fuzzy. In the previous implementation you can actually write / read from the mapped memory even if it is smaller / larger than the actual region size, which is obviously wrong. Current implementation at least make sure that the created region will be reused without modification, and provide basic region boundary check. So it is now an error if you requested a larger size and write / read larger data, but if you requested a smaller size, then it is fine as long as your following actions act on the requested (smaller) size.

Comment on lines +151 to +156
shm_tensor_view = np.ndarray(
input_value.shape,
input_value.dtype,
buffer=shm_handle._mpsm_handle.buf[offset:],
)
)
offset_current += byte_size
return
shm_tensor_view[:] = input_value[:]
Copy link
Contributor

@rmccorm4 rmccorm4 Oct 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't buffer=shm_handle._mpsm_handle.buf[offset:], fill the array?

Are we immediately overwriting the np array contents with shm_tensor_view[:] = input_value[:]? Or using np array underlying buffer as a mechanism to write to the mpsm_handle.buf?

https://numpy.org/devdocs/reference/generated/numpy.ndarray.html

buffer : object exposing buffer interface, optional
Used to fill the array with data.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The passed buffer is the underlying buffer of the numpy array, this method is used in the shared memroy example

Copy link
Contributor

@fpetrini15 fpetrini15 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, great work!

@GuanLuo GuanLuo merged commit 9b2b42d into main Oct 31, 2024
3 checks passed
@GuanLuo GuanLuo deleted the gluo-shared_memory branch October 31, 2024 23:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants