Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added NativePerJobThreadIntPtrs #5

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

hexonaut
Copy link

@hexonaut hexonaut commented Nov 6, 2019

I generalized NativePerJobThreadIntPtr to take advantage of the wasted cache line space and allow for multiple counters to run at the same time. The buffer will round up to the next cache line size multiple.

Uses array access operators instead of the single Value and Parallel requires an index, but other than that the code is mostly identical.

Copy link
Owner

@jacksondunstan jacksondunstan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your submission and sorry it's taken me so long to review it. Aside from the comments I added, two more changes are needed before merging. First, all collections have a thorough set of unit tests to go along with them. Second, all collections are mentioned with a small example in the README. Please add these and I'll take a look. Thanks again!

@jacksondunstan
Copy link
Owner

One more thing: there should also be a NativePerJobThreadLongPtrs to keep the parity established by NativePerJobThreadLongPtr and NativeLongPtr.

…ty checks; fixed a bug where the data structure wont work with safety checks off
@hexonaut
Copy link
Author

hexonaut commented Nov 26, 2019

I've implemented all the requested changes except for the removal of the arbitrary length. On line 236 you can see that NumCacheLines = (int)Math.Ceiling((double)length / IntsPerCacheLine); makes it so there is potentially more than 1 cache line. I don't see any issue as long as the blocks are in multiples of a cache line. Do you see any performance or race condition issues with this?

Once we are good with this file I'll add in the equivalent long version, tests and readme update.

@jacksondunstan
Copy link
Owner

Hey @hexonaut , thanks for making all these changes! Allow me to elaborate on my feedback that Length should always be constant. The original NativePerJobIntPtr that this is based on is a pointer to exactly one int. In the description for this version, the idea is to avoid wasting cache line space by supporting more than one int. This makes sense to me since a cache line can indeed fit more than one int and it'd be nice to allow users to access the rest of the int values that are stored in that cache line. This would imply that Length is a constant (IntsPerCacheLine) since the number of int values that fit into a cache line never changes.

That said, the presence of a length parameter to the constructor and respecting that value by potentially allocating an arbitrarily large amount of space for any number of int values changes the nature of the type. No longer is it just exposing the rest of the cache line as a constant-length array, but instead it has become a type like NativeArray<int> that supports arbitrary length arrays. While this is a valid design, it's for a very different purpose than stated in the description. It might be better called NativePerJobThreadIntArrays.

So I'd prefer to keep this type to just the constant-length array of int values that fit in a cache line. Making this change will really simplify this type and focus it on just reducing cache line waste. If there's a desire for per-thread int arrays, we should put that in another type.

@hexonaut
Copy link
Author

hexonaut commented Dec 7, 2019

I see what you mean now. I can split the code into NativePerJobThreadIntArrays and remove length from the NativePerJobThreadIntPtrs constructor. Would you prefer a separate PR for NativePerJobThreadIntArrays?

@jacksondunstan
Copy link
Owner

Yes, a separate PR is probably a good idea to keep the branch size manageable.

One more thing: I was thinking about the name NativePerJobThreadIntPtrs and realized that it implies "many pointers to one int" or possibly "many pointers to many ints", which is not accurate because there aren't many pointers and there aren't necessarily many ints. What do you think about renaming it to NativePerJobThreadIntsPtr, which indicates "one pointer to many ints"?

@hexonaut
Copy link
Author

Sounds good to me. I'll make the name switch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants