-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance Issue #2
Comments
So you mean instead of each block storing just its own length, have it store the sum of lengths of all before it?
Why assume that?
I'd like to benchmark that.
I disagree. Wouldn't it be better to allocate regions that re as big as we can get away with? The less regions we have the better we'll perform.
Sounds wasteful.
Need to benchmark that. |
Thanks for getting back, you raise some good points. Would you have any objections to my proposing a pull reque? I found your post on a Quora article and have been giving some thought to the problem. I’m not sure the number of regions matters except if they become small enough to affect cache locality, what really matters is the speed of determining what block of memory to index into. If you use fixed size chunks then it becomes very fast to calculate what chunk to index into. You are right that throwing away chunks that are successfully allocated is wasteful so I gave that some thought. These chunks could actually be sub-divided rather than being discarded & this maintains great cache locality. The general goal though would be to simplify the algorithm by making the chunks all the same size, reducing complexity, and gaining back speed. |
Yes, but the speed of determining what block memory to index to depends on the amount of blocks. Given a constant length, smaller blocks mean bigger amount.
Yes, but how do you determine what the size should be?
Yes, but that would complicate the house keeping logic behind the scenes.
A noble goal, but the proposed method poses new challenges that may be fun to tackle.
No objections at all! Feel free to fork the repository and make all the changes you need. I've added a "how to contribute" section to the README.md file. |
This implementation traverses the list of blocks for each access, checking the size of each block. It would be MUCH faster if the block size was known in advance. It is presumably also okay to sacrifice allocation time for indexing time if we assume all of the memory will be used, this can result in gains in amatorized time.
It should be preferable to allocate a list of fixed size regions, and if any of those allocations fail, reduce the size by a factor of two (throwing away all previously allocated segments that may have succeeded) and then index into this structure.
This would allow the block to be computed as
index / block_size
and the offset asindex % block_size
Alternatively, a tree-like approach could be applied to reduce the theoretical O(n) but any performance increase, in reality, would likely be overwhelmed by increased constant runtime components of any algorithm.
The text was updated successfully, but these errors were encountered: