-
Notifications
You must be signed in to change notification settings - Fork 295
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add other binsizes into hashtable than 1 bit and 8 bits. #51
Comments
We can exploit some efficiencies in the 1-, 2-, 4-, and 8-bit cases because those uniformly "tile" a byte. The other cases do not uniformly tile a byte (e.g., the first 3-bit value appears at bit 0 of byte 0, the second 3-bit value appears at bit 3 of byte 0, the third at bit 6 of byte 0, but the fourth appears at bit 1 (and not bit 0) of byte 1). Bitmasks are thus position-dependent in the latter cases, whereas they are either position-independent or unneeded in the former cases. If one wanted a generalized solution (for n-bit counters, where n >= 1), then I would probably create a template with the generalized bitmask calculation and then create specializations of the template to selectively override this in the cases where we know we can do better. |
Related to issue #27. As I see it, the choice of hash functions is independent of counter memory model. We may wish to split the hash table design into the hashing method used and accesses to the counter memory, since those are essentially orthogonal operations. Splitting the read parsers into data pumps and parsers helped us; I believe that a similar split for hashing and counters, as proposed above, would yield benefits as well. |
Agree C. Titus Brown, [email protected] On Apr 22, 2013, at 18:55, Eric McDonald [email protected] wrote:
|
#1035 proposes using the MiniFloat standard to do this. |
#1551 implements a storage class with 4bits per entry |
Right now, Hashbits supports 1 bit per hashtable entry and CountingHash supports 8 bits per hashtable entry. 2 and 4 bits/entry would be great extensions that could enable substantially decreased memory usage in some useful circumstances; not sure if its worth doing it for arbitrary numbers of bits, or if there would be performance penalties.
The text was updated successfully, but these errors were encountered: