Skip to content

Commit

Permalink
Add commentary about the hash function
Browse files Browse the repository at this point in the history
  • Loading branch information
linas committed Apr 16, 2024
1 parent 1bc4ee0 commit 43546e0
Showing 1 changed file with 15 additions and 0 deletions.
15 changes: 15 additions & 0 deletions link-grammar/connectors.h
Original file line number Diff line number Diff line change
Expand Up @@ -313,10 +313,25 @@ typedef uint32_t connector_hash_t;

static inline connector_hash_t connector_hash(const Connector *c)
{
// The use of (c->desc->lc_mask & 1) during hashing is important;
// See pull req #1487 for details. This raises other questions
// about hashing. Two forms are attempted below. They appear to
// be equivalent, in terms of measured elapsed-time performance.
// (I did not look at the quality of the distribution.)
// The second form uses some mixing bitshifts:
// 266281 == sum of 1 8 32 4096 (256*1024) It is a prime number
// 524429 == sum of 1 4 8 128 (512*1024) and it is a prime number
#ifdef SIMPLE_HASH
return c->desc->uc_num +
(c->multi << 19) +
(((connector_hash_t)c->desc->lc_mask & 1) << 20) +
(connector_hash_t)c->desc->lc_letters;
#else
return c->desc->uc_num +
c->multi * 266281 +
(((connector_hash_t)c->desc->lc_mask & 1) * 524429) +
((connector_hash_t)c->desc->lc_letters) * 101;
#endif
}

/**
Expand Down

0 comments on commit 43546e0

Please sign in to comment.