forked from katef/libfsm
-
Notifications
You must be signed in to change notification settings - Fork 2
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
cdata: Intern individual words from 256-bitsets.
Each state has two 256-bitsets, stored as a uint64_t[4], but the individual words in those have a lot of duplication. Add a table with every unique word, sorted descending by frequency, and replace the per-state labels and label_group_starts arrays with an array of offsets into the label_word table. Typically these offsets will fit in a uint8_t (though the code generation will switch to a uint16_t when necessary), making the per-state data much smaller. The lable_word table's most commonly used entries are all grouped together and should stay in cache.
- Loading branch information
1 parent
d3d1c57
commit 5d44725
Showing
1 changed file
with
133 additions
and
15 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters