search: introduce a new hyperscan-backed searchdef type: HyperscanSearchDef
#6
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Searchkit currently uses python's re which is not known for its' "blow your socks off" pattern scanning performance, hence there is an opportunity for optimization by simply swapping the regex engine.
Hyperscan is a highly optimized, performant regex engine that is typically used high throughput network packet inspection systems (e.g. DPI, IDS/IPS systems) for pattern recognition. The work that searchkit does is aligned with hyperscan's properties so it would be beneficial for searchkit to allow downstream users to leverage hyperscan, especially for searching large files.
This patch introduces a hyperscan-backed SearchDef type which can be used as a drop-in replacement for the existing SearchDef type. The patch also adds hyperscan as a dependency and moves searchkit tests to a base class so the tests can be used for testing both SearchDef and HyperscanSearchDef at the same time.