Construct and use a 'fuzzing dictionary' #8

Zac-HD · 2022-10-26T08:46:22Z

In fuzzing, a "dictionary" is a corpus of known-interesting fragments (boundary values, html tags, etc.) that can be mixed in with randomly-generated or mutated data to increase the chance of stumbling across interesting bugs.

We kinda support doing this with Hypothesis for some types already; it's how we boost the chances of boundary integers and "interesting" floats. However there's not currently any mechanism for adding to the pool at runtime, and adding one will take some care to ensure that we can still replay failing examples without that runtime pool. See also HypothesisWorks/hypothesis#3086 and HypothesisWorks/hypothesis#3127 (comment).

Once we've got that, the standard easy way to get a dictionary is to run strings on your binary. The natural equivalent is to grab our Python source code and collect all the ast.Constant values! (excluding perhaps long strings, which are likely docstrings)

A more advanced trick, shading into full research project, would be to investigate Redqueen-style tracking. For example, "a string in the input matched against this regex pattern in the code, so try generating strings matching that pattern".

The text was updated successfully, but these errors were encountered:

Zac-HD mentioned this issue Jun 25, 2024

Surprisingly neither hypothesis nor hypofuzz finds the failing example, where there is a small failing example #39

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Construct and use a 'fuzzing dictionary' #8

Construct and use a 'fuzzing dictionary' #8

Zac-HD commented Oct 26, 2022 •

edited

Loading

Construct and use a 'fuzzing dictionary' #8

Construct and use a 'fuzzing dictionary' #8

Comments

Zac-HD commented Oct 26, 2022 • edited Loading

Zac-HD commented Oct 26, 2022 •

edited

Loading