You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In fuzzing, a "dictionary" is a corpus of known-interesting fragments (boundary values, html tags, etc.) that can be mixed in with randomly-generated or mutated data to increase the chance of stumbling across interesting bugs.
We kinda support doing this with Hypothesis for some types already; it's how we boost the chances of boundary integers and "interesting" floats. However there's not currently any mechanism for adding to the pool at runtime, and adding one will take some care to ensure that we can still replay failing examples without that runtime pool. See also HypothesisWorks/hypothesis#3086 and HypothesisWorks/hypothesis#3127 (comment).
Once we've got that, the standard easy way to get a dictionary is to run strings on your binary. The natural equivalent is to grab our Python source code and collect all the ast.Constant values! (excluding perhaps long strings, which are likely docstrings)
A more advanced trick, shading into full research project, would be to investigate Redqueen-style tracking. For example, "a string in the input matched against this regex pattern in the code, so try generating strings matching that pattern".
The text was updated successfully, but these errors were encountered:
In fuzzing, a "dictionary" is a corpus of known-interesting fragments (boundary values, html tags, etc.) that can be mixed in with randomly-generated or mutated data to increase the chance of stumbling across interesting bugs.
We kinda support doing this with Hypothesis for some types already; it's how we boost the chances of boundary integers and "interesting" floats. However there's not currently any mechanism for adding to the pool at runtime, and adding one will take some care to ensure that we can still replay failing examples without that runtime pool. See also HypothesisWorks/hypothesis#3086 and HypothesisWorks/hypothesis#3127 (comment).
Once we've got that, the standard easy way to get a dictionary is to run
strings
on your binary. The natural equivalent is to grab our Python source code and collect all theast.Constant
values! (excluding perhaps long strings, which are likely docstrings)A more advanced trick, shading into full research project, would be to investigate Redqueen-style tracking. For example, "a string in the input matched against this regex pattern in the code, so try generating strings matching that pattern".
The text was updated successfully, but these errors were encountered: