Skip to content

Unique wordlist generator of unique wordlists.

License

Notifications You must be signed in to change notification settings

SomeKirill/wordlist_generator

Repository files navigation

wordlist_generator

Tool wordlist_generator generates unique to your target wordlist with techniques mentioned in tomnomnom's video "Who, What, Where, When". It takes URLs from gau to extract directories, file names or words on pages. As additional feature it can extract HTML comments. By default tool will only request 2000 URLs, extract all words and directories.

To clean wordlist, wordlist_generator removes from result everything from "denylists" directory files to keep only unique words. Also it cleans result using regexes from BonJarber's clean_wordlist tool. You can adjust which extenctions will be ignored during parsing files and fetching pages in parsing_allow_extensions.txt and scraping_deny_extensions.txt.

Usage:

Examples:

$ ./wordlist_generator.py -d hackerone.com -a 20 -files
$ ./wordlist_generator.py -d bugcrowd.com -a 7500 -dir
$ ./wordlist_generator.py -d intigriti.com > intigriti_wordlist.txt

To display the help for the tool use the -h flag:

./wordlist_generator.py -h
Flag Description Example
-domain target domain ./wordlist_generator.py -d openbugbounty.org
-amount amount of URLs to fetch from gau ./wordlist_generator.py -d twitter.com -a 10000
-dir Extract only directories ./wordlist_generator.py -d hackerone.com -dir
-f Extract only filenames ./wordlist_generator.py -d hackerone.com -f
-c Extract only comments with no filtering ./wordlist_generator.py -d hackerone.com -c

Installation:

$ GO111MODULE=on go get -u -v github.com/lc/gau
$ git clone https://github.com/SomeKirill/wordlist_generator/
$ pip3 install -r requirements.txt

denylists wordlists used: