This is a software to analyze exploits from the Exploit Database. The ExploitDB is fetched from GitHub. The CSV-file about all exploits is stored to a MongoDB. Afterwards the exploits are separated into a comment and a code part to reach a better static code analysis. Whilst the comment part also includes non pythonic code of the exploit. If the separating succeeded for an exploit, the comment and code analyzer analyzes each part and extract some defined features about the exploit and enrich the document in the database. If done so, all information about the exploits in the database are exported in a defined dataschema to a running MISP instance.
This software was tested on Debian Stretch 9.6 and is written in Python 2.7.13
apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 9DA31620334BD75D9DCB49F368818C72E52529D4
Get root rights:
su
echo "deb http://repo.mongodb.org/apt/debian stretch/mongodb-org/4.0 main" | tee /etc/apt/sources.list.d/mongodb-org-4.0.list
apt update
apt install -y mongodb-org
mkvirtualenv --system-site-packages --python=<path-to-python2.7-bin> <env-name>
add2virtualenv <path-to-this-repo>/src
In any case the requirements installation is required:
pip install -r requirements.txt
MongoDB runs on localhost:27017
.
systemctl start mongod.service
mongo < mongodb_setup.js
If you want to change the name of database or collection in mongodb_setup.js
, you need to change it in database/db_interface.py
as well. By default it inits the MongoDB with the database exploitdb
and the collection exploits
.
Configuration settings are stored in conf/config.cfg
. The configuration settings for the test cases are stored in conf/test_config.cfg
.
Add your MISP authentication key into the MISP section of the config file and just run:
python2.7 -m main
Log files are stored in log/
. The logging config is conf/logging.cfg
. No changes need to do here.
The first module generates various charts. It needs the exploitdb repo stored at local_git
in the config file. While running, it plots various statistics per default to data/assets/png/
. The dir can be changed in the config-file. It uses defined color maps for the values in charts. To create new maps run:
python2.7 -m visualization.aggregation --help
to get more instructions. To use default color maps run it as follows:
python2.7 -m visualization.aggregation
The second module prints some statistics about the data in the database. Run it as follows:
python2.7 -m visualization.db_statistic
For the pytest, a running MongoDB is necessary. Pytest creates a test database with a test collection.
pytest test/
In doc/arch/
is the main architecture of the system. arch_single.xml
is the architecture of this component named "Exploit Information Retrieval". arch.xml
is the hole idea of a system including two other components: "Exploit Execution" and "Exploit Detection" which are not part of this project. The xml-files were created with Draw.io.
arch.vpp
is a VisualParadigm project which includes an UML component diagram, some UML class diagrams and a dataschema about this project. doc/arch/diagrams.pdf
gives a view about them.
In doc/spec/
is a latex document which describes the specification. In this dir run make
to generate the pdf-file.
In doc/arch/diagrams.pdf
is the schema of the data defined. The dataschema shows the parsed data from the CSV-file of the GitHub repo which are stored as documents in the database. The two analyzer enrich these documents with more extracted information. To do this, the config-file has an analyzer section, where the terms to extract are defined. Read more about it in the next section.
This component analyzes all comment parts of the exploits which are stored under data/comments/
by default. It does some pattern and string matching. The pattern matching is done in a general analyzes and is part of the baseclass ExploitAnalyzer
. The strings to search for are defined in the config-file under comment_tags
. If another tag is added there, a new MISP attribute have to be added to the MISP attributes-config-file.
The CodeAnalyzer analyzes each code part of an exploit stored by default under data/code/
by doing a static code analyzes. It is not that easy to add a new pattern to look for in the code as in the CommentAnalyzer
for a string matching. To do so, add new code to that component and store the results in the dedicated result dictionary. As well as add a new term under code_tags
in the main config-file and a new attribute in the misp attributes-config-file.
As just pointed out, to run this system, a running MISP instance is necessary. In the config-file is a section about the MISP configuration. It is required to change the url
and key
in that section.
The workflow is, that the MISP component creates a new event for each exploit. Furthermore it enrichs the created event with predefined attributes under conf/misp_attributes.cfg
. Each new event also gets the predefined object template exploit-poc. In addition a new tag for this exploitdb analyzer project with a predefined colour will be created automatically if it was not yet created. It is defined in the MISP section in the config-file. There is also a list of default taxonomies which are used to tag each event automatically. These are:
- CERT-XLM:intrusion-attempts="exploit-known-vuln"
- CERT-XLM:intrusion-attempts="new-attack-signature"
- cyber-threat-framework:Engagement="exploit-vulnerabilities"
- enisa:nefarious-activity-abuse="exploits-exploit-kits"
- europol-event:exploit-tool-exhausting-resources
If an exploit was changed from the maintainer of ExploitDB, this component will not create a new event. Instead it looks for the changed or added attributes and process properly. This is done by a given uuid which is stored in the database if a new event was created.
This process does not extend events, because it is not sure, that an existing event is exentible for the ExploitDB events. Anyway, a correlation is automatically made by MISP to find existing events with the same attributes.