This tool implements a multi-labeling procedure for malware Android mobile apps based on the query of known detection engines.
- Python is the programming language used. Recommended Python 3.9.2 or upper.
- PostgreSQL is the database used. Recommended PostgreSQL 13.4 or upper.
- VirusTotal used to analyze the acquired apks, and if they are malware, apply multi-tagging.
-
First, simply clone this repo:
git clone _this_repo_
-
Important: Next, you have to download two large files that you cannot download directly with the git clone. So, for downloading the files database.sql and /apks_hashes/list_of_hashes_complete visit this link.
-
Install the requirements:
pip install -r requirements.txt
-
Set the credentials in the /db/database.ini file.
-
Then, create the database:
sudo -u postgres psql -c 'create database database_name;'
-
Optional: Import the database (database.sql):
pg_restore -h localhost -d database_name -U postgres database.sql
NOTE: If you have issues, check out your permissions and psql passwords. Here you can consult some common issues with psql authentication.
Before anything, to run the program, you need a VirusTotal API key.
To obtain it, you just need to register on the VirusTotal page and access the following link: https://www.virustotal.com/gui/user/{username}/apikey, where {username} is the user with whom you have registered your account.
Then, you can write the hashe(s) you want to analyze (preferably SHA256) in the file apks_hashes/list_of_selected_sha256 (one hash per line).
In this point, you can launch the tool apkcollector.py with the -k argument to start parsing the hashes found in the apks_hashes/list_of_selected_sha256 file.
`python3 apkcollector.py -k [VirusTotal API key] (-d True)`
You can check the results with a few simple queries to the database.
-
Connect to the PostgreSQL database:
psql database_name user
-
You can show the tables of the database with:
\dt
-
In the multi_labelling table are the results of the multi-labeling (specifically in the malware colum). Some examples of querys:
SELECT * FROM multi_labeling;
(to see all the info)SELECT malware FROM multi_labeling WHERE hash_id like 'valid_hash_id';
(to see the malware labels assigned for a particular sample)