Skip to content

nesg-ugr/Multi-Labeling-Malware

Repository files navigation

Multi-Labeling Malware

Project description

This tool implements a multi-labeling procedure for malware Android mobile apps based on the query of known detection engines.

Tools

  • Python is the programming language used. Recommended Python 3.9.2 or upper.
  • PostgreSQL is the database used. Recommended PostgreSQL 13.4 or upper.
  • VirusTotal used to analyze the acquired apks, and if they are malware, apply multi-tagging.

Downloading and Preparing the environment

  1. First, simply clone this repo: git clone _this_repo_

  2. Important: Next, you have to download two large files that you cannot download directly with the git clone. So, for downloading the files database.sql and /apks_hashes/list_of_hashes_complete visit this link.

  3. Install the requirements: pip install -r requirements.txt

  4. Set the credentials in the /db/database.ini file.

  5. Then, create the database: sudo -u postgres psql -c 'create database database_name;'

  6. Optional: Import the database (database.sql): pg_restore -h localhost -d database_name -U postgres database.sql

NOTE: If you have issues, check out your permissions and psql passwords. Here you can consult some common issues with psql authentication.


Running the program

Before anything, to run the program, you need a VirusTotal API key.

To obtain it, you just need to register on the VirusTotal page and access the following link: https://www.virustotal.com/gui/user/{username}/apikey, where {username} is the user with whom you have registered your account.

Then, you can write the hashe(s) you want to analyze (preferably SHA256) in the file apks_hashes/list_of_selected_sha256 (one hash per line).

In this point, you can launch the tool apkcollector.py with the -k argument to start parsing the hashes found in the apks_hashes/list_of_selected_sha256 file.

`python3 apkcollector.py -k [VirusTotal API key] (-d True)`

Reviewing the results

You can check the results with a few simple queries to the database.

  1. Connect to the PostgreSQL database: psql database_name user

  2. You can show the tables of the database with: \dt

  3. In the multi_labelling table are the results of the multi-labeling (specifically in the malware colum). Some examples of querys:

    SELECT * FROM multi_labeling; (to see all the info)

    SELECT malware FROM multi_labeling WHERE hash_id like 'valid_hash_id'; (to see the malware labels assigned for a particular sample)

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages