Skip to content

Code for Deep Android Malware Detection paper and I add some files to test my own apk

Notifications You must be signed in to change notification settings

Sp1keeeee/Deep-Android-Malware-Detection

 
 

Repository files navigation

Deep Android Malware Detection

This repository contains the code for the paper "Deep Android Malware Detection" (pdf download) | (citation)

We use a convolutional neural network (CNN) for android malware classification. Malware classification is performed based on static analysis of the raw opcode sequence from a disassembled android apk. Features indicative of malware are automatically learned from the raw opcode sequence thus removing the need for hand-engineered malware features. The network runs on GPU, allowing a very large number of files to be quickly scanned.

If you use this code please cite the following paper:

@inproceedings{mclaughlin2017codaspy,
title = "Deep Android Malware Detection",
author = "Niall McLaughlin and {Martinez del Rincon}, Jesus and BooJoong Kang and Suleiman Yerima and Paul Miller and Sakir Sezer and Yeganeh Safaeisemnani and Erik Trickel and Ziming Zhao and Adam Doupé and {Joon Ahn}, Gail",
year = "2016",
month = "12",
booktitle = "Proceeding of the ACM Conference on Data and Applications Security and Privacy (CODASPY) 2017",
publisher = "Association for Computing Machinery (ACM)",
}

How to run the code

Given an existing dataset directory (see below for details), the run.sh file will do the following:

  1. Partition the dataset into training-set and held-out test-set
  2. Train a neural network
  3. Test the trained network on the test-set

Prerequisites

Dataset structure

An example dataset with the required directory structure is provided in ./dataset

The neural network requires opcode sequence files in the correct format, and a dataset directory with sub-directories containing malware and benign opcode sequence files.

An example dataset directory is provided in ./dataset. The dataset directory must have the following structure:

  1. There must be a directory called 'Benign', and contains non-malware opcode sequences files
  2. The other directory can have any name ,and contains malware opcode sequence files

Opcode Sequence files

Opcode sequence files can be created from android APK files using the opcode sequence creation tool. This tool is located in ./opcodeseq_creator Please see the readme file in this directory for more information.

Setup

The neural network code is implemented using Torch. It is recommended to use a GPU to achieve acceleration of testing and training. For details on installing Torch please see http://torch.ch

The opcode sequence creator tool requires APKTool https://ibotpeaches.github.io/Apktool/

配置环境

java1.8以及torch

使用流程

  1. 使用opcodeseq_creator文件夹中的py文件将目标文件夹中的apk文件转换为opcode序列文件存储到指定文件夹

    参数:<apk所在的文件夹><临时文件夹:直接写./tmpfile><存储的指定文件夹>
    
  2. run.sh中第一行th代码执行setup模式,将指定文件夹中的文件分类为训练集/测试集并保存为metadata(指定文件夹中是opseq文件)

    -dataDir 参数记得改为上述的<存储的指定文件夹>
    
  3. run.sh第二行th代码读取setup模式获得的metadata然后开始训练并将训练模型存储到./trainedNets/文件夹中(具体代码在trainModel.lua中)

    注:这里手动将代码中opt.saveModel改为了true,否则不会保存模型

  4. 使用第三行代码进行测试,输出混淆矩阵

  5. 使用testwithMyapp.lua调用存储的模型测试自己的app是malware还是bengin

    参数 -dataDir <自己app的opseq文件> -modelpath<模型的路径>
    

问题

  1. readMalwareData.lua200行currSize改为了*100
  2. trainModel.lua第75行batchProg的类型是是一个类型值,为啥也能运行出模型??

About

Code for Deep Android Malware Detection paper and I add some files to test my own apk

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Lua 81.2%
  • Python 15.1%
  • Shell 3.7%