Skip to content

☠️ A CNN model is used to classify grayscale images as either ransomware or normal files.

License

Notifications You must be signed in to change notification settings

gunh0/malware-image-classification

Repository files navigation

Ransomware Detection using Convolutional Neural Networks

License: MIT

Dataset The dataset used for training consists of two categories: "normal" and "ransomware". The original dataset directory should have separate subdirectories for each category. The script automatically creates the necessary train, validation, and test directories, and copies the images from the original dataset into these directories based on a 60/20/20 split.

Model Architecture The CNN model consists of several convolutional layers, followed by max-pooling layers, and finally fully connected layers. The model architecture is as follows:

Input layer: Accepts grayscale images of size 150x150. Convolutional layer with 32 filters and a 3x3 kernel, followed by ReLU activation. Max-pooling layer with a 2x2 pool size. Convolutional layer with 64 filters and a 3x3 kernel, followed by ReLU activation. Max-pooling layer with a 2x2 pool size. Convolutional layer with 128 filters and a 3x3 kernel, followed by ReLU activation. Max-pooling layer with a 2x2 pool size. Convolutional layer with 128 filters and a 3x3 kernel, followed by ReLU activation. Max-pooling layer with a 2x2 pool size. Flatten layer to convert the 3D output to 1D. Dense layer with 512 units and ReLU activation. Dropout layer with a rate of 0.5 for regularization. Dense layer with 256 units and ReLU activation. Dropout layer with a rate of 0.5 for regularization. Output layer with a sigmoid activation function, predicting the probability of the input image being ransomware.

Model Training The model is compiled with the binary cross-entropy loss function and the RMSprop optimizer. The training data is augmented using the ImageDataGenerator, which rescales the pixel values and applies random transformations like shifting and flipping. The validation data is rescaled without augmentation. The model is trained for 30 epochs with a batch size of 20.

Evaluation and Prediction After training, the model is evaluated on the test set. The test images are loaded, preprocessed, and fed into the model to calculate accuracy and loss. Additionally, the script provides a way to make predictions on new images. The model is loaded, and the images in the test directory are iterated through. Each image is preprocessed, and the model predicts its category (normal or ransomware). The predicted and actual categories are printed, along with the accuracy for each category and the total accuracy.

Please note that this is a summary of the script's functionality, and the actual code contains more implementation details.


IMG Converter - mini project [PyQt]

A small project to show files in gray-scale.

Pyqt was used for implementation.

Specifying a specific directory generates files that have converted all files in that directory into images.


Reference

(2011) Malware images: visualization and automatic classification

https://dl.acm.org/doi/10.1145/2016904.2016908

We propose a simple yet effective method for visualizing and classifying malware using image processing techniques.

Malware binaries are visualized as gray-scale images, with the observation that for many malware families, the images belonging to the same family appear very similar in layout and texture.

Motivated by this visual similarity, a classification method using standard image features is proposed.

Neither disassembly nor code execution is required for classification.

Preliminary experimental results are quite promising with 98% classification accuracy on a malware database of 9,458 samples with 25 different malware families.

Our technique also exhibits interesting resilience to popular obfuscation techniques such as section encryption.

About

☠️ A CNN model is used to classify grayscale images as either ransomware or normal files.

Topics

Resources

License

Stars

Watchers

Forks