This repository contains two different approaches for detecting deepfake images: a deep learning model and a k-means clustering-based method. Both approaches leverage pre-trained models and feature extraction techniques to distinguish between real and fake images.
The deep learning model for deepfake detection is implemented in the deepfake_detector.py
file. This approach follows the ideas presented in the following research papers:
-
"Multi-attentional Deepfake Detection" by Hanqing Zhao, Xiaolei Zhu, Duc Liao, Iuri Petrov, Lyudmila Shagaeva, and Kevin Park.
- This paper proposes a multi-attentional framework for deepfake detection, combining features from various attention modules and leveraging the power of attention mechanisms to capture relevant spatial and temporal information.
-
"DeepFake Detection by Analyzing Convolutional Traces" by Luca Guarnera, Oliver Giudice, and Sebastiano Battiato.
- This work focuses on analyzing the convolutional traces (activations) of pre-trained models to detect artifacts and inconsistencies introduced by deepfake generation methods, enabling effective detection without the need for manual feature engineering.
The deepfake_detector.py
file contains the following key components:
- VGG16 Pre-trained Model: The VGG16 model, pre-trained on the ImageNet dataset, is used as the feature extractor.
- Custom Layer for Stacking: A custom TensorFlow layer (
StackLayer
) is defined to stack the feature maps from the VGG16 model. - Deepfake Detection Model: The deepfake detection model is built by utilizing the VGG16 model as the feature extractor, followed by a flattening layer, a dense layer, and a final dense layer with a sigmoid activation for binary classification.
- Detection Function: The
detect_deepfake
function takes an image path, loads the image, preprocesses it, and uses the deepfake detection model to predict whether the image is real or fake. - Evaluation: The script iterates over a dataset of real and fake images and counts the number of correctly detected real and fake images.
The k-means clustering-based approach for deepfake detection is implemented in the kmean.py
file. This method is inspired by the following research papers:
-
"DeepfakeDetector: A Clustering-based Approach for Deepfake Detection" by Md. Jabed Nesar, Shengrui Zhang, and Md. Mehedi Hassan Bhuiyan (2022).
- This paper proposes a clustering-based approach using k-means clustering for detecting deepfake videos, leveraging pre-trained models for feature extraction.
-
"Clustering-based Deepfake Detection" by Jingwei Ran, Yu Liu, and Yi Huang (2021).
- This work introduces a clustering-based method for deepfake detection using k-means clustering on features extracted from pre-trained models.
-
"Deepfake Detection Using K-Means Clustering and Ensemble Learning" by Shahriar Sazzad, Yaqoub Nazir, and Md. Jabed Nesar (2021).
- This paper proposes a two-stage approach: k-means clustering followed by an ensemble learning classifier for deepfake detection.
The kmean.py
file contains the following key components:
- Pre-trained Models: Multiple pre-trained models (VGG16, ResNet50, and InceptionV3) are loaded and used for feature extraction.
- Feature Extraction: Features are extracted from real and fake images using the pre-trained models, and the features from different models are concatenated to create a combined feature representation.
- Dimensionality Reduction: Principal Component Analysis (PCA) is applied to reduce the dimensionality of the combined features, potentially enhancing the clustering performance.
- Clustering Algorithms: Two clustering algorithms are employed: k-means and DBSCAN (Density-Based Spatial Clustering of Applications with Noise).
- Silhouette Coefficient: The Silhouette Coefficient is calculated for both k-means and DBSCAN clustering results to evaluate the quality of the clustering.
- Visualization: The clustering results are visualized using matplotlib, allowing for a qualitative assessment of the separation between real and fake image clusters.
To run the deep learning model for deepfake detection, execute the deepfake_detector.py
script:
python deepfake_detector.py
To run the k-means clustering-based approach, execute the kmean.py
script:
python kmean.py
Note that both scripts assume the presence of a Dataset
directory containing subdirectories Test/Real
and Test/Fake
with the corresponding real and fake images.
The following dependencies are required to run the code:
- Python 3.x
- TensorFlow
- OpenCV
- NumPy
- Scikit-learn
- Matplotlib
You can install the required dependencies using pip:
pip install tensorflow opencv-python numpy scikit-learn matplotlib
Contributions to this repository are welcome. If you encounter any issues or have suggestions for improvements, please open an issue or submit a pull request.
This project is licensed under the BSD-3 CLAUSE.
- Aaradhy Sharma
- Unnat Sharma
- Dhruv Mishra
- Suhani Singh
- Nihar Puthin