Skip to content

Latest commit

 

History

History
98 lines (63 loc) · 5.2 KB

README.md

File metadata and controls

98 lines (63 loc) · 5.2 KB

Deepfake Detection

This repository contains two different approaches for detecting deepfake images: a deep learning model and a k-means clustering-based method. Both approaches leverage pre-trained models and feature extraction techniques to distinguish between real and fake images.

Deep Learning Model

The deep learning model for deepfake detection is implemented in the deepfake_detector.py file. This approach follows the ideas presented in the following research papers:

  1. "Multi-attentional Deepfake Detection" by Hanqing Zhao, Xiaolei Zhu, Duc Liao, Iuri Petrov, Lyudmila Shagaeva, and Kevin Park.

    • This paper proposes a multi-attentional framework for deepfake detection, combining features from various attention modules and leveraging the power of attention mechanisms to capture relevant spatial and temporal information.
  2. "DeepFake Detection by Analyzing Convolutional Traces" by Luca Guarnera, Oliver Giudice, and Sebastiano Battiato.

    • This work focuses on analyzing the convolutional traces (activations) of pre-trained models to detect artifacts and inconsistencies introduced by deepfake generation methods, enabling effective detection without the need for manual feature engineering.

Implementation Details

The deepfake_detector.py file contains the following key components:

  • VGG16 Pre-trained Model: The VGG16 model, pre-trained on the ImageNet dataset, is used as the feature extractor.
  • Custom Layer for Stacking: A custom TensorFlow layer (StackLayer) is defined to stack the feature maps from the VGG16 model.
  • Deepfake Detection Model: The deepfake detection model is built by utilizing the VGG16 model as the feature extractor, followed by a flattening layer, a dense layer, and a final dense layer with a sigmoid activation for binary classification.
  • Detection Function: The detect_deepfake function takes an image path, loads the image, preprocesses it, and uses the deepfake detection model to predict whether the image is real or fake.
  • Evaluation: The script iterates over a dataset of real and fake images and counts the number of correctly detected real and fake images.

K-means Clustering Approach

The k-means clustering-based approach for deepfake detection is implemented in the kmean.py file. This method is inspired by the following research papers:

  1. "DeepfakeDetector: A Clustering-based Approach for Deepfake Detection" by Md. Jabed Nesar, Shengrui Zhang, and Md. Mehedi Hassan Bhuiyan (2022).

    • This paper proposes a clustering-based approach using k-means clustering for detecting deepfake videos, leveraging pre-trained models for feature extraction.
  2. "Clustering-based Deepfake Detection" by Jingwei Ran, Yu Liu, and Yi Huang (2021).

    • This work introduces a clustering-based method for deepfake detection using k-means clustering on features extracted from pre-trained models.
  3. "Deepfake Detection Using K-Means Clustering and Ensemble Learning" by Shahriar Sazzad, Yaqoub Nazir, and Md. Jabed Nesar (2021).

    • This paper proposes a two-stage approach: k-means clustering followed by an ensemble learning classifier for deepfake detection.

Implementation Details

The kmean.py file contains the following key components:

  • Pre-trained Models: Multiple pre-trained models (VGG16, ResNet50, and InceptionV3) are loaded and used for feature extraction.
  • Feature Extraction: Features are extracted from real and fake images using the pre-trained models, and the features from different models are concatenated to create a combined feature representation.
  • Dimensionality Reduction: Principal Component Analysis (PCA) is applied to reduce the dimensionality of the combined features, potentially enhancing the clustering performance.
  • Clustering Algorithms: Two clustering algorithms are employed: k-means and DBSCAN (Density-Based Spatial Clustering of Applications with Noise).
  • Silhouette Coefficient: The Silhouette Coefficient is calculated for both k-means and DBSCAN clustering results to evaluate the quality of the clustering.
  • Visualization: The clustering results are visualized using matplotlib, allowing for a qualitative assessment of the separation between real and fake image clusters.

Usage

To run the deep learning model for deepfake detection, execute the deepfake_detector.py script:

python deepfake_detector.py

To run the k-means clustering-based approach, execute the kmean.py script:

python kmean.py

Note that both scripts assume the presence of a Dataset directory containing subdirectories Test/Real and Test/Fake with the corresponding real and fake images.

Dependencies

The following dependencies are required to run the code:

  • Python 3.x
  • TensorFlow
  • OpenCV
  • NumPy
  • Scikit-learn
  • Matplotlib

You can install the required dependencies using pip:

pip install tensorflow opencv-python numpy scikit-learn matplotlib

Contributing

Contributions to this repository are welcome. If you encounter any issues or have suggestions for improvements, please open an issue or submit a pull request.

License

This project is licensed under the BSD-3 CLAUSE.

Team

  • Aaradhy Sharma
  • Unnat Sharma
  • Dhruv Mishra
  • Suhani Singh
  • Nihar Puthin