Skip to content

This project classifies images from online marketplaces as fraudulent or non-fraudulent by analyzing the text within the images.

Notifications You must be signed in to change notification settings

StrangePineAplle/CLIP-Marketplace-Fraud-Image-Classifier

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 

Repository files navigation

CLIP-Marketplace-Fraud-Image-Classifier

Python CLIP CatBoost

OCR base solution

Overview

The Marketplace Fraud Image Classifier project aims to detect fraudulent images in online marketplaces by classifying them as either fraudulent or non-fraudulent. The model is trained to identify images that contain deceptive text, such as "деньги за отзывы телеграм @mony_tg," which are commonly associated with scams.

Data

The dataset used for training the Marketplace Fraud Image Classifier consists of two main categories: fraudulent images and non-fraudulent (normal) images.

Fraudulent Images

The fraudulent data examples contain images that feature deceptive text, such as scam messages or suspicious offers. These images are crucial for training the model to accurately identify potential fraud attempts. Some examples of fraudulent images are shown below:

Non-Fraudulent (Normal) Images

The non-fraudulent data examples represent legitimate images that do not contain any suspicious content. These images serve as a contrast to the fraudulent data, helping the model distinguish between genuine and fraudulent listings. Examples of non-fraudulent images are provided below:

By training the model on a balanced dataset consisting of both fraudulent and non-fraudulent images, the Marketplace Fraud Image Classifier can learn to effectively distinguish between legitimate and suspicious content, ultimately enhancing the safety and trust in online marketplaces.

Model Architecture

CLIP (Contrastive Language-Image Pretraining)

The project utilizes the CLIP model, developed by OpenAI, which is a multimodal neural network designed to understand and relate images and text. CLIP is trained on a large dataset of image-text pairs, allowing it to learn visual concepts from natural language descriptions. This training enables the model to predict which text snippet corresponds to a given image, effectively bridging the gap between visual and textual information.

How CLIP Works

  • Multi-Modal Learning: CLIP consists of two encoders: one for images and another for text. Both encoders map their respective inputs into a shared embedding space.
  • Contrastive Learning: During training, CLIP learns to maximize the similarity between correct image-text pairs while minimizing the similarity between incorrect pairs. This approach allows the model to generalize well across various tasks without needing task-specific training data.
  • Zero-Shot Capabilities: After training, CLIP can perform zero-shot classification, meaning it can classify images based on textual descriptions without needing additional fine-tuning.

CatBoost Classifier

In this project, the embeddings generated by the CLIP model serve as input features for the CatBoost classifier. CatBoost is a gradient boosting algorithm that excels at handling categorical features and provides robust performance in classification tasks.

Features

  • Fraud Detection: Classifies images based on the presence of scam-related text and visual features.
  • Text Understanding: Trained to comprehend the meaning of text within images, enabling effective identification of fraudulent content.
  • Advanced Techniques: Combines CLIP for feature extraction and CatBoost for classification, leveraging their strengths in handling multimodal data.

Results

The performance of the Marketplace Fraud Image Classifier was evaluated using various metrics, with a primary focus on the F1 score, which balances precision and recall.

After thorough training and validation, the final model achieved an impressive F1 score of 0.92. This high score indicates that the model is highly effective in distinguishing between fraudulent and non-fraudulent images, demonstrating its ability to minimize false positives and false negatives.

Performance Metrics

  • F1 Score: 0.92
  • speed: less (around) 1 minute/3000 img

About

This project classifies images from online marketplaces as fraudulent or non-fraudulent by analyzing the text within the images.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published