Task 5: Multimedia Automatic Misogyny Identification (MAMI)

Team MAMI-SAN: Sebastian Wilharm, Aleksandra Sharapova, Nailia Mirzakhmedova

Create a new conda environment with all required dependencies: conda env create -f environment.yml
Activate the environment: conda activate mami-san
Download dataset here and put all files in folder ./MAMI DATASET/
Download ResNet weights here and put them in the main directory
Download Bert weights here and put them in the main directory
To run everything with one command, run python main.py. Alternatively go through it step by step as described below.

Model Description

The model that was used for image classification is the pretrained Wide ResNet-50-2 model from “Wide Residual Networks”.

ResNet weights can be found here

It achieved the accuracy of 65.8% on the test set and 75.9% on the training set just after 3 epochs.

The model that was used for text classification is the pretrained cased Bidirectional Encoder Representations from Transformers (BERT), which achieves 53.8% accuracy on the test set after 3 epochs.

BERT weights can be found here

The combined accuracy of the two models on the whole test dataset is 62.8% and the macro averaged f1 score is 58.2.

Data Preprocessing

To preprocess the dataset, run the following command once:

python read_dataset.py

The command unzips the data into the data folder in the working folder (creates it if necessary), preprocesses the files and creates dataloaders for the training and testing. Two files in the test and train folders are created under the name labels.csv.

Training

To run the training script for image classifier, execute:

python resnet.py

To run the training script for text classifier, execute:

python bert.py

Evaluation

To load the image classifier weights and do the evaluation for 4 random images from the test set, run the following command:

python vizualize.py

To load both classifiers and compute their accuracy, run:

python combined.py

Notebook

Data preprocessing, exploration and visualization, as well as all the code from the scripts with outputs can be found in the Jupyter Notebook bert_resnet.ipynb

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
MAMI DATASET		MAMI DATASET
.gitignore		.gitignore
README.md		README.md
bert.py		bert.py
bert_resnet.ipynb		bert_resnet.ipynb
bert_training.ipynb		bert_training.ipynb
combined.py		combined.py
environment.yml		environment.yml
group_contract.pdf		group_contract.pdf
main.py		main.py
modules.py		modules.py
project_report_wilharm.pdf		project_report_wilharm.pdf
read_dataset.py		read_dataset.py
resnet.py		resnet.py
visualize.py		visualize.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Task 5: Multimedia Automatic Misogyny Identification (MAMI)

Team MAMI-SAN: Sebastian Wilharm, Aleksandra Sharapova, Nailia Mirzakhmedova

Model Description

Data Preprocessing

Training

Evaluation

Notebook

About

Releases

Packages

Contributors 3

Languages

swilharm/multimedia-automatic-misogyny-identification

Folders and files

Latest commit

History

Repository files navigation

Task 5: Multimedia Automatic Misogyny Identification (MAMI)

Team MAMI-SAN: Sebastian Wilharm, Aleksandra Sharapova, Nailia Mirzakhmedova

Model Description

Data Preprocessing

Training

Evaluation

Notebook

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages