Automatic Detection of Fake News in the Syrian War

This repository contains the code and input needed to reproduce our work.

feature_extraction, and learning_model are the source code directories that contain the code behind the work
main.py is the file needed to reproduce our work.
feature_selection/ is the directory that contains feature selection code. It accounts for both regression and classification, can run various FS methods, can take any training/testing dataset, can scale data if specified

Testing feature extraction:

In order to test the feature extraction approach in our work, you need to first download the 'stanford-corenlp-full-2018-02-27' Stanford dependency parser which should consist of the following:

stanford-corenlp-3.9.1.jar
stanford-corenlp-3.9.1-models.jar

It can be downloaded using the following link: http://nlp.stanford.edu/software/stanford-corenlp-full-2018-02-27.zip

You will also need to download the StanfordNERTagger 'stanford-ner-2018-02-27' which consists of the following:

classifiers/english.all.3class.distsim.crf.ser.gz
stanford-ner.jar

It can be downloaded using the following link: https://nlp.stanford.edu/software/stanford-ner-2018-02-27.zip

Download the above two requirements and save them in this directory. Inside main.py, uncomment test_feature_extraction() from the main function in order to test feature extraction. Feature extraction csv files will be outputted into the output directory

Testing machine learning models:

By running the main.py function, you run the best found machine learning model on the FA-KES train and test datasets You can change the paths in the test_features_model() function in the main.py file in order to test it on other datasets. The other datasets are found in the other_datasets directory (Buzzfeed train and test features extraction are there) comment the test_features_model() in the main function call in the main.py file if you do not want to test the feature-based machine learning model.

Testing baseline text-based deep learning model:

You need to download the Glove file glove.6B.300d.txt and place it in the same directory as the main.py file
It can be downloaded from this link: http://nlp.stanford.edu/data/glove.6B.zip
uncomment the test_text_model() in the main function call in the main.py file if you want to test the text-based deep learning model.
You can change the paths in the test_text_model() function in the main.py file in order to test it on other datasets. The other datasets are found in the other_datasets directory (Buzzfeed train and test features extraction are there)

Meta-Learning vs. Shallow Models

We have applied meta learning to the FAKES dataset. More specifically, we experimented with MAML
We have compared performance to shallow models
All Experiments can be found here

Adavnced ML Evaluations: Meta Learning Vs. Shallow

We have also run advanced machine learning evaluations to assess the quality of the predictions of each of the best meta learners and best performing shallow model:

Results can be found here
Code for Running Advanced Evaluations can be found here

Crowd Sourced Annotations

Can be found here

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Automatic Detection of Fake News in the Syrian War

Testing feature extraction:

Testing machine learning models:

Testing baseline text-based deep learning model:

Meta-Learning vs. Shallow Models

Adavnced ML Evaluations: Meta Learning Vs. Shallow

Crowd Sourced Annotations

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 41 Commits
Experiments		Experiments
exploratory_analysis		exploratory_analysis
feature_extraction		feature_extraction
feature_selection		feature_selection
input		input
learning_model		learning_model
other_datasets		other_datasets
output		output
.DS_Store		.DS_Store
ReadME.md		ReadME.md
crowdsourced-annotations.xlsx		crowdsourced-annotations.xlsx
main.py		main.py
requirements.txt		requirements.txt

hiyamgh/fake_news_detection

Folders and files

Latest commit

History

Repository files navigation

Automatic Detection of Fake News in the Syrian War

Testing feature extraction:

Testing machine learning models:

Testing baseline text-based deep learning model:

Meta-Learning vs. Shallow Models

Adavnced ML Evaluations: Meta Learning Vs. Shallow

Crowd Sourced Annotations

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages