Skip to content

This repository features a script which can guess whether a reddit submissions is a false or true rumour

License

Notifications You must be signed in to change notification settings

torbenal/RumourResolution

 
 

Repository files navigation

Rumour Veracity Resolution for Reddit

This is a tool which can guess whether a rumourous reddit submission in danish is true or false.

It applies stance classification and then rumour veracity classification on the stance labels.

Prerequisites

Python libraries

The tool requires python and a number of libraries to be installed:

  • Afinn
  • Numpy
  • scikit learn
  • hmmlearn
  • nltk
  • psaw
  • praw
  • joblib
  • ...

Reddit Permissions

For this tool to work, a file 'praw.ini' must be created in this folder.

It should have the format presented below:

[uuuu]
client_id=XXX
client_secret=XXX
user_agent=python:XXX:v1.0 (by /u/<Reddit_user_name>)

Where u is the name used on the command line when calling the program. The application, client_id and client_secret can be obtained by following these steps. Please note the username of your own account must replace the '<Reddit_user_name>'.

Danish word embeddings

Danish word2vec word embeddings must be downloaded and added to '/data/word_embeddings/' folder.

They can be obtained here.

Running the tool

To run the tool run 'py veracity.py -u -s_id '

Where uuuu should match the [uuuu] in the praw.ini file and the submissionID should match the reddit submission you want to analyse.

Credits

  • DSL ...The word embeddings have been trained on both sentence data from dsl and on reddit data from the danish stance dataset.
  • Afinn ...The afinn sentiment is facilitated by the afinn sentiment library, which has been linked above. Further credits can be seen below. ...Finn Årup Nielsen, "A new ANEW: evaluation of a word list for sentiment analysis in microblogs", Proceedings of the ESWC2011 Workshop on 'Making Sense of Microposts': Big things come in small packages. Volume 718 in CEUR Workshop Proceedings: 93-98. 2011 May. Matthew Rowe, Milan Stankovic, Aba-Sah Dadzie, Mariann Hardey (editors)
  • Polyglot for POS tagging. See: "Al-Rfou, Rami and Perozzi, Bryan and Skiena, Steven, (2013), Polyglot: Distributed Word Representations for Multilingual NLP"

About

This repository features a script which can guess whether a reddit submissions is a false or true rumour

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%