message-translation

An assistive writing tool to analyze linguistic and cultural variation across communities

Environment Setup

Please run the following:

conda create -n message python=3.8
pip install -r requirements.txt

Dataset

You can follow the instructions from the public BLM Twitter dataset to download tweets using our filtered tweetid to generate a smaller dataset which contains ~200K pro-BLM tweets and ~100K anti-BLM tweets. The preprocessing code and data are here. After that, move the dataset to ./data/blm_alm/raw/ such that you have the following two files: pro_blm_200k.txt and anti_blm_100k.txt.

Semantic Shift Analysis

cd semantic_shift
# download BERTweet to your local machine
python download_bertweet.py
sh ./bash_scripts/compute_semantic_shifts.sh

Check the notebook to see the analysis.

Cultural and Ideological Analysis

cd ideology-alignment
sh train_script.sh

Check the notebook to see the analysis.

Acknowledgement

This github is developed on the basis of UiO-UvA at SemEval-2020 Task 1 and Aligning Multidimensional Worldviews and Discovering Ideological Differences.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
data/blm_alm/raw		data/blm_alm/raw
ideology-alignment		ideology-alignment
preprocessing		preprocessing
semantic_shift		semantic_shift
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

message-translation

Environment Setup

Dataset

Semantic Shift Analysis

Cultural and Ideological Analysis

Acknowledgement

About

Releases

Packages

Languages

License

mit-ccc/message-translation

Folders and files

Latest commit

History

Repository files navigation

message-translation

Environment Setup

Dataset

Semantic Shift Analysis

Cultural and Ideological Analysis

Acknowledgement

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages