Bachelor's thesis on removing hate from online comments using paraphrasing: algorithm DPhate.
To recreate the data generated in the research paper (also available here), where the input are hateful sentences from the Hatexplain dataset, use:
python3 DPhate.py
To test the algorithm on your own examples use the followoing python code:
from DPhate import DPhate
dphate = DPhate()
phrase = "I fucking love your mother."
toxicity = dphate.modelD.predict(phrase)['toxicity']
toxCategory = int((toxicity-0.5)//0.125)
dphate.predict(phrase,toxCategory)
Fine-tuned T5 models are too big for GitHub and can be downloaded here. It is a 2.3GB zip file, which contains 3 different T5 models.