Codes for PAN12 Deception Detection: Sexual Predator Identification task
Data can be requested at https://zenodo.org/record/3713280#.Yfl8LOrMJEY
The notebooks contains the following:
- preprocessing of the xml documents to extract the corpus
- preprocessing with Tf-Idf
- Modelling with Naives Bayes algorithm
- Modelling with Logistic regression