NJUST-at-SMP

Text Provenance Competition: Code Repository This repository contains the code used for the Text Provenance competition, which can be found here. In this project, we implemented various text similarity algorithms, including:

LDA (Latent Dirichlet Allocation)
Doc2Vec
Word2Vec
Jaccard Distance
Edit Distance
TF-IDF (Term Frequency-Inverse Document Frequency)

These methods were used to calculate text similarity as part of the competition’s task.

More detailed information can be found here:http://www.cips-smp.org/smp_data/4

Feel free to explore the code and adapt it for your own text analysis projects!

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
SMP-ETST-2018		SMP-ETST-2018
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NJUST-at-SMP

About

Releases

Packages

Languages

michellemashutian/NJUST-at-SMP

Folders and files

Latest commit

History

Repository files navigation

NJUST-at-SMP

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages