Skip to content

This repository contains some basic text similarity algorithms.

Notifications You must be signed in to change notification settings

michellemashutian/NJUST-at-SMP

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 

Repository files navigation

NJUST-at-SMP

Text Provenance Competition: Code Repository This repository contains the code used for the Text Provenance competition, which can be found here. In this project, we implemented various text similarity algorithms, including:

  • LDA (Latent Dirichlet Allocation)
  • Doc2Vec
  • Word2Vec
  • Jaccard Distance
  • Edit Distance
  • TF-IDF (Term Frequency-Inverse Document Frequency)

These methods were used to calculate text similarity as part of the competition’s task.

More detailed information can be found here:http://www.cips-smp.org/smp_data/4

Feel free to explore the code and adapt it for your own text analysis projects!

About

This repository contains some basic text similarity algorithms.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages