Skip to content

Is it possible to recreate and classify topics assigned to news articles by using a topic modelling algorithm on their respective content?

Notifications You must be signed in to change notification settings

notsaman/newspaper_topic_modelling

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Newspaper Topic Modelling

  • Scraped around 280.000 articles from Spiegel Online distributed in 11 topics

Research Questions:

Is it possible to recreate and classify topics assigned to news articles by using a topic modelling algorithm on their respective content?

Subquestions

  • Which of the topics has the most topic markers put out by the algorithm?

  • Which of the topic markers is classified the most?

  • Are articles of a certain topic classified more accurately than others?

Results

  • The model was run on a linear SVM, with topics generated through LDA.
  • The average accuracy on both the test set and the entire dataset is 79%, with a recall of 78%, with precision ranging from 68% to 93% between the (unevenly distributed) classes.

About

Is it possible to recreate and classify topics assigned to news articles by using a topic modelling algorithm on their respective content?

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages