COVIPEDIA

A Recommendation System for Navigating COVID-19 Research Articles

NLP Unsupervised ML project

Goal

The goal of this project is build a recommendation system for scientists and researchers to navigate the current surge of papers about COVID-19, find what is relevant to their work, and uncover the hidden semantic relationships. Using the COVID-19 Open Research Dataset, I used the abstract of the subset of articles from January 2020 to May 2021 (about 260,000 articles) as text in this project. With the LDA model, I assigned each documents with dominant topic and their relevance to the topic and grouped articles by topics for recommendation system. So researchers can look up articles based on topic that is related to their work. Lastly, I deployed a Strealit app on Heroku with a smaller dataset that recommends top 20 related articles for the selected topic.

To learn more, see my blog post and presentation slides

The topic model visualization with pyLDAvis is saved as a html file, you can download it from here to see.

Try out the Heroku app for COVIPEDIA~

Workflow

Code (in Workflow Folder)
- Streamlit app on Heroku
  - main python file
  - Procfile, setup doc, required library for Heroku
  - Dataset used in app

Technologies

Python (pandas, numpy)
langdetect
regex, string
spaCy, scispaCy ("en_core_sci_lg" model for biomedical, scientific, and clinical vocabulary)
NLTK
Scikitlearn
Gensim
WordCloud
pyLDAvis
Streamlit
Heroku

Name		Name	Last commit message	Last commit date
Latest commit History 48 Commits
Images		Images
Workflow		Workflow
deliverables		deliverables
Covipedia.png		Covipedia.png
LICENSE		LICENSE
Procfile		Procfile
README.md		README.md
app_ready.csv		app_ready.csv
app_topics.csv		app_topics.csv
myapp.py		myapp.py
presentation.pdf		presentation.pdf
requirements.txt		requirements.txt
setup.sh		setup.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

COVIPEDIA

A Recommendation System for Navigating COVID-19 Research Articles

Goal

Workflow

Technologies

About

Releases

Packages

Languages

License

crystal-ctrl/nlp_project

Folders and files

Latest commit

History

Repository files navigation

COVIPEDIA

A Recommendation System for Navigating COVID-19 Research Articles

Goal

Workflow

Technologies

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages