Skip to content
umkcmax edited this page Sep 11, 2018 · 4 revisions

Welcome to the CS5560-Hongcheng-Jiang-LabSubmission- wiki! 1.Check Project Detailsin the google sheets for Potential Data Sources.

2.Mine 10 unique publications abstracts relevant to your “Project Topic”from the Data Sources(Use Abstract Download Code)

Here is screenshots.

We need get abstracts with ID.

3.Report Data Statistics (e.g. Year of Publication, Number of terms presented in the paper, Number of images & graphs) https://github.com/umkcmax/CS5560-Hongcheng-Jiang-LabSubmission-/blob/master/Lab1/documentation/3.png 4.Perform Basic NLP (Tokenization, Lemmatization) and provide the statistics

we need count pos_noun, pos_verb and ner_name. Here is screenshot. 5.Perform Valid Word Filtering

6.Perform Valid Medical Word Filtering and provide the statistics

Clone this wiki locally