The data from the given link was scraped using BeautifulSoup4
The given features were extracted and the data was cleaned.
This is done in the file step1.py
Now a json file was generated in step1 which was imported in mongodb database using the following cmd: mongoimport --db test --collection productreviews --file data.json
Text classification of 1st 100 reviews using Latent Dirichlet Allocation algorithm
The text is lemmatized, the stop words are removed.
Now the TF-IDF of the text is taken along with LDA model from nltk library
Thus we get what topics might be associated with the given text.
More training would yield better results
Now finally the semantic analysis is done using Afinn library
This does not yield the best results. But due to the time constraint i've used this