After we scrape data from the web, we will need a means of categorizing them. Is this product a loaf of bread? Is it detergent? For our app to differentiate between groups, we will need to give it a model to sort items into categories.
In this part of the project, we will traverse many machine learning(statistical) algorithms and see which model will work best.
Work Flow: Convert product titles into vectors. This allows us to compare words together and see if they're closely related. There are two methods that I know of we can use to convert the titles into vectors.
- Word2Vec: which uses TensorFlow
- TFIDF