This is a text mining project developed by me and my colleagues at the University of Manchester. Purpose of this project is to provide Natural Language Processing (NLP) methods in a Tripadvisor's dataset in order to answer questions like which restaurant type the European users seem to prefer. The data for this project was acquired through the kaggle and the pipeline was programmed in R. Also, Python was used for some data cleaning and data preparation.
Customer reviews in restaurants can give important insights about their experience and their satisfaction. In this research, we analysed customers’ reviews from Tripadvisor.com using sentiment analysis to determine the polarity of their opinion, and based on that, to make conclusions on which cuisine type is most preferable. Finally, Latent Dirichlet Allocation (LDA) was used to perform topic modelling. The results showed that people tend to submit a review to express a positive opinion, and that European cuisine is the most common option. Lastly, people tend to discuss more about the restaurant, the service and the food.
You can easily see a detailed pressentation of the code in the Restaurant Reviews Analysis file in this repository. Also, if you can check out the code in the NLP.R file.