Description

The data is retrieved from Kaggle.com; it is titled “Political Social Media Posts” and consists of 5000 observations with 21 variables. We analyze this data set based on questions that involve natural language processing of one or more variables. Our ultimate goal is to analyze the sentiment of each tweet and see how it varies by different attributes such as audience, message, and political party. In addition, we did Regression and Classification with the label of controversiality to see if it can predict bias.

NLP on Reddit Comments

python notebook: https://github.com/wafer110/Python-NLP-Analyze_TextualData_on_Reddit_Comments/blob/master/%5Bwh%5D%20FinalProj_Coding.ipynb

Project Presentation: https://github.com/wafer110/Python-NLP-Analyze_TextualData_on_Reddit_Comments/blob/master/NLP%20on%20Reddit%20Comments.pdf

Project Report: https://github.com/wafer110/Python-NLP-Analyze_TextualData_on_Reddit_Comments/blob/master/Alex_Wafer_FinalReport.pdf

Knitted PDF: https://github.com/wafer110/Python-NLP-Analyze_TextualData_on_Reddit_Comments/blob/master/%5Bwh%5D%20FinalProj_Coding.pdf