Skip to content

Latest commit

 

History

History
26 lines (21 loc) · 1.09 KB

README.md

File metadata and controls

26 lines (21 loc) · 1.09 KB

Master's thesis, MSc Statistics for Data Science

Universidad Carlos III de Madrid (UC3M)

This repository contains the following scripts:

  • Dataset Creation Part 1: Retrieving datasets from Google Analytics and performing initial web scraping using Beautiful Soup.
  • Dataset Creation Part 2: Conducting sentiment analysis using the syuzhet package.
  • Preprocessing: Handling missing data with iterative web scraping, processing multinomial variables, plotting EDA.
  • I implemented three different models:
    • Poisson regression: Using a negative binomial to manage overdispersion.
    • Bayesian regression: Using a negative binomial to manage overdispersion.
    • Neural networks: Non-informative, but with higher predictive power.

Citation

To cite this thesis in publications use:

@mastersthesis{l.ripoll2024,
  author    = {Luisa Ripoll},
  title     = {Advanced Predictive Models for the Young Readership of `La Razón' Newspaper},
  school    = {Universidad Carlos III de Madrid},
  year      = {2024},
  url       = {https://github.com/luisarip/masters-thesis/}
}