Skip to content

Master's thesis in Statistics and Data Science: modelling the young readership of the Spanish newspaper "La Razón"

License

Notifications You must be signed in to change notification settings

luisarip/masters-thesis

Repository files navigation

Master's thesis, MSc Statistics for Data Science

Universidad Carlos III de Madrid (UC3M)

This repository contains the following scripts:

  • Dataset Creation Part 1: Retrieving datasets from Google Analytics and performing initial web scraping using Beautiful Soup.
  • Dataset Creation Part 2: Conducting sentiment analysis using the syuzhet package.
  • Preprocessing: Handling missing data with iterative web scraping, processing multinomial variables, plotting EDA.
  • I implemented three different models:
    • Poisson regression: Using a negative binomial to manage overdispersion.
    • Bayesian regression: Using a negative binomial to manage overdispersion.
    • Neural networks: Non-informative, but with higher predictive power.

Citation

To cite this thesis in publications use:

@mastersthesis{l.ripoll2024,
  author    = {Luisa Ripoll},
  title     = {Advanced Predictive Models for the Young Readership of `La Razón' Newspaper},
  school    = {Universidad Carlos III de Madrid},
  year      = {2024},
  url       = {https://github.com/luisarip/masters-thesis/}
}

About

Master's thesis in Statistics and Data Science: modelling the young readership of the Spanish newspaper "La Razón"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published