The idea came to me during pandemic when everything was slow and everyone was moving towards online content. The recommender of netflix helped these people to enjoy milions and millions of shows. But when we look into a little depth we see how netflix curates these reccomedations. These small things like changing the title image according to a users perspective is why I wanted to study it and provide to deliver onto something which would help people to enjoy.
The basic idea is to make a dataset which would help us determine what could be considered as mood or styles. This was done by scraping the netflix website (see below to understand what was scraped) to get Key indicators which would help us determine likings and mood. The dataset after being scraped was cleaned (painstaking but important to have clean dataset 😛). This was then visualized to see how the data is being represented. This helped in the final outcome for the model.
These pictures represent some of the data that was scraped from the title pages of netflix. These pages have content like "This _ has" which gives us the mood. It also has categories, Audio support, Ratings, year. All this will help us determine the final outcome of the model which will help us provide recommendations on Netflix titles.
These are some of the visualizations that were achived from the dataset.
- The first one helps understand the relationship between titles that were released by netflix every year. This helped us understand the growth of netflix and gave us insights into how subscribers increased.
- The second one helps understand the relationship between age and restrictions on avg. title each year. This gives insights into how avg. age has been decreasing over a period of time showing netflix is curating to a much more broader clientale as the year's go past.