Skip to content

Releases: prushh/movie-lens-mlp

TMDB dataset

15 Apr 08:30
Compare
Choose a tag to compare

Inside the acquisition.py file, the retrieve_tmdb() function executes a large number of API requests to create the uploaded dataset.
This is a time-consuming operation and for this reason, we upload the final version.

It contains the following features:

  • movieId is the id for the MovieLens movies [integer]
  • tmdbId is the id for the TMDB movies [integer]
  • runtime is the duration of the film denoted in minutes [integer or null]
  • adult show us if a movie is explicit or not [boolean]
  • imdb_id is the id for the IMDB movies [string or null]
  • budget is the budget that the director had to realize that movie [integer]
  • revenue is the profit gained from the movie [integer]

It is possible to create a different version of the CSV with different features specifying them like follow:
features = {'imdb_id', 'budget', 'revenue', 'adult', 'runtime'}
and setting the boolean USE_API = True inside const.py file to save the new one.