Releases: prushh/movie-lens-mlp
Releases · prushh/movie-lens-mlp
TMDB dataset
Inside the acquisition.py file, the retrieve_tmdb()
function executes a large number of API requests to create the uploaded dataset.
This is a time-consuming operation and for this reason, we upload the final version.
It contains the following features:
- movieId is the id for the MovieLens movies [integer]
- tmdbId is the id for the TMDB movies [integer]
- runtime is the duration of the film denoted in minutes [integer or null]
- adult show us if a movie is explicit or not [boolean]
- imdb_id is the id for the IMDB movies [string or null]
- budget is the budget that the director had to realize that movie [integer]
- revenue is the profit gained from the movie [integer]
It is possible to create a different version of the CSV with different features specifying them like follow:
features = {'imdb_id', 'budget', 'revenue', 'adult', 'runtime'}
and setting the boolean USE_API = True
inside const.py file to save the new one.