This dataset to contain information about movies, including ratings, genres, and other metadata.
The dataset consists of 10 columns with 29,697 rows.
-
TCONST (object):
- Unique IMDB identifier for each entry
- Corresponds to IMDb's tconst (title constant) identifier
-
ORIGINAL_TITLE (object):
- The original title of the movie or TV show
-
TYPE (object):
- Indicates whether the entry is a movie, TV show, or other media type
-
AVG_RATING (float64):
- The average rating of the title
- Numeric values, on a scale (e.g., 1-10)
-
VOTE_COUNT (int64):
- The number of votes or ratings received
- Integer values
-
GENRES (object):
- The genre(s) associated with the title
- Likely a string, possibly comma-separated if multiple genres
-
DIRECTORS (object):
- The director(s) of the title
- Ccomma-separated if multiple directors
-
YEAR (object):
- The year of release or production
- Stored as an object (string) rather than a numeric type
-
OVERVIEW (object):
- Brief synopsis or description of the title
-
COMBINED_FEATURES (object):
- A column that combines multiple features (ORIGINAL_TITLE, GENRES, YEAR, DIRECTORS, OVERVIEW) for TFIDF processing
- The exact composition of this column is not specified in the provided information
- 8 columns are of type 'object' (likely strings)
- 1 column is of type 'float64' (AVG_RATING)
- 1 column is of type 'int64' (VOTE_COUNT)