This folder contains the data processing script for the Anime Data Visualization. Each script is a Python Jupyter notebook and will output a JSON file once ran, which can be manually tweaked for formatting. Final JSON files used by the data visualization app are placed in absolute folder /app/public/data.
To run the scripts you will need:
- Python 3 (link): scripting and programming runtime and environment
- Jupyter Notebook (link): web-based development application for interactive Python scripts
- Pandas (link): data analysis library for Python
- JikanPy (link): Python wrapper library around the Jikan API
- The original dataset CSVs which can be found here, placed in relative folder
./raw
(you may need to create it)
Each .ipynb
can be opened in your local Jupyter Notebook instance as a standalone script.
-
Genres.ipynb
: takes as inputraw/anime_cleaned.csv
and producesgenre_data.json
andgenre_top_animes_data.json
, containing the data for the genres bubble diagram. -
History.ipynb
: takes as inputraw/AnimeList.csv
and produceshistory.json
, containing the data for the histogram -
Actors.ipynb
: takes as inputraw/anime_cleaned.csv
and fetches data from the Jikan API. It outputsvA_datasets.json
andvA_infos.json
(the latter is to be placed in /app/src/pages/chord for bundling) which contain the data for the actors chord diagram. -
Studios.ipynb
: takes as inputraw/anime_cleaned.csv
and producessankey_dataset.json
which contains the data for the studios sankey diagram.