The purpose of this project is to extract, archive and show data related to the MANGA Plus Hottest Manga list.
The original Hottest Manga list only shows the current state of the top 40 manga titles, so there is no history/log or charts. Therefore, this project aims to create a data history and generate an interactive chart of the Hottest Manga list.
This project is not affiliated, associated, authorized, endorsed by, or in any way officially connected with "MANGA Plus by SHUEISHA", SHUEISHA Inc., or any of its subsidiaries or its affiliates. The official MANGA Plus website can be found at https://mangaplus.shueisha.co.jp/.
This project strictly focuses on extracting only the ranking information (manga ranking and titles), no other website resources (such as manga logos and story images) are stored or processed in any way.
Required tools and libraries:
Nodejs (for data scraping and processing)
Tested version: v16.15.1
- yarn (alternative package manager)
- puppeteer (for web scraping)
- slugify (for title id generation)
- js-yaml (for data processing)
Once inside the project directory, install the dependencies by running the following command:
yarn install
or
npm install
Sake (for webpage processing)
It's a Makefile I created to generate simple static sites.
Make (for tasks)
I like to use Make to write tasks (instead of using nodejs package scripts) because it's simpler and portable.
NOTE: Make is optional for data scraping and processing, but required for webpage processing.
If you just want to see the chart, ignore the following instructions and go here: https://manga.tumeo.space
Follow the instructions below if you want to run this project locally:
To scrape new data, stay in the repository root directory and run the following command:
make today
The data will be saved to scraped-data/YYYY/YYYY-MM-DD.tsv
NOTE: This task is performed daily by GitHub Actions.
After scraping the data, it needs to be processed to be easily consumable by the website generator.
To process the scraped data, stay in the repository root directory and run the following command:
make yaml
NOTE: This task is already performed automatically by GitHub Pages Environment.
The webpage is built only with make, jinja2, fd, yj and jq.
See the site generator project for more info: https://github.com/williamd1k0/sake
The chart library (Chart.js) is loaded from a CDN.
To process the webpage, change to the site
directory and run the following command:
make
NOTE: This task is already performed automatically by GitHub Pages Environment, see the results here: https://manga.tumeo.space.