The scraper goes through a subaccount (or a CSV called course_IDs.csv) and looks through each course's 'Syllabus' tab and downloads all .pdf files present and renames each of them according to course code and year (into the pdf folder).
dl_data.csv tells you what files were downloaded for each course.
-
If you do not have Python, install it. If you have no experience with it, I recommend installing it through https://www.anaconda.com/download/.
-
Clone this GitHub repository.
-
Install all the dependencies using pip (first time use only). Use the command pip install -r requirements.txt through the command shell in the directory of your cloned GitHub repo.
-
Run the script. It will prompt you for your these things:
- Token (Canvas API token)
- Subaccount to run in
- Term to search through
Please note this script is rather slow. Due to the risk of taking down the AWS server, all API calls are done on a single thread. This script will be rewritten soon.