Jupyter-based dashboards to help visualise activity in issues and Pull Requests across many repositories and organisations - all in one place!
Click here to view the activity dashboard! 👉
Click here to view the past activity summary! 👉
Table of Contents:
get-data.py
is a Python script that makes calls to the GitHub REST API in order to collect information about issues and pull requests.
It specifically makes requests to the search endpoint which allows us search for issues and pull requests as we would expect to do so in GitHub's own search bar.
For example, is:issue is:open assignee:sgibson91
would return all open issues assigned to me.
This turned out to be much more efficient than using the 'list issues assigned to the authenticated user' endpoint since it made fewer individual requests and, therefore, wouldn't rate-limit the script.
The script searches for all issues and pull requests that meet the following criteria:
- the user is either assigned to or has created them,
- they involve the user and were closed in the last month,
- they involve the user and were closed or updated in the last week;
- and, any pull requests where the user's review has been requested.
The results are compiled into a pandas dataframe, along with some metadata, and then written to CSV file called github-activity.csv
.
You can provide a .repoignore
file to prevent results from specific repos turning up the the dataset.
This is a plain text file with a repository to be ignored on each new line.
The repository to be ignored is represented by the form ORG_OR_USER/REPO_NAME
.
You can also use regular expressions here as well.
E.g., if you would like to ignore a whole organisation, this would look like ORG_NAME/.*
.
The get-data.py
script is run in a GitHub Actions workflow on a regular cron trigger.
This cron job runs as if running the script locally and commits the updated CSV file to the main
branch.
The data are visualised using the activity-dashboard.ipynb
and past-activity-summary.ipynb
Jupyter Notebooks.
They each implement widgets to interact with the data so that users can filter by an individual repository and sort by time created, updated, or closed (past activity summary only).
The Notebooks are executed with voila
in order to give the dashboards a more aesthetically pleasing look.
The dashboards can be launched in Binder to generate a quick view without needing to use the repository locally. Binder usually rebuilds the Docker image of the repository with every new commit it sees on the provided git reference. However since the CSV file is regularly updated, this meant Binder was rebuilding a lot when it didn't need to since only the data were changing - not the Notebooks or the environment required by the Notebooks.
To mitigate the number of rebuilds Binder would need to make, the requirements.txt
file containing only the packages needed to run the Notebooks has been separated out onto the notebook-env
branch.
This is the branch we build with Binder.
We then use nbgitpuller
to dynamically pull in the content from the main
branch.
This results in a Binder environment that is only rebuilt when the Notebooks' requirements are changed, but still operates with the most up-to-date data from the main
branch.
Binder needs BOTH the main
branch and the notebook-env
branch to operate in this way!
If you are using this project as a template or forking it, DO NOT remove the notebook-env
branch without ALSO updating the Binder link!
-
Create your own version of this repository by clicking the "Use this template" button at the top of this page. :fire: Make sure to check the "Include all branches" box when creating your repo, as you will need the
notebook-env
branch as well for the Binder links to work! 🔥 You can delete any other branches, except formain
andnotebook-env
. -
Delete the
github-activity.csv
file from your repo. (It will be regenerated when the CI job next runs!) -
Delete the
.repoignore
file or edit it contain a list of repos you'd like excluded from the dataset, in the formORG_OR_USER/REPO_NAME
. -
Create a Personal Access Token with
public_repo
scope and add it as a repository secret calledACCESS_TOKEN
-
Edit the README and update the Binder badges at the top of the document, replacing all instances of
{{ YOUR_GITHUB_HANDLE_HERE }}
(including{{}}
!!!) with your GitHub handle in the below snippet:[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/{{ YOUR_GITHUB_HANDLE_HERE }}/github-activity-dashboard/notebook-env?urlpath=git-pull%3Frepo%3Dhttps%253A%252F%252Fgithub.com%252F{{ YOUR_GITHUB_HANDLE_HERE }}%252Fgithub-activity-dashboard%26urlpath%3D%252Fvoila%252Frender%252Fgithub-activity-dashboard%252Factivity-dashboard.ipynb%26branch%3Dmain) [![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/{{ YOUR_GITHUB_HANDLE_HERE }}/github-activity-dashboard/notebook-env?urlpath=git-pull%3Frepo%3Dhttps%253A%252F%252Fgithub.com%252F{{ YOUR_GITHUB_HANDLE_HERE }}%252Fgithub-activity-dashboard%26urlpath%3D%252Fvoila%252Frender%252Fgithub-activity-dashboard%252Fpast-activity-summary.ipynb%26branch%3Dmain)
🚨 Be careful not to edit anything else in the URL! 🚨
You can either get started straight away by manually triggering the 'Update GitHub Activity' workflow or wait for the cron job to run it for you to produce your github-activity.csv
.
Once that has been added to your repo, click your edited Binder badges to see your dashboards!
This project requires a Python installation. Any minor patch of Python3 should suffice, but that hasn't been tested so proceed with caution!
The packages required to run this project are stored in requirements.txt
and can be installed via pip
:
pip install -r requirements.txt
-
If you have not already done so, create a Personal Access Token with the
public_repo
scope -
Add this as a variable called
ACCESS_TOKEN
to your shell environmentexport ACCESS_TOKEN="PASTE YOUR TOKEN HERE"
-
Run the Python script to generate the
github-activity.csv
filepython get-data.py
🚨 If you see the message "You are rate limited! 😱", you will need to wait ~1hour before trying to run the script again 🚨
Once github-activity.csv
has been generated, view the dashboards by running:
voila activity-dashboard.ipynb
voila past-activity-summary.ipynb
A browser window should be automatically opened.