Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Overhaul exercise for 2024 course #11

Merged
merged 38 commits into from
Aug 21, 2024
Merged
Show file tree
Hide file tree
Changes from 31 commits
Commits
Show all changes
38 commits
Select commit Hold shift + click to select a range
04081c0
Begin change to Colored MNIST
adjavon Jul 25, 2024
84f0a44
Update README overview
adjavon Jul 25, 2024
354005a
wip: Add GAN script
adjavon Jul 25, 2024
14d8e72
wip: Update tasks, parts 1-3
adjavon Jul 25, 2024
4232d59
Add workflow for building notebooks
adjavon Jul 25, 2024
bfc68ba
Commit from GitHub Actions (Build Notebooks)
adjavon Jul 25, 2024
f43a1f5
Clean up tags for parts 1 and 2
adjavon Jul 25, 2024
690d2d0
Commit from GitHub Actions (Build Notebooks)
adjavon Jul 25, 2024
ecef44d
Add EMA to UNet and validate GAN
adjavon Jul 28, 2024
7afaef3
Restart training from checkpoint
adjavon Aug 6, 2024
4fc7a43
Add stargan figure
adjavon Aug 6, 2024
f12e6d8
Reduce hard-coding in viewing results
adjavon Aug 6, 2024
343e364
wip: Add explanations about the GAN trainig
adjavon Aug 6, 2024
3d887bc
Commit from GitHub Actions (Build Notebooks)
adjavon Aug 6, 2024
b3d267d
wip: Add GAN training task
adjavon Aug 12, 2024
3327629
wip: Begin evaluation of the counterfactuals using classifier
adjavon Aug 12, 2024
03e6aaa
Commit from GitHub Actions (Build Notebooks)
adjavon Aug 12, 2024
5e963df
wip: Add EMA to GAN training
adjavon Aug 12, 2024
846525f
Commit from GitHub Actions (Build Notebooks)
adjavon Aug 12, 2024
702c0e3
wip: Add discriminative attribution
adjavon Aug 12, 2024
b4595ab
Commit from GitHub Actions (Build Notebooks)
adjavon Aug 12, 2024
f864649
Finish style space, explanations, and conclusion
adjavon Aug 15, 2024
33a6110
Commit from GitHub Actions (Build Notebooks)
adjavon Aug 15, 2024
c1a6e28
Fix numbering, missing todos, and plotting bug
adjavon Aug 15, 2024
544c6a7
Commit from GitHub Actions (Build Notebooks)
adjavon Aug 15, 2024
559ccf9
Update setup script
adjavon Aug 16, 2024
14d5975
Merge branch '2024' of github.com:dlmbl/knowledge_extraction into 2024
adjavon Aug 16, 2024
5ccd575
Commit from GitHub Actions (Build Notebooks)
adjavon Aug 16, 2024
12a6ff9
Fix enviroment creation script
afoix Aug 17, 2024
81652c5
update exercise number in the README.md
afoix Aug 17, 2024
4921a76
Commit from GitHub Actions (Build Notebooks)
afoix Aug 17, 2024
2599651
Ben/review (#12)
Ben-Salmon Aug 20, 2024
d759e63
Commit from GitHub Actions (Build Notebooks)
adjavon Aug 20, 2024
83495ec
Fix exercise setup
adjavon Aug 20, 2024
7d92477
Move data to extras
adjavon Aug 20, 2024
b454546
Split loss plot
adjavon Aug 20, 2024
81751d2
Update README
adjavon Aug 20, 2024
0fca9ec
Commit from GitHub Actions (Build Notebooks)
adjavon Aug 20, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
32 changes: 32 additions & 0 deletions .github/workflows/build-notebooks.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
name: Build Notebooks
on:
push:

jobs:
run:
runs-on: ubuntu-latest

steps:
- uses: actions/checkout@v2

- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: "3.10"

- name: Install dependencies
run: |
python -m pip install -U pip
python -m pip install jupytext nbconvert


- name: Build notebooks
run: |
jupytext --to ipynb --update-metadata '{"jupytext":{"cell_metadata_filter":"all"}}' solution.py

jupyter nbconvert solution.ipynb --TagRemovePreprocessor.enabled=True --TagRemovePreprocessor.remove_cell_tags solution --to notebook --output exercise.ipynb
jupyter nbconvert solution.ipynb --TagRemovePreprocessor.enabled=True --TagRemovePreprocessor.remove_cell_tags task --to notebook --output solution.ipynb

- uses: EndBug/add-and-commit@v9
with:
add: solution.ipynb exercise.ipynb
37 changes: 27 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,19 +1,39 @@
# Exercise 9: Explainable AI and Knowledge Extraction
# Exercise 8: Explainable AI and Knowledge Extraction

## Overview
The goal of this exercise is to learn how to probe what a pre-trained classifier has learned about the data it was trained on.

We will be working with a simple example which is a fun derivation on the MNIST dataset that you will have seen in previous exercises in this course.
Unlike regular MNIST, our dataset is classified not by number, but by color!

![CMNIST](assets/cmnist.png)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adjavon I have a question regarding the image. You explained that the dataset classifies by color, however, the different classes contain different colors. Is that OK? What is the logic behind the classes? 😄

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check the very end of the solutions notebook!! 😈 🌈 🙊


In this exercise, we will return to conventional, gradient-based attribution methods to see what they can tell us about what the classifier knows.
We will see that, even for such a simple problem, there is some information that these methods do not give us.

We will then train a generative adversarial network, or GAN, to try to create counterfactual images.
These images are modifications of the originals, which are able to fool the classifier into thinking they come from a different class!.
We will evaluate this GAN using our classifier; Is it really able to change an image's class in a meaningful way?

Finally, we will combine the two methods — attribution and counterfactual — to get a full explanation of what exactly it is that the classifier is doing. We will likely learn whether it can teach us anything, and whether we should trust it!

If time permits, we will try to apply this all over again as a bonus exercise to a much more complex and more biologically relevant problem.

![synister](assets/synister.png)
## Setup

Before anything else, in the super-repository called `DL-MBL-2023`:
Before anything else, in the super-repository called `DL-MBL-2024`:
```
git pull
git submodule update --init 09_knowledge_extraction
git submodule update --init 08_knowledge_extraction
```

Then, if you have any other exercises still running, please save your progress and shut down those kernels.
This is a GPU-hungry exercise so you're going to need all the GPU memory you can get.

Next, run the setup script. It might take a few minutes.
```
cd 09_knowledge_extraction
cd 08_knowledge_extraction
source setup.sh
```
This will:
Expand All @@ -28,10 +48,7 @@ jupyter lab
```
...and continue with the instructions in the notebook.

## Overview

In this exercise we will:
1. Train a classifier to predict, from 2D EM images of synapses, which neurotransmitter is (mostly) used at that synapse
2. Use a gradient-based attribution method to try to find out what parts of the images contribute to the prediction
3. Train a CycleGAN to create counterfactual images
4. Run a discriminative attribution from counterfactuals
### Acknowledgments

This notebook was written by Jan Funke and modified by Tri Nguyen and Diane Adjavon, using code from Nils Eckstein and a modified version of the [CycleGAN](https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix) implementation.
Binary file added assets/cmnist.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/same_class_diff_color.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/same_color_diff_class.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/stargan.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
5 changes: 5 additions & 0 deletions create_environment.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# Contains the steps that I used to create the environment, for memory
mamba create -n 08_knowledge_extraction python=3.11 pytorch torchvision pytorch-cuda=12.1 -c conda-forge -c pytorch -c nvidia
mamba activate 08_knowledge_extraction
pip install -r requirements.txt
mamba env export > environment.yaml
Empty file removed dac/__init__.py
Empty file.
72 changes: 0 additions & 72 deletions dac/activations.py

This file was deleted.

Loading