Skip to content

Commit

Permalink
Merge pull request #6 from annahedstroem/bridge
Browse files Browse the repository at this point in the history
Bridge
  • Loading branch information
annahedstroem authored Sep 4, 2023
2 parents b58f58b + 0796e52 commit e59414e
Show file tree
Hide file tree
Showing 24 changed files with 1,191 additions and 8,841 deletions.
5 changes: 3 additions & 2 deletions .github/workflows/python-package.yml
Original file line number Diff line number Diff line change
Expand Up @@ -38,9 +38,10 @@ jobs:
# exit-zero treats all errors as warnings. The GitHub editor is 127 chars wide
flake8 . --count --exit-zero --max-complexity=10 --max-line-length=127 --statistics
# run mypy
mypy quantus
# mypy metaquantus
# run balck
black quantus
black metaquantus
- name: Test with pytest
run: |
export PYTHONPATH=${PYTHONPATH}:.
pytest
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -9,4 +9,5 @@ dist/
*.egg-info/
.coverage*
coverage.xml
venvs/
venvs/
*.ipynb_checkpoints/
44 changes: 22 additions & 22 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -125,61 +125,61 @@ To reproduce the results of this paper, you will need to follow these three step

1. **Generate the dataset.** Run the notebook [
Tutorial-Data-Generation-Experiments.ipynb](https://github.com/annahedstroem/MetaQuantus/blob/main/tutorials/Tutorial-Data-Generation-Experiments.ipynb) to generate the necessary data for the experiments. This notebook will guide you through the process of downloading and preprocessing the data in order to save it to appropriate test sets. Please store the models in a folder called `assets/models/` and the tests sets under `assets/test_sets/`.
2. **Run the experiments.** To obtain the results for the respective experiments, you have to run the respective Python scripts which are detailed below. All these Python files are located in the `scripts/` folder. If you want to run the experiments on other explanation methods, datasets or models, feel free to change the hyperparameters.
3. **Analyse the results.** Once the results are obtained for your chosen experiments, run the [Tutorial-Reproduce-Paper-Experiments.ipynb](https://github.com/annahedstroem/MetaQuantus/blob/main/tutorials/Tutorial-Reproduce-Experiments.ipynb) to analyse the results. (In the notebook itself, we have also listed which specific Python scripts that need to be run in order to obtain the results for this analysis step.)
2. **Run the experiments.** To obtain the results for the respective experiments, you have to run the respective Python experiments which are detailed below. All these Python files are located in the `experiments/` folder. If you want to run the experiments on other explanation methods, datasets or models, feel free to change the hyperparameters.
3. **Analyse the results.** Once the results are obtained for your chosen experiments, run the [Tutorial-Reproduce-Paper-Experiments.ipynb](https://github.com/annahedstroem/MetaQuantus/blob/main/tutorials/Tutorial-Reproduce-Experiments.ipynb) to analyse the results. (In the notebook itself, we have also listed which specific Python experiments that need to be run in order to obtain the results for this analysis step.)

<details>
<summary><b><normal>Additional details on step 2 (Run the Experiments)</normal></b></summary>

**Test**: Go to the root folder and run a simple test that meta-evaluation work.
```bash
python3 scripts/run_test.py --K=5 --iters=10 --dataset=MNIST
python3 experiments/run_test.py --K=5 --iters=10 --dataset=MNIST
```

**Application**: Run the benchmarking experiments (also used for category convergence analysis).
```bash
python3 scripts/run_benchmarking.py --dataset=MNIST --fname=f --K=5 --iters=3
python3 scripts/run_benchmarking.py --dataset=fMNIST --fname=f --K=5 --iters=3
python3 scripts/run_benchmarking.py --dataset=cMNIST --fname=f --K=5 --iters=3
python3 scripts/run_benchmarking.py --dataset=ImageNet --fname=ResNet18 --K=5 --iters=3 --batch_size=50 --start_idx_fixed=100 --end_idx_fixed=150 --reverse_order=False --folder=benchmarks_imagenet/ --PATH_ASSETS=../assets/ --PATH_RESULTS=results/
python3 experiments/run_benchmarking.py --dataset=MNIST --fname=f --K=5 --iters=3
python3 experiments/run_benchmarking.py --dataset=fMNIST --fname=f --K=5 --iters=3
python3 experiments/run_benchmarking.py --dataset=cMNIST --fname=f --K=5 --iters=3
python3 experiments/run_benchmarking.py --dataset=ImageNet --fname=ResNet18 --K=5 --iters=3 --batch_size=50 --start_idx_fixed=100 --end_idx_fixed=150 --reverse_order=False --folder=benchmarks_imagenet/ --PATH_ASSETS=../assets/ --PATH_RESULTS=results/
```

**Application**: Run hyperparameter optimisation experiment.
```bash
python3 scripts/run_hp.py --dataset=MNIST --K=3 --iters=2 --PATH_ASSETS=../assets/ --PATH_RESULTS=results/
python3 scripts/run_hp.py --dataset=ImageNet --K=3 --iters=2 --PATH_ASSETS=../assets/ --PATH_RESULTS=results/
python3 experiments/run_hp.py --dataset=MNIST --K=3 --iters=2 --PATH_ASSETS=../assets/ --PATH_RESULTS=results/
python3 experiments/run_hp.py --dataset=ImageNet --K=3 --iters=2 --PATH_ASSETS=../assets/ --PATH_RESULTS=results/
```

**Experiment**: Run the faithfulness ranking disagreement exercise.
```bash
python3 scripts/run_ranking.py --dataset=cMNIST --fname=f --K=5 --iters=3 --category=Faithfulness --PATH_ASSETS=../assets/ --PATH_RESULTS=results/
python3 experiments/run_ranking.py --dataset=cMNIST --fname=f --K=5 --iters=3 --category=Faithfulness --PATH_ASSETS=../assets/ --PATH_RESULTS=results/
```

**Sanity-Check**: Run sanity-checking exercise: adversarial estimators.
```bash
python3 scripts/run_sanity_checks.py --dataset=ImageNet --K=3 --iters=2 --PATH_ASSETS=../assets/ --PATH_RESULTS=results/
python3 experiments/run_sanity_checks.py --dataset=ImageNet --K=3 --iters=2 --PATH_ASSETS=../assets/ --PATH_RESULTS=results/
```

**Sanity-Check**: Run sanity-checking exercise: L dependency.
```bash
python3 scripts/run_l_dependency.py --dataset=MNIST --K=5 --iters=3 --PATH_ASSETS=../assets/ --PATH_RESULTS=results/
python3 scripts/run_l_dependency.py --dataset=fMNIST --K=5 --iters=3 --PATH_ASSETS=../assets/ --PATH_RESULTS=results/
python3 scripts/run_l_dependency.py --dataset=cMNIST --K=5 --iters=3 --PATH_ASSETS=../assets/ --PATH_RESULTS=results/
python3 experiments/run_l_dependency.py --dataset=MNIST --K=5 --iters=3 --PATH_ASSETS=../assets/ --PATH_RESULTS=results/
python3 experiments/run_l_dependency.py --dataset=fMNIST --K=5 --iters=3 --PATH_ASSETS=../assets/ --PATH_RESULTS=results/
python3 experiments/run_l_dependency.py --dataset=cMNIST --K=5 --iters=3 --PATH_ASSETS=../assets/ --PATH_RESULTS=results/
```

**Benchmarking Transformers**: Run transformer benchmarking experiment.
```bash
python3 scripts/run_benchmarking_transformers.py --dataset=ImageNet --K=5 --iters=3 --start_idx=0 --end_idx=40 --category=localisation --PATH_ASSETS=../assets/ --PATH_RESULTS=results/
python3 scripts/run_benchmarking_transformers.py --dataset=ImageNet --K=5 --iters=3 --start_idx=40 --end_idx=80 --category=localisation --PATH_ASSETS=../assets/ --PATH_RESULTS=results/
python3 scripts/run_benchmarking_transformers.py --dataset=ImageNet --K=5 --iters=3 --start_idx=80 --end_idx=120 --category=localisation --PATH_ASSETS=../assets/ --PATH_RESULTS=results/
python3 scripts/run_benchmarking_transformers.py --dataset=ImageNet --K=5 --iters=3 --start_idx=120 --end_idx=160 --category=localisation --PATH_ASSETS=../assets/ --PATH_RESULTS=results/
python3 experiments/run_benchmarking_transformers.py --dataset=ImageNet --K=5 --iters=3 --start_idx=0 --end_idx=40 --category=localisation --PATH_ASSETS=../assets/ --PATH_RESULTS=results/
python3 experiments/run_benchmarking_transformers.py --dataset=ImageNet --K=5 --iters=3 --start_idx=40 --end_idx=80 --category=localisation --PATH_ASSETS=../assets/ --PATH_RESULTS=results/
python3 experiments/run_benchmarking_transformers.py --dataset=ImageNet --K=5 --iters=3 --start_idx=80 --end_idx=120 --category=localisation --PATH_ASSETS=../assets/ --PATH_RESULTS=results/
python3 experiments/run_benchmarking_transformers.py --dataset=ImageNet --K=5 --iters=3 --start_idx=120 --end_idx=160 --category=localisation --PATH_ASSETS=../assets/ --PATH_RESULTS=results/
```

```bash
python3 scripts/run_benchmarking_transformers.py --dataset=ImageNet --K=5 --iters=3 --start_idx=40 --end_idx=80 --category=complexity --PATH_ASSETS=../assets/ --PATH_RESULTS=results/
python3 scripts/run_benchmarking_transformers.py --dataset=ImageNet --K=5 --iters=3 --start_idx=0 --end_idx=40 --category=complexity --PATH_ASSETS=../assets/ --PATH_RESULTS=results/
python3 scripts/run_benchmarking_transformers.py --dataset=ImageNet --K=5 --iters=3 --start_idx=80 --end_idx=120 --category=complexity --PATH_ASSETS=../assets/ --PATH_RESULTS=results/
python3 scripts/run_benchmarking_transformers.py --dataset=ImageNet --K=5 --iters=3 --start_idx=120 --end_idx=160 --category=complexity --PATH_ASSETS=../assets/ --PATH_RESULTS=results/
python3 experiments/run_benchmarking_transformers.py --dataset=ImageNet --K=5 --iters=3 --start_idx=40 --end_idx=80 --category=complexity --PATH_ASSETS=../assets/ --PATH_RESULTS=results/
python3 experiments/run_benchmarking_transformers.py --dataset=ImageNet --K=5 --iters=3 --start_idx=0 --end_idx=40 --category=complexity --PATH_ASSETS=../assets/ --PATH_RESULTS=results/
python3 experiments/run_benchmarking_transformers.py --dataset=ImageNet --K=5 --iters=3 --start_idx=80 --end_idx=120 --category=complexity --PATH_ASSETS=../assets/ --PATH_RESULTS=results/
python3 experiments/run_benchmarking_transformers.py --dataset=ImageNet --K=5 --iters=3 --start_idx=120 --end_idx=160 --category=complexity --PATH_ASSETS=../assets/ --PATH_RESULTS=results/
```
</details>

Expand Down
18 changes: 18 additions & 0 deletions experiments/experiment_kwargs/bridge_estimators_101.ini
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
[DEFAULT]
perturbation_levels=None
nr_levels=5
nr_samples=10
x_noise=0.01
abs=False
normalise=True
similarity_func=quantus.similarity_func.correlation_spearman
measure_func=quantus.similarity_func.squared_difference
normalise_func=quantus.normalise_func.normalise_by_average_second_moment_estimate
return_aggregate=False
disable_warnings=True
display_progressbar=False
xai_settings = {"MNIST": ["Saliency", "InputXGradient", "LayerGradCam", "GradientShap"],
"fMNIST": ["Saliency", "InputXGradient", "LayerGradCam", "GradientShap"],
"cMNIST": ["Gradient", "InputXGradient", "LayerGradCam"],
"ImageNet": ["Saliency", "InputXGradient", "GradientShap"],}
std_max = {"MNIST": 2.0, "fMNIST": 2.0, "cMNIST": 0.75, "ImageNet": 0.5}
12 changes: 12 additions & 0 deletions experiments/experiment_kwargs/bridge_estimators_101_hp.ini
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
[DEFAULT]
nr_models = [1, 5, 10]
nr_levels = [2, 5, 10, 20]
dist_funcs = {
"sq": quantus.similarity_func.squared_difference,
"cos": quantus.similarity_func.cosine,
"euc": quantus.similarity_func.distance_euclidean,
}
simi_funcs = {
"pear": quantus.similarity_func.correlation_pearson,
"spear": quantus.similarity_func.correlation_spearman,
}
66 changes: 41 additions & 25 deletions scripts/run_benchmarking.py → experiments/run_benchmarking.py
Original file line number Diff line number Diff line change
Expand Up @@ -50,24 +50,35 @@
start_idx_fixed = eval(args.start_idx_fixed)
PATH_ASSETS = str(args.PATH_ASSETS)
PATH_RESULTS = str(args.PATH_RESULTS)
print(dataset_name, K, iters, batch_size, fname, reverse_order, folder, start_idx_fixed, end_idx_fixed, PATH_ASSETS, PATH_RESULTS)
print(
dataset_name,
K,
iters,
batch_size,
fname,
reverse_order,
folder,
start_idx_fixed,
end_idx_fixed,
PATH_ASSETS,
PATH_RESULTS,
)

#########
# GPUs. #
#########

# Setting device on GPU if available, else CPU.
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print("Using device:", device)
print()
print(torch.version.cuda)
print("\nUsing device:", device)
print("\t{torch.version.cuda}")

# Additional info when using cuda.
if device.type == "cuda":
print(torch.cuda.get_device_name(0))
print("Memory Usage:")
print("Allocated:", round(torch.cuda.memory_allocated(0) / 1024 ** 3, 1), "GB")
print("Cached: ", round(torch.cuda.memory_cached(0) / 1024 ** 3, 1), "GB")
print(f"\t{torch.cuda.get_device_name(0)}")
print("\tMemory Usage:")
print("\tAllocated:", round(torch.cuda.memory_allocated(0) / 1024 ** 3, 1), "GB")
print("\tCached: ", round(torch.cuda.memory_cached(0) / 1024 ** 3, 1), "GB")

# Reduce the number of explanation methods and samples for ImageNet.
if dataset_name == "ImageNet":
Expand All @@ -83,19 +94,19 @@
dataset_name=dataset_name, path_assets=PATH_ASSETS, device=device
)
dataset_settings = {dataset_name: SETTINGS[dataset_name]}
dataset_kwargs = dataset_settings[dataset_name]["estimator_kwargs"]
estimator_kwargs = dataset_settings[dataset_name]["estimator_kwargs"]

# Get analyser suite.
analyser_suite = setup_test_suite(dataset_name=dataset_name)

# Get estimators.
estimators = setup_estimators(
features=dataset_kwargs["features"],
num_classes=dataset_kwargs["num_classes"],
img_size=dataset_kwargs["img_size"],
percentage=dataset_kwargs["percentage"],
patch_size=dataset_kwargs["patch_size"],
perturb_baseline=dataset_kwargs["perturb_baseline"],
features=estimator_kwargs["features"],
num_classes=estimator_kwargs["num_classes"],
img_size=estimator_kwargs["img_size"],
percentage=estimator_kwargs["percentage"],
patch_size=estimator_kwargs["patch_size"],
perturb_baseline=estimator_kwargs["perturb_baseline"],
)

estimators_sub = {
Expand All @@ -109,8 +120,8 @@
# Get explanation methods.
xai_methods = setup_xai_methods(
gc_layer=dataset_settings[dataset_name]["gc_layers"][model_name],
img_size=dataset_kwargs["img_size"],
nr_channels=dataset_kwargs["nr_channels"],
img_size=estimator_kwargs["img_size"],
nr_channels=estimator_kwargs["nr_channels"],
)

###########################
Expand Down Expand Up @@ -156,8 +167,9 @@
}
elif fname == "Deit":
dataset_settings[dataset_name]["models"] = {
"Deit": timm.create_model(model_name='deit_tiny_distilled_patch16_224',
pretrained=True),
"Deit": timm.create_model(
model_name="deit_tiny_distilled_patch16_224", pretrained=True
),
}

# Prepare batching.
Expand All @@ -172,7 +184,7 @@

# Get indicies.
end_idx = min(int(start_idx + batch_size), nr_samples)
if (end_idx-start_idx) < batch_size:
if (end_idx - start_idx) < batch_size:
continue

if end_idx_fixed:
Expand All @@ -192,9 +204,15 @@
)

# Reduce the number of samples.
dataset_settings[dataset_name]["x_batch"] = dataset_settings[dataset_name]["x_batch"][start_idx:end_idx]
dataset_settings[dataset_name]["y_batch"] = dataset_settings[dataset_name]["y_batch"][start_idx:end_idx]
dataset_settings[dataset_name]["s_batch"] = dataset_settings[dataset_name]["s_batch"][start_idx:end_idx]
dataset_settings[dataset_name]["x_batch"] = dataset_settings[dataset_name][
"x_batch"
][start_idx:end_idx]
dataset_settings[dataset_name]["y_batch"] = dataset_settings[dataset_name][
"y_batch"
][start_idx:end_idx]
dataset_settings[dataset_name]["s_batch"] = dataset_settings[dataset_name][
"s_batch"
][start_idx:end_idx]

# Benchmark!
benchmark = MetaEvaluationBenchmarking(
Expand All @@ -212,5 +230,3 @@

if start_idx_fixed is not None:
break


Loading

0 comments on commit e59414e

Please sign in to comment.