Skip to content

amiratag/InterpretationFragility

Repository files navigation

InterpretationFragility

Code for implementation of Interpretation of Nueral Network is Fragile..

Please cite the following work if you use this benchmark or the provided tools or implementations:

@inproceedings{ghorbani2019interpretation,
  title={Interpretation of neural networks is fragile},
  author={Ghorbani, Amirata and Abid, Abubakar and Zou, James},
  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
  volume={33},
  pages={3681--3688},
  year={2019}
}

Authors

License

This project is licensed under the MIT License - see the LICENSE.md file for details

The large scale results of attack methods against four famous feature-attribution methods

alt text

Examples of targeted attack for semantically meaningful change in feature-importance

alt text

Attack examples on Deep Taylor Decomposition

alt text

About

Interpretation of Neural Network is Fragile

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published