Official implementation of the EMNLP 2021 paper SPECTRA: Sparse Structured Text Rationalization.
Nuno M Guerreiro and André Martins
Abstract: Selective rationalization aims to produce decisions along with rationales (e.g., text highlights or word alignments between two sentences). Commonly, rationales are modeled as stochastic binary masks, requiring sampling-based gradient estimators, which complicates training and requires careful hyperparameter tuning. Sparse attention mechanisms are a deterministic alternative, but they lack a way to regularize the rationale extraction (e.g., to control the sparsity of a text highlight or the number of alignments). In this paper, we present a unified framework for deterministic extraction of structured explanations via constrained inference on a factor graph, forming a differentiable layer. Our approach greatly eases training and rationale regularization, generally outperforming previous work on what comes to performance and plausibility of the extracted rationales. We further provide a comparative study of stochastic and deterministic methods for rationale extraction for classification and natural language inference tasks, jointly assessing their predictive power, quality of the explanations, and model variability.
If you use this code in your work, please cite our paper.
- Paper (arXiv)
All material is made available under the MIT license. You can use, redistribute, and adapt the material for non-commercial purposes, as long as you give appropriate credit by citing our paper and indicating any changes that you've made.
This code was tested on Python 3.8.2
. To install, follow these steps:
- In a virtual environment, first install Cython:
pip install cython
- Clone the Eigen repository to the main folder:
git clone [email protected]:libeigen/eigen.git
- Clone the LP-SparseMAP fork repository to main folder, and follow the installation instructions found there (for reproducibility purposes, please use this commit)
- Install PyTorch:
pip install torch
(we used version 1.6.0) - Install the requirements:
pip install -r requirements.txt
- Install the
spectra-rationalization
package:pip install .
(or in editable mode if you want to make changes:pip install -e .
)
We have used Hugging Face Datasets to get the data for our models. All data modules can be found in rationalizers/data_modules
. If you wish to use a dataset that you cannot find on Datasets, please refer to rationalizers/custom_hf_datasets
and follow the examples there. The data is downloaded automatically when you start training and will remain cached.
To train a model you need to define a .yaml
config. We have made available several of them in /configs/
.1 This config will include all relevant hyperparameters for that run. Below, we will show some examples:
Train SPECTRA on AgNews:
python3 -W ignore rationalizers train --config configs/agnews/agnews_spectra
Train HardKuma on IMDB:
python3 -W ignore rationalizers train --config configs/imdb/imdb_hardkuma
To test a model, you can use the same .yaml
config you used for training. After training, the test set is ran automatically. However, if you want to run the test loop afterwards, you may run:
Test SPECTRA on AgNews:
python3 -W ignore rationalizers predict --config configs/agnews/agnews_spectra --ckpt {ckpt_path}
If for some reason you want to resume training from a given checkpoint, you will need to change your .yaml
config so as to include resume
args (you may copy the predict
ones). Then, you may run:
Resume SPECTRA on AgNews:
python3 -W ignore rationalizers predict --config configs/agnews/agnews_spectra --ckpt {ckpt_path}
We want to thank Marcos Treviso for helping starting this codebase. We are also very grateful to Vlad Niculae for all the help he provided regarding the use of the LP-SparseMAP library. We also thank Jasmijn Bastings because the code in this repository was inspired by the structure and implementations in Interpretable Predictions.