The code for the SelfExplain framework (https://arxiv.org/abs/2103.12279)
Currently, this repo supports SelfExplain-XLNet and SelfExplain-RoBERTa version for SST-2 dataset, SST-5 dataset, and SUBJ dataset. We have also tested it with CoLA, which only RoBERTa provide reasonable performance because sentences in the CoLA are too short for XLNet.
Data for preprocessing available in data/
folder
On a python shell, do the following for installing the parser
>>> import benepar
>>> benepar.download('benepar_en3')
sh scripts/run_preprocessing.sh
For preprocessing, we want to point out that we will need to adjust the hyperparameters on the top. We have created two separate folders in data folder: RoBERTa-SST-2 and XLNet-SST-2. We expect users follow this practice because concept store are unique for each Transformer-based classifier and each dataset.
Please comfirm DATA_FOLDER is the correct path.
Please comfirm TOKENIZER_NAME is the correct tokenizer you would like to use. (roberta-base or
xlnet-base-cased).
Please comfirm MAX_LENGTH because this will affect the number of concepts. If MAX_LENGTH is
small and average length for dataset is long, you may end up in training errors.
Example:
export DATA_FOLDER='data/SST-2-XLNet'
export TOKENIZER_NAME='xlnet-base-cased'
export MAX_LENGTH=5
Note if you wish to parse test.tsv please edit process_trec_dataset.py at line 57. Note we have provided data for SST-2 and SUBJ.
For training, please edit data path and control other parameters.
sh scripts/run_self_explain.sh
Example:
python model/run.py --dataset_basedir data/RoBERTa-SST-2 \
--lr 2e-5 --max_epochs 5 \
--gpus 1 \
--model_name roberta-base \
--concept_store data/RoBERTa-SST-2/concept_store.pt \
--topk 5 \
--gamma 0.1 \
--lamda 0.1
Note the specified model_name should accord with the tokenizer used in the pre-processing stage.
The Original author claims this is in developing setting. We have utilized it and it works well.
python model/infer_model.py
--ckpt $PATH_TO_BEST_DEV_CHECKPOINT \
--concept_map $DATA_FOLDER/concept_idx.json \
--batch_size $BS \
--paths_output_loc $PATH_TO_OUTPUT_PREDS \
--dev_file $PATH_TO_DEV_FILE
Example:
python model/infer_model.py
--ckpt lightning_logs/version_3/checkpoints/epoch=2-step=1499-val_acc_epoch=0.9570.ckpt \
--concept_map data/RoBERTa-SST-2/concept_idx.json \
--paths_output_loc result/result_roberta_7.csv \
--dev_file data/RoBERTa-SST-2/dev_with_parse.json \
--batch_size 16
@misc{rajagopal2021selfexplain,
title={SelfExplain: A Self-Explaining Architecture for Neural Text Classifiers},
author={Dheeraj Rajagopal and Vidhisha Balachandran and Eduard Hovy and Yulia Tsvetkov},
year={2021},
eprint={2103.12279},
archivePrefix={arXiv},
primaryClass={cs.CL}
}