Salience-aware-Learning

Code for our paper Table-based Fact Verification with Salience-aware Learning at EMNLP 2021 Findings.

Installation

pip install -r requirements.txt

Install pytorch_scatter.

Data

We conduct experiments on the TabFact dataset. The statements in officially released train/val/test set are lemmatized. We use the raw (unlemmatized) statements. More discussion can be found in this issue.

Download the train/val/test set to ./data.

Download the table set to ./data/tables.

To convert raw data to model inputs:

cd data
python preprocess.py

Token Salience Detection

cd token_salience

First, run bash run_origin.sh to get predictions for original inputs.
Second, run bash run_masked.sh to get predictions for inputs with masked tokens.
Third, run python calculate_salience.py to get salience scores by comparing the outputs of last two steps.
Finally, run python add_salience_to_data.py to merge the salience scores into input data.

Non-salient Token Replacement

cd token_replacement

First, run bash run_mlm.sh to get predictions for replacing non-salient tokens.
Second, run python add_token_replacement.py to merge the token replacement candidates into input data.

Joint Fact Verification and Salient Token Prediction

cd joint_model
bash run_joint_model.sh

Citing

@inproceedings{wang-etal-2021-table-based,
    title = "Table-based Fact Verification With Salience-aware Learning",
    author = "Wang, Fei  and
      Sun, Kexuan  and
      Pujara, Jay  and
      Szekely, Pedro  and
      Chen, Muhao",
    booktitle = "EMNLP - findings",
    year = "2021",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2021.findings-emnlp.338",
    pages = "4025--4036"
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Salience-aware-Learning

Installation

Data

Token Salience Detection

Non-salient Token Replacement

Joint Fact Verification and Salient Token Prediction

Citing

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
data		data
joint_model		joint_model
token_replacement		token_replacement
token_salience		token_salience
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

License

luka-group/Salience-aware-Learning

Folders and files

Latest commit

History

Repository files navigation

Salience-aware-Learning

Installation

Data

Token Salience Detection

Non-salient Token Replacement

Joint Fact Verification and Salient Token Prediction

Citing

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages