- If
gcc
do not exist in your system, please install build essential:sudo apt-get install build-essential
- Create conda environment:
conda create -n nlp_atk python=3.9.7
andconda activate nlp_atk
- Install textattack:
cd TextAttack;pip install -e '.[tensorflow]' --extra-index-url https://download.pytorch.org/whl/cu113
- Download omw-1.4:
cd ..;python download.py
- --recipe : full attack recipe(ex.
bayesattack-wordnet
). - --random-seed : random seed.
- --num-examples : the number of examples to process.
- --sidx : start index of dataset.
- --pkl-dir : directory to save budget information.
- --use-sod : use this option for sod dataset sampling.
- --post-opt : use 'v3' to use the post optimization algorithm in our paper.
- --dpp-type :
dpp_posterior
for batch update via DPP. - --max-budget-key-type : the name of target baseline to compare. this option set max query budget of our method same to target baseline. one of ['pwws','textfooler','pso','lsh','bae'].
- --max-loop :
5
for default setting. (the number of loops of BBA) - --fit-iter :
3
for default setting. (the number of update steps in GP parameter fitting) - --max-patience : query budget for post optimization.
To reproduce results of the baseline method in table 1, we produce some example commands.
textattack attack --silent --shuffle --shuffle-seed 0 --random-seed 0 --product-space --recipe pwws --model bert-base-uncased-mr --num-examples 500 --sidx 0 --pkl-dir RESULTS
textattack attack --silent --shuffle --shuffle-seed 0 --random-seed 0 --product-space --recipe textfooler --model lstm-ag-news --num-examples 500 --sidx 0 --pkl-dir RESULTS
textattack attack --silent --shuffle --shuffle-seed 0 --random-seed 0 --product-space --recipe pso --model xlnet-base-cased-mr --num-examples 500 --sidx 0 --pkl-dir RESULTS
Categories
- recipe : pwws, textfooler, pso
- model : bert-base-uncased-mr, bert-base-uncased-ag-news, lstm-mr, lstm-ag-news, xlnet-base-cased-mr, xlnet-base-cased-ag-news
We provide commands to reproduce our results in table 1. The max-patience
value used in our experiment can be found in table 7.
Dataset | Model | Method | ASR (%) | MR (%) | Qrs |
---|---|---|---|---|---|
AG | BERT-base | PWWS | 57.1 | 18.3 | 367 |
BBA | 77.4 | 17.8 | 217 | ||
LSTM | PWWS | 78.3 | 16.4 | 336 | |
BBA | 83.2 | 15.4 | 190 | ||
MR | XLNet-base | PWWS | 83.9 | 14.4 | 143 |
BBA | 87.8 | 14.4 | 77 | ||
BERT-base | PWWS | 82.0 | 15.0 | 143 | |
BBA | 88.3 | 14.6 | 94 | ||
LSTM | PWWS | 94.2 | 13.3 | 132 | |
BBA | 94.2 | 13.0 | 67 |
BERT (AG's News, WordNet)
textattack attack --silent --shuffle --shuffle-seed 0 --random-seed 0 --recipe bayesattack-wordnet --model bert-base-uncased-ag-news --num-examples 500 --sidx 0 --pkl-dir RESULTS --post-opt v3 --use-sod --dpp-type dpp_posterior --max-budget-key-type pwws --max-patience 50
LSTM (AG's News, WordNet)
textattack attack --silent --shuffle --shuffle-seed 0 --random-seed 0 --recipe bayesattack-wordnet --model lstm-ag-news --num-examples 500 --sidx 0 --pkl-dir RESULTS --post-opt v3 --use-sod --dpp-type dpp_posterior --max-budget-key-type pwws --max-patience 50
XLNet (Movie Review, WordNet)
textattack attack --silent --shuffle --shuffle-seed 0 --random-seed 0 --recipe bayesattack-wordnet --model xlnet-base-cased-mr --num-examples 500 --sidx 0 --pkl-dir RESULTS --post-opt v3 --use-sod --dpp-type dpp_posterior --max-budget-key-type pwws --max-patience 100
BERT (Movie Review, WordNet)
textattack attack --silent --shuffle --shuffle-seed 0 --random-seed 0 --recipe bayesattack-wordnet --model bert-base-uncased-mr --num-examples 500 --sidx 0 --pkl-dir RESULTS --post-opt v3 --use-sod --dpp-type dpp_posterior --max-budget-key-type pwws --max-patience 50
LSTM (Movie Review, WordNet)
textattack attack --silent --shuffle --shuffle-seed 0 --random-seed 0 --recipe bayesattack-wordnet --model lstm-mr --num-examples 500 --sidx 0 --pkl-dir RESULTS --post-opt v3 --use-sod --dpp-type dpp_posterior --max-budget-key-type pwws --max-patience 50
Dataset | Model | Method | ASR (%) | MR (%) | Qrs |
---|---|---|---|---|---|
AG | BERT-base | TF | 84.7 | 24.9 | 346 |
BBA | 96.0 | 18.9 | 154 | ||
LSTM | TF | 94.9 | 17.3 | 228 | |
BBA | 98.5 | 16.6 | 142 | ||
MR | XLNet-base | TF | 95.0 | 18.0 | 101 |
BBA | 96.3 | 16.2 | 68 | ||
BERT-base | TF | 89.2 | 20.0 | 115 | |
BBA | 95.7 | 16.9 | 67 | ||
LSTM | TF | 98.2 | 13.6 | 72 | |
BBA | 98.2 | 13.1 | 54 |
BERT-base (AG's News, Embedding)
textattack attack --silent --shuffle --shuffle-seed 0 --random-seed 0 --recipe bayesattack-embedding --model bert-base-uncased-ag-news --num-examples 500 --sidx 0 --pkl-dir RESULTS --post-opt v3 --use-sod --dpp-type dpp_posterior --max-budget-key-type textfooler --max-patience 20
LSTM (AG's News, Embedding)
textattack attack --silent --shuffle --shuffle-seed 0 --random-seed 0 --recipe bayesattack-embedding --model lstm-ag-news --num-examples 500 --sidx 0 --pkl-dir RESULTS --post-opt v3 --use-sod --dpp-type dpp_posterior --max-budget-key-type textfooler --max-patience 20
XLNet-base (Movie Review, Embedding)
textattack attack --silent --shuffle --shuffle-seed 0 --random-seed 0 --recipe bayesattack-embedding --model xlnet-base-cased-mr --num-examples 500 --sidx 0 --pkl-dir RESULTS --post-opt v3 --use-sod --dpp-type dpp_posterior --max-budget-key-type textfooler --max-patience 20
BERT-base (Movie Review, Embedding)
textattack attack --silent --shuffle --shuffle-seed 0 --random-seed 0 --recipe bayesattack-embedding --model bert-base-uncased-mr --num-examples 500 --sidx 0 --pkl-dir RESULTS --post-opt v3 --use-sod --dpp-type dpp_posterior --max-budget-key-type textfooler --max-patience 20
LSTM (Movie Review, Embedding)
textattack attack --silent --shuffle --shuffle-seed 0 --random-seed 0 --recipe bayesattack-embedding --model lstm-mr --num-examples 500 --sidx 0 --pkl-dir RESULTS --post-opt v3 --use-sod --dpp-type dpp_posterior --max-budget-key-type textfooler --max-patience 20
Dataset | Model | Method | ASR (%) | MR (%) | Qrs |
---|---|---|---|---|---|
AG | BERT-base | PSO | 67.2 | 21.2 | 65860 |
BBA | 70.8 | 15.5 | 5176 | ||
LSTM | PSO | 71.0 | 19.7 | 44956 | |
BBA | 71.9 | 13.7 | 3278 | ||
MR | XLNet-base | PSO | 91.3 | 18.6 | 4504 |
BBA | 91.3 | 11.7 | 321 | ||
BERT-base | PSO | 90.9 | 17.3 | 6299 | |
BBA | 90.9 | 12.4 | 403 | ||
LSTM | PSO | 94.4 | 15.3 | 2030 | |
BBA | 94.4 | 11.2 | 138 |
BERT-base (AG's News, HowNet)
textattack attack --silent --shuffle --shuffle-seed 0 --random-seed 0 --recipe bayesattack-hownet --model bert-base-uncased-ag-news --num-examples 500 --sidx 0 --pkl-dir RESULTS --post-opt v3 --use-sod --dpp-type dpp_posterior --max-budget-key-type pso --max-patience 100
LSTM (AG's News, HowNet)
textattack attack --silent --shuffle --shuffle-seed 0 --random-seed 0 --recipe bayesattack-hownet --model lstm-ag-news --num-examples 500 --sidx 0 --pkl-dir RESULTS --post-opt v3 --use-sod --dpp-type dpp_posterior --max-budget-key-type pso --max-patience 100
XLNet-base (Movie Review, HowNet)
textattack attack --silent --shuffle --shuffle-seed 0 --random-seed 0 --recipe bayesattack-hownet --model xlnet-base-cased-mr --num-examples 500 --sidx 0 --pkl-dir RESULTS --post-opt v3 --use-sod --dpp-type dpp_posterior --max-budget-key-type pso --max-patience 100
BERT-base (Movie Review, HowNet)
textattack attack --silent --shuffle --shuffle-seed 0 --random-seed 0 --recipe bayesattack-hownet --model bert-base-uncased-mr --num-examples 500 --sidx 0 --pkl-dir RESULTS --post-opt v3 --use-sod --dpp-type dpp_posterior --max-budget-key-type pso --max-patience 100
LSTM (Movie Review, HowNet)
textattack attack --silent --shuffle --shuffle-seed 0 --random-seed 0 --recipe bayesattack-hownet --model lstm-mr --num-examples 500 --sidx 0 --pkl-dir RESULTS --post-opt v3 --use-sod --dpp-type dpp_posterior --max-budget-key-type pso --max-patience 100
We refer to commands in exp_imdb.txt, exp_yelp.txt, and exp_nli.txt.
@inproceedings{leeICML22,
title = {Query-Efficient and Scalable Black-Box Adversarial Attacks on Discrete Sequential Data via Bayesian Optimization},
author = {Lee, Deokjae and Moon, Seungyong and Lee, Junhyeok and Song, Hyun Oh},
booktitle = {International Conference on Machine Learning (ICML)},
year = {2022}
}