Generating adversarial examples by stacking mutiple augmentation methods automatically.
About • Setup • Main Usage • Other Usage • Design
Augmentation Strategy Optimization for Language Understanding is a Python framework for adversarial attacks, data augmentation, and model training in NLP. Stacked data augmentation (SDA) is a Python framework for stacking different augmentation methods automatically with reinforcement learning.
You should be running Python 3.6.13 to use this package. A CUDA-compatible GPU is optional but will greatly improve code speed.
SDA can install irectly from GitHub. For a basic install, run:
git clone https://github.com/BigPigKing/Adversarial_Data_Boost.git
cd Adversarial_Data_Boost
pip3 install -r requirements.txt
rsync -avh -e 'ssh -p 12030' [email protected]:/home/god/lab/Adversarial_Data_Boost/data . # Installation for dataset
rsync -avh -e 'ssh -p 12030' [email protected]:/home/god/lab/Intergration/model_record . # Installation for model recording dataset
The procedure for training with SDA from scratch can be divided into four steps.
In SDA, there are totally six dataset can be selected for testing, which are SST-2, SST-5, MPQA, TREC-6, CR, and SUBJ.
To start with the specific dataset, just load the model config json file provided in model_configs directory.
rm model_config.json
cp model_configs/sst2_model_config.json model_config.json
To commit an adversarial training, we need to first train an text model from scratch with clean data.
python3 sst_complete.py
Waiting until the training is finished, if the training process going to the training of REINFORCE. Press Ctrl^C for cancling it.
Otherwise the training of generator and discriminator will be intereact for one time.
It is essential to retain the parameter of original clean model, thus the comparasion can be coducted easily
cd model_record
cp -r text_model_weights test_bed # Retain the original clean model
vim model_config.json
And you will get the modelconfig.json like the below.
./run.sh
The times can be choose by change the number of seq. (30 in the figure)
Once you finish the experiments, one can use ./clean.sh to back to the original text model which training only using clean dataset without adversarial training.
./clean.sh
There are also many different function is supported in SDA including Visualization, TextAttack, TextAugment and ModelLoading.
One can check the training process of adversarial training using Tensorboard
tensorboard --logdir=runs --samples_per_plugin=text=100
Or check the log.txt for more detailed information
cat log.txt
SDA also provides the training process of the other adversarial training methods. It can be done by leverageing the power of TextAttack Module
pip3 install textattack # it should already installed in previous steps
./attack.sh
Running the attack.sh, and it will automatically running three different AT methods, DWB, PWWS, and TextBugger.
If you want to change differentt target model, just change the model card in --target-model
And the detailed of different AT method is provided in below:
Attack Recipe Name | Goal Function | ConstraintsEnforced | Transformation | Search Method | Main Idea |
---|---|---|---|---|---|
Attacks on classification tasks, like sentiment classification and entailment: | |||||
bae |
Untargeted Classification | USE sentence encoding cosine similarity | BERT Masked Token Prediction | Greedy-WIR | BERT masked language model transformation attack from (["BAE: BERT-based Adversarial Examples for Text Classification" (Garg & Ramakrishnan, 2019)](https://arxiv.org/abs/2004.01970)). |
deepwordbug |
{Untargeted, Targeted} Classification | Levenshtein edit distance | {Character Insertion, Character Deletion, Neighboring Character Swap, Character Substitution} | Greedy-WIR | Greedy replace-1 scoring and multi-transformation character-swap attack (["Black-box Generation of Adversarial Text Sequences to Evade Deep Learning Classifiers" (Gao et al., 2018)](https://arxiv.org/abs/1801.04354) |
fast-alzantot |
Untargeted {Classification, Entailment} | Percentage of words perturbed, Language Model perplexity, Word embedding distance | Counter-fitted word embedding swap | Genetic Algorithm | Modified, faster version of the Alzantot et al. genetic algorithm, from (["Certified Robustness to Adversarial Word Substitutions" (Jia et al., 2019)](https://arxiv.org/abs/1909.00986)) |
hotflip (word swap) |
Untargeted Classification | Word Embedding Cosine Similarity, Part-of-speech match, Number of words perturbed | Gradient-Based Word Swap | Beam search | (["HotFlip: White-Box Adversarial Examples for Text Classification" (Ebrahimi et al., 2017)](https://arxiv.org/abs/1712.06751)) |
iga |
Untargeted {Classification, Entailment} | Percentage of words perturbed, Word embedding distance | Counter-fitted word embedding swap | Genetic Algorithm | Improved genetic algorithm -based word substitution from (["Natural Language Adversarial Attacks and Defenses in Word Level (Wang et al., 2019)"](https://arxiv.org/abs/1909.06723) |
input-reduction |
Input Reduction | Word deletion | Greedy-WIR | Greedy attack with word importance ranking , Reducing the input while maintaining the prediction through word importance ranking (["Pathologies of Neural Models Make Interpretation Difficult" (Feng et al., 2018)](https://arxiv.org/pdf/1804.07781.pdf)) | |
kuleshov |
Untargeted Classification | Thought vector encoding cosine similarity, Language model similarity probability | Counter-fitted word embedding swap | Greedy word swap | (["Adversarial Examples for Natural Language Classification Problems" (Kuleshov et al., 2018)](https://openreview.net/pdf?id=r1QZ3zbAZ)) |
pso |
Untargeted Classification | HowNet Word Swap | Particle Swarm Optimization | (["Word-level Textual Adversarial Attacking as Combinatorial Optimization" (Zang et al., 2020)](https://www.aclweb.org/anthology/2020.acl-main.540/)) | |
pwws |
Untargeted Classification | WordNet-based synonym swap | Greedy-WIR (saliency) | Greedy attack with word importance ranking based on word saliency and synonym swap scores (["Generating Natural Language Adversarial Examples through Probability Weighted Word Saliency" (Ren et al., 2019)](https://www.aclweb.org/anthology/P19-1103/)) | |
textbugger : (black-box) |
Untargeted Classification | USE sentence encoding cosine similarity | {Character Insertion, Character Deletion, Neighboring Character Swap, Character Substitution} | Greedy-WIR | ([(["TextBugger: Generating Adversarial Text Against Real-world Applications" (Li et al., 2018)](https://arxiv.org/abs/1812.05271)). |
textfooler |
Untargeted {Classification, Entailment} | Word Embedding Distance, Part-of-speech match, USE sentence encoding cosine similarity | Counter-fitted word embedding swap | Greedy-WIR | Greedy attack with word importance ranking (["Is Bert Really Robust?" (Jin et al., 2019)](https://arxiv.org/abs/1907.11932)) |
SDA also provides the function to augment the target dataset automatically using ten different augmentation methods.
Including SEDA, EDA, Word Embedding, Clare, Checklist, Charswap, BackTranslation (De, Zh, Ru), and Spelling.
./make_noisy_to_all.sh
If you want to change the specific hyperparameter for different augmentation methods.
vim make_noisy.sh
And change the hyperparameter to the favor one.
SDA will store all the model weight in model_record.
For the weights of differen AT methods, it will stored in outputs
And all the detailed can be accessed.
If you use Augmentation Strategy Optimization for Language Understanding for your research, please cite