Skip to content

Latest commit

 

History

History
574 lines (517 loc) · 50.9 KB

scenarios.md

File metadata and controls

574 lines (517 loc) · 50.9 KB

Scenarios

Armory is intended to evaluate threat-model scenarios. Baseline evaluation scenarios are described below. Additionally, we've provided some academic standard scenarios.

Configuration Files

Scenario configuration files are found in the scenario_configs directory here. The most recent config files are found in the eval5 subfolder and older configs are found in the eval1-4 subfolder. There are also symlinks to representative configs found in the base of the scenario_configs directory.

Base Scenario Class

All scenarios inherit from the Base Armory Scenario. The base class parses an armory configuration file and calls a particular scenario's private _evaluate to perform all of the computation for a given threat-models robustness to attack. All _evaluate methods return a dictionary of recorded metrics which are saved into the armory output_dir upon completion.

Baseline Scenarios

Currently the following Scenarios are available within the armory package.

RESISC image classification (Updated June 2020)

  • Description: In this scenario, the system under evaluation is assumed to be a real-time overhead imagery scene classification system that a human operator is either passively monitoring or not monitoring at all.
  • Dataset: The dataset is the NWPU RESISC-45 dataset. It comprises 45 classes and 700 images for each class. Images 1-500 of each class are in the training split, 500-600 are in the validation split, and 600-700 are in the test split.
  • Baseline Model: To maximize time spent on defense research, a trained baseline model will be provided, but performers are not required to use it, if their defense requires a different architecture. The model is an ImageNet-pretrained DenseNet-121 that is fine-tuned on RESISC-45.
  • Threat Scenario:
    • Adversary objectives:
      • Untargeted - an adversary may simply wish to evade detection
      • Targeted - an adversary may wish to divert attention or resources to scenes that are otherwise uninteresting
    • Adversary Operating Environment:
      • Non real-time, digital evasion attack - attack is not "per-example" bur "universal," which could be created offline (i.e., non real-time). The goal is to mimic conditions under which physical evasion attack could be carried out.
      • Black-box, white-box, and adaptive attacks will be performed on defenses - for black-box attack, a held-back model or dataset will be used as surrogate.
    • Adversary Capabilities and Resources
      • Attacks that are non-overtly perceptible under quick glance are allowed - we assume in this scenario that a human may at most passively monitor the classifier system. Use own judgement on the maximum perturbation budget allowed while meeting the perceptibility requirement.
      • Type of attacks that will be implemented during evaluation: universal perturbation (untargeted attack) and universal patch (targeted attack)
        • For universal patch attack, assume the total area of the patch is at most 25% of the total image area. The location and shape of the patch will vary.
  • Metrics of Interest:
    • Primary metrics:
      • Accuracy (mean, per-class), attack computational cost, defense computational cost, various distance measures of perturbation (Lp-norms, Wasserstein distance)
    • Derivative metrics - see end of document
    • Additional metrics specific to the scenario or that are informative may be added later
  • Baseline Model Performance: (results obtained using Armory < v0.10)
    • Baseline Clean Top-1 Accuracy: 93%
    • Baseline Attacked (Universal Perturbation) Top-1 Accuracy: 6%
    • Baseline Attacked (Universal Patch) Top-1 Accuracy: 23%
  • Baseline Defense Performance: (results obtained using Armory < v0.10) Baseline defense is art_experimental.defences.JpegCompressionNormalized(clip_values=(0.0, 1.0), quality=50, channel_index=3, apply_fit=False, apply_predict=True, means=[0.36386173189316956, 0.38118692953271804, 0.33867067558870334], stds=[0.20350874, 0.18531173, 0.18472934]) - see resisc45_baseline_densenet121_adversarial.json for example usage. Baseline defense performance is evaluated for a grey-box attack: adversarial examples generated on undefended baseline model evaluated on defended model.
    • Baseline Clean Top-1 Accuracy: 92%
    • Baseline Attacked (Universal Perturbation) Top-1 Accuracy: 40%
    • Baseline Attacked (Universal Patch) Top-1 Accuracy: 21%

Librispeech speaker audio classification (Updated June 2020)

  • Description: In this scenario, the system under evaluation is a speaker identification system that a human operator is either passively monitoring or not monitoring at all.
  • Dataset: The dataset is the LibriSpeech dataset. Due to the large size of the dataset, a custom subset is created. It comprises 40 speakers (20 male/ 20 female), each with 4/2/2 minutes of audio in the train/validation/test splits, respectively.
  • Baseline Model: To maximize time spent on defense research, two trained baseline models will be provided - one based on spectrogram (not mel-cepstrum or MFCC) and one based on raw audio - but performers are not required to use them, if their defense requires a different architecture. The spectrogram-based model is developed and trained from scratch, and the raw audio-based model is SincNet, trained from scratch.
  • Threat Scenario:
    • Adversary objectives:
      • Untargeted - an adversary may simply wish to evade detection
      • Targeted - an adversary may wish to impersonate someone else
    • Adversary Operating Environment:
      • Non-real time, digital evasion attack - attack is not "per-example" bur rather "universal," which could be created offline (i.e., non real-time). The goal is to mimic conditions under which physical evasion attack could be carried out.
      • Assuming perfect acoustic channel
      • Black-box, white-box, and adaptive attacks will be performed on defenses - for black-box attack, spectrogram-based model will be the surrogate for the raw audio-based model, and vice versa.
    • Adversary Capabilities and Resources
      • Attacks that are non-overtly perceptible under passive listening are allowed - we assume in this scenario that a human may at most passively monitor the classifier system. Use own judgement on the maximum perturbation budget allowed while meeting the perceptibility requirement.
      • Type of attacks that will be implemented during evaluation: universal perturbation (untargeted and targeted attacks)
  • Metrics of Interest:
    • Primary metrics:
      • Accuracy (mean, per-class), attack computational cost, defense computational cost, various distance measures of perturbation (Lp-norms, Wasserstein distance, signal-to-noise ratio)
    • Derivative metrics - see end of document
    • Additional metrics specific to the scenario or that are informative may be added later
  • Baseline Model Performance: To be added
  • Baseline Defense Performance: To be added

UCF101 video classification (Updated July 2021)

  • Description: In this scenario, the system under evaluation is a video action recognition system that a human operator is either passively monitoring or not monitoring at all.
  • Dataset: The dataset is the UCF101 dataset. It comprises 101 actions and 13,320 total videos. For the training/testing split, we use the official Split 01.
  • Baseline Model: To maximize time spent on defense research, a trained baseline model will be provided, but performers are not required to use it, if their defense requires a different architecture. The model uses the MARS architecture, which is a single-stream (RGB) 3D convolution architecture that simultaneously mimics the optical flow stream. The provided model is pre-trained on the Kinetics dataset and fine-tuned on UCF101.
  • Threat Scenario:
    • Adversary objectives:
      • Untargeted - an adversary may simply wish to evade detection
    • Adversary Operating Environment:
      • Non-real time, digital evasion attack - we assume an adversary is the author of the video, so they could create an evasion attack offline before distributing the video. Typically, a non real-time attack is "universal," but in this case, it is "per-example."
      • Adaptive attacks will be performed on defenses
    • Adversary Capabilities and Resources
      • Attacks that are non-overtly perceptible under quick glance are allowed, as are attacks that create perceptible but non-suspicious patches - we assume in this scenario that a human may at most passively monitor the classifier system.
  • Metrics of Interest:
    • Primary metrics:
      • Accuracy (mean, per-class), attack budget
    • Derivative metrics - see end of document
    • Additional metrics specific to the scenario or that are informative may be added later
  • Baseline Attacks:
  • Baseline Defense: Video Compression
  • Baseline Model Performance: (results obtained using Armory v0.13.3)
Attack Budget Benign Top1/Top5 Accuracy (Undefended) Adversarial Top1/Top5 Accuracy (Undefended) Benign Top1/Top5 Accuracy (Defended) Adversarial Top1/Top5 Accuracy (Defended) Test Size
Flicker (low perceptibility) beta_0=4.0 beta_1=0.1 beta_2=0.9 0.92/1.00 0.51/1.00 0.92/1.00 0.44/1.00 100
Flicker (medium perceptibility) beta_0=2.0 beta_1=0.1 beta_2=0.9 0.92/1.00 0.39/1.00 0.92/1.00 0.40/0.97 100
Flicker (high perceptibility) beta_0=1.0 beta_1=0.1 beta_2=0.9 0.92/1.00 0.37/1.00 0.92/1.00 0.38/0.98 100
Frame Border patch ratio=0.10 0.92/1.00 0.00/0.25 0.93/1.00 0.03/0.36 100
Frame Border patch ratio=0.15 0.92/1.00 0.00/0.19 0.93/1.00 0.01/0.29 100
Frame Border patch ratio=0.20 0.92/1.00 0.00/0.19 0.93/1.00 0.00/0.25 100
Masked PGD patch ratio=0.10 0.92/1.00 0.02/0.61 0.93/1.00 0.01/0.66 100
Masked PGD patch ratio=0.15 0.92/1.00 0.00/0.42 0.93/1.00 0.00/0.36 100
Masked PGD patch_ratio=0.20 0.92/1.00 0.00/0.28 0.93/1.00 0.00/0.31 100
Frame Saliency (iterative_saliency) eps=0.004 0.92/1.00 0.00/0.96 0.92/1.00 0.81/1.00 100
Frame Saliency (iterative_saliency) eps=0.008 0.92/1.00 0.00/0.96 0.92/1.00 0.47/1.00 100
Frame Saliency (iterative_saliency) eps=0.015 0.92/1.00 0.00/0.96 0.92/1.00 0.23/0.99 100
Frame Saliency (one_shot) eps=0.004 0.92/1.00 0.00/0.26 0.93/1.00 0.79/0.97 100
Frame Saliency (one_shot) eps=0.008 0.92/1.00 0.00/0.22 0.93/1.00 0.46/0.89 100
Frame Saliency (one_shot) eps=0.015 0.92/1.00 0.00/0.20 0.93/1.00 0.21/0.74 100

Find reference baseline configurations here

German traffic sign poisoned image classification (Updated December 2020)

  • Description: In this scenario, the system under evaluation is a traffic sign recognition system that requires continuous training, and the training data is procured through less trustworthy external sources (e.g., third-party, Internet, etc.) and may contain backdoor triggers, where some images and labels are intentionally altered to mislead the system into making specific test-time decisions.
  • Dataset: The dataset is the German Traffic Sign Recognition Benchmark. It comprises 43 classes and more than 50,000 total images. The official Final_Training and Final_Test data are used for the train/test split. The dataset is available in canonical format, but the official scenario applies preprocessing consisting of contrast equalization, cropping to a square shape, and resizing.
  • Baseline Model: To maximize time spent on defense research, an untrained baseline model will be provided, but performers are not required to use it, if their defense requires a different architecture. The model uses the MicronNet architecture. Also provided will be poison data (1/5/10% of the training size) that should be mixed with the training data.
  • Threat Scenario:
    • Adversary objectives:
      • Targeted
    • Adversary Operating Environment:
      • Non-real time, digital evasion attack - the goal is to mimic conditions under which physical evasion attack could be carried out.
      • Black-box, white-box, and adaptive attacks will be performed on defenses - for black-box attack, a held-back model or dataset will be used as surrogate.
    • Adversary Capabilities and Resources
      • Attacks that are non-overtly perceptible under quick glance are allowed, as are attacks that create perceptible but non-suspicious triggers - we assume in this scenario that a human may at most passively monitor the classifier system.
  • Metrics of Interest:
    • Primary metrics:
      • Accuracy (mean, per-class), backdoor success rate, attack computational cost, defense computational cost
    • Fairness metrics. These attempt to ascertain whether the model or the filter is biased toward data that is unusual in some sense:
      • Filter Perplexity compares the class distribution of False Positives to the class distribution of unpoisoned data. A filter is biased if it removes, besides poison, data from only a few classes.
      • Model Subclass Bias measures, for each class, the difference in accuracy in two sub-populations of the class
      • Filter Subclass Bias measures, for each class, the difference in filtering rate in two sub-populations of the class
    • Derivative metrics - see end of document
    • Additional metrics specific to the scenario or that are informative may be added later
  • Baseline Attacks:
    • Dirty-label Backdoor Attack: 1 to 30% of a source class in the training data have trigger added and are intentionally mislabeled with target label; during test time, the same trigger is added to an input of source class to cause targeted misclassification.
    • Clean-label Backdoor Attack: 20 to 80% of the target class in training data are imperceptibly perturbed (so they are still correctly labeled) and have trigger added; during test time, same trigger is added to an input of a source class to cause targeted misclassification
      • Perturbation constraints: Linf (eps <= 16/255), L2 (eps <= 8/255 * sqrt(N)), N=# of pixels in a single input
  • Baseline Defense: Activation Clustering and/or Spectral Signature
  • Baseline Model Performance: To be added
  • Baseline Defense Performance: To be added

Librispeech automatic speech recognition (Updated June 2022)

  • Description: In this scenario, the system under evaluation is an automatic speech recognition system that a human operator is either passively monitoring or not monitoring at all.
  • Dataset: The dataset is the LibriSpeech dataset and comprises train_clean100, train_clean360 and test_clean.
  • Baseline Model: To maximize time spent on defense research, a trained baseline model will be provided, but performers are not required to use it, if their defense requires a different architecture. The model uses the DeepSpeech 2 architecture with pretrained weights from either the AN4, LibriSpeech, or TEDLIUM datasets. Custom weights may also be loaded by the model.
  • Threat Scenario:
    • Adversary objectives:
      • Untargeted - an adversary may simply wish for speech to be transcribed incorrectly
      • Targeted - an adversary may wish for specific strings to be predicted
    • Adversary Operating Environment:
      • Non-real time, digital evasion attack. Each attack will be "universal" with respect to channel conditions (under a single perfect channel, the attack will be "per-example.")
      • Under some threat models, the channel model consists only a single perfect acoustic channel, and under others, it may consist of one additional multipath channel.
      • Adaptive attacks will be performed on defenses.
    • Adversary Capabilities and Resources
      • To place an evaluation bound on the perceptibility of perturbations, the SNR is restricted to >20 dB.
  • Metrics of Interest:
    • Primary metrics:
      • Word error rate, SNR, entailment rate
    • Derivative metrics - see end of document
    • Additional metrics specific to the scenario or that are informative may be added later
  • Baseline Attacks:
  • Baseline Defense: MP3 Compression
  • Baseline Model Performance: (Table 1 results are obtained using Armory v0.13.3, Table 2 results are obtained using Armory v0.15.2)

Table 1

Attack Targeted Budget Benign WER (Undefended) Adversarial WER (Undefended) Benign WER (Defended) Adversarial WER (Defended) Test Size
Imperceptible ASR yes max_iter_1=100 0.10 0.63 0.13 N/A* 320
Imperceptible ASR yes max_iter_1=200 0.10 0.20 0.13 N/A 320
Imperceptible ASR yes max_iter_1=400 0.10 0.11 0.13 N/A 320
Kenansville no snr=20dB 0.10 0.27 0.13 0.36 1000
Kenansville no snr=30dB 0.10 0.11 0.13 0.17 1000
Kenansville no snr=40dB 0.10 0.10 0.13 0.13 1000
PGD (single channel) no snr=20dB 0.10 0.46 0.13 0.53 100
PGD (single channel) no snr=30dB 0.10 0.46 0.13 0.50 100
PGD (single channel) no snr=40dB 0.10 0.33 0.13 0.36 100
PGD (single channel)* yes snr=20dB 0.11 1.03 0.15 1.01 100
PGD (single channel)* yes snr=30dB 0.11 1.02 0.15 0.99 100
PGD (single channel)* yes snr=40dB 0.11 0.88 0.15 0.84 100
PGD (multiple channels) no snr=20dB 0.13 0.96 N/A N/A 100
PGD (multiple channels) no snr=30dB 0.13 0.59 N/A N/A 100
PGD (multiple channels) no snr=40dB 0.13 0.38 N/A N/A 100
PGD (multiple channels)* yes snr=20dB 0.13 0.99 N/A N/A 100
PGD (multiple channels)* yes snr=30dB 0.13 0.92 N/A N/A 100
PGD (multiple channels)* yes snr=40dB 0.13 0.75 N/A N/A 100
  • *Targeted attack, where a random target phrase of similar length as the ground truth, was applied but WER wrt the ground truth was calculated

Find reference baseline configurations here

  • Missing defended baseline is due to current incompatibility of the attack and defense.

Table 2

Attack Targeted Budget Attack Parameters Entailment/Contradiction/Neutral Rates (Benign Undefended) Number of Entailment/Contradiction/Neutral Rates (Adversarial Undefended) Entailment/Contradiction/Neutral Rates (Benign Defended) Entailment/Contradiction/Neutral Rates (Adversarial Defended) Test Size
PGD* yes snr=20dB eps_step=0.05, max_iter=500 0.95/0.05/0.00 0.01/0.98/0.01 0.93/0.07/0.00 0.02/0.96/0.02 100
PGD* yes snr=30dB eps_step=0.03, max_iter=500 0.95/0.05/0.00 0.04/0.95/0.01 0.93/0.07/0.00 0.19/0.79/0.02 100
PGD* yes snr=40dB eps_step=0.01, max_iter=500 0.95/0.05/0.00 0.43/0.53/0.04 0.93/0.07/0.00 0.66/0.34/0.00 100
  • *Targeted attack, where contradictory target phrases are generated from ground truth phrases by changing a few key words (e.g., target phrase: he is a bad person; ground truth phrase: he is a good person)

Find reference baseline configurations here

so2sat multimodal image classification (Updated July 2021)

  • Description: In this scenario, the system under evaluation is an image classifier which determines local climate zone from a combination of co-registered synthetic aperture radar (SAR) and multispectral electro-optical (EO) images.
  • Dataset: The dataset is the so2sat dataset. It comprises 352k/24k images in train/validation datasets and 17 classes of local climate zones.
  • Baseline Model: To maximize time spent on defense research, a trained baseline model will be provided, but performers are not required to use it, if their defense requires a different architecture. The model uses a custom CNN architecture with a single input that stacks SAR (first four channels only, representing the real and imaginary components of the reflected electromagnetic waves) and EO (all ten channels) data. Immediately after the input layer, the data is split into SAR and EO data streams and fed into their respective feature extraction networks. In the final layer, the two networks are fused to produce a single prediction output.
  • Threat Scenario:
    • Adversary objectives:
      • Untargeted - an adversary wish to evade correct classification
    • Adversary Operating Environment:
      • Non-real time, digital evasion attack
      • Adversary perturbs a single modality (SAR or EO)
      • Adaptive attacks will be performed on defenses.
    • Adversary Capabilities and Resources
      • Patch ratio < 15% of the image area
  • Metrics of Interest:
    • Primary metrics:
      • Accuracy (mean, per-class), Patch size
    • Derivative metrics - see end of document
    • Additional metrics specific to the scenario or that are informative may be added later
  • Baseline Attacks:
  • Baseline Defense: JPEG Compression for Multi-Channel
  • Baseline Model Performance: (results obtained using Armory v0.13.3)
Attacked Modality Patch Ratio Benign Accuracy (Undefended) Adversarial Accuracy (Undefended) Benign Accuracy (Defended) Adversarial Accuracy (Defended) Test Size
EO 0.05 0.583 0.00 0.556 0.00 1000
EO 0.10 0.583 0.00 0.556 0.00 1000
EO 0.15 0.583 0.00 0.556 0.00 1000
SAR 0.05 0.583 0.00 0.556 0.00 1000
SAR 0.10 0.583 0.00 0.556 0.00 1000
SAR 0.15 0.583 0.00 0.556 0.00 1000

Find reference baseline configurations here

xView object detection (Updated July 2021)

  • Description: In this scenario, the system under evaluation is an object detector which localizes and identifies various classes from satellite imagery.
  • Dataset: The dataset is the xView dataset. It comprises 59k/19k train and test images (each with dimensions 300x300, 400x400 or 500x500) and 62 classes
  • Baseline Model: To maximize time spent on defense research, a trained baseline model will be provided, but performers are not required to use it, if their defense requires a different architecture. The model uses the Faster-RCNN ResNet-50 FPN architecture pre-trained on MSCOCO objects and fine-tuned on xView.
  • Threat Scenario:
    • Adversary objectives:
      • Untargeted - an adversary wishes to disable object detection
    • Adversary Operating Environment:
      • Non-real time, digital and physical-like evasion attacks and translation.
      • Adaptive attacks will be performed on defenses.
  • Adversary Capabilities and Resources
    • Patch size <100x100 pixels
  • Metrics of Interest:
    • Primary metrics:
      • Average precision (mean, per-class) of ground truth classes, Patch Size
    • Derivative metrics - see end of document
    • Additional metrics specific to the scenario or that are informative may be added later
  • Baseline Attacks:
  • Baseline Defense: JPEG Compression
  • Baseline Model Performance: (results obtained using Armory v0.13.3)
Attack Patch Size Benign mAP (Undefended) Adversarial mAP (Undefended) Benign mAP (Defended) Adversarial mAP (Defended) Test Size
Masked PGD 50x50 0.284 0.142 0.232 0.139 100
Masked PGD 75x75 0.284 0.071 0.232 0.094 100
Masked PGD 100x100 0.284 0.076 0.232 0.092 100
Robust DPatch 50x50 0.284 0.193 0.232 0.184 100
Robust DPatch 75x75 0.284 0.184 0.232 0.146 100
Robust DPatch 100x100 0.284 0.173 0.232 0.165 100

Find reference baseline configurations here

DAPRICOT object detection (Updated July 2021)

  • Description: In this scenario, the system under evaluation is an object detector trained to identify the classes in the Microsoft COCO dataset.
  • Dataset: The dataset is the Dynamic APRICOT (DAPRICOT) dataset 1 and dataset 2. It is similar to the APRICOT dataset (see below), but instead of pre-generated physical patches taken in the natural environment, the DAPRICOT dataset contains greenscreens and natural lighting metadata that allow digital, adaptive patches to be inserted and rendered into the scene similar to if they were physically printed. This dataset contains 15 scenes, where each scene contains 3 different greenscreen shapes, taken at 3 different distances, 3 different heights and using 3 different camera angles, for a total of over 1000 images.
  • Baseline Model: The model uses the pretrained Faster-RCNN with ResNet-50 model.
  • Threat Scenario:
    • Adversary objectives:
      • Targeted attack - objective is to force an object detector to localize and classify the patch as an MSCOCO object.
    • Adversary Operating Environment:
      • Non-real time, digital and physical-like patch attacks
      • Adaptive attacks will be performed on defenses.
  • Adversary Capabilities and Resources
    • Patch size of different shapes as dictated by the greenscreen sizes in the images
  • Metrics of Interest:
    • Primary metrics:
      • Average precision (mean, per-class) of patches, Average target success
    • Additional metrics specific to the scenario or that are informative may be added later
  • Baseline Attacks:
  • Baseline Defense: JPEG Compression
  • Baseline Model Performance: (results obtained using Armory v0.13.3 and dev test data)
Attack Patch Size Target Success (Undefended) Target mAP (Undefended) Target Success (Defended) Target mAP (Defended) Test Size
Masked PGD all 0.99 0.91 0.99 0.91 100
Masked PGD small 0.97 0.91 0.97 0.91 100
Masked PGD medium 1.00 1.00 1.00 0.91 100
Masked PGD large 1.00 1.00 1.00 0.91 100
Robust DPatch all 0.56 0.64 0.61 0.64 100
Robust DPatch small 0.51 0.64 0.60 0.64 100
Robust DPatch medium 0.61 0.64 0.65 0.73 100
Robust DPatch large 0.55 0.64 0.63 0.73 100

Find reference baseline configurations here

APRICOT object detection (Updated December 2020)

  • Description: In this scenario, the system under evaluation is an object detector trained to identify the classes in the Microsoft COCO dataset.
  • Dataset: The dataset is the APRICOT dataset, which includes over 1000 natural images with physically-printed adversarial patches, covering three object detection architectures (Faster-RCNN with ResNet-50, SSD with MobileNet, and RetinaNet), two shapes (circle and rectangular), and ten MS-COCO classes as targets.
  • Baseline Model: The model uses the pretrained Faster-RCNN with ResNet-50, SSD with MobileNet, and RetinaNet models. Note: currently, only Tensorflow Faster-RCNN with ResNet-50 is implemented, with the other two architectures to be implemented in the near future. In order to perform as close to a white-box evaluation as possible, it is strongly recommended, but not required, that performers adopt one of the above architectures for defense research - the pretrained weights may not be robust, so performers can change the weights.
  • Threat Scenario:
    • Adversary Operating Environment:
      • This is a dataset of precomputed adversarial images on which trained models will be evaluated.
      • Each patch is a targeted attack, whose objective is to force an object detector to localize and classify the patch as an MSCOCO object.
  • Metrics of Interest:
    • Primary metrics:
      • Average precision (mean, per-class) of patches, Average precision of MSCOCO objects
    • Additional metrics specific to the scenario or that are informative may be added later
  • Baseline Attacks:
  • Baseline Defense: JPEG Compression
  • Baseline Model Performance: (results obtained using Armory v0.13)
    • Baseline MSCOCO Objects mAP: 8.76% (all test examples)
    • Baseline Targeted Patch mAP: 5.70% (all test examples)
  • Baseline Defense Performance: (results obtained using Armory v0.13)
    Baseline defense is art_experimental.defences.jpeg_compression_normalized(clip_values=(0.0, 1.0), quality=10, channel_index=3, apply_fit=False, apply_predict=True).
    Baseline defense performance is evaluated for a transfer attack.
    • Baseline MSCOCO Objects mAP: 7.83% (all test examples)
    • Baseline Targeted Patch mAP: 4.59% (all test examples)

CARLA object detection (Updated June 2022)

  • Description: In this scenario, the system under evaluation is an object detector trained to identify vehicles, pedestrians, and traffic lights.
  • Dataset: The development dataset is generated using CARLA, which includes RGB and depth channels for 31 synthetic images from both vehicle and security camera perspectives. Each image contains a green-screen intended for adversarial patch insertion. The dataset contains natural lighting metadata that allow digital, adaptive patches to be inserted and rendered into the scene similar to if they were physically printed.
  • Baseline Model:
  • Threat Scenario:
    • Adversary objectives:
      • To degrade the performance of an object detector through the insertion of adversarial patches.
    • Adversary Operating Environment:
      • Non-real time, physical-like patch attacks
      • Adaptive attacks will be performed on defenses.
  • Adversary Capabilities and Resources
    • Patch size of different size/shape as dictated by the green-screen in each image. In the multimodal case, both RGB and depth channels are to be perturbed.
  • Metrics of Interest:
    • Primary metrics:
      • mAP
      • Disappearance rate
      • Hallucinations per image
      • Misclassification rate
      • True positive rate
  • Baseline Attacks:
  • Baseline Defense: JPEG Compression
  • Baseline Model Performance: (results obtained using Armory v0.15.2 and dev data)

Single Modality (RGB) Object Detection

Attack Attack Parameters Benign mAP Benign Disappearance Rate Benign Hallucination per Image Benign Misclassification Rate Benign True Positive Rate Adversarial mAP Adversarial Disappearance Rate Adversarial Hallucination per Image Adversarial Misclassification Rate Adversarial True Positive Rate Test Size
Robust DPatch depth_delta_meters=3, learning_rate=0.002, learning_rate_depth=0.0001, max_iter=2000 0.76/0.72 0.19/0.22 3.97/3.48 0.06/0.06 0.75/0.71 0.68/0.66 0.27/0.28 4.48/3.65 0.06/0.07 0.67/0.65 31
Adversarial Patch depth_delta_meters=3, learning_rate=0.003, learning_rate_depth=0.0001, max_iter=1000 0.76/0.72 0.19/0.22 3.97/3.48 0.06/0.06 0.75/0.71 0.54/* 0.32/* 22.16/* 0.05/* 0.62/* 31

Multimodality (RGB+depth) Object Detection

Attack Attack Parameters Benign mAP Benign Disappearance Rate Benign Hallucination per Image Benign Misclassification Rate Benign True Positive Rate Adversarial mAP Adversarial Disappearance Rate Adversarial Hallucination per Image Adversarial Misclassification Rate Adversarial True Positive Rate Test Size
Robust DPatch depth_delta_meters=3, learning_rate=0.002, learning_rate_depth=0.0001, max_iter=2000 0.87/0.86 0.06/0.04 1.23/2.55 0.05/0.05 0.88/0.91 0.76/0.83 0.10/0.06 5.68/4.87 0.05/0.05 0.84/0.89 31
Adversarial Patch depth_delta_meters=3, learning_rate=0.003, learning_rate_depth=0.0001, max_iter=1000 0.87/0.86 0.06/0.04 1.23/2.55 0.05/0.05 0.88/0.91 0.66/0.76 0.11/0.10 10.74/7.13 0.06/0.05 0.83/0.85 31

a/b in the tables refer to undefended/defended performance results, respectively.

* Undefended results not available for Adversarial Patch attack against single modality because JPEG Compression defense is not implemented in PyTorch and so is not fully differentiable

Find reference baseline configurations here

CARLA video tracking (Updated June 2022)

  • Description: In this scenario, the system under evaluation is an object tracker trained to localize pedestrians.
  • Dataset: The development dataset is the CARLA Video Tracking dataset, which includes 20 videos, each of which contains a green-screen in all frames intended for adversarial patch insertion. The dataset contains natural lighting metadata that allow digital, adaptive patches to be inserted and rendered into the scene similar to if they were physically printed.
  • Baseline Model:
  • Threat Scenario:
    • Adversary objectives:
      • To degrade the performance of the tracker through the insertion of adversarial patches.
    • Adversary Operating Environment:
      • Non-real time, physical-like patch attacks
      • Adaptive attacks will be performed on defenses.
  • Adversary Capabilities and Resources
    • Patch size of different size/shape as dictated by the green-screen in the frames. The adversary is expected to apply a patch with constant texture across all frames in the video, but the patch relative to the sensor may change due to sensor motion.
  • Metrics of Interest:
    • Primary metrics:
      • mean IOU
      • mean succss rate (mean IOUs are calculated for multiple IOU thresholds and averaged)
  • Baseline Attacks:
  • Baseline Defense: Video Compression
  • Baseline Model Performance: (results obtained using Armory v0.15.2 and dev data)
Attack Parameters Benign Mean IoU Benign Mean Success Rate Adversarial Mean IoU Adversarial Mean Success Rate Test Size
step_size=0.02, max_iter=100 0.55/0.57 0.57/0.60 0.14/0.19 0.15/0.20 20

a/b in the tables refer to undefended/defended performance results, respectively.

Find reference baseline configurations here

Academic Scenarios

Cifar10 image classification

  • Description: This is a standard white-box attack scenario.
  • Threat Scenario: White-box attack
  • Metrics of Interest: Benign accuracy, Adversarial accuracy, Adversarial perturbation
  • Baseline Model Performance:
  • Baseline Defense Performance: See academic literature for the most up to date results

MNIST image classification

  • Description:
  • Threat Scenario: White-box attack
  • Metrics of Interest: Benign accuracy, Adversarial accuracy, Adversarial perturbation
  • Baseline Model Performance:
  • Baseline Defense Performance: See academic literature for the most up to date results

Creating a new scenario

Users may want to create their own scenario, because the baseline scenarios do not fit the requirements of some defense/threat-model, or because it may be easier to debug in code that you have access to as opposed to what is pre-installed by the armory package.

An example of doing this can be found in our armory-examples repo:

Derivative metrics

alt text

Exporting Samples

Scenarios can be configured to export benign and adversarial image, video, and audio samples. This feature is enabled by setting the export_samples field under scenario in the configuration file to a non-zero integer. The specified number of samples will be saved in the output directory for this evaluation, along with a pickle file which stores the ground truth and model output for each sample. For video files, samples are saved both in a compressed video format and frame-by-frame.