From 3a8151f77895cac26b356672859f8ff5f3926d62 Mon Sep 17 00:00:00 2001 From: Samuel Sasaki <64754486+sammsaski@users.noreply.github.com> Date: Fri, 6 Dec 2024 18:12:22 -0600 Subject: [PATCH 1/2] changes to smoke test and readme --- .../Submission/FORMALISE2025/README.md | 49 +- .../FORMALISE2025/run_single_sample_vvn.sh | 95 ++++ .../Submission/FORMALISE2025/run_smoketest.sh | 87 +--- .../FORMALISE2025/run_subset_vvn.sh | 11 +- .../FORMALISE2025/src/analysis/make_table2.py | 417 +++++++++--------- .../Submission/FORMALISE2025/src/run.py | 29 +- 6 files changed, 364 insertions(+), 324 deletions(-) create mode 100755 code/nnv/examples/Submission/FORMALISE2025/run_single_sample_vvn.sh diff --git a/code/nnv/examples/Submission/FORMALISE2025/README.md b/code/nnv/examples/Submission/FORMALISE2025/README.md index 670bba605..6117800c0 100644 --- a/code/nnv/examples/Submission/FORMALISE2025/README.md +++ b/code/nnv/examples/Submission/FORMALISE2025/README.md @@ -2,20 +2,20 @@ This artifact is used to reproduce the results in _Robustness Verification of Video Classification Neural Networks_. -All results from _Robustness Verification of Video Classification Neural Networks_ were captured using an `Apple M1 Max 10-core CPU@3.20GHz×10` with 64GB of RAM. The times reported to run the smoke test and reproduce results are approximated based on this computer. +All results from _Robustness Verification of Video Classification Neural Networks_ were captured using an `Apple M1 Max 10-core CPU@3.20GHz×10` with 64GB of RAM. # Requirements The following resources are required to run this artifact: -- MATLAB 2024a with NNV and npy-matlab installed and added to the path. +- MATLAB 2024a with NNV and `npy-matlab` installed and added to the path. - A conda environment with Python v3.11. Install rquirements from `requirements.txt`. Make sure to install the source files. # Installation -This section describes all of the necessary steps for installing tools, dependencies, etc. needed to reproduce the artifact. For the remainder of these instructions, when the `FORMALISE2025` directory is referred to, this really means the directory at the following path: `nnv/code/nnv/examples/Submission/FORMALISE2025`. +This section describes all of the necessary steps for installing tools, dependencies, etc. needed to reproduce the artifact. For the remainder of these instructions when the `FORMALISE2025` directory is referred to we are really referring to the directory at the following path: `nnv/code/nnv/examples/Submission/FORMALISE2025`. -1. Clone NNV and npy-matlab and install them ensuring that both have been added to the MATLAB path. +1. Clone NNV and `npy-matlab` and install them ensuring that both have been added to the MATLAB path. ``` # Clone NNV and npy-matlab @@ -33,7 +33,7 @@ Either comment it out or remove it. For `npy-matlab`, add and save the path to MATLAB. Again, instructions are available on [`npy-matlab`'s README](https://github.com/kwikteam/npy-matlab/blob/master/README.md). -For installing both of these tools, it will be necessary to run with administrator privileges so that you can save them to the MATLAB path. +> NOTE: It will be necessary to perform installations with administrator privileges for both of these tools so that it is possible to `savepath` in MATLAB after completing their respective installation instructions. 2. Download the following dataset files from [here](https://drive.google.com/drive/folders/1sXRtSObHLBTeKVss2IA-NGPKljirgD8D?usp=drive_link) into the `FORMALISE2025/data` folder so that the file tree now looks like @@ -47,25 +47,26 @@ FORMALISE2025 ... ``` -Note that the `GTSRB` data folder is quite large, so it will likely be downloaded in separate parts. Please make sure the organization of the folder is the same as in the download link. +Note that the `GTSRB` data folder is quite large, so it will likely be downloaded in separate parts. Please make sure the organization of the folder is the same as in the download link and that all filenames match. -3. Create a conda environment with `Python v3.11` and install the requirements from `requirements.txt`. Additionally, install the source files to the environment. Both can be done by running the following commands from the root directory (`FORMALISE2025`): +3. Create a conda environment with `Python v3.11` and install the requirements from `requirements.txt`. Instructions for installing conda can be found [here](https://docs.conda.io/projects/conda/en/latest/user-guide/install/) with links under `# Regular installation` to instructions for various operating systems. After confirming that conda has installed successfully by creating a conda environment and making it active, proceed to install the requirements as detailed below. In addition to installing from `requirements.txt`, you must install the source files to the environment. Both of these can be done by running the following commands from the root directory (`FORMALISE2025`): - ```bash - pip install -r requirements.txt +```bash +# installing requirements +pip install -r requirements.txt - # before installing source files, make sure to navigate to this src directory, e.g. - cd /path/to/FORMALISE2025/src - pip install -e . - ``` +# before installing source files, make sure to navigate to this src directory, e.g. +cd /path/to/FORMALISE2025/src +pip install -e . +``` 4. Modify the `.env` file to add the path to your NNV and npy-matlab directories (the repositories that were cloned earlier). For the `npy-matlab` repository, you'll want to reference the subfolder in the directory also called `npy-matlab`, e.g. `/some/path/to/npy-matlab/npy-matlab`. 5. With all of these steps done, you are now ready to begin reproducing the artifacts. -# Smoke Test Instructions (~30 minutes) +# Smoke Test Instructions -Instructions to quickly test the software needed to reproduce the artifacts of the paper. If no issues arise during the smoke test, you can safely proceed to reproducing all artifacts as described in the below section. +Instructions to quickly test the software needed to reproduce the artifacts of the paper. If no issues arise during the smoke test, you can safely proceed to reproducing all artifacts as described in the below sections. 1. Open a terminal and navigate to the `FORMALISE2025` directory. 2. Make sure the conda environment with the installed dependencies is activated. @@ -75,7 +76,7 @@ Instructions to quickly test the software needed to reproduce the artifacts of t chmod +x run_smoketest.sh && ./run_smoketest.sh ``` -4. The smoketest will verify a single sample from each of the different datasets and with the different verification algorithms. The results will be output to a `smoketest_outputs.txt` file. Assuming everything ran smoothly, you should see: +4. The smoke test will verify a single sample. If the smoke test is successful, then the following message will be output. ``` ********************************************** @@ -85,7 +86,7 @@ chmod +x run_smoketest.sh && ./run_smoketest.sh # Reproducing a Subset of the Results -Assuming the average runtime for the experiments remains as shown in the paper, then it will take approximately 9-10 days to reproduce the results. For that reason, this set of instructions is for reproducing a subset of the results. In this case, we verify every tenth sample from what was done for the original verification. Generating the reachable output range plots remains the same. +Assuming the average runtime for the experiments remains as shown in the paper, then it will take approximately 9-10 days to reproduce the results. For that reason, this set of instructions is for reproducing a subset of the results. More specifically, we reproduce the first row of Table 2 from the paper, e.g. the verification results for the 4-frame variation of the Zoom In dataset with the relax verification algorithm. The results will be output to the console after the script completes. Additionally, this script will generate the reachable output range plots used in Figure 7. 1. Open a terminal and navigate to the `FORMALISE2025` directory. 2. Make sure the conda environment with the installed dependencies is activated. @@ -95,9 +96,21 @@ Assuming the average runtime for the experiments remains as shown in the paper, chmod +x run_subset_vvn.sh && run_subset_vvn.sh ``` +# Reproducing a Subset of the Results pt. 2 + +There is additionally a script to reproduce a single sample from all variations of the datasets. To run this subset, please perform the following: + +1. Open a terminal and navigate to the `FORMALISE25` directory. +2. Make sure the conda environment with the installed dependencies is activated. +3. Run the following command to begin reproducing a subset of the artifacts: + +```bash +chmod +x run_single_sample_vvn.sh && run_single_sample_vvn.sh +``` + # Reproducing the Full Results -This set of instructions describes how to reproduce the full results. The instructions will be just like the previous parts only we use a different script. +This set of instructions describes how to reproduce the full results. 1. Open a terminal and navigate to the `FORMALISE2025` directory. 2. Make sure the conda environment with the installed dependencies is activated. diff --git a/code/nnv/examples/Submission/FORMALISE2025/run_single_sample_vvn.sh b/code/nnv/examples/Submission/FORMALISE2025/run_single_sample_vvn.sh new file mode 100755 index 000000000..d905cb71a --- /dev/null +++ b/code/nnv/examples/Submission/FORMALISE2025/run_single_sample_vvn.sh @@ -0,0 +1,95 @@ +#!/bin/bash + +OUTPUT_FILE="test_single_all_samples.txt" + +# Clear the file if it exists +> $OUTPUT_FILE + +echo "Starting test..." +echo "Starting test..." >> $OUTPUT_FILE + +####################### +### ZOOM IN RESULTS ### +####################### +# Zoom In 4f, relax +python src/vvn/verify.py zoom_in relax 1 4 1800 751 >> $OUTPUT_FILE 2>&1 + +# Zoom In 8f, relax +python src/vvn/verify.py zoom_in relax 1 8 1800 751 >> $OUTPUT_FILE 2>&1 + +# Zoom In 16f, relax +python src/vvn/verify.py zoom_in relax 1 16 1800 751 >> $OUTPUT_FILE 2>&1 + +# Zoom In 4f, approx +python src/vvn/verify.py zoom_in approx 1 4 1800 751 >> $OUTPUT_FILE 2>&1 + +# Zoom In 8f, approx +python src/vvn/verify.py zoom_in approx 1 8 1800 751 >> $OUTPUT_FILE 2>&1 + + +######################## +### ZOOM OUT RESULTS ### +######################## +# Zoom Out 4f, relax +python src/vvn/verify.py zoom_out relax 1 4 1800 624 >> $OUTPUT_FILE 2>&1 + +# Zoom Out 8f, relax +python src/vvn/verify.py zoom_out relax 1 8 1800 624 >> $OUTPUT_FILE 2>&1 + +# Zoom Out 16f, relax +python src/vvn/verify.py zoom_out relax 1 16 1800 624 >> $OUTPUT_FILE 2>&1 + +# Zoom Out 4f, approx +python src/vvn/verify.py zoom_out approx 1 4 1800 624 >> $OUTPUT_FILE 2>&1 + +# Zoom Out 8f, approx +python src/vvn/verify.py zoom_out approx 1 8 1800 624 >> $OUTPUT_FILE 2>&1 + + +####################### +#### GTSRB RESULTS #### +####################### +# GTSRB 4f, relax +python src/vvn/verify.py gtsrb relax 1 4 1800 12597 >> $OUTPUT_FILE 2>&1 + +# GTSRB 8f, relax +python src/vvn/verify.py gtsrb relax 1 8 1800 12597 >> $OUTPUT_FILE 2>&1 + +# GTSRB 16f, relax +python src/vvn/verify.py gtsrb relax 1 16 1800 12597 >> $OUTPUT_FILE 2>&1 + +# GTSRB 4f, approx +python src/vvn/verify.py gtsrb approx 1 4 1800 12597 >> $OUTPUT_FILE 2>&1 + +# GTSRB 8f, approx +python src/vvn/verify.py gtsrb approx 1 8 1800 12597 >> $OUTPUT_FILE 2>&1 + +# GTSRB 16f, approx +python src/vvn/verify.py gtsrb approx 1 16 1800 12597 >> $OUTPUT_FILE 2>&1 + + +####################### +### STMNIST RESULTS ### +####################### +# STMNIST 16f, relax +python src/vvn/verify.py stmnist relax 1 16 1800 311 >> $OUTPUT_FILE 2>&1 + +# STMNIST 32f, relax +python src/vvn/verify.py stmnist relax 1 32 1800 311 >> $OUTPUT_FILE 2>&1 + +# STMNIST 64f, relax +python src/vvn/verify.py stmnist relax 1 64 1800 311 >> $OUTPUT_FILE 2>&1 + +# STMNIST 16f, approx +python src/vvn/verify.py stmnist approx 1 16 1800 311 >> $OUTPUT_FILE 2>&1 + +# STMNIST 32f, approx +python src/vvn/verify.py stmnist approx 1 32 1800 311 >> $OUTPUT_FILE 2>&1 + +# STMNIST 64f, approx +python src/vvn/verify.py stmnist approx 1 64 1800 311 >> $OUTPUT_FILE 2>&1 + + +echo -e "\n\n**********************************************" +echo " Test all samples complete. " +echo -e "**********************************************\n\n" diff --git a/code/nnv/examples/Submission/FORMALISE2025/run_smoketest.sh b/code/nnv/examples/Submission/FORMALISE2025/run_smoketest.sh index 8cfaddddf..924c9da9e 100755 --- a/code/nnv/examples/Submission/FORMALISE2025/run_smoketest.sh +++ b/code/nnv/examples/Submission/FORMALISE2025/run_smoketest.sh @@ -1,94 +1,9 @@ #!/bin/bash -OUTPUT_FILE="smoketest_outputs.txt" - -# Clear the file if it exists -> $OUTPUT_FILE - echo "Starting smoke test..." -echo "Starting smoke test..." >> $OUTPUT_FILE -####################### -### ZOOM IN RESULTS ### -####################### # Zoom In 4f, relax -python src/vvn/verify.py zoom_in relax 1 4 1800 751 >> $OUTPUT_FILE 2>&1 - -# Zoom In 8f, relax -python src/vvn/verify.py zoom_in relax 1 8 1800 751 >> $OUTPUT_FILE 2>&1 - -# Zoom In 16f, relax -python src/vvn/verify.py zoom_in relax 1 16 1800 751 >> $OUTPUT_FILE 2>&1 - -# Zoom In 4f, approx -python src/vvn/verify.py zoom_in approx 1 4 1800 751 >> $OUTPUT_FILE 2>&1 - -# Zoom In 8f, approx -python src/vvn/verify.py zoom_in approx 1 8 1800 751 >> $OUTPUT_FILE 2>&1 - - -######################## -### ZOOM OUT RESULTS ### -######################## -# Zoom Out 4f, relax -python src/vvn/verify.py zoom_out relax 1 4 1800 624 >> $OUTPUT_FILE 2>&1 - -# Zoom Out 8f, relax -python src/vvn/verify.py zoom_out relax 1 8 1800 624 >> $OUTPUT_FILE 2>&1 - -# Zoom Out 16f, relax -python src/vvn/verify.py zoom_out relax 1 16 1800 624 >> $OUTPUT_FILE 2>&1 - -# Zoom Out 4f, approx -python src/vvn/verify.py zoom_out approx 1 4 1800 624 >> $OUTPUT_FILE 2>&1 - -# Zoom Out 8f, approx -python src/vvn/verify.py zoom_out approx 1 8 1800 624 >> $OUTPUT_FILE 2>&1 - - -####################### -#### GTSRB RESULTS #### -####################### -# GTSRB 4f, relax -python src/vvn/verify.py gtsrb relax 1 4 1800 12597 >> $OUTPUT_FILE 2>&1 - -# GTSRB 8f, relax -python src/vvn/verify.py gtsrb relax 1 8 1800 12597 >> $OUTPUT_FILE 2>&1 - -# GTSRB 16f, relax -python src/vvn/verify.py gtsrb relax 1 16 1800 12597 >> $OUTPUT_FILE 2>&1 - -# GTSRB 4f, approx -python src/vvn/verify.py gtsrb approx 1 4 1800 12597 >> $OUTPUT_FILE 2>&1 - -# GTSRB 8f, approx -python src/vvn/verify.py gtsrb approx 1 8 1800 12597 >> $OUTPUT_FILE 2>&1 - -# GTSRB 16f, approx -python src/vvn/verify.py gtsrb approx 1 16 1800 12597 >> $OUTPUT_FILE 2>&1 - - -####################### -### STMNIST RESULTS ### -####################### -# STMNIST 16f, relax -python src/vvn/verify.py stmnist relax 1 16 1800 311 >> $OUTPUT_FILE 2>&1 - -# STMNIST 32f, relax -python src/vvn/verify.py stmnist relax 1 32 1800 311 >> $OUTPUT_FILE 2>&1 - -# STMNIST 64f, relax -python src/vvn/verify.py stmnist relax 1 64 1800 311 >> $OUTPUT_FILE 2>&1 - -# STMNIST 16f, approx -python src/vvn/verify.py stmnist approx 1 16 1800 311 >> $OUTPUT_FILE 2>&1 - -# STMNIST 32f, approx -python src/vvn/verify.py stmnist approx 1 32 1800 311 >> $OUTPUT_FILE 2>&1 - -# STMNIST 64f, approx -python src/vvn/verify.py stmnist approx 1 64 1800 311 >> $OUTPUT_FILE 2>&1 - +python src/vvn/verify.py zoom_in relax 1 4 1800 751 echo -e "\n\n**********************************************" echo " Smoke test complete. " diff --git a/code/nnv/examples/Submission/FORMALISE2025/run_subset_vvn.sh b/code/nnv/examples/Submission/FORMALISE2025/run_subset_vvn.sh index 7e14c94df..5c8ba3fa5 100755 --- a/code/nnv/examples/Submission/FORMALISE2025/run_subset_vvn.sh +++ b/code/nnv/examples/Submission/FORMALISE2025/run_subset_vvn.sh @@ -1,14 +1,9 @@ #!/bin/bash -# TABLE II (GET ALL VERIFICATION RESULTS) -python src/run.py subset # MNIST Video -python src/run_gtsrb.py subset # GTSRB -python src/run_stmnist.py subset # STMNIST -python src/analysis/make_table2.py - -# FIGURE 8 (COMPARISON OF AVERAGE RUNTIME) -python src/analysis/make_plots.py +# Reproduce the first row of table 2. +# TABLE II (GET ALL VERIFICATION RESULTS) +python src/run.py --subset # MNIST Video # FIGURE 7 (REACHABLE OUTPUT PLOTS) python src/generate_output_plots.py \ No newline at end of file diff --git a/code/nnv/examples/Submission/FORMALISE2025/src/analysis/make_table2.py b/code/nnv/examples/Submission/FORMALISE2025/src/analysis/make_table2.py index 922345d73..3e7914ccb 100644 --- a/code/nnv/examples/Submission/FORMALISE2025/src/analysis/make_table2.py +++ b/code/nnv/examples/Submission/FORMALISE2025/src/analysis/make_table2.py @@ -56,216 +56,217 @@ def summarize(output_file_dir): return num_samples_verified, average_times -# OUTPUTS -tv = [] -rt = [] - -# ZOOM IN RELAX -config = Config( - class_size=10, - epsilon=[1/255, 2/255, 3/255], - ds_type='zoom_in', - sample_len=4, - ver_algorithm='relax', - timeout=1800, - output_dir=output_dir -) -zoomin_4_tv_relax, zoomin_4_relax = summarize(vp.build_output_filepath(config=config, parent_only=True)) -tv.append(zoomin_4_tv_relax) -rt.append(zoomin_4_relax) - -config.sample_len = 8 -zoomin_8_tv_relax, zoomin_8_relax = summarize(vp.build_output_filepath(config=config, parent_only=True)) -tv.append(zoomin_8_tv_relax) -rt.append(zoomin_8_relax) - -config.sample_len = 16 -zoomin_16_tv_relax, zoomin_16_relax = summarize(vp.build_output_filepath(config=config, parent_only=True)) -tv.append(zoomin_16_tv_relax) -rt.append(zoomin_16_relax) - -# ZOOM IN APPROX -config.sample_len = 4 -config.ver_algorithm = 'approx' -zoomin_4_tv_approx, zoomin_4_approx = summarize(vp.build_output_filepath(config=config, parent_only=True)) -tv.append(zoomin_4_tv_approx) -rt.append(zoomin_4_approx) - -config.sample_len = 8 -zoomin_8_tv_approx, zoomin_8_approx = summarize(vp.build_output_filepath(config=config, parent_only=True)) -tv.append(zoomin_8_tv_approx) -rt.append(zoomin_8_approx) - - -# ZOOM OUT RELAX -config.ds_type = 'zoom_out' -config.sample_len = 4 -config.ver_algorithm = 'relax' -zoomout_4_tv_relax, zoomout_4_relax = summarize(vp.build_output_filepath(config=config, parent_only=True)) -tv.append(zoomout_4_tv_relax) -rt.append(zoomout_4_relax) - -config.sample_len = 8 -zoomout_8_tv_relax, zoomout_8_relax = summarize(vp.build_output_filepath(config=config, parent_only=True)) -tv.append(zoomout_8_tv_relax) -rt.append(zoomout_8_relax) - -config.sample_len = 16 -zoomout_16_tv_relax, zoomout_16_relax = summarize(vp.build_output_filepath(config=config, parent_only=True)) -tv.append(zoomout_16_tv_relax) -rt.append(zoomout_16_relax) - -# ZOOM OUT APPROX -config.sample_len = 4 -config.ver_algorithm = 'approx' -zoomout_4_tv_approx, zoomout_4_approx = summarize(vp.build_output_filepath(config=config, parent_only=True)) -tv.append(zoomout_4_tv_approx) -rt.append(zoomout_4_approx) - -config.sample_len = 8 -zoomout_8_tv_approx, zoomout_8_approx = summarize(vp.build_output_filepath(config=config, parent_only=True)) -tv.append(zoomout_8_tv_approx) -rt.append(zoomout_8_approx) - - -# GTSRB RELAX -config.ds_type = 'gtsrb' -config.sample_len = 4 -config.ver_algorithm = 'relax' -gtsrb_4_tv_relax, gtsrb_4_relax = summarize(vgp.build_output_filepath(config=config, parent_only=True)) -tv.append(gtsrb_4_tv_relax) -rt.append(gtsrb_4_relax) - -config.sample_len = 8 -gtsrb_8_tv_relax, gtsrb_8_relax = summarize(vgp.build_output_filepath(config=config, parent_only=True)) -tv.append(gtsrb_8_tv_relax) -rt.append(gtsrb_8_relax) - -config.sample_len = 16 -gtsrb_16_tv_relax, gtsrb_16_relax = summarize(vgp.build_output_filepath(config=config, parent_only=True)) -tv.append(gtsrb_16_tv_relax) -rt.append(gtsrb_16_relax) - -# GTSRB APPROX -config.sample_len = 4 -config.ver_algorithm = 'approx' -gtsrb_4_tv_approx, gtsrb_4_approx = summarize(vgp.build_output_filepath(config=config, parent_only=True)) -tv.append(gtsrb_4_tv_approx) -rt.append(gtsrb_4_approx) - -config.sample_len = 8 -gtsrb_8_tv_approx, gtsrb_8_approx = summarize(vgp.build_output_filepath(config=config, parent_only=True)) -tv.append(gtsrb_8_tv_approx) -rt.append(gtsrb_8_approx) - -config.sample_len = 16 -gtsrb_16_tv_approx, gtsrb_16_approx = summarize(vgp.build_output_filepath(config=config, parent_only=True)) -tv.append(gtsrb_16_tv_approx) -rt.append(gtsrb_16_approx) - - -# STMNIST RELAX -config.ds_type = 'stmnist' -config.sample_len = 16 -config.ver_algorithm = 'relax' -stmnist_16_tv_relax, stmnist_16_relax = summarize(vsp.build_output_filepath(config=config, parent_only=True)) -tv.append(stmnist_16_tv_relax) -rt.append(stmnist_16_relax) - -config.sample_len = 32 -stmnist_32_tv_relax, stmnist_32_relax = summarize(vgp.build_output_filepath(config=config, parent_only=True)) -tv.append(stmnist_32_tv_relax) -rt.append(stmnist_32_relax) - -config.sample_len = 64 -stmnist_64_tv_relax, stmnist_64_relax = summarize(vgp.build_output_filepath(config=config, parent_only=True)) -tv.append(stmnist_64_tv_relax) -rt.append(stmnist_64_relax) - -# STMNIST APPROX -config.sample_len = 16 -config.ver_algorithm = 'approx' -stmnist_16_tv_approx, stmnist_16_approx = summarize(vsp.build_output_filepath(config=config, parent_only=True)) -tv.append(stmnist_16_tv_approx) -rt.append(stmnist_16_approx) - -config.sample_len = 32 -stmnist_32_tv_approx, stmnist_32_approx = summarize(vgp.build_output_filepath(config=config, parent_only=True)) -tv.append(stmnist_32_tv_approx) -rt.append(stmnist_32_approx) - -config.sample_len = 64 -stmnist_64_tv_approx, stmnist_64_approx = summarize(vgp.build_output_filepath(config=config, parent_only=True)) -tv.append(stmnist_64_tv_approx) -rt.append(stmnist_64_approx) - - -if __name__ == "__main__": - datasets = ["Zoom In", "Zoom Out", "GTSRB", "STMNIST"] - veralgs = ["relax", "approx"] - - # Create all the labels - labels = [] - for ds in datasets: - for va in veralgs: - if (ds == "Zoom In" or ds == "Zoom Out") and va == "approx": - sample_lengths = ["4", "8"] - - elif ds == "STMNIST": - sample_lengths = ["4", "8", "16"] - else: - sample_lengths = ["16", "32", "64"] - - for sl in sample_lengths: - for eps in ["1/255", "2/255", "3/255"]: - labels.append(f'{ds} {va} {sl} {eps}') - - # needs to be flattened for pairing with label later - total_verified_flattened = [ - x for sample in tv - for x in sample - ] - - # needs to be flattened for pairing with label later - running_time_flattened = [ - x for sample in rt - for x in sample - ] - - MAX_LENGTH_LABEL = max([len(label) for label in labels]) - - # Pair together the labels and results - labels_and_results = zip(labels, total_verified_flattened, running_time_flattened) - - # Create the row for the table - def format_row(sample): - label, tv, t = sample - label = f"{label:<{MAX_LENGTH_LABEL}}" +if __name__=="__main__": + # OUTPUTS + tv = [] + rt = [] + + # ZOOM IN RELAX + config = Config( + class_size=10, + epsilon=[1/255, 2/255, 3/255], + ds_type='zoom_in', + sample_len=4, + ver_algorithm='relax', + timeout=1800, + output_dir=output_dir + ) + zoomin_4_tv_relax, zoomin_4_relax = summarize(vp.build_output_filepath(config=config, parent_only=True)) + tv.append(zoomin_4_tv_relax) + rt.append(zoomin_4_relax) + + config.sample_len = 8 + zoomin_8_tv_relax, zoomin_8_relax = summarize(vp.build_output_filepath(config=config, parent_only=True)) + tv.append(zoomin_8_tv_relax) + rt.append(zoomin_8_relax) + + config.sample_len = 16 + zoomin_16_tv_relax, zoomin_16_relax = summarize(vp.build_output_filepath(config=config, parent_only=True)) + tv.append(zoomin_16_tv_relax) + rt.append(zoomin_16_relax) + + # ZOOM IN APPROX + config.sample_len = 4 + config.ver_algorithm = 'approx' + zoomin_4_tv_approx, zoomin_4_approx = summarize(vp.build_output_filepath(config=config, parent_only=True)) + tv.append(zoomin_4_tv_approx) + rt.append(zoomin_4_approx) + + config.sample_len = 8 + zoomin_8_tv_approx, zoomin_8_approx = summarize(vp.build_output_filepath(config=config, parent_only=True)) + tv.append(zoomin_8_tv_approx) + rt.append(zoomin_8_approx) + + + # ZOOM OUT RELAX + config.ds_type = 'zoom_out' + config.sample_len = 4 + config.ver_algorithm = 'relax' + zoomout_4_tv_relax, zoomout_4_relax = summarize(vp.build_output_filepath(config=config, parent_only=True)) + tv.append(zoomout_4_tv_relax) + rt.append(zoomout_4_relax) + + config.sample_len = 8 + zoomout_8_tv_relax, zoomout_8_relax = summarize(vp.build_output_filepath(config=config, parent_only=True)) + tv.append(zoomout_8_tv_relax) + rt.append(zoomout_8_relax) + + config.sample_len = 16 + zoomout_16_tv_relax, zoomout_16_relax = summarize(vp.build_output_filepath(config=config, parent_only=True)) + tv.append(zoomout_16_tv_relax) + rt.append(zoomout_16_relax) + + # ZOOM OUT APPROX + config.sample_len = 4 + config.ver_algorithm = 'approx' + zoomout_4_tv_approx, zoomout_4_approx = summarize(vp.build_output_filepath(config=config, parent_only=True)) + tv.append(zoomout_4_tv_approx) + rt.append(zoomout_4_approx) + + config.sample_len = 8 + zoomout_8_tv_approx, zoomout_8_approx = summarize(vp.build_output_filepath(config=config, parent_only=True)) + tv.append(zoomout_8_tv_approx) + rt.append(zoomout_8_approx) + + + # GTSRB RELAX + config.ds_type = 'gtsrb' + config.sample_len = 4 + config.ver_algorithm = 'relax' + gtsrb_4_tv_relax, gtsrb_4_relax = summarize(vgp.build_output_filepath(config=config, parent_only=True)) + tv.append(gtsrb_4_tv_relax) + rt.append(gtsrb_4_relax) + + config.sample_len = 8 + gtsrb_8_tv_relax, gtsrb_8_relax = summarize(vgp.build_output_filepath(config=config, parent_only=True)) + tv.append(gtsrb_8_tv_relax) + rt.append(gtsrb_8_relax) + + config.sample_len = 16 + gtsrb_16_tv_relax, gtsrb_16_relax = summarize(vgp.build_output_filepath(config=config, parent_only=True)) + tv.append(gtsrb_16_tv_relax) + rt.append(gtsrb_16_relax) + + # GTSRB APPROX + config.sample_len = 4 + config.ver_algorithm = 'approx' + gtsrb_4_tv_approx, gtsrb_4_approx = summarize(vgp.build_output_filepath(config=config, parent_only=True)) + tv.append(gtsrb_4_tv_approx) + rt.append(gtsrb_4_approx) + + config.sample_len = 8 + gtsrb_8_tv_approx, gtsrb_8_approx = summarize(vgp.build_output_filepath(config=config, parent_only=True)) + tv.append(gtsrb_8_tv_approx) + rt.append(gtsrb_8_approx) + + config.sample_len = 16 + gtsrb_16_tv_approx, gtsrb_16_approx = summarize(vgp.build_output_filepath(config=config, parent_only=True)) + tv.append(gtsrb_16_tv_approx) + rt.append(gtsrb_16_approx) + + + # STMNIST RELAX + config.ds_type = 'stmnist' + config.sample_len = 16 + config.ver_algorithm = 'relax' + stmnist_16_tv_relax, stmnist_16_relax = summarize(vsp.build_output_filepath(config=config, parent_only=True)) + tv.append(stmnist_16_tv_relax) + rt.append(stmnist_16_relax) + + config.sample_len = 32 + stmnist_32_tv_relax, stmnist_32_relax = summarize(vgp.build_output_filepath(config=config, parent_only=True)) + tv.append(stmnist_32_tv_relax) + rt.append(stmnist_32_relax) + + config.sample_len = 64 + stmnist_64_tv_relax, stmnist_64_relax = summarize(vgp.build_output_filepath(config=config, parent_only=True)) + tv.append(stmnist_64_tv_relax) + rt.append(stmnist_64_relax) + + # STMNIST APPROX + config.sample_len = 16 + config.ver_algorithm = 'approx' + stmnist_16_tv_approx, stmnist_16_approx = summarize(vsp.build_output_filepath(config=config, parent_only=True)) + tv.append(stmnist_16_tv_approx) + rt.append(stmnist_16_approx) + + config.sample_len = 32 + stmnist_32_tv_approx, stmnist_32_approx = summarize(vgp.build_output_filepath(config=config, parent_only=True)) + tv.append(stmnist_32_tv_approx) + rt.append(stmnist_32_approx) + + config.sample_len = 64 + stmnist_64_tv_approx, stmnist_64_approx = summarize(vgp.build_output_filepath(config=config, parent_only=True)) + tv.append(stmnist_64_tv_approx) + rt.append(stmnist_64_approx) + + + if __name__ == "__main__": + datasets = ["Zoom In", "Zoom Out", "GTSRB", "STMNIST"] + veralgs = ["relax", "approx"] + + # Create all the labels + labels = [] + for ds in datasets: + for va in veralgs: + if (ds == "Zoom In" or ds == "Zoom Out") and va == "approx": + sample_lengths = ["4", "8"] + + elif ds == "STMNIST": + sample_lengths = ["4", "8", "16"] + else: + sample_lengths = ["16", "32", "64"] + + for sl in sample_lengths: + for eps in ["1/255", "2/255", "3/255"]: + labels.append(f'{ds} {va} {sl} {eps}') + + # needs to be flattened for pairing with label later + total_verified_flattened = [ + x for sample in tv + for x in sample + ] + + # needs to be flattened for pairing with label later + running_time_flattened = [ + x for sample in rt + for x in sample + ] + + MAX_LENGTH_LABEL = max([len(label) for label in labels]) + + # Pair together the labels and results + labels_and_results = zip(labels, total_verified_flattened, running_time_flattened) - # compute percent verified - num_attempted = 215 if 'GTSRB' in label else 100 - percent_verified = round((tv / num_attempted) * 100, 2) - t = round(t, 2) + # Create the row for the table + def format_row(sample): + label, tv, t = sample + label = f"{label:<{MAX_LENGTH_LABEL}}" + + # compute percent verified + num_attempted = 215 if 'GTSRB' in label else 100 + percent_verified = round((tv / num_attempted) * 100, 2) + t = round(t, 2) - pv_string = f"{percent_verified:<{6}}" - at_string = f"{t:<{9}}" + pv_string = f"{percent_verified:<{6}}" + at_string = f"{t:<{9}}" - return " | ".join([label, pv_string, at_string]) - - # Create the human-readable table - output_file = os.path.join(output_dir, "table2.txt") - with open(output_file, 'w') as f: - - # write the header - label_header = f"{'LABEL':^{MAX_LENGTH_LABEL}}" - psrv_header = f"{'PSRV':^{6}}" - time_header = f"{'Avg. Time':^{10}}" - header = " | ".join([label_header, psrv_header, time_header]) - line = '-'*len(header) - f.write(header + "\n") - f.write(line + "\n") - - for sample in labels_and_results: - f.write(format_row(sample) + "\n") + return " | ".join([label, pv_string, at_string]) + # Create the human-readable table + output_file = os.path.join(output_dir, "table2.txt") + with open(output_file, 'w') as f: + + # write the header + label_header = f"{'LABEL':^{MAX_LENGTH_LABEL}}" + psrv_header = f"{'PSRV':^{6}}" + time_header = f"{'Avg. Time':^{10}}" + header = " | ".join([label_header, psrv_header, time_header]) + line = '-'*len(header) + f.write(header + "\n") + f.write(line + "\n") + + for sample in labels_and_results: + f.write(format_row(sample) + "\n") + diff --git a/code/nnv/examples/Submission/FORMALISE2025/src/run.py b/code/nnv/examples/Submission/FORMALISE2025/src/run.py index b58ef367c..822e0ee36 100644 --- a/code/nnv/examples/Submission/FORMALISE2025/src/run.py +++ b/code/nnv/examples/Submission/FORMALISE2025/src/run.py @@ -3,6 +3,7 @@ import vvn.prep as vp import vvn.verify as vvn +from analysis.make_table2 import summarize from vvn.config import Config @@ -27,10 +28,6 @@ zoom_in_samples = [751, 137, 34, 878, 328, 289, 262, 168, 871, 126, 796, 877, 645, 108, 693, 501, 44, 41, 117, 257, 272, 597, 707, 37, 661, 237, 845, 763, 830, 645, 498, 259, 532, 692, 331, 972, 8, 904, 966, 193, 826, 501, 407, 331, 187, 254, 908, 402, 126, 116, 452, 121, 429, 411, 709, 317, 969, 57, 863, 543, 636, 150, 450, 99, 652, 350, 999, 736, 724, 432, 680, 230, 833, 90, 61, 780, 267, 922, 346, 100, 272, 125, 452, 331, 537, 744, 435, 198, 442, 423, 248, 790, 320, 830, 806, 761, 92, 714, 744, 207] zoom_out_samples = [624, 882, 278, 180, 540, 439, 306, 757, 821, 654, 248, 817, 368, 949, 963, 59, 260, 34, 357, 465, 304, 69, 238, 666, 867, 356, 239, 776, 585, 460, 760, 536, 158, 301, 154, 280, 908, 659, 632, 297, 910, 687, 499, 686, 463, 418, 248, 152, 596, 578, 96, 922, 50, 117, 169, 738, 176, 989, 809, 491, 702, 67, 445, 441, 547, 616, 285, 649, 12, 809, 872, 126, 812, 630, 916, 303, 952, 758, 390, 120, 332, 507, 174, 529, 4, 873, 868, 297, 586, 933, 196, 594, 112, 736, 337, 755, 719, 223, 169, 433] - if len(sys.argv) < 2: - zoom_in_samples = zoom_in_samples[:, :, 10] - zoom_out_samples = zoom_out_samples[:, :, 10] - # ===================================== # ============ RELAX ================== # ===================================== @@ -38,6 +35,30 @@ # run experiment #1 : dataset = zoom in, video length = 4 vvn.run(config=config, indices=zoom_in_samples) + if "--subset" in sys.argv: + # summarize the results + tv, rt = summarize(vp.build_output_filepath(config=config, parent_only=True)) + rt = [round(rt[i], 2) for i in range(len(rt))] + + label = "Zoom In, relax 1/255 2/255 3/255 " + tv_row = f"{'PSRV':>{14}}" + f"{' ':<{5}}" + f"{tv[0]:<{12}}" + f"{tv[1]:<{12}}" + f"{tv[2]:<{12}}" + rt_row = f"{'Avg. Time (s)':>{14}}" + f"{' ':<{5}}" + f"{rt[0]:<{12}}" + f"{rt[1]:<{12}}" + f"{rt[2]:<{12}}" + + length_toplines = max(len(tv_row), len(rt_row), len(label)) + + # print results + print('\n') + print("="*length_toplines) + print(label) + print("-"*length_toplines) + print(tv_row) + print(rt_row) + print("="*length_toplines) + print('\n') + + # ensure the program stops there + sys.exit() + # run experiment #2 : dataset = zoom out, video length = 4 config.ds_type = 'zoom_out' vvn.run(config=config, indices=zoom_out_samples) From f6561984d63e1cf37708f26291179b8d84179e0a Mon Sep 17 00:00:00 2001 From: Samuel Sasaki <64754486+sammsaski@users.noreply.github.com> Date: Fri, 6 Dec 2024 18:20:21 -0600 Subject: [PATCH 2/2] add timing to README --- code/nnv/examples/Submission/FORMALISE2025/README.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/code/nnv/examples/Submission/FORMALISE2025/README.md b/code/nnv/examples/Submission/FORMALISE2025/README.md index 6117800c0..59a99ffee 100644 --- a/code/nnv/examples/Submission/FORMALISE2025/README.md +++ b/code/nnv/examples/Submission/FORMALISE2025/README.md @@ -64,7 +64,7 @@ pip install -e . 5. With all of these steps done, you are now ready to begin reproducing the artifacts. -# Smoke Test Instructions +# Smoke Test Instructions (~1min) Instructions to quickly test the software needed to reproduce the artifacts of the paper. If no issues arise during the smoke test, you can safely proceed to reproducing all artifacts as described in the below sections. @@ -84,7 +84,7 @@ chmod +x run_smoketest.sh && ./run_smoketest.sh ********************************************** ``` -# Reproducing a Subset of the Results +# Reproducing a Subset of the Results (~1-2 hours) Assuming the average runtime for the experiments remains as shown in the paper, then it will take approximately 9-10 days to reproduce the results. For that reason, this set of instructions is for reproducing a subset of the results. More specifically, we reproduce the first row of Table 2 from the paper, e.g. the verification results for the 4-frame variation of the Zoom In dataset with the relax verification algorithm. The results will be output to the console after the script completes. Additionally, this script will generate the reachable output range plots used in Figure 7.