This repository contains the code for the method of Adversarial Alpha-Divergence Minimization (AADM) algorithm. For further information on this method please refer to the article.
Some very simple examples are given for the use of the code: a toy problem, one application to the Boston Housing UCI dataset and a final example with a large dataset (Airlines Delay).
To run the codes we have used Python 3.6. Moreover, the following dependencies must be installed
Here are included some examples of the experiments performed with AADM, including code and the datasets employed. Whenever one of the codes included is executed, it creates a folder and stores the results automatically, which in each case might be different depending on the task we are interested in.
Each code provides the results for three metrics: for the regression cases, those would be the RMSE, the test log-likelihood and the CRPS. However, for the classification codes we will have the classification error rate, the test log-likelihood and a form of the Brier score, depending on the experiment (binary or multiclass classification).
Two synthetic dataset were generated: one where the data follows a bimodal posterior distribution and another where the posterior is heteroscedastic. Both the datasets and the code employed can be found in the folder synthetic_problems
. The final results, when comparing alpha = 1.0 to alpha = 0.0001 behave as the following:
As an example for the rest of the datasets employed, the folder regression_example
includes the code and files necessary to try the AADM algorithm in the Boston Housing dataset (included in the folder as well). The file named permutations_boston_housing.txt
contains 20 rows of the randomly shuffled index of the rows of the dataset so every method can train using the exact same permutations as the others. To extract the results we executed all the codes in all of these 20 permutations and averaged the performance of each method across them. There are three codes included:
AADM_boston.py
- Our proposed method. To run it usepython3.6 AADM_boston.py <split number> <alpha value> <layers of the main network - 1 or 2> boston_housing.txt
.AVB_boston.py
- The Adversarial Variational Bayes code we have reproduced to compare results. This code can be run with the same line of AADM substituting the name of the python script and skipping over the value of alpha needed there.VI_boston.py
- Variational Inference to use as a baseline to compare results as well. To run this code usepython3.6 VI_boston.py <split_number> <layers of the main network - 1 or 2> boston_housing.txt
.
For each dataset we obtained a new set of 20 permutations and averaged the performance, obtaining the results presented in the paper. In order to use this codes for other similar datasets we would only have to change the name of the file employed in the previous calls for the algorithms and create a permutations file accordingly.
Similar to the regression case, the folder classification_example
includes the code and files necessary to try the AADM algorithm in the Iono binary classification dataset (included in the folder as well). The structure and execution of the code is identical to the case for regression. There are three codes included:
AADM_iono.py
- Our proposed method. To run it usepython3.6 AADM_iono.py <split number> <alpha value> <layers of the main network - 1 or 2> iono.txt
.AVB_iono.py
- Adversarial Variational Bayes for comparison. This code can be run with the same line of AADM substituting the name of the python script and without including the value for alpha.VI_iono.py
- Variational Inference as a baseline to compare results as well. To run this code usepython3.6 VI_iono.py <split_number> <layers of the main network - 1 or 2> iono.txt
.
For each dataset we obtained a new set of 20 permutations and averaged the performance, obtaining the results presented in the paper. In order to use this codes for other similar datasets we would only have to change the name of the file employed in the previous calls for the algorithms and create a permutations file accordingly.
Finally we include the code for one of the experiments we performed in a big dataset to study the convergence. The folder big_data_example
contains the code to perform the Airlines Delay study we present in the paper. The data needs to be downloaded from here and converted to the variables we have described in the paper. Once this is done, the data need to be shuffled and stored as shuffled_airlines.npy
. The three codes included here then can be used. They are:
AADM_airlines.py
- Our method. To run it usepython3.6 AADM_airlines.py <alpha value> <number of layers of the main network - 1 or 2>
.AVB_airlines.py
- Adversarial Variational Vayes. To run it, use the previous line excluding the flag of the value of alpha.VI_airlines.py
- Variational Inference. To run it do the same as in the case of AVB.
For the case of MNIST we provide two different versions of the code: one with AADM, AVB and VI implemented into fully-connected networks, and another using convolutional neural networks. In the first case, special techniques can be used to accelerate the computation, such as normalizing flows. We have also deactivated the adaptive contrast in this simple case. In the other hand, AADM can be implemented in CNNs, as can be seen here. The resulting code can be used for CIFAR-10 without very strong changes (only modifiying those values concerning the fact that CIFAR-10 images have 3 input channels instead of 1, as in MNIST).
For more examples of the code, the datasets employed for any of the experiments and other results or further questions, please contact at [email protected] or open an issue.