From 6f2a8ab6a5b7408162014d2bfd7d49328325db3a Mon Sep 17 00:00:00 2001
From: Arnau Quera-Bofarull <arnauq@protonmail.com>
Date: Mon, 25 Sep 2023 15:03:50 +0100
Subject: [PATCH] added examples readme, updated bayesflow reference

---
 examples/README.md | 42 ++++++++++++++++++++++++++++++++++++++++++
 paper/paper.bib    | 17 +++++++++++------
 2 files changed, 53 insertions(+), 6 deletions(-)
 create mode 100644 examples/README.md

diff --git a/examples/README.md b/examples/README.md
new file mode 100644
index 0000000..9d5ff72
--- /dev/null
+++ b/examples/README.md
@@ -0,0 +1,42 @@
+# Examples
+
+Here we include multiple examples showcasing the utility of Blackbirds to perform calibration in a variety of ways, including variational inference (VI), MCMC, and simulated minimum distance (SMD).
+
+# 1. Models
+
+## 1.1 Random walk
+
+The random walk process we considered is given by 
+
+$$
+x(t+1) = x(t) + 2\epsilon -1, \;\;\; \epsilon \sim \mathrm{Bernoulli}(p)
+$$
+
+and the inference exercise consists on inferring the value of $p$ from an observed time-series $\{x(t)\}_t$. In `smd/01-random_walk.ipynb` we recover $p$ by just employing gradient descent to minimize the L2 distance between the simulated and the observed time-series. A Bayesian approach using generalized variational inference (GVI) is shown in `variational_inference/01-random_walk.ipynb`. In this case we consider the candidate family to approximate the generalised posterior as a family of normal distributions where we vary the mean $\mu$ and standard deviation $\sigma$.
+
+## 1.2 SIR model
+
+In `variational_inference/02-SIR.ipynb` we perform GVI on a typical agent-based SIR model. We assume that an infectious disease is spreaded among a population by direct contact between agents. The contacts of the agents are given by a contact graph, which can be any graph generated by `networkx`. For each contact with an infected agent, the receving agent has a probability $\beta$ of becoming infected. Thus, at each time-step, the probability of agent $i$ becoming infected is
+
+$$
+p_i = 1 - (1-\beta)^{n_i},
+$$
+
+where $n_i$ is the number of infected neighbours. Each infected agent can then recover with probability $\gamma$. Note that this parameterization of the SIR model does not recover the standard ODE approach for the case of a complete graph. Contrary to the random walk model, we here consider $\mathcal Q$, the family of approximating distributions to the generalised posterior, to be a normalising flow. As before, we specify a metric to compute the distance between an observed and simulated time-series and perform GVI to recover the model parameters: $\beta$, $\gamma$, and the initial fraction of infected agents.
+
+## 1.3 Brock & Hommes Model
+
+The [Brock & Hommes](https://www.sciencedirect.com/science/article/abs/pii/S0165188998000116) (BH) model is an ABM model of asset pricing with heterogenous agents that despite its simplicity is able to generate quite chaotic dynamics, making it a good test case for the robustness of the methods we implement in this package. We refer to [this](https://arxiv.org/pdf/2307.01085.pdf) paper for a more detailed explanation of the model, as well as technical experiments run with `BlackBIRDS`. In `variational_inference/03-brock-hommes.ipynb` we show an example of GVI inferring 4 parameters of the BH model.
+
+# 2. Multi-GPU parallelisation support
+
+`BlackBIRDS` supports multi-CPU and multi-GPU utilization to perform GVI. At each epoch, one needs to sample $n$ samples from the candidate posterior and evaluate the samples using the loss function. This process is embarassingly parallel so we can exploit a large availability of resources. We use `MPI4PY` to attain this. The example showcased in [the documentation](https://www.arnau.ai/blackbirds/examples/gpu_parallelization/) should be illustrative enough.
+
+
+# 3. Score-based vs pathwise gradients
+
+To perform GVI, we need to compute the gradient of the expected loss respect to the candidate posterior parameters (see [this paper](https://jmlr.org/papers/volume21/19-346/19-346.pdf) for a good review of Monte Carlo gradient estimation). This gradient can be obtained through the score-based function, which only necessitates gradients of the density but makes no assumptions on the simulator, or through the pathwise gradient, which requires the simulator to be differentiable.
+
+While GVI can be conducted without having a differentiable simulator, the pathwise gradient typically shows a lower variance and more efficient training than the score-based approach, making it worthwile to port simulators to differentiable frameworks. A comparison of both methods is shown in `examples/variational_inference/04-score_vs_pathwise_gradient.ipynb`.
+
+
diff --git a/paper/paper.bib b/paper/paper.bib
index 255b40c..fafe28c 100644
--- a/paper/paper.bib
+++ b/paper/paper.bib
@@ -187,10 +187,15 @@ @article{pyabc
   url = {https://doi.org/10.21105/joss.04304},
 }
 
-@misc{radev2023bayesflow,
-  title = {BayesFlow: Amortized {B}ayesian Workflows With Neural Networks},
-  author = {Stefan T Radev and Marvin Schmitt and Lukas Schumacher and Lasse Elsem\"{u}ller and Valentin Pratz and Yannik Sch\"{a}lte and Ullrich K\"{o}the and Paul-Christian B\"{u}rkner},
-  year = {2023},
-  publisher= {arXiv},
-  url={https://arxiv.org/abs/2306.16015}
+@article{radev2023bayesflow,
+    author = {Radev, Stefan T. and Schmitt, Marvin and Schumacher, Lukas and Elsemüller, Lasse and Pratz, Valentin and Schälte, Yannik and Köthe, Ullrich and Bürkner, Paul-Christian},
+    doi = {10.21105/joss.05702},
+    journal = {Journal of Open Source Software},
+    month = sep,
+    number = {89},
+    pages = {5702},
+    title = {{BayesFlow: Amortized Bayesian Workflows With Neural Networks}},
+    url = {https://joss.theoj.org/papers/10.21105/joss.05702},
+    volume = {8},
+    year = {2023}
 }
\ No newline at end of file