From a489c4624512e5480f167211f4cf419c517bb7fb Mon Sep 17 00:00:00 2001 From: David Nabergoj Date: Wed, 14 Aug 2024 13:35:38 +0200 Subject: [PATCH] Update docs --- README.md | 66 ++----------------- .../source/guides/mathematical_background.rst | 16 +++++ docs/source/guides/usage.rst | 3 +- docs/source/index.rst | 3 + 4 files changed, 28 insertions(+), 60 deletions(-) create mode 100644 docs/source/guides/mathematical_background.rst diff --git a/README.md b/README.md index 3fe5c48..016d1bd 100644 --- a/README.md +++ b/README.md @@ -30,16 +30,20 @@ print(log_prob.shape) # (100,) print(x_new.shape) # (50, 3) ``` -We provide more examples [here](examples/). +Check examples and documentation, including the list of supported architectures [here](torchflows.readthedocs.io/en/latest/). +We also provide examples [here](examples/). ## Installing -Install via pip: +We support Python versions 3.7 and upwards. + +Install Torchflows via pip: + ``` pip install torchflows ``` -Install the package directly from Github: +Install Torchflows directly from Github: ``` pip install git+https://github.com/davidnabergoj/torchflows.git @@ -53,59 +57,3 @@ cd torchflows pip install -r requirements.txt ``` -We support Python versions 3.7 and upwards. - -## Brief background - -A normalizing flow (NF) is a flexible trainable distribution. -It is defined as a bijective transformation of a simple distribution, such as a standard Gaussian. -The bijection is typically an invertible neural network. -Training a NF using a dataset means optimizing the bijection's parameters to make the dataset likely under the NF. -We can use a NF to compute the probability of a data point or to independently sample data from the process that -generated our dataset. - -The density of a NF $q(x)$ with the bijection $f(z) = x$ and base distribution $p(z)$ is defined as: -$$\log q(x) = \log p(f^{-1}(x)) + \log\left|\det J_{f^{-1}}(x)\right|.$$ -Sampling from a NF means sampling from the simple distribution and transforming the sample using the bijection. - -## Supported architectures - -We list supported NF architectures below. -We classify architectures as either autoregressive, residual, or continuous; as defined -by [Papamakarios et al. (2021)](https://arxiv.org/abs/1912.02762). -We specify whether the forward and inverse passes are exact; otherwise they are numerical or not implemented (Planar, -Radial, and Sylvester flows). -An exact forward pass guarantees exact density estimation, whereas an exact inverse pass guarantees exact sampling. -Note that the directions can always be reversed, which enables exact computation for the opposite task. -We also specify whether the logarithm of the Jacobian determinant of the transformation is exact or computed numerically. - -| Architecture | Bijection type | Exact forward | Exact inverse | Exact log determinant | -|--------------------------------------------------------------------------|:--------------------------:|:---------------:|:-------------:|:---------------------:| -| [NICE](http://arxiv.org/abs/1410.8516) | Autoregressive | ✔ | ✔ | ✔ | -| [Real NVP](http://arxiv.org/abs/1605.08803) | Autoregressive | ✔ | ✔ | ✔ | -| [MAF](http://arxiv.org/abs/1705.07057) | Autoregressive | ✔ | ✔ | ✔ | -| [IAF](http://arxiv.org/abs/1606.04934) | Autoregressive | ✔ | ✔ | ✔ | -| [Rational quadratic NSF](http://arxiv.org/abs/1906.04032) | Autoregressive | ✔ | ✔ | ✔ | -| [Linear rational NSF](http://arxiv.org/abs/2001.05168) | Autoregressive | ✔ | ✔ | ✔ | -| [NAF](http://arxiv.org/abs/1804.00779) | Autoregressive | ✔ | ✗ | ✔ | -| [UMNN](http://arxiv.org/abs/1908.05164) | Autoregressive | ✗ | ✗ | ✔ | -| [Planar](https://onlinelibrary.wiley.com/doi/abs/10.1002/cpa.21423) | Residual | ✔ | ✗ | ✔ | -| [Radial](https://proceedings.mlr.press/v37/rezende15.html) | Residual | ✔ | ✗ | ✔ | -| [Sylvester](http://arxiv.org/abs/1803.05649) | Residual | ✔ | ✗ | ✔ | -| [Invertible ResNet](http://arxiv.org/abs/1811.00995) | Residual | ✔ | ✗ | ✗ | -| [ResFlow](http://arxiv.org/abs/1906.02735) | Residual | ✔ | ✗ | ✗ | -| [Proximal ResFlow](http://arxiv.org/abs/2211.17158) | Residual | ✔ | ✗ | ✗ | -| [FFJORD](http://arxiv.org/abs/1810.01367) | Continuous | ✗ | ✗ | ✗ | -| [RNODE](http://arxiv.org/abs/2002.02798) | Continuous | ✗ | ✗ | ✗ | -| [DDNF](http://arxiv.org/abs/1810.03256) | Continuous | ✗ | ✗ | ✗ | -| [OT flow](http://arxiv.org/abs/2006.00104) | Continuous | ✗ | ✗ | ✗ | - - -We also support simple bijections (all with exact forward passes, inverse passes, and log determinants): - -* Permutation -* Elementwise translation (shift vector) -* Elementwise scaling (diagonal matrix) -* Rotation (orthogonal matrix) -* Triangular matrix -* Dense matrix (using the QR or LU decomposition) diff --git a/docs/source/guides/mathematical_background.rst b/docs/source/guides/mathematical_background.rst new file mode 100644 index 0000000..cb4dd94 --- /dev/null +++ b/docs/source/guides/mathematical_background.rst @@ -0,0 +1,16 @@ +What is a normalizing flow +========================== + +A normalizing flow (NF) is a flexible trainable distribution. +It is defined as a bijective transformation of a simple distribution, such as a standard Gaussian. +The bijection is typically an invertible neural network. +Training a NF using a dataset means optimizing the bijection's parameters to make the dataset likely under the NF. +We can use a NF to compute the probability of a data point or to independently sample data from the process that +generated our dataset. + +The density of a NF :math:`q(x)` with the bijection :math:`f(z) = x` and base distribution :math:`p(z)` is defined as: + +.. math:: + \log q(x) = \log p(f^{-1}(x)) + \log\left|\det J_{f^{-1}}(x)\right|. + +Sampling from a NF means sampling from the simple distribution and transforming the sample using the bijection. diff --git a/docs/source/guides/usage.rst b/docs/source/guides/usage.rst index fd8cd2c..9b91ff6 100644 --- a/docs/source/guides/usage.rst +++ b/docs/source/guides/usage.rst @@ -5,7 +5,8 @@ We provide tutorials and notebooks for typical Torchflows use cases. .. toctree:: + mathematical_background basic_usage event_shapes image_modeling - choosing_base_distributions \ No newline at end of file + choosing_base_distributions diff --git a/docs/source/index.rst b/docs/source/index.rst index bb5696f..606b626 100644 --- a/docs/source/index.rst +++ b/docs/source/index.rst @@ -12,6 +12,9 @@ It implements many normalizing flow architectures and their building blocks for: * easy use of normalizing flows as trainable distributions; * easy implementation of new normalizing flows. +Torchflows is structured according to the review paper `Normalizing Flows for Probabilistic Modeling and Inference <(https://arxiv.org/abs/1912.02762)>`_ by Papamakarios et al. (2021), which classifies flow architectures as autoregressive, residual, or continuous. +Visit the `Github page `_ to keep up to date and post any questions or issues `here `_. + Installing --------------- Torchflows can be installed easily using pip: