Skip to content

Commit

Permalink
Widening the scope of the package and dropping support for batching (#…
Browse files Browse the repository at this point in the history
…214)

* renamed rv to result in forward

* added abstrac type Transform and removed dimensionality from Bijector

* updated Composed to new interface

* updated Exp and Log to new interface

* updated Logit to new interface

* removed something that shouldnt be there

* removed false statement in docstring of Transform

* fixed a typo in implementation of logabsdetjac_batch

* added types for representing batches

* make it possible to use broadcasting for working with batches

* updated SimplexBijector to new interface, I think

* updated PDBijector to new interface

* use transform_batch rather than broadcasting

* added default implementations for batches

* updated ADBijector to new interface

* updated CorrBijector to new interface

* updated Coupling to new interface

* updated LeakyReLU to new interface

* updated NamedBijector to new interface

* updated BatchNormalisation to new interface

* updated Permute to new interface

* updated PlanarLayer to new interface

* updated RadialLayer to new interface

* updated RationalQuadraticSpline to new interface

* updated Scale to new interface

* updated Shift to new interface

* updated Stacked to new interface

* updated TruncatedBijector to new interface

* added ConstructionBase as dependency

* fixed a bunch of small typos and errors from previous commits

* forgot to wrap some in Batch

* allow inverses of non-bijectors

* relax definition of VectorBatch so Vector{<:Real} is covered

* just perform invertibility check in Inverse rather than inv

* moved some code arround

* added docstrings and default impls for mutating batched methods

* add elementype to VectorBatch

* simplify Shift bijector

* added rrules for logabsdetjac_shift

* use type-stable implementation of eachslice

* initial work on adding proper testing

* make Batch compatible with Zygote

* updated OrderedBijector

* temporary stuff

* added docs

* removed all batch related functionality

* move bijectors over to with_logabsdet_jacobian and drop official batch support

* updated compat

* updated tests

* updated docs

* removed reundndat dep

* remove batch

* remove redundant defs of transform

* removed unnecessary impls of with_logabsdet_jacobian

* remove usage of Exp and Log in tests

* fixed docs

* added bijectors with docs to docs

* small change to docs

* fixed bug in computation of logabsdetjac of truncated

* bump minor version

* run GH actions on Julia 1.6, which is the new LTS, instead of 1.3

* added Github actions for making docs, etc.

* removed left-overs from batch impls

* removed redundant comment

* dont return NamedTuple from with_logabsdet_jacobian

* remove unnused methods

* remove old deprecation warnings

* fix exports

* updated tests for deprecations

* completed some random TODOs

* fix SimplexBijector tests

* removed whitespace

* made some docstrings into doctests

* removed unnused method

* improved show for scale and shift

* converted example for Coupling into doctest

* added reference to Coupling bijector for NamedCoupling

* fixed docstring

* fixed documentation setup

* nvm, now I fixed documentation setup

* removed references to dimensionality in code

* fixed typo

* add impl of invertible for Elementwise

* added transforms and distributions as separate pages in docs

* removed all the unnecessary stuff in README

* added examples to docs

* added some show methods for certain bijectors

* added compat entries to docs

* updated docstring for RationalQuadraticSpline

* removed commented code

* remove reference to logpdf_forward

* remove enforcement of type of input and output being the same in tests

* make logpdf_with_trans compatible with logpdf when it comes to
handling batches

* Apply suggestions from code review

Co-authored-by: David Widmann <[email protected]>

* remove usage of invertible, etc. and use InverseFunctions.NoInverse instead

* specialze transform on Function

* removed unnecessary show and deprecation warnings

* remove references to Log and Exp

---------

Co-authored-by: Hong Ge <[email protected]>
Co-authored-by: David Widmann <[email protected]>
  • Loading branch information
3 people authored Feb 1, 2023
1 parent 0bb86e2 commit 8b924d0
Show file tree
Hide file tree
Showing 44 changed files with 851 additions and 2,235 deletions.
26 changes: 26 additions & 0 deletions .github/workflows/DocsPreviewCleanup.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
name: DocsPreviewCleanup

on:
pull_request:
types: [closed]

jobs:
cleanup:
runs-on: ubuntu-latest
steps:
- name: Checkout gh-pages branch
uses: actions/checkout@v2
with:
ref: gh-pages
- name: Delete preview and history + push changes
run: |
if [ -d "previews/PR$PRNUM" ]; then
git config user.name "Documenter.jl"
git config user.email "[email protected]"
git rm -rf "previews/PR$PRNUM"
git commit -m "delete preview"
git branch gh-pages-new $(echo "delete history" | git commit-tree HEAD^{tree})
git push --force origin gh-pages-new:gh-pages
fi
env:
PRNUM: ${{ github.event.number }}
2 changes: 1 addition & 1 deletion Project.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
name = "Bijectors"
uuid = "76274a88-744f-5084-9051-94815aaf08c4"
version = "0.10.6"
version = "0.11.0"

[deps]
ArgCheck = "dce04be8-c92d-5529-be00-80e4d2c0e197"
Expand Down
252 changes: 1 addition & 251 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
# Bijectors.jl

[![Stable](https://img.shields.io/badge/docs-stable-blue.svg)](https://turinglang.github.io/Bijectors.jl/stable)
[![Interface tests](https://github.com/TuringLang/Bijectors.jl/workflows/Interface%20tests/badge.svg?branch=master)](https://github.com/TuringLang/Bijectors.jl/actions?query=workflow%3A%22Interface+tests%22+branch%3Amaster)
[![AD tests](https://github.com/TuringLang/Bijectors.jl/workflows/AD%20tests/badge.svg?branch=master)](https://github.com/TuringLang/Bijectors.jl/actions?query=workflow%3A%22AD+tests%22+branch%3Amaster)

Expand Down Expand Up @@ -135,19 +136,6 @@ true

Pretty neat, huh? `Inverse{Logit}` is also a `Bijector` where we've defined `(ib::Inverse{<:Logit})(y)` as the inverse transformation of `(b::Logit)(x)`. Note that it's not always the case that `inverse(b) isa Inverse`, e.g. the inverse of `Exp` is simply `Log` so `inverse(Exp()) isa Log` is true.

#### Dimensionality
One more thing. See the `0` in `Inverse{Logit{Float64}, 0}`? It represents the *dimensionality* of the bijector, in the same sense as for an `AbstractArray` with the exception of `0` which means it expects 0-dim input and output, i.e. `<:Real`. This can also be accessed through `dimension(b)`:

```julia
julia> Bijectors.dimension(b)
0

julia> Bijectors.dimension(Exp{1}())
1
```

In most cases specification of the dimensionality is unnecessary as a `Bijector{N}` is usually only defined for a particular value of `N`, e.g. `Logit isa Bijector{0}` since it only makes sense to apply `Logit` to a real number (or a vector of reals if you're doing batch-computation). As a user, you'll rarely have to deal with this dimensionality specification. Unfortunately there are exceptions, e.g. `Exp` which can be applied to both real numbers and a vector of real numbers, in both cases treating it as a single input. This means that when `Exp` receives a vector input `x` as input, it's ambiguous whether or not to treat `x` as a *batch* of 0-dim inputs or as a single 1-dim input. As a result, to support batch-computation it is necessary to know the expected dimensionality of the input and output. Notice that we assume the dimensionality of the input and output to be the *same*. This is a reasonable assumption considering we're working with *bijections*.

#### Composition
Also, we can _compose_ bijectors:

Expand Down Expand Up @@ -491,244 +479,6 @@ julia> x, y, logjac, logpdf_y = forward(flow) # sample + transform and returns a
This method is for example useful when computing quantities such as the _expected lower bound (ELBO)_ between this transformed distribution and some other joint density. If no analytical expression is available, we have to approximate the ELBO by a Monte Carlo estimate. But one term in the ELBO is the entropy of the base density, which we _do_ know analytically in this case. Using the analytical expression for the entropy and then using a monte carlo estimate for the rest of the terms in the ELBO gives an estimate with lower variance than if we used the monte carlo estimate for the entire expectation.


### Normalizing flows with bounded support


## Implementing your own `Bijector`
There's mainly two ways you can implement your own `Bijector`, and which way you choose mainly depends on the following question: are you bothered enough to manually implement `logabsdetjac`? If the answer is "Yup!", then you subtype from `Bijector`, if "Naaaah" then you subtype `ADBijector`.

### `<:Bijector`
Here's a simple example taken from the source code, the `Identity`:

```julia
import Bijectors: logabsdetjac

struct Identity{N} <: Bijector{N} end
(::Identity)(x) = x # transform itself, "forward"
(::Inverse{<: Identity})(y) = y # inverse tramsform, "backward"

# see the proper implementation for `logabsdetjac` in general
logabsdetjac(::Identity{0}, y::Real) = zero(eltype(y)) # ∂ₓid(x) = ∂ₓ x = 1 → log(abs(1)) = log(1) = 0
```

A slightly more complex example is `Logit`:

```julia
using LogExpFunctions: logit, logistic

struct Logit{T<:Real} <: Bijector{0}
a::T
b::T
end

(b::Logit)(x::Real) = logit((x - b.a) / (b.b - b.a))
(b::Logit)(x) = map(b, x)
# `orig` contains the `Bijector` which was inverted
(ib::Inverse{<:Logit})(y::Real) = (ib.orig.b - ib.orig.a) * logistic(y) + ib.orig.a
(ib::Inverse{<:Logit})(y) = map(ib, y)

logabsdetjac(b::Logit, x::Real) = - log((x - b.a) * (b.b - x) / (b.b - b.a))
logabsdetjac(b::Logit, x) = map(logabsdetjac, x)
```

(Batch computation is not fully supported by all bijectors yet (see issue #35), but is actively worked on. In the particular case of `Logit` there's only one thing that makes sense, which is elementwise application. Therefore we've added `@.` to the implementation above, thus this works for any `AbstractArray{<:Real}`.)

Then

```julia
julia> b = Logit(0.0, 1.0)
Logit{Float64}(0.0, 1.0)

julia> b(0.6)
0.4054651081081642

julia> inverse(b)(y)
Tracked 2-element Array{Float64,1}:
0.3078149833748082
0.72380041667891

julia> logabsdetjac(b, 0.6)
1.4271163556401458

julia> logabsdetjac(inverse(b), y) # defaults to `- logabsdetjac(b, inverse(b)(x))`
Tracked 2-element Array{Float64,1}:
-1.546158373866469
-1.6098711387913573

julia> with_logabsdet_jacobian(b, 0.6) # defaults to `(b(x), logabsdetjac(b, x))`
(0.4054651081081642, 1.4271163556401458)
```

For further efficiency, one could manually implement `with_logabsdet_jacobian(b::Logit, x)`:

```julia
julia> using Bijectors: Logit

julia> import Bijectors: with_logabsdet_jacobian

julia> function with_logabsdet_jacobian(b::Logit{<:Real}, x)
totally_worth_saving = @. (x - b.a) / (b.b - b.a) # spoiler: it's probably not
y = logit.(totally_worth_saving)
logjac = @. - log((b.b - x) * totally_worth_saving)
return (y, logjac)
end
forward (generic function with 16 methods)

julia> with_logabsdet_jacobian(b, 0.6)
(0.4054651081081642, 1.4271163556401458)

julia> @which with_logabsdet_jacobian(b, 0.6)
with_logabsdet_jacobian(b::Logit{#s4} where #s4<:Real, x) in Main at REPL[43]:2
```
As you can see it's a very contrived example, but you get the idea.
### `<:ADBijector`
We could also have implemented `Logit` as an `ADBijector`:
```julia
using LogExpFunctions: logit, logistic
using Bijectors: ADBackend

struct ADLogit{T, AD} <: ADBijector{AD, 0}
a::T
b::T
end

# ADBackend() returns ForwardDiffAD, which means we use ForwardDiff.jl for AD
ADLogit(a::T, b::T) where {T<:Real} = ADLogit{T, ADBackend()}(a, b)

(b::ADLogit)(x) = @. logit((x - b.a) / (b.b - b.a))
(ib::Inverse{<:ADLogit{<:Real}})(y) = @. (ib.orig.b - ib.orig.a) * logistic(y) + ib.orig.a
```
No implementation of `logabsdetjac`, but:
```julia
julia> b_ad = ADLogit(0.0, 1.0)
ADLogit{Float64,Bijectors.ForwardDiffAD}(0.0, 1.0)

julia> logabsdetjac(b_ad, 0.6)
1.4271163556401458

julia> y = b_ad(0.6)
0.4054651081081642

julia> inverse(b_ad)(y)
0.6

julia> logabsdetjac(inverse(b_ad), y)
-1.4271163556401458
```
Neat! And just to verify that everything works:
```julia
julia> b = Logit(0.0, 1.0)
Logit{Float64}(0.0, 1.0)

julia> logabsdetjac(b, 0.6)
1.4271163556401458

julia> logabsdetjac(b_ad, 0.6) logabsdetjac(b, 0.6)
true
```
We can also use Tracker.jl for the AD, rather than ForwardDiff.jl:
```julia
julia> Bijectors.setadbackend(:reversediff)
:reversediff

julia> b_ad = ADLogit(0.0, 1.0)
ADLogit{Float64,Bijectors.TrackerAD}(0.0, 1.0)

julia> logabsdetjac(b_ad, 0.6)
1.4271163556401458
```
### Reference
Most of the methods and types mention below will have docstrings with more elaborate explanation and examples, e.g.
```julia
help?> Bijectors.Composed
Composed(ts::A)

(b1::Bijector{N}, b2::Bijector{N})::Composed{<:Tuple}
composel(ts::Bijector{N}...)::Composed{<:Tuple}
composer(ts::Bijector{N}...)::Composed{<:Tuple}

where A refers to either

• Tuple{Vararg{<:Bijector{N}}}: a tuple of bijectors of dimensionality N

• AbstractArray{<:Bijector{N}}: an array of bijectors of dimensionality N

A Bijector representing composition of bijectors. composel and composer results in a Composed for which application occurs from left-to-right and right-to-left, respectively.

Note that all the alternative ways of constructing a Composed returns a Tuple of bijectors. This ensures type-stability of implementations of all relating methods, e.g. inverse.

If you want to use an Array as the container instead you can do

Composed([b1, b2, ...])

In general this is not advised since you lose type-stability, but there might be cases where this is desired, e.g. if you have a insanely large number of bijectors to compose.

Examples
≡≡≡≡≡≡≡≡≡≡

It's important to note that does what is expected mathematically, which means that the bijectors are applied to the input right-to-left, e.g. first applying b2 and then b1:

(b1 b2)(x) == b1(b2(x)) # => true

But in the Composed struct itself, we store the bijectors left-to-right, so that

cb1 = b1 b2 # => Composed.ts == (b2, b1)
cb2 = composel(b2, b1) # => Composed.ts == (b2, b1)
cb1(x) == cb2(x) == b1(b2(x)) # => true
```
If anything is lacking or not clear in docstrings, feel free to open an issue or PR.
#### Types
The following are the bijectors available:
- Abstract:
- `Bijector`: super-type of all bijectors.
- `ADBijector{AD} <: Bijector`: subtypes of this only require the user to implement `(b::UserBijector)(x)` and `(ib::Inverse{<:UserBijector})(y)`. Automatic differentation will be used to compute the `jacobian(b, x)` and thus `logabsdetjac(b, x).
- Concrete:
- `Composed`: represents a composition of bijectors.
- `Stacked`: stacks univariate and multivariate bijectors
- `Identity`: does what it says, i.e. nothing.
- `Logit`
- `Exp`
- `Log`
- `Scale`: scaling by scalar value, though at the moment only well-defined `logabsdetjac` for univariate.
- `Shift`: shifts by a scalar value.
- `Permute`: permutes the input array using matrix multiplication
- `SimplexBijector`: mostly used as the constrained-to-unconstrained bijector for `SimplexDistribution`, e.g. `Dirichlet`.
- `PlanarLayer`: §4.1 Eq. (10) in [1]
- `RadialLayer`: §4.1 Eq. (14) in [1]
The distribution interface consists of:
- `TransformedDistribution <: Distribution`: implements the `Distribution` interface from Distributions.jl. This means `rand` and `logpdf` are provided at the moment.
#### Methods
The following methods are implemented by all subtypes of `Bijector`, this also includes bijectors such as `Composed`.
- `(b::Bijector)(x)`: implements the transform of the `Bijector`
- `inverse(b::Bijector)`: returns the inverse of `b`, i.e. `ib::Bijector` s.t. `(ib ∘ b)(x) ≈ x`. In most cases this is `Inverse{<:Bijector}`.
- `logabsdetjac(b::Bijector, x)`: computes log(abs(det(jacobian(b, x)))).
- `with_logabsdet_jacobian(b::Bijector, x)`: returns the tuple `(b(x), logabsdetjac(b, x))` in the most efficient manner.
- `∘`, `composel`, `composer`: convenient and type-safe constructors for `Composed`. `composel(bs...)` composes s.t. the resulting composition is evaluated left-to-right, while `composer(bs...)` is evaluated right-to-left. `∘` is right-to-left, as excepted from standard mathematical notation.
- `jacobian(b::Bijector, x)` [OPTIONAL]: returns the Jacobian of the transformation. In some cases the analytical Jacobian has been implemented for efficiency.
- `dimension(b::Bijector)`: returns the dimensionality of `b`.
- `isclosedform(b::Bijector)`: returns `true` or `false` depending on whether or not `b(x)` has a closed-form implementation.
For `TransformedDistribution`, together with default implementations for `Distribution`, we have the following methods:
- `bijector(d::Distribution)`: returns the default constrained-to-unconstrained bijector for `d`
- `transformed(d::Distribution)`, `transformed(d::Distribution, b::Bijector)`: constructs a `TransformedDistribution` from `d` and `b`.
- `logpdf_forward(d::Distribution, x)`, `logpdf_forward(d::Distribution, x, logjac)`: computes the `logpdf(td, td.transform(x))` using the forward pass, which is potentially faster depending on the transform at hand.
- `forward(d::Distribution)`: returns `(x = rand(dist), y = b(x), logabsdetjac = logabsdetjac(b, x), logpdf = logpdf_forward(td, x))` where `b = td.transform`. This combines sampling from base distribution and transforming into one function. The intention is that this entire process should be performed in the most efficient manner, e.g. the `logabsdetjac(b, x)` call might instead be implemented as `- logabsdetjac(inverse(b), b(x))` depending on which is most efficient.
# Bibliography
1. Rezende, D. J., & Mohamed, S. (2015). Variational Inference With Normalizing Flows. [arXiv:1505.05770](https://arxiv.org/abs/1505.05770v6).
2. Kucukelbir, A., Tran, D., Ranganath, R., Gelman, A., & Blei, D. M. (2016). Automatic Differentiation Variational Inference. [arXiv:1603.00788](https://arxiv.org/abs/1603.00788v1).
2 changes: 1 addition & 1 deletion docs/make.jl
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ makedocs(
sitename = "Bijectors",
format = Documenter.HTML(),
modules = [Bijectors],
pages = ["Home" => "index.md", "Distributions.jl integration" => "distributions.md", "Examples" => "examples.md"],
pages = ["Home" => "index.md", "Transforms" => "transforms.md", "Distributions.jl integration" => "distributions.md", "Examples" => "examples.md"],
strict=false,
checkdocs=:exports,
)
Expand Down
Loading

2 comments on commit 8b924d0

@torfjelde
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JuliaRegistrator
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Registration pull request created: JuliaRegistries/General/76818

After the above pull request is merged, it is recommended that a tag is created on this repository for the registered package version.

This will be done automatically if the Julia TagBot GitHub Action is installed, or can be done manually through the github interface, or via:

git tag -a v0.11.0 -m "<description of version>" 8b924d0f091377bc190ed50359213043b14b4d37
git push origin v0.11.0

Please sign in to comment.