Skip to content

Latest commit

 

History

History
178 lines (146 loc) · 10.1 KB

README.md

File metadata and controls

178 lines (146 loc) · 10.1 KB

ModelWrappers

Documentation, Stable Build Status Coverage ColPrac: Contributor's Guide on Collaborative Practices for Community Packages

ModelWrappers.jl is a utility package that makes it easier to work with Model parameters stated as (nested) NamedTuples. It handles

  1. flattening/unflattening model parameter fields of arbitrary dimensions.
  2. constraining/unconstraining parameter (if a corresponding constraint is provided).
  3. using Automatic Differentation for Model parameter given as a NamedTuple for a user specified function, i.e., a log-posterior distribution.

Flattening/Unflattening Model Parameter

ModelWrappers.jl allows you to flatten a (nested) NamedTuple to a vector, and also returns an unflatten function to convert a vector back to a NamedTuple. By default, discrete parameter are not flattened and the default flatten type is Float64. One can construct flatten/unflatten via a ReConstructor.

using ModelWrappers
myparameter = (a = Float32(1.), b = 2, c = [3., 4.], d = [5, 6])
reconstruct = ReConstructor(myparameter)
vals_vec = flatten(reconstruct, myparameter) #Vector{Float64} with 3 elements (1., 3., 4.)
vals = unflatten(reconstruct, vals_vec) #(a = 1.0f0, b = 2, c = [3.0, 4.0], d = [5, 6])

You can adjust these settings by using the FlattenDefault struct. For instance, the following settings will map myparameter to a Float16 vector and also flatten the Integer values.

flattendefault = FlattenDefault(; output = Float16, flattentype = FlattenAll())
reconstruct = ReConstructor(flattendefault, myparameter)
vals_vec = flatten(reconstruct, myparameter) #Vector{Float16} with 6 elements (1., 2., 3., 4., 5., 6.)
vals = unflatten(reconstruct, vals_vec) #(a = 1.0f0, b = 2, c = [3.0, 4.0], d = [5, 6])

Flatten/Unflatten can also be used for Automatic Differentiation. The functions flattenAD and unflattenAD return output based on the input type. Check the differences to the first two cases in this example:

myparameter = (a = Float32(1.), b = 2, c = [3., 4.], d = [5, 6])
flattendefault = FlattenDefault(; output = Float32, flattentype = FlattenAll())
reconstruct = ReConstructor(flattendefault, myparameter)
vals_vec = flattenAD(reconstruct, myparameter)  #Vector{Float64} with 6 elements (1.00 2.00 3.00 4.00 5.00 6.00)
vals = unflattenAD(reconstruct, vals_vec) #(a = 1.0, b = 2.0, c = [3.0, 4.0], d = [5.0, 6.0])

A ReConstructor will assign buffers for flatten and unflatten, so most operations can be performed without allocations. Unflatten can usually be performed free of most allocations, even if arrays are involved:

using BenchmarkTools
myparameter2 = (a = Float32(1.), b = 2, c = [3., 4.], d = [5, 6], e = randn(1000), f = rand(1:2, 1000), g = randn(1000, 2))
reconstruct = ReConstructor(myparameter2)
vals_vec = flatten(reconstruct, myparameter2)
vals_vec #Vector{Float64} with 3003 element
@btime unflatten($reconstruct, $vals_vec)   # 419.095 ns (0 allocations: 0 bytes)
@btime flatten($reconstruct, $myparameter2) # 3.475 μs (8 allocations: 39.83 KiB)

Note that it is possible to nest NamedTuples, and use arbitrary Array-of-Arrays structures for your parameter, but this will often come with a performance penalty:

myparameter3 = (a = myparameter, b = (c = (d = myparameter2, ), ), e = [rand(10), rand(15), rand(20)])
reconstruct = ReConstructor(myparameter3)
vals_vec = flatten(reconstruct, myparameter3)
vals_vec #Vector{Float64} with 3051 element
@btime unflatten($reconstruct, $vals_vec)   # 1.220 μs (32 allocations: 3.19 KiB)
@btime flatten($reconstruct, $myparameter3) # 7.275 μs (19 allocations: 88.17 KiB)

Constraining/Unconstraining Model Parameter

Consider now the following problem: you have a model that consists of various (unknown) parameters and you want to estimate these parameter with a custom algorithm. Many common algorithms not only require you to take a vector as function argument, but also require you to know in which space your parameter operate.

If a corresponding prior distribution is provided, ModelWrappers.jl allows you to efficiently constrain and unconstrain your parameter tuple. To do so, one can initiate a Param struct, which is a temporary constructor that checks if the package can handle the (value, constraint) combination. The initial NamedTuple can then be wrapped in a ModelWrapper struct.

using Distributions
myparameter4 == Param(Normal(), 0.0,), σ = Param(Gamma(), 1.0, ))
mymodel = ModelWrapper(myparameter4)

Note that providing a distribution from 'Distributions.jl' in Param will just assign a bijector from 'Bijectors.jl' to the parameter. Other valid constraint options for a Param struct at the moment include

  1. a bijector from Bijectors.jl,
  2. all distributions that work with Bijectors.jl,
  3. a Fixed struct, which keeps val fixed and excludes it from flatten/unflatten,
  4. an Unconstrained struct, which flattens val without taking into account any constraint,
  5. and a Constrained struct, which flattens val without taking into account any constraint, but will take into account the constraints when constraining values.
using Bijectors
myparameter_constraints = (
    μ = Param(Normal(), 0.0,),
    σ = Param(Bijection(bijector(Gamma(2,2))), 1.0,),

    buffer1 = Param(Fixed(), zeros(Int64, 2,3,4), ),
    buffer2 = Param(Unconstrained(), [zeros(10), zeros(20)], ),
    buffer3 = Param(Constrained(1., 5.), 3., ),

)
model_constraints = ModelWrapper(myparameter_constraints)
flatten(model_constraints) #Vector{Float64}

A ModelWrapper struct is mutable, and contains the values of your NamedTuple field. Values can be flattened or unconstrained, and may be updated by new values/samples. Also, when a ModelWrapper struct is created, an unflatten function for strict and variable type conversion is stored. To show this, we will a create ModelWrapper struct, flatten its values, and update the struct with new values:

using Distributions, Random
_rng = MersenneTwister(2)
myparameter4 == Param(Normal(), 0.0, ), σ = Param(Gamma(), 1.0, ))
mymodel = ModelWrapper(myparameter4)
#Flatten/Unconstrain Model parameter
vals_vec = flatten(mymodel) #Vector{Float64} with 2 elements
unconstrain(mymodel) #(μ = 0.0, σ = 0.0)
unconstrain_flatten(mymodel) #Vector{Float64} with 2 elements

#Unflatten/Constrain proposed parameter from unconstrained space
θ_proposed = randn(_rng, length(vals_vec)) #Vector{Float64} with 2 elements
ModelWrappers.unflatten(mymodel, θ_proposed) #(μ = 0.7396206598864331, σ = -0.7445071021408705)
unflatten_constrain(mymodel, θ_proposed) #(μ = 0.7396206598864331, σ = 0.4749683531374296)

#Replacing current model parameter with proposed parameter
mymodel.val #(μ = 0.0, σ = 1.0)
unflatten_constrain!(mymodel, θ_proposed)
mymodel.val #(μ = 0.7396206598864331, σ = 0.4749683531374296)

Using Automatic Differentiation with a ModelWrapper

ModelWrappers.jl supports the usage of various Automatic Differentiation backends by providing an immutable Objective struct that contains your ModelWrapper, data, and all parameter that you want to get derivative information from. Objective is a functor, and you can manually add a target function wrt your original parameter NamedTuple that should be included in the AD call, i.e., a log-posterior density.

Let us work with the model from before. We first sample data, create the objective and then define a function that we want to use AD for:

using UnPack
#Create Model and data
myparameter4 == Param(Normal(), 0.0, ), σ = Param(Gamma(), 1.0, ))
mymodel = ModelWrapper(myparameter4)
data = randn(1000)

#Create objective for both μ and σ and define a target function for it
myobjective = Objective(mymodel, data, (, ))
function (objective::Objective{<:ModelWrapper{BaseModel}})(θ::NamedTuple)
	@unpack data = objective
	lprior = Distributions.logpdf(Distributions.Normal(),θ.μ) + Distributions.logpdf(Distributions.Exponential(), θ.σ)
    	llik = sum(Distributions.logpdf( Distributions.Normal.μ, θ.σ), data[iter] ) for iter in eachindex(data))
	return lprior + llik
end

myobjective can take a vector from an unconstrained space as input, constrains and converts the argument to a NamedTuple, checks if all conversions are finite, and adds all eventual Jacobian adjustments from the transformations before your target function is evaluated. This can usually be done efficiently:

#Sample new parameter, and evaluate target function wrt to Vector (not NamedTuple)
θ_proposed = randn(_rng, length(vals_vec))
myobjective(θ_proposed)

#Functor call wrt NamedTuple parameter
@btime $myobjective($mymodel.val) #6.420 μs (0 allocations: 0 bytes)
#Functor call wrt proposed Parameter Vector
@btime $myobjective($θ_proposed) #6.480 μs (0 allocations: 0 bytes)

Objective can also be called from various AD frameworks:

using ForwardDiff, ReverseDiff, Zygote
grad_fwd = ForwardDiff.gradient(myobjective, θ_proposed)
grad_rvd = ReverseDiff.gradient(myobjective, θ_proposed)
grad_zyg = Zygote.gradient(myobjective, θ_proposed)
all(grad_fwd .≈ grad_rvd .≈ grad_zyg[1]) #true

Going Forward

This package is still highly experimental - suggestions and comments are always welcome! New constraints should be reasonable simple to add, check out src/Core/constrain/constraints/constrained.jl as an example with guidance in the comments.