-
Notifications
You must be signed in to change notification settings - Fork 219
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wishart priors resulting in PosDefException: matrix is not positive definite; Cholesky factorization failed
#2188
Comments
Likely related to TuringLang/Bijectors.jl#313 |
The above example now works with the most recent Bijectors.jl release. A small step size is useful for stable optimisation or HMC sampling: julia> chain = sample(demo_model, HMC(0.01, 10), 10000);
Sampling 100%|████████████████████████████████████████████████████████████████████████████████████| Time: 0:00:00 I am not sure how to specify step size for optimisation. @mhauru is that possible? |
These numerical errors can be explicitly catched and used to inform the inference backend to reject the proposal. julia> @model function demo(x)
try
_A ~ Wishart(5, A);
_x_mu = sum(_A);
x ~ Normal(_x_mu, 1);
catch e;
if e isa PosDefException
Turing.@addlogprob! -Inf;
end
end
end
julia> chain = sample(demo(1), NUTS(), 1000)
┌ Info: Found initial step size
└ ϵ = 0.0125 | ETA: N/A
Sampling 100%|████████████████████████████████████████████████████████████████████████████████████| Time: 0:00:00
Chains MCMC chain (1000×21×1 Array{Float64, 3}):
Iterations = 501:1:1500
Number of chains = 1
Samples per chain = 1000
Wall duration = 0.19 seconds
Compute duration = 0.19 seconds
parameters = _A[1, 1], _A[2, 1], _A[3, 1], _A[1, 2], _A[2, 2], _A[3, 2], _A[1, 3], _A[2, 3], _A[3, 3]
internals = lp, n_steps, is_accept, acceptance_rate, log_density, hamiltonian_energy, hamiltonian_energy_error, max_hamiltonian_energy_error, tree_depth, numerical_error, step_size, nom_step_size
Summary Statistics
parameters mean std mcse ess_bulk ess_tail rhat ess_per_sec
Symbol Float64 Float64 Float64 Float64 Float64 Float64 Float64
_A[1, 1] 3.6504 2.3821 0.0935 546.6195 372.3050 1.0003 2907.5507
_A[2, 1] -1.4789 1.7435 0.0766 565.4885 588.1340 1.0066 3007.9174
_A[3, 1] -1.5504 1.7307 0.1053 254.4582 458.9476 1.0133 1353.5010
_A[1, 2] -1.4789 1.7435 0.0766 565.4885 588.1340 1.0066 3007.9174
_A[2, 2] 3.5547 2.2049 0.0896 549.1068 560.1294 1.0012 2920.7810
_A[3, 2] -1.4334 1.6267 0.0781 466.9718 495.1435 0.9991 2483.8925
_A[1, 3] -1.5504 1.7307 0.1053 254.4582 458.9476 1.0133 1353.5010
_A[2, 3] -1.4334 1.6267 0.0781 466.9718 495.1435 0.9991 2483.8925
_A[3, 3] 3.5786 2.2005 0.1018 556.3600 493.4034 1.0026 2959.3615
Quantiles
parameters 2.5% 25.0% 50.0% 75.0% 97.5%
Symbol Float64 Float64 Float64 Float64 Float64
_A[1, 1] 0.5643 1.8723 3.1353 4.8230 9.5504
_A[2, 1] -5.6961 -2.3031 -1.2180 -0.3144 1.1457
_A[3, 1] -5.8425 -2.3965 -1.2686 -0.4025 1.0603
_A[1, 2] -5.6961 -2.3031 -1.2180 -0.3144 1.1457
_A[2, 2] 0.5728 1.9093 3.1736 4.6654 8.8305
_A[3, 2] -5.1991 -2.3551 -1.2366 -0.3020 1.2692
_A[1, 3] -5.8425 -2.3965 -1.2686 -0.4025 1.0603
_A[2, 3] -5.1991 -2.3551 -1.2366 -0.3020 1.2692
_A[3, 3] 0.7432 1.9730 3.1545 4.6346 9.2545 |
Depends on the optimisation algorithm, but if the algorithm has a notion of a step size, then usually yes. The default algorithm is LBFGS, which first finds a direction to go in and then does a line search along that direction to figure out how far to go, so there isn't a fixed step size, but you can set an initial guess for the step size like this:
That seems to help avoiding the loss of positivity errors in this case. |
@yebai I don't think anything I did fixing the correlation bijectors would have fixed this. I'm not at a computer right now, but I imagine the problem is similar; the Jacobian computation is going through a The other place a |
Thanks @sethaxen, for the clarification. For now, users can be referred to #2188 (comment) before numerically more stable alternatives are implemented. We should also update docs to include some guides on how to use |
julia v1.10.2
Turing v0.30.7
Distributions v0.25.107
Using Wishart priors results in several functions throwing the above error, which does not make sense to me, e.g. in this MWE maximizing:
MAP() throws an error, as does sampling with HMC and NUTS, but not with MH.
The text was updated successfully, but these errors were encountered: