ParetoSmooth
Documentation for ParetoSmooth.
ParetoSmooth.ModelComparison
ParetoSmooth.Psis
ParetoSmooth.PsisLoo
ParetoSmooth.loo
ParetoSmooth.loo_compare
ParetoSmooth.loo_from_psis
ParetoSmooth.naive_lpd
ParetoSmooth.pointwise_log_likelihoods
ParetoSmooth.psis
ParetoSmooth.psis!
ParetoSmooth.psis_ess
ParetoSmooth.psis_loo
ParetoSmooth.relative_eff
ParetoSmooth.sup_ess
ParetoSmooth.ModelComparison
— TypeModelComparison
A struct containing the results of model comparison.
Fields
pointwise::KeyedArray
: AKeyedArray
of pointwise estimates. See [PsisLoo
]@ref.estimates::KeyedArray
: A table containing the results of model comparison, with the following columns –cv_elpd
: The difference in total leave-one-out cross validation scores between models.cv_avg
: The difference in average LOO-CV scores between models.weight
: A set of Akaike-like weights assigned to each model, which can be used in pseudo-Bayesian model averaging.
std_err::NamedTuple
: A named tuple containing the standard error ofcv_elpd
. Note that these estimators (incorrectly) assume all folds are independent, despite their substantial overlap, which creates a downward biased estimator. LOO-CV differences are not asymptotically normal, so these standard errors cannot be used to calculate a confidence interval.gmpd::NamedTuple
: The geometric mean of the predictive distribution. It equals the geometric mean of the probability assigned to each data point by the model, that is,exp(cv_avg)
. This measure is only meaningful for classifiers (variables with discrete outcomes). We can think of it as measuring how often the model was right: A model that always predicts incorrectly will have a GMPD of 0, while a model that always predicts correctly will have a GMPD of 1. However, the GMPD gives a model "Partial points" between 0 and 1 whenever the model assigns a probability other than 0 or 1 to the outcome that actually happened.
See also: PsisLoo
ParetoSmooth.Psis
— TypePsis{R<:Real, AT<:AbstractArray{R, 3}, VT<:AbstractVector{R}}
A struct containing the results of Pareto-smoothed importance sampling.
Fields
weights
: A vector of smoothed, truncated, and normalized importance sampling weights.pareto_k
: Estimates of the shape parameterk
of the generalized Pareto distribution.ess
: Estimated effective sample size for each LOO evaluation, based on the variance of the weights.sup_ess
: Estimated effective sample size for each LOO evaluation, based on the supremum norm, i.e. the size of the largest weight. More likely thaness
to warn when importance sampling has failed. However, it can have a high variance.r_eff
: The relative efficiency of the MCMC chain, i.e. ESS / posterior sample size.tail_len
: Vector indicating how large the "tail" is for each observation.posterior_sample_size
: How many draws from an MCMC chain were used for PSIS.data_size
: How many data points were used for PSIS.
ParetoSmooth.PsisLoo
— TypePsisLoo <: AbstractCV
A struct containing the results of leave-one-out cross validation computed with Pareto smoothed importance sampling.
Fields
estimates::KeyedArray
: A KeyedArray with columns:total, :se_total, :mean, :se_mean
, and rows:cv_elpd, :naive_lpd, :p_eff
. See# Extended help
for more.:cv_elpd
contains estimates for the out-of-sample prediction error, as estimated using leave-one-out cross validation.:naive_lpd
contains estimates of the in-sample prediction error.:p_eff
is the effective number of parameters – a model with ap_eff
of 2 is "about as overfit" as a model with 2 parameters and no regularization.
pointwise::KeyedArray
: AKeyedArray
of pointwise estimates with 5 columns –:cv_elpd
contains the estimated out-of-sample error for this point, as measured
:naive_lpd
contains the in-sample estimate of error for this point.:p_eff
is the difference in the two previous estimates.:ess
is the L2 effective sample size, which estimates the simulation error caused by using Monte Carlo estimates. It does not measure model performance.:inf_ess
is the supremum-based effective sample size, which estimates the simulation error caused by using Monte Carlo estimates. It is more robust than:ess
and should therefore be preferred. It does not measure model performance.:pareto_k
is the estimated value for the parameterξ
of the generalized Pareto distribution. Values above .7 indicate that PSIS has failed to approximate the true distribution.
psis_object::Psis
: APsis
object containing the results of Pareto-smoothed importance sampling.gmpd
: The geometric mean of the predictive density. It is defined as the geometric mean of the probability assigned to each data point by the model, i.e.exp(cv_avg)
. This measure is only interpretable for classifiers (variables with discrete outcomes). We can think of it as measuring how often the model was right: A model that always predicts incorrectly will have a GMPD of 0, while a model that always predicts correctly will have a GMPD of 1. However, the GMPD gives a model "Partial points" between 0 and 1 whenever the model assigns a probability other than 0 or 1 to the outcome that actually happened, making it a fully Bayesian measure of model quality.mcse
: A float containing the estimated Monte Carlo standard error for the total cross-validation estimate.
Extended help
The total score depends on the sample size, and summarizes the weight of evidence for or against a model. Total scores are on an interval scale, meaning that only differences of scores are meaningful. It is not possible to interpret a total score by looking at it. The total score is not a goodness-of-fit statistic (for this, see the average score).
The average score is the total score, divided by the sample size. It estimates the expected log score, i.e. the expectation of the log probability density of observing the next point. The average score is a relative goodness-of-fit statistic which does not depend on sample size.
Unlike for chi-square goodness of fit tests, models do not have to be nested for model comparison using cross-validation methods.
See also: [loo
]@ref, [bayes_cv
]@ref, [psis_loo
]@ref, [Psis
]@ref
ParetoSmooth.loo
— Methodfunction loo(args...; kwargs...) -> PsisLoo
Compute an approximate leave-one-out cross-validation score.
Currently, this function only serves to call psis_loo
, but this could change in the future. The default methods or return type may change without warning, so we recommend using psis_loo
instead if reproducibility is required.
ParetoSmooth.loo_compare
— Methodfunction loo_compare(
cv_results...;
sort_models::Bool=true,
diff --git a/previews/PR100/turing/index.html b/previews/PR100/turing/index.html
index e5c62ac..dbf342b 100644
--- a/previews/PR100/turing/index.html
+++ b/previews/PR100/turing/index.html
@@ -457,6 +457,7 @@
});
+
Turing Example
This example demonstrates how to correctly compute PSIS LOO for a model developed with Turing.jl. Below, we show two ways to correctly specify the model in Turing. What is most important is to specify the model so that pointwise log densities are computed for each observation.
To make things simple, we will use a Gaussian model in each example. Suppose observations $Y = \{y_1,y_2,\dots y_n\}$ come from a Gaussian distribution with an uknown parameter $\mu$ and known parameter $\sigma=1$. The model can be stated as follows:
$\mu \sim \mathrm{normal}(0, 1)$
$Y \sim \mathrm{Normal}(\mu, 1)$
For Loop Method
One way to specify a model to correctly compute PSIS LOO is to iterate over the observations using a for loop, as follows:
using Turing
using ParetoSmooth
using Distributions