diff --git a/dev/.documenter-siteinfo.json b/dev/.documenter-siteinfo.json index c0c58dd..6ec0b29 100644 --- a/dev/.documenter-siteinfo.json +++ b/dev/.documenter-siteinfo.json @@ -1 +1 @@ -{"documenter":{"julia_version":"1.11.1","generation_timestamp":"2024-11-01T19:28:22","documenter_version":"1.7.0"}} \ No newline at end of file +{"documenter":{"julia_version":"1.11.1","generation_timestamp":"2024-12-01T21:14:55","documenter_version":"1.8.0"}} \ No newline at end of file diff --git a/dev/api/index.html b/dev/api/index.html index 829a538..6e9cbbf 100644 --- a/dev/api/index.html +++ b/dev/api/index.html @@ -1,6 +1,6 @@ API · PosteriorStats.jl

API

    Summary statistics

    PosteriorStats.SummaryStatsType
    struct SummaryStats{D, V<:(AbstractVector)}

    A container for a column table of values computed by summarize.

    This object implements the Tables and TableTraits column table interfaces. It has a custom show method.

    SummaryStats behaves like an OrderedDict of columns, where the columns can be accessed using either Symbols or a 1-based integer index.

    • name::String: The name of the collection of summary statistics, used as the table title in display.

    • data::Any: The summary statistics for each parameter. It must implement the Tables interface.

    • parameter_names::AbstractVector: Names of the parameters

    SummaryStats([name::String,] data[, parameter_names])
    -SummaryStats(data[, parameter_names]; name::String="SummaryStats")

    Construct a SummaryStats from tabular data with optional stats name and param_names.

    data must not contain a column :parameter, as this is reserved for the parameter names, which are always in the first column.

    source
    PosteriorStats.default_diagnosticsFunction
    default_diagnostics(focus=Statistics.mean; kwargs...)

    Default diagnostics to be computed with summarize.

    The value of focus determines the diagnostics to be returned:

    • Statistics.mean: mcse_mean, mcse_std, ess_tail, ess_bulk, rhat
    • Statistics.median: mcse_median, ess_tail, ess_bulk, rhat
    source
    PosteriorStats.default_statsFunction
    default_stats(focus=Statistics.mean; prob_interval=0.94, kwargs...)

    Default statistics to be computed with summarize.

    The value of focus determines the statistics to be returned:

    • Statistics.mean: mean, std, hdi_94%
    • Statistics.median: median, mad, eti_94%

    If prob_interval is set to a different value than the default, then different HDI and ETI statistics are computed accordingly. hdi refers to the highest-density interval, while eti refers to the equal-tailed interval.

    See also: hdi, eti

    source
    PosteriorStats.summarizeFunction
    summarize(data, stats_funs...; name="SummaryStats", [var_names]) -> SummaryStats

    Compute the summary statistics in stats_funs on each param in data.

    stats_funs is a collection of functions that reduces a matrix with shape (draws, chains) to a scalar or a collection of scalars. Alternatively, an item in stats_funs may be a Pair of the form name => fun specifying the name to be used for the statistic or of the form (name1, ...) => fun when the function returns a collection. When the function returns a collection, the names in this latter format must be provided.

    If no stats functions are provided, then those specified in default_summary_stats are computed.

    var_names specifies the names of the parameters in data. If not provided, the names are inferred from data.

    To support computing summary statistics from a custom object, overload this method specifying the type of data.

    See also SummaryStats, default_summary_stats, default_stats, default_diagnostics.

    Examples

    Compute mean, std and the Monte Carlo standard error (MCSE) of the mean estimate:

    julia> using Statistics, StatsBase
    +SummaryStats(data[, parameter_names]; name::String="SummaryStats")

    Construct a SummaryStats from tabular data with optional stats name and param_names.

    data must not contain a column :parameter, as this is reserved for the parameter names, which are always in the first column.

    source
    PosteriorStats.default_diagnosticsFunction
    default_diagnostics(focus=Statistics.mean; kwargs...)

    Default diagnostics to be computed with summarize.

    The value of focus determines the diagnostics to be returned:

    • Statistics.mean: mcse_mean, mcse_std, ess_tail, ess_bulk, rhat
    • Statistics.median: mcse_median, ess_tail, ess_bulk, rhat
    source
    PosteriorStats.default_statsFunction
    default_stats(focus=Statistics.mean; prob_interval=0.94, kwargs...)

    Default statistics to be computed with summarize.

    The value of focus determines the statistics to be returned:

    • Statistics.mean: mean, std, hdi_94%
    • Statistics.median: median, mad, eti_94%

    If prob_interval is set to a different value than the default, then different HDI and ETI statistics are computed accordingly. hdi refers to the highest-density interval, while eti refers to the equal-tailed interval.

    See also: hdi, eti

    source
    PosteriorStats.summarizeFunction
    summarize(data, stats_funs...; name="SummaryStats", [var_names]) -> SummaryStats

    Compute the summary statistics in stats_funs on each param in data.

    stats_funs is a collection of functions that reduces a matrix with shape (draws, chains) to a scalar or a collection of scalars. Alternatively, an item in stats_funs may be a Pair of the form name => fun specifying the name to be used for the statistic or of the form (name1, ...) => fun when the function returns a collection. When the function returns a collection, the names in this latter format must be provided.

    If no stats functions are provided, then those specified in default_summary_stats are computed.

    var_names specifies the names of the parameters in data. If not provided, the names are inferred from data.

    To support computing summary statistics from a custom object, overload this method specifying the type of data.

    See also SummaryStats, default_summary_stats, default_stats, default_diagnostics.

    Examples

    Compute mean, std and the Monte Carlo standard error (MCSE) of the mean estimate:

    julia> using Statistics, StatsBase
     
     julia> x = randn(1000, 4, 3) .+ reshape(0:10:20, 1, 1, :);
     
    @@ -37,7 +37,7 @@
            q5     q25       q50     q75    q95
      1  -1.61  -0.668   0.00447   0.653   1.64
      2   8.41   9.34   10.0      10.7    11.6
    - 3  18.4   19.3    20.0      20.6    21.6
    source

    Credible intervals

    PosteriorStats.hdiFunction
    hdi(samples::AbstractVecOrMat{<:Real}; [prob, sorted, method]) -> IntervalSets.ClosedInterval
    + 3  18.4   19.3    20.0      20.6    21.6
    source

    Credible intervals

    PosteriorStats.hdiFunction
    hdi(samples::AbstractVecOrMat{<:Real}; [prob, sorted, method]) -> IntervalSets.ClosedInterval
     hdi(samples::AbstractArray{<:Real}; [prob, sorted, method]) -> Array{<:IntervalSets.ClosedInterval}

    Estimate the highest density interval (HDI) of samples for the probability prob.

    The HDI is the minimum width Bayesian credible interval (BCI). That is, it is the smallest possible interval containing (100*prob)% of the probability mass.[Hyndman1996] This implementation uses the algorithm of [ChenShao1999].

    See also: hdi!, eti, eti!.

    Arguments

    • samples: an array of shape (draws[, chains[, params...]]). If multiple parameters are present, a marginal HDI is computed for each.

    Keywords

    • prob: the probability mass to be contained in the HDI. Default is 0.94.
    • sorted=false: if true, the input samples are assumed to be sorted.
    • method::Symbol: the method used to estimate the HDI. Available options are:
      • :unimodal: Assumes a unimodal distribution (default). Bounds are entries in samples.
      • :multimodal: Fits a density estimator to samples and finds the HDI of the estimated density. For continuous data, the density estimator is a kernel density estimate (KDE) computed using kde_reflected. For discrete data, a histogram with bin width 1 is used.
      • :multimodal_sample: Like :multimodal, but uses the density estimator to find the entries in samples with the highest density and computes the HDI from those points. This can correct for inaccuracies in the density estimator.
    • is_discrete::Union{Bool,Nothing}=nothing: Specify if the data is discrete (integer-valued). If nothing, it's automatically determined.
    • kwargs: For continuous data and multimodal methods, remaining keywords are forwarded to kde_reflected.

    Returns

    • intervals: If samples is a vector or matrix, then a single IntervalSets.ClosedInterval is returned for :unimodal method, or a vector of IntervalSets.ClosedInterval for multimodal methods. For higher dimensional inputs, an array with the shape (params...,) is returned, containing marginal HDIs for each parameter.
    Note

    Any default value of prob is arbitrary. The default value of prob=0.94 instead of a more common default like prob=0.95 is chosen to remind the user of this arbitrariness.

    Examples

    Here we calculate the 83% HDI for a normal random variable:

    julia> x = randn(2_000);
     
     julia> hdi(x; prob=0.83)
    @@ -52,7 +52,7 @@
     julia> hdi(x; method=:multimodal)
     2-element Vector{IntervalSets.ClosedInterval{Float64}}:
      -1.8882401079608677 .. 2.0017686164555037
    - 2.9839268475847436 .. 6.9235952578447275
    source
    PosteriorStats.hdi!Function
    hdi!(samples::AbstractArray{<:Real}; [prob, sorted])

    A version of hdi that partially sorts samples in-place while computing the HDI.

    See also: hdi, eti, eti!.

    source
    PosteriorStats.etiFunction
    eti(samples::AbstractVecOrMat{<:Real}; [prob, kwargs...]) -> IntervalSets.ClosedInterval
    + 2.9839268475847436 .. 6.9235952578447275
    source
    PosteriorStats.hdi!Function
    hdi!(samples::AbstractArray{<:Real}; [prob, sorted])

    A version of hdi that partially sorts samples in-place while computing the HDI.

    See also: hdi, eti, eti!.

    source
    PosteriorStats.etiFunction
    eti(samples::AbstractVecOrMat{<:Real}; [prob, kwargs...]) -> IntervalSets.ClosedInterval
     eti(samples::AbstractArray{<:Real}; [prob, kwargs...]) -> Array{<:IntervalSets.ClosedInterval}

    Estimate the equal-tailed interval (ETI) of samples for the probability prob.

    The ETI of a given probability is the credible interval wih the property that the probability of being below the interval is equal to the probability of being above it. That is, it is defined by the (1-prob)/2 and 1 - (1-prob)/2 quantiles of the samples.

    See also: eti!, hdi, hdi!.

    Arguments

    • samples: an array of shape (draws[, chains[, params...]]). If multiple parameters are present

    Keywords

    • prob: the probability mass to be contained in the ETI. Default is 0.94.
    • kwargs: remaining keywords are passed to Statistics.quantile.

    Returns

    • intervals: If samples is a vector or matrix, then a single IntervalSets.ClosedInterval is returned. Otherwise, an array with the shape (params...,), is returned, containing a marginal ETI for each parameter.
    Note

    Any default value of prob is arbitrary. The default value of prob=0.94 instead of a more common default like prob=0.95 is chosen to reminder the user of this arbitrariness.

    Examples

    Here we calculate the 83% ETI for a normal random variable:

    julia> x = randn(2_000);
     
     julia> eti(x; prob=0.83)
    @@ -62,7 +62,7 @@
     3-element Vector{IntervalSets.ClosedInterval{Float64}}:
      -1.951006825019686 .. 1.9011666217153793
      3.048993174980314 .. 6.9011666217153795
    - 8.048993174980314 .. 11.90116662171538
    source
    PosteriorStats.eti!Function
    eti!(samples::AbstractArray{<:Real}; [prob, kwargs...])

    A version of eti that partially sorts samples in-place while computing the ETI.

    See also: eti, hdi, hdi!.

    source

    LOO and WAIC

    PosteriorStats.AbstractELPDResultType
    abstract type AbstractELPDResult

    An abstract type representing the result of an ELPD computation.

    Every subtype stores estimates of both the expected log predictive density (elpd) and the effective number of parameters p, as well as standard errors and pointwise estimates of each, from which other relevant estimates can be computed.

    Subtypes implement the following functions:

    source
    PosteriorStats.PSISLOOResultType

    Results of Pareto-smoothed importance sampling leave-one-out cross-validation (PSIS-LOO).

    See also: loo, AbstractELPDResult

    • estimates: Estimates of the expected log pointwise predictive density (ELPD) and effective number of parameters (p)

    • pointwise: Pointwise estimates

    • psis_result: Pareto-smoothed importance sampling (PSIS) results

    source
    PosteriorStats.WAICResultType

    Results of computing the widely applicable information criterion (WAIC).

    See also: waic, AbstractELPDResult

    • estimates: Estimates of the expected log pointwise predictive density (ELPD) and effective number of parameters (p)

    • pointwise: Pointwise estimates

    source
    PosteriorStats.elpd_estimatesFunction
    elpd_estimates(result::AbstractELPDResult; pointwise=false) -> (; elpd, elpd_mcse, lpd)

    Return the (E)LPD estimates from the result.

    source
    PosteriorStats.information_criterionFunction
    information_criterion(elpd, scale::Symbol)

    Compute the information criterion for the given scale from the elpd estimate.

    scale must be one of (:deviance, :log, :negative_log).

    See also: loo, waic

    source
    information_criterion(result::AbstractELPDResult, scale::Symbol; pointwise=false)

    Compute information criterion for the given scale from the existing ELPD result.

    scale must be one of (:deviance, :log, :negative_log).

    If pointwise=true, then pointwise estimates are returned.

    source
    PosteriorStats.looFunction
    loo(log_likelihood; reff=nothing, kwargs...) -> PSISLOOResult{<:NamedTuple,<:NamedTuple}

    Compute the Pareto-smoothed importance sampling leave-one-out cross-validation (PSIS-LOO). [Vehtari2017][LOOFAQ]

    log_likelihood must be an array of log-likelihood values with shape (chains, draws[, params...]).

    Keywords

    • reff::Union{Real,AbstractArray{<:Real}}: The relative effective sample size(s) of the likelihood values. If an array, it must have the same data dimensions as the corresponding log-likelihood variable. If not provided, then this is estimated using MCMCDiagnosticTools.ess.
    • kwargs: Remaining keywords are forwarded to [PSIS.psis].

    See also: PSISLOOResult, waic

    Examples

    Manually compute $R_\mathrm{eff}$ and calculate PSIS-LOO of a model:

    julia> using ArviZExampleData, MCMCDiagnosticTools
    + 8.048993174980314 .. 11.90116662171538
    source
    PosteriorStats.eti!Function
    eti!(samples::AbstractArray{<:Real}; [prob, kwargs...])

    A version of eti that partially sorts samples in-place while computing the ETI.

    See also: eti, hdi, hdi!.

    source

    LOO and WAIC

    PosteriorStats.AbstractELPDResultType
    abstract type AbstractELPDResult

    An abstract type representing the result of an ELPD computation.

    Every subtype stores estimates of both the expected log predictive density (elpd) and the effective number of parameters p, as well as standard errors and pointwise estimates of each, from which other relevant estimates can be computed.

    Subtypes implement the following functions:

    source
    PosteriorStats.PSISLOOResultType

    Results of Pareto-smoothed importance sampling leave-one-out cross-validation (PSIS-LOO).

    See also: loo, AbstractELPDResult

    • estimates: Estimates of the expected log pointwise predictive density (ELPD) and effective number of parameters (p)

    • pointwise: Pointwise estimates

    • psis_result: Pareto-smoothed importance sampling (PSIS) results

    source
    PosteriorStats.WAICResultType

    Results of computing the widely applicable information criterion (WAIC).

    See also: waic, AbstractELPDResult

    • estimates: Estimates of the expected log pointwise predictive density (ELPD) and effective number of parameters (p)

    • pointwise: Pointwise estimates

    source
    PosteriorStats.elpd_estimatesFunction
    elpd_estimates(result::AbstractELPDResult; pointwise=false) -> (; elpd, elpd_mcse, lpd)

    Return the (E)LPD estimates from the result.

    source
    PosteriorStats.information_criterionFunction
    information_criterion(elpd, scale::Symbol)

    Compute the information criterion for the given scale from the elpd estimate.

    scale must be one of (:deviance, :log, :negative_log).

    See also: loo, waic

    source
    information_criterion(result::AbstractELPDResult, scale::Symbol; pointwise=false)

    Compute information criterion for the given scale from the existing ELPD result.

    scale must be one of (:deviance, :log, :negative_log).

    If pointwise=true, then pointwise estimates are returned.

    source
    PosteriorStats.looFunction
    loo(log_likelihood; reff=nothing, kwargs...) -> PSISLOOResult{<:NamedTuple,<:NamedTuple}

    Compute the Pareto-smoothed importance sampling leave-one-out cross-validation (PSIS-LOO). [Vehtari2017][LOOFAQ]

    log_likelihood must be an array of log-likelihood values with shape (chains, draws[, params...]).

    Keywords

    • reff::Union{Real,AbstractArray{<:Real}}: The relative effective sample size(s) of the likelihood values. If an array, it must have the same data dimensions as the corresponding log-likelihood variable. If not provided, then this is estimated using MCMCDiagnosticTools.ess.
    • kwargs: Remaining keywords are forwarded to [PSIS.psis].

    See also: PSISLOOResult, waic

    Examples

    Manually compute $R_\mathrm{eff}$ and calculate PSIS-LOO of a model:

    julia> using ArviZExampleData, MCMCDiagnosticTools
     
     julia> idata = load_example_data("centered_eight");
     
    @@ -79,7 +79,7 @@
     Pareto shape (k) diagnostic values:
                         Count      Min. ESS
      (-Inf, 0.5]  good  7 (87.5%)  151
    -  (0.5, 0.7]  okay  1 (12.5%)  446
    source
    PosteriorStats.waicFunction
    waic(log_likelihood::AbstractArray) -> WAICResult{<:NamedTuple,<:NamedTuple}

    Compute the widely applicable information criterion (WAIC).[Watanabe2010][Vehtari2017][LOOFAQ]

    log_likelihood must be an array of log-likelihood values with shape (chains, draws[, params...]).

    See also: WAICResult, loo

    Examples

    Calculate WAIC of a model:

    julia> using ArviZExampleData
    +  (0.5, 0.7]  okay  1 (12.5%)  446
    source
    PosteriorStats.waicFunction
    waic(log_likelihood::AbstractArray) -> WAICResult{<:NamedTuple,<:NamedTuple}

    Compute the widely applicable information criterion (WAIC).[Watanabe2010][Vehtari2017][LOOFAQ]

    log_likelihood must be an array of log-likelihood values with shape (chains, draws[, params...]).

    See also: WAICResult, loo

    Examples

    Calculate WAIC of a model:

    julia> using ArviZExampleData
     
     julia> idata = load_example_data("centered_eight");
     
    @@ -88,7 +88,7 @@
     julia> waic(log_like)
     WAICResult with estimates
      elpd  elpd_mcse    p  p_mcse
    -  -31        1.4  0.9    0.33
    source

    Model comparison

    PosteriorStats.ModelComparisonResultType
    ModelComparisonResult

    Result of model comparison using ELPD.

    This struct implements the Tables and TableTraits interfaces.

    Each field returns a collection of the corresponding entry for each model:

    • name: Names of the models, if provided.

    • rank: Ranks of the models (ordered by decreasing ELPD)

    • elpd_diff: ELPD of a model subtracted from the largest ELPD of any model

    • elpd_diff_mcse: Monte Carlo standard error of the ELPD difference

    • weight: Model weights computed with weights_method

    • elpd_result: AbstactELPDResults for each model, which can be used to access useful stats like ELPD estimates, pointwise estimates, and Pareto shape values for PSIS-LOO

    • weights_method: Method used to compute model weights with model_weights

    source
    PosteriorStats.compareFunction
    compare(models; kwargs...) -> ModelComparisonResult

    Compare models based on their expected log pointwise predictive density (ELPD).

    The ELPD is estimated either by Pareto smoothed importance sampling leave-one-out cross-validation (LOO) or using the widely applicable information criterion (WAIC). We recommend loo. Read more theory here - in a paper by some of the leading authorities on model comparison dx.doi.org/10.1111/1467-9868.00353

    Arguments

    • models: a Tuple, NamedTuple, or AbstractVector whose values are either AbstractELPDResult entries or any argument to elpd_method.

    Keywords

    • weights_method::AbstractModelWeightsMethod=Stacking(): the method to be used to weight the models. See model_weights for details
    • elpd_method=loo: a method that computes an AbstractELPDResult from an argument in models.
    • sort::Bool=true: Whether to sort models by decreasing ELPD.

    Returns

    • ModelComparisonResult: A container for the model comparison results. The fields contain a similar collection to models.

    Examples

    Compare the centered and non centered models of the eight school problem using the defaults: loo and Stacking weights. A custom myloo method formates the inputs as expected by loo.

    julia> using ArviZExampleData
    +  -31        1.4  0.9    0.33
    source

    Model comparison

    PosteriorStats.ModelComparisonResultType
    ModelComparisonResult

    Result of model comparison using ELPD.

    This struct implements the Tables and TableTraits interfaces.

    Each field returns a collection of the corresponding entry for each model:

    • name: Names of the models, if provided.

    • rank: Ranks of the models (ordered by decreasing ELPD)

    • elpd_diff: ELPD of a model subtracted from the largest ELPD of any model

    • elpd_diff_mcse: Monte Carlo standard error of the ELPD difference

    • weight: Model weights computed with weights_method

    • elpd_result: AbstactELPDResults for each model, which can be used to access useful stats like ELPD estimates, pointwise estimates, and Pareto shape values for PSIS-LOO

    • weights_method: Method used to compute model weights with model_weights

    source
    PosteriorStats.compareFunction
    compare(models; kwargs...) -> ModelComparisonResult

    Compare models based on their expected log pointwise predictive density (ELPD).

    The ELPD is estimated either by Pareto smoothed importance sampling leave-one-out cross-validation (LOO) or using the widely applicable information criterion (WAIC). We recommend loo. Read more theory here - in a paper by some of the leading authorities on model comparison dx.doi.org/10.1111/1467-9868.00353

    Arguments

    • models: a Tuple, NamedTuple, or AbstractVector whose values are either AbstractELPDResult entries or any argument to elpd_method.

    Keywords

    • weights_method::AbstractModelWeightsMethod=Stacking(): the method to be used to weight the models. See model_weights for details
    • elpd_method=loo: a method that computes an AbstractELPDResult from an argument in models.
    • sort::Bool=true: Whether to sort models by decreasing ELPD.

    Returns

    • ModelComparisonResult: A container for the model comparison results. The fields contain a similar collection to models.

    Examples

    Compare the centered and non centered models of the eight school problem using the defaults: loo and Stacking weights. A custom myloo method formates the inputs as expected by loo.

    julia> using ArviZExampleData
     
     julia> models = (
                centered=load_example_data("centered_eight"),
    @@ -118,7 +118,7 @@
                    rank  elpd  elpd_mcse  elpd_diff  elpd_diff_mcse  weight    p   ⋯
      non_centered     1   -31        1.4       0              0.0      0.52  0.9   ⋯
      centered         2   -31        1.4       0.06           0.067    0.48  0.9   ⋯
    -                                                                1 column omitted
    source
    PosteriorStats.model_weightsFunction
    model_weights(elpd_results; method=Stacking())
     model_weights(method::AbstractModelWeightsMethod, elpd_results)

    Compute weights for each model in elpd_results using method.

    elpd_results is a Tuple, NamedTuple, or AbstractVector with AbstractELPDResult entries. The weights are returned in the same type of collection.

    Stacking is the recommended approach, as it performs well even when the true data generating process is not included among the candidate models. See [YaoVehtari2018] for details.

    See also: AbstractModelWeightsMethod, compare

    Examples

    Compute Stacking weights for two models:

    julia> using ArviZExampleData
     
     julia> models = (
    @@ -139,10 +139,10 @@
       :non_centered => 1.0

    Now we compute BootstrappedPseudoBMA weights for the same models:

    julia> model_weights(elpd_results; method=BootstrappedPseudoBMA()) |> pairs
     pairs(::NamedTuple) with 2 entries:
       :centered     => 0.483727
    -  :non_centered => 0.516273
    source

    The following model weighting methods are available

    PosteriorStats.BootstrappedPseudoBMAType
    struct BootstrappedPseudoBMA{R<:Random.AbstractRNG, T<:Real} <: AbstractModelWeightsMethod

    Model weighting method using pseudo Bayesian Model Averaging using Akaike-type weighting with the Bayesian bootstrap (pseudo-BMA+)[YaoVehtari2018].

    The Bayesian bootstrap stabilizes the model weights.

    BootstrappedPseudoBMA(; rng=Random.default_rng(), samples=1_000, alpha=1)
    -BootstrappedPseudoBMA(rng, samples, alpha)

    Construct the method.

    • rng::Random.AbstractRNG: The random number generator to use for the Bayesian bootstrap

    • samples::Int64: The number of samples to draw for bootstrapping

    • alpha::Real: The shape parameter in the Dirichlet distribution used for the Bayesian bootstrap. The default (1) corresponds to a uniform distribution on the simplex.

    See also: Stacking

    source
    PosteriorStats.PseudoBMAType
    struct PseudoBMA <: AbstractModelWeightsMethod

    Model weighting method using pseudo Bayesian Model Averaging (pseudo-BMA) and Akaike-type weighting.

    PseudoBMA(; regularize=false)
    -PseudoBMA(regularize)

    Construct the method with optional regularization of the weights using the standard error of the ELPD estimate.

    Note

    This approach is not recommended, as it produces unstable weight estimates. It is recommended to instead use BootstrappedPseudoBMA to stabilize the weights or Stacking. For details, see [YaoVehtari2018].

    See also: Stacking

    source
    PosteriorStats.StackingType
    struct Stacking{O<:Optim.AbstractOptimizer} <: AbstractModelWeightsMethod

    Model weighting using stacking of predictive distributions[YaoVehtari2018].

    Stacking(; optimizer=Optim.LBFGS(), options=Optim.Options()
    -Stacking(optimizer[, options])

    Construct the method, optionally customizing the optimization.

    • optimizer::Optim.AbstractOptimizer: The optimizer to use for the optimization of the weights. The optimizer must support projected gradient optimization via a manifold field.

    • options::Optim.Options: The Optim options to use for the optimization of the weights.

    See also: BootstrappedPseudoBMA

    source

    Predictive checks

    PosteriorStats.loo_pitFunction
    loo_pit(y, y_pred, log_weights; kwargs...) -> Union{Real,AbstractArray}

    Compute leave-one-out probability integral transform (LOO-PIT) checks.

    Arguments

    • y: array of observations with shape (params...,)
    • y_pred: array of posterior predictive samples with shape (draws, chains, params...).
    • log_weights: array of normalized log LOO importance weights with shape (draws, chains, params...).

    Keywords

    • is_discrete: If not provided, then it is set to true iff elements of y and y_pred are all integer-valued. If true, then data are smoothed using smooth_data to make them non-discrete before estimating LOO-PIT values.
    • kwargs: Remaining keywords are forwarded to smooth_data if data is discrete.

    Returns

    • pitvals: LOO-PIT values with same size as y. If y is a scalar, then pitvals is a scalar.

    LOO-PIT is a marginal posterior predictive check. If $y_{-i}$ is the array $y$ of observations with the $i$th observation left out, and $y_i^*$ is a posterior prediction of the $i$th observation, then the LOO-PIT value for the $i$th observation is defined as

    \[P(y_i^* \le y_i \mid y_{-i}) = \int_{-\infty}^{y_i} p(y_i^* \mid y_{-i}) \mathrm{d} y_i^*\]

    The LOO posterior predictions and the corresponding observations should have similar distributions, so if conditional predictive distributions are well-calibrated, then all LOO-PIT values should be approximately uniformly distributed on $[0, 1]$.[Gabry2019]

    Examples

    Calculate LOO-PIT values using as test quantity the observed values themselves.

    julia> using ArviZExampleData
    +  :non_centered => 0.516273
    source

    The following model weighting methods are available

    PosteriorStats.BootstrappedPseudoBMAType
    struct BootstrappedPseudoBMA{R<:Random.AbstractRNG, T<:Real} <: AbstractModelWeightsMethod

    Model weighting method using pseudo Bayesian Model Averaging using Akaike-type weighting with the Bayesian bootstrap (pseudo-BMA+)[YaoVehtari2018].

    The Bayesian bootstrap stabilizes the model weights.

    BootstrappedPseudoBMA(; rng=Random.default_rng(), samples=1_000, alpha=1)
    +BootstrappedPseudoBMA(rng, samples, alpha)

    Construct the method.

    • rng::Random.AbstractRNG: The random number generator to use for the Bayesian bootstrap

    • samples::Int64: The number of samples to draw for bootstrapping

    • alpha::Real: The shape parameter in the Dirichlet distribution used for the Bayesian bootstrap. The default (1) corresponds to a uniform distribution on the simplex.

    See also: Stacking

    source
    PosteriorStats.PseudoBMAType
    struct PseudoBMA <: AbstractModelWeightsMethod

    Model weighting method using pseudo Bayesian Model Averaging (pseudo-BMA) and Akaike-type weighting.

    PseudoBMA(; regularize=false)
    +PseudoBMA(regularize)

    Construct the method with optional regularization of the weights using the standard error of the ELPD estimate.

    Note

    This approach is not recommended, as it produces unstable weight estimates. It is recommended to instead use BootstrappedPseudoBMA to stabilize the weights or Stacking. For details, see [YaoVehtari2018].

    See also: Stacking

    source
    PosteriorStats.StackingType
    struct Stacking{O<:Optim.AbstractOptimizer} <: AbstractModelWeightsMethod

    Model weighting using stacking of predictive distributions[YaoVehtari2018].

    Stacking(; optimizer=Optim.LBFGS(), options=Optim.Options()
    +Stacking(optimizer[, options])

    Construct the method, optionally customizing the optimization.

    • optimizer::Optim.AbstractOptimizer: The optimizer to use for the optimization of the weights. The optimizer must support projected gradient optimization via a manifold field.

    • options::Optim.Options: The Optim options to use for the optimization of the weights.

    See also: BootstrappedPseudoBMA

    source

    Predictive checks

    PosteriorStats.loo_pitFunction
    loo_pit(y, y_pred, log_weights; kwargs...) -> Union{Real,AbstractArray}

    Compute leave-one-out probability integral transform (LOO-PIT) checks.

    Arguments

    • y: array of observations with shape (params...,)
    • y_pred: array of posterior predictive samples with shape (draws, chains, params...).
    • log_weights: array of normalized log LOO importance weights with shape (draws, chains, params...).

    Keywords

    • is_discrete: If not provided, then it is set to true iff elements of y and y_pred are all integer-valued. If true, then data are smoothed using smooth_data to make them non-discrete before estimating LOO-PIT values.
    • kwargs: Remaining keywords are forwarded to smooth_data if data is discrete.

    Returns

    • pitvals: LOO-PIT values with same size as y. If y is a scalar, then pitvals is a scalar.

    LOO-PIT is a marginal posterior predictive check. If $y_{-i}$ is the array $y$ of observations with the $i$th observation left out, and $y_i^*$ is a posterior prediction of the $i$th observation, then the LOO-PIT value for the $i$th observation is defined as

    \[P(y_i^* \le y_i \mid y_{-i}) = \int_{-\infty}^{y_i} p(y_i^* \mid y_{-i}) \mathrm{d} y_i^*\]

    The LOO posterior predictions and the corresponding observations should have similar distributions, so if conditional predictive distributions are well-calibrated, then all LOO-PIT values should be approximately uniformly distributed on $[0, 1]$.[Gabry2019]

    Examples

    Calculate LOO-PIT values using as test quantity the observed values themselves.

    julia> using ArviZExampleData
     
     julia> idata = load_example_data("centered_eight");
     
    @@ -182,7 +182,7 @@
      "Hotchkiss"         0.435094
      "Lawrenceville"     0.220627
      "St. Paul's"        0.775086
    - "Mt. Hermon"        0.296706
    source
    PosteriorStats.r2_scoreFunction
    r2_score(y_true::AbstractVector, y_pred::AbstractArray) -> (; r2, r2_std)

    $R²$ for linear Bayesian regression models.[GelmanGoodrich2019]

    Arguments

    • y_true: Observed data of length noutputs
    • y_pred: Predicted data with size (ndraws[, nchains], noutputs)

    Examples

    julia> using ArviZExampleData
    + "Mt. Hermon"        0.296706
    source
    PosteriorStats.r2_scoreFunction
    r2_score(y_true::AbstractVector, y_pred::AbstractArray) -> (; r2, r2_std)

    $R²$ for linear Bayesian regression models.[GelmanGoodrich2019]

    Arguments

    • y_true: Observed data of length noutputs
    • y_pred: Predicted data with size (ndraws[, nchains], noutputs)

    Examples

    julia> using ArviZExampleData
     
     julia> idata = load_example_data("regression1d");
     
    @@ -193,4 +193,4 @@
     julia> r2_score(y_true, y_pred) |> pairs
     pairs(::NamedTuple) with 2 entries:
       :r2     => 0.683197
    -  :r2_std => 0.0368838
    source

    Utilities

    PosteriorStats.kde_reflectedFunction
    kde_reflected(data::AbstractVector{<:Real}; bounds=extrema(data), kwargs...)

    Compute the boundary-corrected kernel density estimate (KDE) of data using reflection.

    For $x \in (l, u)$, the reflected KDE has the density

    \[\hat{f}_R(x) = \hat{f}(x) + \hat{f}(2l - x) + \hat{f}(2u - x),\]

    where $\hat{f}$ is the usual KDE of data. This is equivalent to augmenting the original data with 2 additional copies of the data reflected around each bound, computing the usual KDE, trimming the KDE to the bounds, and renormalizing.

    Any non-finite bounds are ignored. Remaining kwargs are passed to KernelDensity.kde. The default bandwidth is estimated using the Improved Sheather-Jones (ISJ) method [Botev2010].

    source
    PosteriorStats.smooth_dataFunction
    smooth_data(y; dims=:, interp_method=CubicSpline, offset_frac=0.01)

    Smooth y along dims using interp_method.

    interp_method is a 2-argument callabale that takes the arguments y and x and returns a DataInterpolations.jl interpolation method, defaulting to a cubic spline interpolator.

    offset_frac is the fraction of the length of y to use as an offset when interpolating.

    source
    • Hyndman1996Rob J. Hyndman (1996) Computing and Graphing Highest Density Regions, Amer. Stat., 50(2): 120-6. DOI: 10.1080/00031305.1996.10474359 jstor.
    • ChenShao1999Ming-Hui Chen & Qi-Man Shao (1999) Monte Carlo Estimation of Bayesian Credible and HPD Intervals, J Comput. Graph. Stat., 8:1, 69-92. DOI: 10.1080/10618600.1999.10474802 jstor.
    • Vehtari2017Vehtari, A., Gelman, A. & Gabry, J. Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC. Stat Comput 27, 1413–1432 (2017). doi: 10.1007/s11222-016-9696-4 arXiv: 1507.04544
    • LOOFAQAki Vehtari. Cross-validation FAQ. https://mc-stan.org/loo/articles/online-only/faq.html
    • Watanabe2010Watanabe, S. Asymptotic Equivalence of Bayes Cross Validation and Widely Applicable Information Criterion in Singular Learning Theory. 11(116):3571−3594, 2010. https://jmlr.csail.mit.edu/papers/v11/watanabe10a.html
    • Vehtari2017Vehtari, A., Gelman, A. & Gabry, J. Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC. Stat Comput 27, 1413–1432 (2017). doi: 10.1007/s11222-016-9696-4 arXiv: 1507.04544
    • LOOFAQAki Vehtari. Cross-validation FAQ. https://mc-stan.org/loo/articles/online-only/faq.html
    • YaoVehtari2018Yuling Yao, Aki Vehtari, Daniel Simpson, and Andrew Gelman. Using Stacking to Average Bayesian Predictive Distributions. 2018. Bayesian Analysis. 13, 3, 917–1007. doi: 10.1214/17-BA1091 arXiv: 1704.02030
    • YaoVehtari2018Yuling Yao, Aki Vehtari, Daniel Simpson, and Andrew Gelman. Using Stacking to Average Bayesian Predictive Distributions. 2018. Bayesian Analysis. 13, 3, 917–1007. doi: 10.1214/17-BA1091 arXiv: 1704.02030
    • YaoVehtari2018Yuling Yao, Aki Vehtari, Daniel Simpson, and Andrew Gelman. Using Stacking to Average Bayesian Predictive Distributions. 2018. Bayesian Analysis. 13, 3, 917–1007. doi: 10.1214/17-BA1091 arXiv: 1704.02030
    • YaoVehtari2018Yuling Yao, Aki Vehtari, Daniel Simpson, and Andrew Gelman. Using Stacking to Average Bayesian Predictive Distributions. 2018. Bayesian Analysis. 13, 3, 917–1007. doi: 10.1214/17-BA1091 arXiv: 1704.02030
    • Gabry2019Gabry, J., Simpson, D., Vehtari, A., Betancourt, M. & Gelman, A. Visualization in Bayesian Workflow. J. R. Stat. Soc. Ser. A Stat. Soc. 182, 389–402 (2019). doi: 10.1111/rssa.12378 arXiv: 1709.01449
    • GelmanGoodrich2019Andrew Gelman, Ben Goodrich, Jonah Gabry & Aki Vehtari (2019) R-squared for Bayesian Regression Models, The American Statistician, 73:3, 307-9, DOI: 10.1080/00031305.2018.1549100.
    • Botev2010Kernel density estimation via diffusion. Z. I. Botev, J. F. Grotowski, and D. P. Kroese (2010) Annals of Statistics, Volume 38, Number 5, pages 2916-2957. doi: 10.1214/10-AOS799
    + :r2_std => 0.0368838source

    Utilities

    PosteriorStats.kde_reflectedFunction
    kde_reflected(data::AbstractVector{<:Real}; bounds=extrema(data), kwargs...)

    Compute the boundary-corrected kernel density estimate (KDE) of data using reflection.

    For $x \in (l, u)$, the reflected KDE has the density

    \[\hat{f}_R(x) = \hat{f}(x) + \hat{f}(2l - x) + \hat{f}(2u - x),\]

    where $\hat{f}$ is the usual KDE of data. This is equivalent to augmenting the original data with 2 additional copies of the data reflected around each bound, computing the usual KDE, trimming the KDE to the bounds, and renormalizing.

    Any non-finite bounds are ignored. Remaining kwargs are passed to KernelDensity.kde. The default bandwidth is estimated using the Improved Sheather-Jones (ISJ) method [Botev2010].

    source
    PosteriorStats.smooth_dataFunction
    smooth_data(y; dims=:, interp_method=CubicSpline, offset_frac=0.01)

    Smooth y along dims using interp_method.

    interp_method is a 2-argument callabale that takes the arguments y and x and returns a DataInterpolations.jl interpolation method, defaulting to a cubic spline interpolator.

    offset_frac is the fraction of the length of y to use as an offset when interpolating.

    source
    diff --git a/dev/assets/documenter.js b/dev/assets/documenter.js index 82252a1..7d68cd8 100644 --- a/dev/assets/documenter.js +++ b/dev/assets/documenter.js @@ -612,176 +612,194 @@ function worker_function(documenterSearchIndex, documenterBaseURL, filters) { }; } -// `worker = Threads.@spawn worker_function(documenterSearchIndex)`, but in JavaScript! -const filters = [ - ...new Set(documenterSearchIndex["docs"].map((x) => x.category)), -]; -const worker_str = - "(" + - worker_function.toString() + - ")(" + - JSON.stringify(documenterSearchIndex["docs"]) + - "," + - JSON.stringify(documenterBaseURL) + - "," + - JSON.stringify(filters) + - ")"; -const worker_blob = new Blob([worker_str], { type: "text/javascript" }); -const worker = new Worker(URL.createObjectURL(worker_blob)); - /////// SEARCH MAIN /////// -// Whether the worker is currently handling a search. This is a boolean -// as the worker only ever handles 1 or 0 searches at a time. -var worker_is_running = false; - -// The last search text that was sent to the worker. This is used to determine -// if the worker should be launched again when it reports back results. -var last_search_text = ""; - -// The results of the last search. This, in combination with the state of the filters -// in the DOM, is used compute the results to display on calls to update_search. -var unfiltered_results = []; - -// Which filter is currently selected -var selected_filter = ""; - -$(document).on("input", ".documenter-search-input", function (event) { - if (!worker_is_running) { - launch_search(); - } -}); - -function launch_search() { - worker_is_running = true; - last_search_text = $(".documenter-search-input").val(); - worker.postMessage(last_search_text); -} - -worker.onmessage = function (e) { - if (last_search_text !== $(".documenter-search-input").val()) { - launch_search(); - } else { - worker_is_running = false; - } - - unfiltered_results = e.data; - update_search(); -}; +function runSearchMainCode() { + // `worker = Threads.@spawn worker_function(documenterSearchIndex)`, but in JavaScript! + const filters = [ + ...new Set(documenterSearchIndex["docs"].map((x) => x.category)), + ]; + const worker_str = + "(" + + worker_function.toString() + + ")(" + + JSON.stringify(documenterSearchIndex["docs"]) + + "," + + JSON.stringify(documenterBaseURL) + + "," + + JSON.stringify(filters) + + ")"; + const worker_blob = new Blob([worker_str], { type: "text/javascript" }); + const worker = new Worker(URL.createObjectURL(worker_blob)); + + // Whether the worker is currently handling a search. This is a boolean + // as the worker only ever handles 1 or 0 searches at a time. + var worker_is_running = false; + + // The last search text that was sent to the worker. This is used to determine + // if the worker should be launched again when it reports back results. + var last_search_text = ""; + + // The results of the last search. This, in combination with the state of the filters + // in the DOM, is used compute the results to display on calls to update_search. + var unfiltered_results = []; + + // Which filter is currently selected + var selected_filter = ""; + + $(document).on("input", ".documenter-search-input", function (event) { + if (!worker_is_running) { + launch_search(); + } + }); -$(document).on("click", ".search-filter", function () { - if ($(this).hasClass("search-filter-selected")) { - selected_filter = ""; - } else { - selected_filter = $(this).text().toLowerCase(); + function launch_search() { + worker_is_running = true; + last_search_text = $(".documenter-search-input").val(); + worker.postMessage(last_search_text); } - // This updates search results and toggles classes for UI: - update_search(); -}); + worker.onmessage = function (e) { + if (last_search_text !== $(".documenter-search-input").val()) { + launch_search(); + } else { + worker_is_running = false; + } -/** - * Make/Update the search component - */ -function update_search() { - let querystring = $(".documenter-search-input").val(); + unfiltered_results = e.data; + update_search(); + }; - if (querystring.trim()) { - if (selected_filter == "") { - results = unfiltered_results; + $(document).on("click", ".search-filter", function () { + if ($(this).hasClass("search-filter-selected")) { + selected_filter = ""; } else { - results = unfiltered_results.filter((result) => { - return selected_filter == result.category.toLowerCase(); - }); + selected_filter = $(this).text().toLowerCase(); } - let search_result_container = ``; - let modal_filters = make_modal_body_filters(); - let search_divider = `
    `; + // This updates search results and toggles classes for UI: + update_search(); + }); - if (results.length) { - let links = []; - let count = 0; - let search_results = ""; - - for (var i = 0, n = results.length; i < n && count < 200; ++i) { - let result = results[i]; - if (result.location && !links.includes(result.location)) { - search_results += result.div; - count++; - links.push(result.location); - } - } + /** + * Make/Update the search component + */ + function update_search() { + let querystring = $(".documenter-search-input").val(); - if (count == 1) { - count_str = "1 result"; - } else if (count == 200) { - count_str = "200+ results"; + if (querystring.trim()) { + if (selected_filter == "") { + results = unfiltered_results; } else { - count_str = count + " results"; + results = unfiltered_results.filter((result) => { + return selected_filter == result.category.toLowerCase(); + }); } - let result_count = `
    ${count_str}
    `; - search_result_container = ` + let search_result_container = ``; + let modal_filters = make_modal_body_filters(); + let search_divider = `
    `; + + if (results.length) { + let links = []; + let count = 0; + let search_results = ""; + + for (var i = 0, n = results.length; i < n && count < 200; ++i) { + let result = results[i]; + if (result.location && !links.includes(result.location)) { + search_results += result.div; + count++; + links.push(result.location); + } + } + + if (count == 1) { + count_str = "1 result"; + } else if (count == 200) { + count_str = "200+ results"; + } else { + count_str = count + " results"; + } + let result_count = `
    ${count_str}
    `; + + search_result_container = ` +
    + ${modal_filters} + ${search_divider} + ${result_count} +
    + ${search_results} +
    +
    + `; + } else { + search_result_container = `
    ${modal_filters} ${search_divider} - ${result_count} -
    - ${search_results} -
    -
    +
    0 result(s)
    + +
    No result found!
    `; - } else { - search_result_container = ` -
    - ${modal_filters} - ${search_divider} -
    0 result(s)
    -
    -
    No result found!
    - `; - } + } - if ($(".search-modal-card-body").hasClass("is-justify-content-center")) { - $(".search-modal-card-body").removeClass("is-justify-content-center"); - } + if ($(".search-modal-card-body").hasClass("is-justify-content-center")) { + $(".search-modal-card-body").removeClass("is-justify-content-center"); + } - $(".search-modal-card-body").html(search_result_container); - } else { - if (!$(".search-modal-card-body").hasClass("is-justify-content-center")) { - $(".search-modal-card-body").addClass("is-justify-content-center"); + $(".search-modal-card-body").html(search_result_container); + } else { + if (!$(".search-modal-card-body").hasClass("is-justify-content-center")) { + $(".search-modal-card-body").addClass("is-justify-content-center"); + } + + $(".search-modal-card-body").html(` +
    Type something to get started!
    + `); } + } - $(".search-modal-card-body").html(` -
    Type something to get started!
    - `); + /** + * Make the modal filter html + * + * @returns string + */ + function make_modal_body_filters() { + let str = filters + .map((val) => { + if (selected_filter == val.toLowerCase()) { + return `${val}`; + } else { + return `${val}`; + } + }) + .join(""); + + return ` +
    + Filters: + ${str} +
    `; } } -/** - * Make the modal filter html - * - * @returns string - */ -function make_modal_body_filters() { - let str = filters - .map((val) => { - if (selected_filter == val.toLowerCase()) { - return `${val}`; - } else { - return `${val}`; - } - }) - .join(""); - - return ` -
    - Filters: - ${str} -
    `; +function waitUntilSearchIndexAvailable() { + // It is possible that the documenter.js script runs before the page + // has finished loading and documenterSearchIndex gets defined. + // So we need to wait until the search index actually loads before setting + // up all the search-related stuff. + if (typeof documenterSearchIndex !== "undefined") { + runSearchMainCode(); + } else { + console.warn("Search Index not available, waiting"); + setTimeout(waitUntilSearchIndexAvailable, 1000); + } } +// The actual entry point to the search code +waitUntilSearchIndexAvailable(); + }) //////////////////////////////////////////////////////////////////////////////// require(['jquery'], function($) { diff --git a/dev/index.html b/dev/index.html index 32c85bf..2a0984f 100644 --- a/dev/index.html +++ b/dev/index.html @@ -1,2 +1,2 @@ -Home · PosteriorStats.jl

    PosteriorStats

    PosteriorStats implements widely-used and well-characterized statistical analyses for the Bayesian workflow. These functions generally estimate properties of posterior and/or posterior predictive distributions. The default implementations defined here operate on Monte Carlo samples.

    See the API for details.

    Extending this package

    The methods defined here are intended to be extended by two types of packages.

    • packages that implement data types for storing Monte Carlo samples
    • packages that implement other representations for posterior distributions than Monte Carlo draws
    +Home · PosteriorStats.jl

    PosteriorStats

    PosteriorStats implements widely-used and well-characterized statistical analyses for the Bayesian workflow. These functions generally estimate properties of posterior and/or posterior predictive distributions. The default implementations defined here operate on Monte Carlo samples.

    See the API for details.

    Extending this package

    The methods defined here are intended to be extended by two types of packages.

    • packages that implement data types for storing Monte Carlo samples
    • packages that implement other representations for posterior distributions than Monte Carlo draws