Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

more accuracy for p_sig/p_dir #1012

Merged
merged 5 commits into from
Sep 12, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -113,6 +113,7 @@ Suggests:
coxme,
cplm,
dbscan,
distributional,
domir (>= 0.2.0),
drc,
DRR,
Expand Down Expand Up @@ -221,4 +222,4 @@ Config/testthat/edition: 3
Config/testthat/parallel: true
Config/Needs/website: easystats/easystatstemplate
Config/rcmdcheck/ignore-inconsequential-notes: true
Remotes: easystats/insight
Remotes: easystats/insight, easystats/bayestestR
98 changes: 64 additions & 34 deletions R/equivalence_test.R
Original file line number Diff line number Diff line change
Expand Up @@ -10,15 +10,14 @@
#'
#' @param x A statistical model.
#' @param range The range of practical equivalence of an effect. May be
#' `"default"`, to automatically define this range based on properties of the
#' model's data.
#' `"default"`, to automatically define this range based on properties of the
#' model's data.
#' @param ci Confidence Interval (CI) level. Default to `0.95` (`95%`).
#' @param rule Character, indicating the rules when testing for practical
#' equivalence. Can be `"bayes"`, `"classic"` or `"cet"`. See
#' 'Details'.
#' equivalence. Can be `"bayes"`, `"classic"` or `"cet"`. See 'Details'.
#' @param test Hypothesis test for computing contrasts or pairwise comparisons.
#' See [`?ggeffects::test_predictions`](https://strengejacke.github.io/ggeffects/reference/test_predictions.html)
#' for details.
#' See [`?ggeffects::test_predictions`](https://strengejacke.github.io/ggeffects/reference/test_predictions.html)
#' for details.
#' @param verbose Toggle warnings and messages.
#' @param ... Arguments passed to or from other methods.
#' @inheritParams model_parameters.merMod
Expand All @@ -28,18 +27,17 @@
#' readings can be found in the references. See also [`p_significance()`] for
#' a unidirectional equivalence test.
#'
#' @details
#' In classical null hypothesis significance testing (NHST) within a frequentist
#' framework, it is not possible to accept the null hypothesis, H0 - unlike
#' in Bayesian statistics, where such probability statements are possible.
#' "[...] one can only reject the null hypothesis if the test
#' @details In classical null hypothesis significance testing (NHST) within a
#' frequentist framework, it is not possible to accept the null hypothesis, H0 -
#' unlike in Bayesian statistics, where such probability statements are
#' possible. "[...] one can only reject the null hypothesis if the test
#' statistics falls into the critical region(s), or fail to reject this
#' hypothesis. In the latter case, all we can say is that no significant effect
#' was observed, but one cannot conclude that the null hypothesis is true."
#' (_Pernet 2017_). One way to address this issues without Bayesian methods
#' is *Equivalence Testing*, as implemented in `equivalence_test()`.
#' While you either can reject the null hypothesis or claim an inconclusive result
#' in NHST, the equivalence test - according to _Pernet_ - adds a third category,
#' (_Pernet 2017_). One way to address this issues without Bayesian methods is
#' *Equivalence Testing*, as implemented in `equivalence_test()`. While you
#' either can reject the null hypothesis or claim an inconclusive result in
#' NHST, the equivalence test - according to _Pernet_ - adds a third category,
#' *"accept"*. Roughly speaking, the idea behind equivalence testing in a
#' frequentist framework is to check whether an estimate and its uncertainty
#' (i.e. confidence interval) falls within a region of "practical equivalence".
Expand Down Expand Up @@ -95,25 +93,25 @@
#' ## p-Values
#' The equivalence p-value is the area of the (cumulative) confidence
#' distribution that is outside of the region of equivalence. It can be
#' interpreted as p-value for *rejecting* the alternative hypothesis
#' and *accepting* the "null hypothesis" (i.e. assuming practical
#' equivalence). That is, a high p-value means we reject the assumption of
#' practical equivalence and accept the alternative hypothesis.
#' interpreted as p-value for *rejecting* the alternative hypothesis and
#' *accepting* the "null hypothesis" (i.e. assuming practical equivalence). That
#' is, a high p-value means we reject the assumption of practical equivalence
#' and accept the alternative hypothesis.
#'
#' ## Second Generation p-Value (SGPV)
#' Second generation p-values (SGPV) were proposed as a statistic that
#' represents _the proportion of data-supported hypotheses that are also null
#' hypotheses_ _(Blume et al. 2018, Lakens and Delacre 2020)_. It represents the
#' proportion of the _full_ confidence interval range (assuming a normally
#' distributed, equal-tailed interval) that is inside the ROPE. The SGPV ranges
#' from zero to one. Higher values indicate that the effect is more likely to be
#' practically equivalent ("not of interest").
#' proportion of the _full_ confidence interval range (assuming a normally or
#' t-distributed, equal-tailed interval, based on the model) that is inside the
#' ROPE. The SGPV ranges from zero to one. Higher values indicate that the
#' effect is more likely to be practically equivalent ("not of interest").
#'
#' Note that the assumed interval, which is used to calculate the SGPV, is an
#' estimation of the _full interval_ based on the chosen confidence level. For
#' example, if the 95% confidence interval of a coefficient ranges from -1 to 1,
#' the underlying _full (normally distributed) interval_ approximately ranges
#' from -1.9 to 1.9, see also following code:
#' the underlying _full (normally or t-distributed) interval_ approximately
#' ranges from -1.9 to 1.9, see also following code:
#'
#' ```
#' # simulate full normal distribution
Expand Down Expand Up @@ -390,6 +388,7 @@
focal <- attributes(x)$original.terms
obj_name <- attributes(x)$model.name
ci <- attributes(x)$ci.lvl
dof <- attributes(x)$df

x <- .get_ggeffects_model(x)

Expand Down Expand Up @@ -419,6 +418,7 @@
ci_narrow,
range_rope = range,
rule = rule,
dof = dof,
verbose = verbose
)
}, conf_int, conf_int2
Expand Down Expand Up @@ -490,6 +490,18 @@
}


# ==== check degrees of freedom ====

df_column <- grep("(df|df_error)", colnames(x))
if (length(df_column) > 0) {
dof <- unique(x[[df_column]])
if (length(dof) > 1) {
dof <- Inf
}
} else {
dof <- Inf
}

# ==== requested confidence intervals ====

params <- conf_int <- .ci_generic(x, ci = ci)
Expand All @@ -513,6 +525,7 @@
ci_narrow,
range_rope = range,
rule = rule,
dof = dof,
verbose = verbose
)
}, conf_int, conf_int2
Expand Down Expand Up @@ -619,11 +632,12 @@

#' @keywords internal
.equivalence_test_numeric <- function(ci = 0.95,
ci_wide,

Check warning on line 635 in R/equivalence_test.R

View workflow job for this annotation

GitHub Actions / lint-changed-files / lint-changed-files

file=R/equivalence_test.R,line=635,col=39,[function_argument_linter] Arguments without defaults should come before arguments with defaults.
ci_narrow,

Check warning on line 636 in R/equivalence_test.R

View workflow job for this annotation

GitHub Actions / lint-changed-files / lint-changed-files

file=R/equivalence_test.R,line=636,col=39,[function_argument_linter] Arguments without defaults should come before arguments with defaults.
range_rope,

Check warning on line 637 in R/equivalence_test.R

View workflow job for this annotation

GitHub Actions / lint-changed-files / lint-changed-files

file=R/equivalence_test.R,line=637,col=39,[function_argument_linter] Arguments without defaults should come before arguments with defaults.
rule,

Check warning on line 638 in R/equivalence_test.R

View workflow job for this annotation

GitHub Actions / lint-changed-files / lint-changed-files

file=R/equivalence_test.R,line=638,col=39,[function_argument_linter] Arguments without defaults should come before arguments with defaults.
dof = Inf,
verbose) {

Check warning on line 640 in R/equivalence_test.R

View workflow job for this annotation

GitHub Actions / lint-changed-files / lint-changed-files

file=R/equivalence_test.R,line=640,col=39,[function_argument_linter] Arguments without defaults should come before arguments with defaults.
final_ci <- NULL

# ==== HDI+ROPE decision rule, by Kruschke ====
Expand Down Expand Up @@ -674,7 +688,7 @@
data.frame(
CI_low = final_ci[1],
CI_high = final_ci[2],
SGPV = .rope_coverage(ci = ci, range_rope, ci_range = final_ci),
SGPV = .rope_coverage(ci = ci, range_rope, ci_range = final_ci, dof = dof),
ROPE_low = range_rope[1],
ROPE_high = range_rope[2],
ROPE_Equivalence = decision,
Expand Down Expand Up @@ -726,8 +740,8 @@
# same range / limits as the confidence interval, thus indeed representing a
# normally distributed confidence interval. We then calculate the probability
# mass of this interval that is inside the ROPE.
.rope_coverage <- function(ci = 0.95, range_rope, ci_range) {
out <- .generate_posterior_from_ci(ci, ci_range)
.rope_coverage <- function(ci = 0.95, range_rope, ci_range, dof = Inf) {

Check warning on line 743 in R/equivalence_test.R

View workflow job for this annotation

GitHub Actions / lint-changed-files / lint-changed-files

file=R/equivalence_test.R,line=743,col=39,[function_argument_linter] Arguments without defaults should come before arguments with defaults.

Check warning on line 743 in R/equivalence_test.R

View workflow job for this annotation

GitHub Actions / lint-changed-files / lint-changed-files

file=R/equivalence_test.R,line=743,col=51,[function_argument_linter] Arguments without defaults should come before arguments with defaults.
out <- .generate_posterior_from_ci(ci, ci_range, dof = dof)
# compare: ci_range and range(out)
# The SGPV refers to the proportion of the confidence interval inside the
# full ROPE - thus, we set ci = 1 here
Expand All @@ -736,23 +750,39 @@
}


.generate_posterior_from_ci <- function(ci = 0.95, ci_range, precision = 10000) {
.generate_posterior_from_ci <- function(ci = 0.95, ci_range, dof = Inf, precision = 10000) {

Check warning on line 753 in R/equivalence_test.R

View workflow job for this annotation

GitHub Actions / lint-changed-files / lint-changed-files

file=R/equivalence_test.R,line=753,col=52,[function_argument_linter] Arguments without defaults should come before arguments with defaults.
# this function creates an approximate normal distribution that covers the
# CI-range, i.e. we "simulate" a posterior distribution from a frequentist CI

# first we need the z-values of thq quantiles from the CI range
z_value <- stats::qnorm((1 + ci) / 2)
# then we need the range of the CI (in units), also to calculate the mean of
# sanity check - dof argument
if (is.null(dof)) {
dof <- Inf
}
# first we need the range of the CI (in units), also to calculate the mean of
# the normal distribution
diff_ci <- abs(diff(ci_range))
mean_dist <- ci_range[2] - (diff_ci / 2)
# then we need the critical values of the quantiles from the CI range
z_value <- stats::qt((1 + ci) / 2, df = dof)
# the range of Z-scores (from lower to upper quantile) gives us the range of
# the provided interval in terms of standard deviations. now we divide the
# known range of the provided CI (in units) by the z-score-range, which will
# give us the standard deviation of the distribution.
sd_dist <- diff_ci / diff(c(-1 * z_value, z_value))
# we now know all parameters (mean and sd) to simulate a normal distribution
bayestestR::distribution_normal(n = precision, mean = mean_dist, sd = sd_dist)
# generate normal-distribution if we don't have t-distribution, or if
# we don't have necessary packages installed
if (is.infinite(dof) || !insight::check_if_installed("distributional", quietly = TRUE)) {
# tell user to install "distributional"
if (!is.infinite(dof)) {
insight::format_alert("For models with only few degrees of freedom, install the {distributional} package to increase accuracy of `p_direction()`, `p_significance()` and `equivalence_test()`.") # nolint
}
# we now know all parameters (mean and sd) to simulate a normal distribution
bayestestR::distribution_normal(n = precision, mean = mean_dist, sd = sd_dist)
} else {
insight::check_if_installed("distributional")
out <- distributional::dist_student_t(df = dof, mu = mean_dist, sigma = sd_dist)
sort(unlist(distributional::generate(out, times = precision), use.names = FALSE))
}
}


Expand Down
26 changes: 14 additions & 12 deletions R/p_significance.R
Original file line number Diff line number Diff line change
Expand Up @@ -35,21 +35,22 @@ bayestestR::p_significance
#' for functions related to checking effect existence and significance.
#'
#' @details `p_significance()` returns the proportion of the _full_ confidence
#' interval range (assuming a normally distributed, equal-tailed interval) that
#' is outside a certain range (the negligible effect, or ROPE, see argument
#' `threshold`). If there are values of the distribution both below and above
#' the ROPE, `p_significance()` returns the higher probability of a value being
#' outside the ROPE. Typically, this value should be larger than 0.5 to indicate
#' practical significance. However, if the range of the negligible effect is
#' rather large compared to the range of the confidence interval,
#' `p_significance()` will be less than 0.5, which indicates no clear practical
#' significance.
#' interval range (assuming a normally or t-distributed, equal-tailed interval,
#' based on the model) that is outside a certain range (the negligible effect,
#' or ROPE, see argument `threshold`). If there are values of the distribution
#' both below and above the ROPE, `p_significance()` returns the higher
#' probability of a value being outside the ROPE. Typically, this value should
#' be larger than 0.5 to indicate practical significance. However, if the range
#' of the negligible effect is rather large compared to the range of the
#' confidence interval, `p_significance()` will be less than 0.5, which
#' indicates no clear practical significance.
#'
#' Note that the assumed interval, which is used to calculate the practical
#' significance, is an estimation of the _full interval_ based on the chosen
#' confidence level. For example, if the 95% confidence interval of a
#' coefficient ranges from -1 to 1, the underlying _full (normally distributed)
#' interval_ approximately ranges from -1.9 to 1.9, see also following code:
#' coefficient ranges from -1 to 1, the underlying _full (normally or
#' t-distributed) interval_ approximately ranges from -1.9 to 1.9, see also
#' following code:
#'
#' ```
#' # simulate full normal distribution
Expand Down Expand Up @@ -178,11 +179,12 @@ p_significance.lm <- function(x, threshold = "default", ci = 0.95, verbose = TRU
.posterior_ci <- function(x, ci, ...) {
# first, we need CIs
out <- ci(x, ci = ci, ...)
dof <- .safe(insight::get_df(x, type = "wald"), Inf)
# we now iterate all confidence intervals and create an approximate normal
# distribution that covers the CI-range.
posterior <- as.data.frame(lapply(seq_len(nrow(out)), function(i) {
ci_range <- as.numeric(out[i, c("CI_low", "CI_high")])
.generate_posterior_from_ci(ci, ci_range)
.generate_posterior_from_ci(ci, ci_range, dof = dof)
}))
colnames(posterior) <- out$Parameter

Expand Down
39 changes: 19 additions & 20 deletions man/equivalence_test.lm.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion man/p_direction.lm.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading
Loading