Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

survey_prop should default to proportions #142

Merged
merged 5 commits into from
Feb 20, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
35 changes: 24 additions & 11 deletions R/survey_statistics.r
Original file line number Diff line number Diff line change
@@ -1,12 +1,14 @@
#' Calculate mean/proportion and its variation using survey methods
#'
#' Calculate means and proportions from complex survey data. A wrapper
#' around \code{\link[survey]{svymean}}, or if \code{proportion = TRUE},
#' \code{\link[survey]{svyciprop}}. \code{survey_mean} should always be
#' called from \code{\link{summarise}}.
#' Calculate means and proportions from complex survey data.
#' \code{survey_mean} with \code{proportion = FALSE} (the default) or \code{survey_prop} with \code{proportion = FALSE}
#' is a wrapper around \code{\link[survey]{svymean}}.
#' \code{survey_prop} with \code{proportion = TRUE} (the default) or \code{survey_mean} with \code{proportion = TRUE}
#' is a wrapper around \code{\link[survey]{svyciprop}}.
#' \code{survey_mean} and \code{survey_prop} should always be called from \code{\link{summarise}}.
#'
#' Using \code{survey_prop} is equivalent to leaving out the \code{x} argument in
#' \code{survey_mean} and this calculates the proportion represented within the
#' \code{survey_mean} and setting \code{proportion = TRUE} and this calculates the proportion represented within the
#' data, with the last grouping variable "unpeeled". \code{\link{interact}}
#' allows for "unpeeling" multiple variables at once.
#'
Expand Down Expand Up @@ -93,7 +95,7 @@ survey_mean <- function(
vartype = c("se", "ci", "var", "cv"),
level = 0.95,
proportion = FALSE,
prop_method = c("logit", "likelihood", "asin", "beta", "mean"),
prop_method = c("logit", "likelihood", "asin", "beta", "mean", "xlogit"),
deff = FALSE,
df = NULL,
...
Expand All @@ -105,9 +107,15 @@ survey_mean <- function(
}
prop_method <- match.arg(prop_method)
if (is.null(df)) df <- survey::degf(cur_svy_full())
if (missing(x)) return(survey_prop(vartype = vartype, level = level,
proportion = proportion, prop_method = prop_method,
deff = deff, df = df, .svy = cur_svy()))
if (missing(x)){
if (missing(proportion) & ("ci" %in% vartype)){
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This message is not actually true because of how missing args works (even though (missing(proportion) is TRUE, the default value comes from line 97 not line 143 in this branch of code).

I actually kind of prefer this behavior - my argument would be that this way we're better adhering to the mental model that survey_mean's behavior comes from survey::svymean and survey_prop's comes from survey::svyciprop. But wanted to check if you disagreed.

My preference is to fix by removing this inform, but the other option is to change line 97 so that it TRUE.

library(srvyr)
#> 
#> Attaching package: 'srvyr'
#> The following object is masked from 'package:stats':
#> 
#>     filter
data(api, package = "survey")

dstrata <- apistrat %>%
  as_survey_design(strata = stype, weights = pw)



dstrata %>%
  group_by(above = api99 > 600) %>%
  summarise(
    prop_default = survey_mean(vartype = "ci"),
    prop_true = survey_mean(proportion = TRUE, vartype = "ci"),
    prop_false = survey_mean(proportion = FALSE, vartype = "ci")
  ) %>% select(ends_with("low"))
#> When `proportion` is unspecified, `survey_mean()` now defaults to `proportion = TRUE` when `x` is left out. This should improve confidence interval coverage.
#> This message is displayed once per session.
#> # A tibble: 2 × 3
#>   prop_default_low prop_true_low prop_false_low
#>              <dbl>         <dbl>          <dbl>
#> 1            0.365         0.367          0.365
#> 2            0.483         0.483          0.483

Copy link
Owner

@gergness gergness Feb 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, just noticed this as I was getting ready to submit. I need to submit in the next few days, so plan to go with my gut when I get back to this later tonight, unless I hear from you

inform("When `proportion` is unspecified, `survey_mean()` now defaults to `proportion = TRUE` when `x` is left out. This should improve confidence interval coverage.",
.frequency = "once", .frequency_id="sm_pd")
}
return(survey_prop(vartype = vartype, level = level,
proportion = proportion, prop_method = prop_method,
deff = deff, df = df, .svy = cur_svy()))
}
stop_for_factor(x)
if (!proportion) {
if (is.logical(x)) x <- as.integer(x)
Expand All @@ -132,15 +140,20 @@ survey_mean <- function(
survey_prop <- function(
vartype = c("se", "ci", "var", "cv"),
level = 0.95,
proportion = FALSE,
prop_method = c("logit", "likelihood", "asin", "beta", "mean"),
proportion = TRUE,
prop_method = c("logit", "likelihood", "asin", "beta", "mean", "xlogit"),
deff = FALSE,
df = NULL,
...
) {
.svy <- cur_svy()
.full_svy <- cur_svy_full()

if (missing(proportion) & ("ci" %in% vartype)){
inform("When `proportion` is unspecified, `survey_prop()` now defaults to `proportion = TRUE`. This should improve confidence interval coverage.",
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After seeing how long the message is, let's break it up by doing the following:

inform(
    c(
        "When `proportion` is unspecified, `survey_prop()` now defaults to `proportion = TRUE`.",  
        i = "This should improve confidence interval coverage."
    ),
    .frequency = "once", .frequency_id="spd"
)

.frequency = "once", .frequency_id="spd")
}

if (!is.null(vartype)) {
vartype <- if (missing(vartype)) "se" else match.arg(vartype, several.ok = TRUE)
}
Expand Down
18 changes: 10 additions & 8 deletions man/survey_mean.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.