-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updates to stats vignette #406
Conversation
Added Lu2021 and updated two other references. Signed-off-by: wolbersm <[email protected]>
Updated section on Standard errors of the treatment effect Signed-off-by: wolbersm <[email protected]>
Yes, sorry for the blunder and thanks for spotting this!! Co-authored-by: Alessandro Noci <[email protected]> Signed-off-by: wolbersm <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just made few comments, looks good! :)
Co-authored-by: Alessandro Noci <[email protected]> Signed-off-by: wolbersm <[email protected]>
Yes, good suggestion. Please make sure that this is also adapted in the actual new vignette. Co-authored-by: Alessandro Noci <[email protected]> Signed-off-by: wolbersm <[email protected]>
Hi both, just to say please commit the updated html to the repo as the vignettes are not rebuilt at package installation time but are taken "as-is". EDIT - Also please update the news file to given a brief summary about what was changed with this update :) |
@gowerc I have added the updated html. In addition to this, I have included a new vignette (called CondMean_Inference). I have included it by doing the following:
Could you please check that this is done correctly? In addition, it would be great if you could review the vignette and its code. Thanks! |
PS: this closes #403 |
As described in section 3.10.2 of the statistical specifications of the package (`vignette(topic = "stat_specs", package = "rbmi")`), two different types of variance estimators have been proposed for reference-based imputation methods in the statistical literature (@Bartlett2021). The first is the frequentist variance which describes the actual repeated sampling variability of the estimator and results in inference which is correct in the frequentist sense, i.e. hypothesis tests have accurate type I error control and confidence intervals have correct coverage probabilities under repeated sampling if the reference-based assumption is correctly specified (@Bartlett2021, @Wolbers2021). Reference-based missing data assumption are strong and borrow information from the control arm for imputation in the active arm. As a consequence, the size of frequentist standard errors for treatment effects may decrease with increasing amounts of missing data. The second is the so-called "information-anchored" variance which was originally proposed in the context of sensitivity analyses (@CroEtAl2019). This variance estimator is based on disentangling point estimation and variance estimation altogether. The resulting information-anchored variance is typically very similar to the variance under missing-at-random (MAR) imputation and increases with increasing amounts of missing data at approximately the same rate as MAR imputation. However, the information-anchored variance does not reflect the actual variability of the reference-based estimator and the resulting frequentist inference is highly conservative resulting in a substantial power loss. | ||
|
||
Reference-based conditional mean imputation combined with a resampling method such as the jackknife or the bootstrap was first introduced in @Wolbers2021. This approach naturally targets the frequentist variance. The information-anchored variance is typically estimated using Rubin's rules for Bayesian multiple imputation which are not applicable within the conditional mean imputation framework. However, an alternative information-anchored variance proposed by @Lu2021 can easily be obtained as we show below. The basic idea of @Lu2021 is to obtain the information-anchored variance via a MAR imputation combined with a delta-adjustment where delta is selected in a data-driven way to match the reference-based estimator. For conditional mean imputation, the proposal by @Lu2021 can be implemented by choosing the delta-adjustment as the difference between the conditional mean imputation under the chosen reference-based assumption and MAR on the original dataset. The variance can then be obtained via the jackknife or the bootstrap while keeping the delta-adjustment fixed. The resulting variance estimate is very similar to Rubin's variance. Moreover as shown in @CroEtAl2019, the variance of MAR-imputation combined with a delta-adjustment achieves even better information-anchoring properties than Rubin's variance for reference-based imputation. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the sake of my own understand can I ask why this is ?
My guess is that if you have lots of data in the control arm then all missing data essentially gets filled in by the mean (well the mean conditioned on covariates) thus the more data you have the more observations are imputed at the mean so there is less variability in the data you are analysising. Is that roughly right ?
As described in section 3.10.2 of the statistical specifications of the package (`vignette(topic = "stat_specs", package = "rbmi")`), two different types of variance estimators have been proposed for reference-based imputation methods in the statistical literature (@Bartlett2021). The first is the frequentist variance which describes the actual repeated sampling variability of the estimator and results in inference which is correct in the frequentist sense, i.e. hypothesis tests have accurate type I error control and confidence intervals have correct coverage probabilities under repeated sampling if the reference-based assumption is correctly specified (@Bartlett2021, @Wolbers2021). Reference-based missing data assumption are strong and borrow information from the control arm for imputation in the active arm. As a consequence, the size of frequentist standard errors for treatment effects may decrease with increasing amounts of missing data. The second is the so-called "information-anchored" variance which was originally proposed in the context of sensitivity analyses (@CroEtAl2019). This variance estimator is based on disentangling point estimation and variance estimation altogether. The resulting information-anchored variance is typically very similar to the variance under missing-at-random (MAR) imputation and increases with increasing amounts of missing data at approximately the same rate as MAR imputation. However, the information-anchored variance does not reflect the actual variability of the reference-based estimator and the resulting frequentist inference is highly conservative resulting in a substantial power loss. | ||
|
||
Reference-based conditional mean imputation combined with a resampling method such as the jackknife or the bootstrap was first introduced in @Wolbers2021. This approach naturally targets the frequentist variance. The information-anchored variance is typically estimated using Rubin's rules for Bayesian multiple imputation which are not applicable within the conditional mean imputation framework. However, an alternative information-anchored variance proposed by @Lu2021 can easily be obtained as we show below. The basic idea of @Lu2021 is to obtain the information-anchored variance via a MAR imputation combined with a delta-adjustment where delta is selected in a data-driven way to match the reference-based estimator. For conditional mean imputation, the proposal by @Lu2021 can be implemented by choosing the delta-adjustment as the difference between the conditional mean imputation under the chosen reference-based assumption and MAR on the original dataset. The variance can then be obtained via the jackknife or the bootstrap while keeping the delta-adjustment fixed. The resulting variance estimate is very similar to Rubin's variance. Moreover as shown in @CroEtAl2019, the variance of MAR-imputation combined with a delta-adjustment achieves even better information-anchoring properties than Rubin's variance for reference-based imputation. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However, an alternative information-anchored variance proposed by @Lu2021 can easily be obtained as shown below. | |
The basic idea of @Lu2021 is to obtain the information-anchored variance via a MAR imputation combined with a delta-adjustment where delta is selected in a data-driven way to match the reference-based estimator. |
As described in section 3.10.2 of the statistical specifications of the package (`vignette(topic = "stat_specs", package = "rbmi")`), two different types of variance estimators have been proposed for reference-based imputation methods in the statistical literature (@Bartlett2021). The first is the frequentist variance which describes the actual repeated sampling variability of the estimator and results in inference which is correct in the frequentist sense, i.e. hypothesis tests have accurate type I error control and confidence intervals have correct coverage probabilities under repeated sampling if the reference-based assumption is correctly specified (@Bartlett2021, @Wolbers2021). Reference-based missing data assumption are strong and borrow information from the control arm for imputation in the active arm. As a consequence, the size of frequentist standard errors for treatment effects may decrease with increasing amounts of missing data. The second is the so-called "information-anchored" variance which was originally proposed in the context of sensitivity analyses (@CroEtAl2019). This variance estimator is based on disentangling point estimation and variance estimation altogether. The resulting information-anchored variance is typically very similar to the variance under missing-at-random (MAR) imputation and increases with increasing amounts of missing data at approximately the same rate as MAR imputation. However, the information-anchored variance does not reflect the actual variability of the reference-based estimator and the resulting frequentist inference is highly conservative resulting in a substantial power loss. | ||
|
||
Reference-based conditional mean imputation combined with a resampling method such as the jackknife or the bootstrap was first introduced in @Wolbers2021. This approach naturally targets the frequentist variance. The information-anchored variance is typically estimated using Rubin's rules for Bayesian multiple imputation which are not applicable within the conditional mean imputation framework. However, an alternative information-anchored variance proposed by @Lu2021 can easily be obtained as we show below. The basic idea of @Lu2021 is to obtain the information-anchored variance via a MAR imputation combined with a delta-adjustment where delta is selected in a data-driven way to match the reference-based estimator. For conditional mean imputation, the proposal by @Lu2021 can be implemented by choosing the delta-adjustment as the difference between the conditional mean imputation under the chosen reference-based assumption and MAR on the original dataset. The variance can then be obtained via the jackknife or the bootstrap while keeping the delta-adjustment fixed. The resulting variance estimate is very similar to Rubin's variance. Moreover as shown in @CroEtAl2019, the variance of MAR-imputation combined with a delta-adjustment achieves even better information-anchoring properties than Rubin's variance for reference-based imputation. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apologies I'm not sure I understand this. Based on this description my understanding is that your per patient estimates are the MAR estimate + (Reference estimate - MAR estimate).
Doesn't this just give you the Reference estimates as your imputed values? So why would the variances be different (apologies I appreciate this is likely a stupid question)
As described in section 3.10.2 of the statistical specifications of the package (`vignette(topic = "stat_specs", package = "rbmi")`), two different types of variance estimators have been proposed for reference-based imputation methods in the statistical literature (@Bartlett2021). The first is the frequentist variance which describes the actual repeated sampling variability of the estimator and results in inference which is correct in the frequentist sense, i.e. hypothesis tests have accurate type I error control and confidence intervals have correct coverage probabilities under repeated sampling if the reference-based assumption is correctly specified (@Bartlett2021, @Wolbers2021). Reference-based missing data assumption are strong and borrow information from the control arm for imputation in the active arm. As a consequence, the size of frequentist standard errors for treatment effects may decrease with increasing amounts of missing data. The second is the so-called "information-anchored" variance which was originally proposed in the context of sensitivity analyses (@CroEtAl2019). This variance estimator is based on disentangling point estimation and variance estimation altogether. The resulting information-anchored variance is typically very similar to the variance under missing-at-random (MAR) imputation and increases with increasing amounts of missing data at approximately the same rate as MAR imputation. However, the information-anchored variance does not reflect the actual variability of the reference-based estimator and the resulting frequentist inference is highly conservative resulting in a substantial power loss. | ||
|
||
Reference-based conditional mean imputation combined with a resampling method such as the jackknife or the bootstrap was first introduced in @Wolbers2021. This approach naturally targets the frequentist variance. The information-anchored variance is typically estimated using Rubin's rules for Bayesian multiple imputation which are not applicable within the conditional mean imputation framework. However, an alternative information-anchored variance proposed by @Lu2021 can easily be obtained as we show below. The basic idea of @Lu2021 is to obtain the information-anchored variance via a MAR imputation combined with a delta-adjustment where delta is selected in a data-driven way to match the reference-based estimator. For conditional mean imputation, the proposal by @Lu2021 can be implemented by choosing the delta-adjustment as the difference between the conditional mean imputation under the chosen reference-based assumption and MAR on the original dataset. The variance can then be obtained via the jackknife or the bootstrap while keeping the delta-adjustment fixed. The resulting variance estimate is very similar to Rubin's variance. Moreover as shown in @CroEtAl2019, the variance of MAR-imputation combined with a delta-adjustment achieves even better information-anchoring properties than Rubin's variance for reference-based imputation. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moreover, as shown in @CroEtAl2019, the variance of MAR-imputation combined with a delta-adjustment achieves even better information-anchoring properties than Rubin's variance for reference-based imputation. |
Co-authored-by: Craig Gower-Page <[email protected]> Signed-off-by: wolbersm <[email protected]>
Co-authored-by: Craig Gower-Page <[email protected]> Signed-off-by: Alessandro Noci <[email protected]>
Co-authored-by: Craig Gower-Page <[email protected]> Signed-off-by: Alessandro Noci <[email protected]>
@gowerc I have tried to re-write the function using only base R (i.e. avoid dplyr functions) and I have updated the vignette title. I have included most of your comments but there are two remaining suggestions from you that I cannot commit because they are "outdated" (however, I agree with them). Could you please have a final review and make the final changes? Thank you! |
Hi @nociale
I have updated the stats vignette (revised section on standard errors and added links to upcoming new vignette) and modified the references bib accordingly. Could you please have a look and if you are happy approve the changes.
Thanks,
Marcel