-
Notifications
You must be signed in to change notification settings - Fork 0
/
README.Rmd
166 lines (123 loc) · 7.91 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
---
output: github_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, include = FALSE}
library(knitr)
knitr::opts_chunk$set(
message = FALSE,
warning = FALSE,
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
to <- asis_output("\U2192")
leadsto <- asis_output("\U2933")
library(paths)
library(gbm)
```
# paths: An Imputation Approach to Estimating Path-specific Causal Effects
This package implements an imputation approach to estimating path-specific effects (PSEs) in causal mediation analysis ([Zhou and Yamamoto 2020](https://osf.io/2rx6p)). Estimation uncertainty is assessed using the nonparametric bootstrap.
## Installation
You can install the released version of paths from [CRAN](https://CRAN.R-project.org) with:
``` r
install.packages("paths")
```
And the development version from [GitHub](https://github.com/) with:
``` r
# install.packages("devtools")
devtools::install_github("mdtrinh/paths")
# or
devtools::install_github("xiangzhou09/paths")
```
## Estimating Path-specific Effects (PSEs)
Given a binary treatment *A*, an outcome of interest *Y*, a set of pretreatment confounders *X*, and *K* causally ordered (possibly vector-valued) mediators *M~1~*, *M~2~*,...*M~K~*, the average total effect of the treatment on the outcome can be decomposed into *K+1* PSEs that represent the causal paths *A* `r to` *M~1~* `r leadsto` *Y*, *A* `r to` *M~2~* `r leadsto` *Y*,... *A* `r to` *M~K~* `r to` *Y*, and *A* `r to` *Y*, where the straight arrow `r to` denotes a direct path and the squiggly arrow `r leadsto` represents a combination of all direct and indirect paths. These effects are nonparametrically identified under standard ignorability assumptions ([Avin, Shpitser and Pearl 2005](https://escholarship.org/content/qt45x689gq/qt45x689gq.pdf)).
The `paths` function estimates these PSEs using an imputation approach, which requires the user to provide *K+1* outcome models that characterize E[*Y* | *X*, *A*], E[*Y* | *X*, *A*, *M~1~*], ...E[*Y* | *X*, *A*, *M~K~*], respectively. These outcome models can be objects returned by `lm`, `glm`, `gbm::gbm`, `BART::pbart`, or `BART::wbart`. Below is an example illustrating the use of `paths` to estimate the direct and indirect effects of ancestor victimization on descendants' political attitudes. The original data come from [Lupu and Peisakhin (2017)](https://doi.org/10.1111/ajps.12327).
```{r include = FALSE}
set.seed(02138)
```
```{r cache = TRUE}
library(paths)
library(gbm)
# K=3 causally ordered mediators
m1 <- c("trust_g1", "victim_g1", "fear_g1")
m2 <- c("trust_g2", "victim_g2", "fear_g2")
m3 <- c("trust_g3", "victim_g3", "fear_g3")
mediators <- list(m1, m2, m3)
# outcome model formulas
formula_m0 <- annex ~ kulak + prosoviet_pre + religiosity_pre + land_pre +
orchard_pre + animals_pre + carriage_pre + otherprop_pre + violence
formula_m1 <- update(formula_m0, ~ . + trust_g1 + victim_g1 + fear_g1)
formula_m2 <- update(formula_m1, ~ . + trust_g2 + victim_g2 + fear_g2)
formula_m3 <- update(formula_m2, ~ . + trust_g3 + victim_g3 + fear_g3)
# outcome models
gbm_m0 <- gbm(formula_m0, data = tatar, distribution = "bernoulli",
interaction.depth = 3)
gbm_m1 <- gbm(formula_m1, data = tatar, distribution = "bernoulli",
interaction.depth = 3)
gbm_m2 <- gbm(formula_m2, data = tatar, distribution = "bernoulli",
interaction.depth = 3)
gbm_m3 <- gbm(formula_m3, data = tatar, distribution = "bernoulli",
interaction.depth = 3)
gbm_ymodels <- list(gbm_m0, gbm_m1, gbm_m2, gbm_m3)
# causal paths analysis using gbm
tatar_paths <- paths(a = "violence", y = "annex", m = mediators,
gbm_ymodels, data = tatar, nboot = 250)
# summarize results
summary(tatar_paths)
```
## An Imputation-based Weighting Estimator
The same PSEs can be alternatively estimated using an imputation-based weighting estimator detailed in [Zhou and Yamamoto (2020)](https://osf.io/2rx6p). To implement this estimator, the user needs to additionally supply a propensity score model for the treatment, which can be objects returned by `glm`, `gbm::gbm`. `twang::ps`, or `BART::pbart`.
```{r cache = TRUE}
# propensity score model via gbm
formula_ps <- violence ~ kulak + prosoviet_pre + religiosity_pre + land_pre +
orchard_pre + animals_pre + carriage_pre + otherprop_pre
gbm_ps <- gbm(formula_ps, data = tatar, distribution = "bernoulli",
interaction.depth = 3)
# causal paths analysis using both the pure imputation estimator and the imputation-based weighting estimator
tatar_paths2 <- paths(a = "violence", y = "annex", m = mediators,
ps_model = gbm_ps, gbm_ymodels, data = tatar, nboot = 250)
```
## Plotting PSEs
The `plot.paths` method can be used to visualize the total and path-specific causal effects estimated by `paths`.
```{r, out.width="80%", out.height="60%", fig.align = "center"}
plot(tatar_paths2, mediator_names = c("G1 identity", "G2 identity", "G3 identity"),
estimator = "both")
```
## Sensitivity Analysis
The `sens()` function implements a set of bias formulas detailed in [Zhou and Yamamoto (2020)](https://osf.io/2rx6p) for assessing the sensitivity of estimated PSEs to an unobserved confounder U of a mediator-outcome relationship. The user provides a fitted `paths` object, the mediator whose relationship with the outcome is potentially confounded, the estimand whose sensitivity to unobserved confounding is being investigated, type of estimator, type of decomposition, and possible values of the sensitivity parameters γ and η.
The object returned by the `sens()` function is a list containing (a) the original estimate of the corresponding PSE, and (b) a data frame where each row represents a potential combination of γ and η, the corresponding bias, bias-adjusted estimate, and an indicator for whether the bias-adjusted estimate is of the opposite sign to the original estimate.
The `plot.sens()` function visualizes the sensitivity analysis
by showing contours of bias-adjusted estimates under different values of γ and η.
```{r, out.width="80%", out.height="60%", fig.align = "center"}
# sensitivity analysis for the path-specific effect via M1
sens_paths <- sens(tatar_paths, confounded = "M1", estimand = "via M1", gamma_values = - seq(0, 0.5, 0.002), eta_values = seq(-0.5, 0.5, 0.002))
plot(sens_paths)
```
## Experimental Data
In experimental data where treatment is randomly assigned, the pure imputation estimator and the imputation-based weighting estimator are equivalent, so that a propensity score model for the treatment is not needed. In addition, because the treatment-outcome relationship is not confounded, the pretreatment covariates *X* can be omitted from the baseline model. Below is an example illustrating the use of `paths` to estimate the direct and indirect effects of issue framing on citizens' support for welfare reform ([Slothuus 2008](https://doi.org/10.1111/j.1467-9221.2007.00610.x)). In this example, the treatment (issue framing) has been randomly assigned via a survey experiment so the baseline outcome model can be a simple linear regression of *Y* on *A*.
```{r cache = TRUE}
# variable names
x <- c("gender1", "educ1", "polint1", "ideo1", "know1", "value1")
a <- "ttt"
m1 <- c("W1", "W2")
m2 <- c("M1","M2","M3","M4","M5")
y <- "Y"
m <- list(m1, m2)
# formulas
form_m0 <- as.formula(paste0(y, "~", a))
form_m1 <- as.formula(paste0(y, "~", paste0(c(x, a, m1), collapse = "+")))
form_m2 <- as.formula(paste0(y, "~", paste0(c(x, a, m1, m2), collapse = "+")))
# baseline model for overall treatment effect
lm_m0 <- lm(form_m0, data = welfare)
# GBM outcome models
gbm_m1 <- gbm(form_m1, data = welfare, distribution = "gaussian",
interaction.depth = 3)
gbm_m2 <- gbm(form_m2, data = welfare, distribution = "gaussian",
interaction.depth = 3)
gbm_ymodels <- list(lm_m0, gbm_m1, gbm_m2)
# causal paths analysis
welfare_paths <- paths(a, y, m, models = gbm_ymodels,
data = welfare, nboot = 250)
```