diff --git a/content/topics/Analyze/causal-inference/did/goodmanbacon.md b/content/topics/Analyze/causal-inference/did/goodmanbacon.md
new file mode 100644
index 000000000..3c4a532d2
--- /dev/null
+++ b/content/topics/Analyze/causal-inference/did/goodmanbacon.md
@@ -0,0 +1,151 @@
+---
+title: "Detecting TWFE Bias in Staggered DiD: the Goodman-Bacon Decomposition"
+description: "The limitations of the TWFE estimator in staggered DiD settings and the use of the Goodman-Bacon decomposition as a diagnostic"
+keywords: "difference-in-difference, causal inference, did, DID, staggered treatment, regression, model, DiD, R, event study, Goodman-Bacon decomposition, Bacon decomposition, TWFE, two-way fixed effects"
+draft: false
+weight: 7
+author: "Victor Arutyunov"
+aliases:
+- /goodman-bacon
+- /bacon-decomp
+---
+
+## Overview
+
+[Staggered difference-in-differences (DiD)](/staggered-did) and [event study designs](/event-studies) allow for causal inference in settings where units are treated at different times – for example, government laws or policies that were introduced at different times in different countries or states. In such cases, the conventional two-way fixed effects (TWFE) method designed for [classic DiD studies](/canonical-DiD) with just 2 groups and 2 periods may yield incorrect and biased results.
+
+The Goodman-Bacon decomposition is a descriptive tool which allows us to see how the staggered DiD estimator is actually constructed and to assess how likely it is to give biased results in our context. In this topic, we will discuss why TWFE may fail in staggered treatment settings, what the Goodman-Bacon decomposition is and why it’s useful, and explore an applied example in R.
+
+{{% tip %}}
+
+This topic deals with the _reasons_ behind the shortcomings of TWFE in staggered DiD designs and how the Goodman-Bacon decomposition allows us to _detect_ them. For ways to _address_ the limitations of TWFE, see [this topic](/staggered-did), which introduces Callaway and Sant’Anna’s (2021) robust method.
+
+{{% /tip %}}
+
+## Staggered treatment and shortcomings of TWFE
+
+The [TWFE model](/within) is widely used for causal inference with [panel data](/paneldata) – that is, data which includes observations across both different cross-sectional groups and different time periods. It is estimated with two types of fixed effects:
+
+1. **Group fixed effects**: these capture group-specific unobservable characteristics that are constant across time.
+
+2. **Time fixed effects**: these capture time trends, or differences between time periods, that are common to all groups.
+
+This model works well when we only have two groups and two time periods (one before and one after treatment). When we have more than two groups or more than two time periods, our treatment estimate becomes a weighted average of all 2x2 TWFE comparisons – all instances where one group changes its treatment status while another’s treatment status remains unaltered.
+
+{{% example %}}
+
+Suppose that we are studying the effects of unilateral (no-fault) divorce laws on socioeconomic outcomes for women, which different states implement in different years. Our unit of observation are counties across three states – A, B, and C. State A introduces a no-fault divorce law in year 1, state B introduces one in year 2, and state C does not pass any unilateral divorce legislation. This is a typical case of staggered treatment timing: state A is treated earlier, state B is treated later, and state C is never treated.
+
+Our TWFE estimate in this case would be a weighted average of all possible 2x2 comparisons between the states. That is, state A vs state B for years 0 and 1, state A vs state B for years 1 and 2, state B vs state C for years 1 and 2, etc. The weights attached to each comparison would depend on the number of observations in the compared groups (in our case, the number of counties in the compared states) and how much the treatment indicator varies within each comparison.
+
+| State/Year | 0 | 1 | 2 |
+| -------- | ------- | ------- | ------- |
+| A | Not Treated | Treated | Treated |
+| B | Not Treated | Not Treated | Treated |
+| C | Not Treated | Not Treated | Not Treated |
+
+{{% /example %}}
+
+The weighted average of all 2x2 comparisons is problematic for two reasons:
+
+1. **Forbidden comparisons/contrasts**: in our example, one of the 2x2 comparisons that form part of the final TWFE estimate is as follows:
+
+| State/Year | 1 | 2 |
+| -------- | ------- | ------- |
+| A | Treated | Treated |
+| B | Not Treated | Treated |
+
+The group that changes its treatment status here (i.e., the treatment group) is state B, as it goes from being untreated in year 1 to being treated in year 2. State A therefore serves as the control group, even though it is actually treated in both weeks!
+
+This would not be a problem if the treatment effect were completely constant over time, as state A would not see any changes in the outcome from week 1 to week 2 _due to the treatment_. However, this is very rarely the case: treatment effects are often dynamic – they can vary over time due to adaptation or learning effects, for example.
+
+The implication of these forbidden comparisons is that our supposedly causal TWFE estimates are partly composed of non-causal, biased coefficients, which actually compare treatment to treatment, as opposed to treatment to control.
+
+2. **Negative weights**: each comparison is assigned a weight which is proportional to group size (the number of observations within the group) and variation in treatment exposure. The larger the group size and the smaller the variation in treatment exposure, the higher the attached weight. All weights sum to one, but they can also be negative – sometimes, the comparisons at the tails of the time period distribution (in very early or very late periods) can carry negative weights, as treatment variation is typically smaller there (as most groups are not yet treated in early stages or already treated in late stages).
+
+{{
+ +
+ +{{% warning %}} +In this case, we actually have 4 types of comparisons, because the data also includes states which had introduced unilateral divorce laws (that is, were treated) before the years that are included in the dataset. Therefore, some units are 'always treated'. +{{% /warning %}} + +We see that a very large weight (around 2/3 in total) is attached to forbidden comparisons ('Later vs Always Treated') and ('Later vs Earlier Treated') - that is, to cases where our control group has been treated in the past. As previously discussed, this can be a source of bias if the treatment effect is not homogenous over time (which is very likely to be the case). Therefore, these results of the Goodman-Bacon decomposition tell us that it is advisable to re-run our study using one of the recently created robust methods for staggered DiD, such as those proposed by [Callaway and Sant’Anna (2021)](/staggered-did) and [Abraham and Sun (2021)](https://www.sciencedirect.com/science/article/pii/S030440762030378X). + +## Summary + +{{% summary %}} +- Conventional TWFE models may fail when estimating treatment effects in a setting with staggered treatment timing. This can happen because of forbidden comparisons (where our control group is actually being treated) or negative weights attached to certain comparisons. + +- The Goodman-Bacon decomposition is a useful diagnostic for the performance of TWFE estimators in DiD and event study settings with staggered treatment timing. + +- It decomposes the overall TWFE treatment estimate into three distinct components – treated vs never treated comparisons, early treated vs late control comparisons, and late treated vs early control comparisons and shows us what weight each component has. + +- If we detect an important presence of forbidden comparisons or bias arising from negative weights, we can use recently developed robust staggered DiD, such as those proposed by [Callaway and Sant’Anna (2021)](/staggered-did) and [Abraham and Sun (2021)](https://www.sciencedirect.com/science/article/pii/S030440762030378X). +{{% /summary %}} + +## See Also + +[Difference-in-differences with variation in treatment timing - Goodman-Bacon (2021)](https://www.sciencedirect.com/science/article/pii/S0304407621001445) + +[Two-Way Fixed Effects Estimators with Heterogeneous Treatment Effects - de Chaisemartin & D'Haultfœuille (2020)](https://www.aeaweb.org/articles?id=10.1257/aer.20181169) diff --git a/content/topics/Analyze/causal-inference/did/images/bacondecomp.png b/content/topics/Analyze/causal-inference/did/images/bacondecomp.png new file mode 100644 index 000000000..27d3f8d64 Binary files /dev/null and b/content/topics/Analyze/causal-inference/did/images/bacondecomp.png differ