Notes on Synthetic Control

Motivation

1. Data feature – only one treatment

  • Treatment is assigned at an aggregate level, to group of individuals

    • Example: policy interventions often take place at an aggregate level, and affect aggregate entities, such as schools, or geographic or administrative areas
  • Only one or a few groups of treated units, and many more control units

    • Challenge: Only having one treatment, it is hard to understand the treatment assignment mechanism
  • Long time series both before and after

  • Shape of data matrix (rows are groups, columns are time): short and wide

2. When parallel trend assumption fails to hold

  • DiD provides a simple estimator of the ATT provided that non-anticipation and parallel trends hold.
  • However, the parallel trends assumption can often fail to hold in practice. Synthetic control (SC) allows extension of DiD type of methods to settings without parallel trends. Specifically, SC methods seek to mitigate bias from failures of parallel trends by carefully reweighting the control units. Intuitively, we use SC to “enforce the parallel trend”.

Formal Setup

  • Data: (J+1) units across periods t=1,,T

  • Treated unit: the first unit (j=1) is being treated only after period T0 (1T0<T)

    • Before treatment period: 1tT0

    • Post treatment period: T0+1tT

  • Untreated units: j=2,,J+1 is a collection of untreated units, also called “donor pool”

  • In post-treatment period, tT0+1,

    • define Y1t(1) to be potential outcome under the treatment

    • define Y1t(0) to be potential outcome without the treatment

  • Parameter of interest: τ1t=Y1t(1)Y1t(0)=Y1t\textcolorredY1t(0) for tT0+1

    Remark 1.
    As Y1t(1) is observable, we have Y1t(1)=Y1t in post-treatment period. The challenge part is to estimate the counterfactual, Y1t(0)

    Remark 2.
    τ1t depends on time t. It allows the effect of the treatment to change over time. This is crucial because treatment effects may not be instantaneous and may accumulate or dissipate as time after the intervention passes.

Theory behind SC

Assumption 1 (Linear factor model for counterfactuals).
(1)Yit(0)=\textcolorredμiλt+δt+Xiβ+ϵit,

where

  • μi is a vector of unobserved confounders

  • λt is the corresponding time-varying coefficients

  • Xi is a vector of observed covariates

Equation (1) generalizes the usual fixed-effects model for DiD, where \textcolorredμiλt is replaced by the unit fixed effect αi , known as the interactive fixed effects model, essentially latent factor model.

Notice that the assumptions on the data-generating process involve Yit(0) , but not Yit(1). Since Y1t(1)=Y1t is observed, estimation of τ1t for t>T0 requires no assumptions on the process that generates Yit(1).

The key idea of synthetic control is to estimate the unobserved \textcolorredY1t(0) by a convex combination of the observed outcomes for the control units. Intuitively, the goal is to create a weighted average of control units that “look like” a treatment unit using past outcomes.

Let W=(w2,,wJ+1) with wj0 and j=2J+1wj=1. Each choice of W represents a potential synthetic control.

Assumption 2 (Key assumption).
There exists weights W such that the pre-treatment covariates and outcomes for the treated unit are balanced j=2J+1wjXj=X1, j=2J+1wjYj1=Y11,  , j=2J+1wjYjT0=Y1T0
  • Assuming factor model (1) and fairly standard conditions, one could show Y1t(0)j=2J+1wjYjt0 if the # of pre-treatment periods is large relative to the residual variance

  • An approximately unbiased estimator of τ1t is

τ^1t=Yitj=2J+1wjYjt,t=T0+1,,T

How to find W?

  • We can generalize the synthetic control method

  • Pre-treatment covariates: Zi=(Yi,Xi)

    • lagged outcomes: Yi=(Yi1,Yi2,,Yi,T0)

    • lagged covariates Xi=(Xi1,Xi2,,Xi,T0)

  • Or some subsets or functions of these variables

  • Balance both the lagged outcomes and pre-treatment covariates w^=argminw(Z1i=2J+1wiZi)Σ^1(Z1i=2J+1wiZi) subject to i=2J+1wi=1, and wi0 for all i=1,,N1 where Σ^ is the covariance matrix of Zi

Limitations and recommendation of SC

  • Exclude unit from donor pool that may be affected by treatment (including indirect effect)

  • Exclude unit that received big shock that NOT related to the treatment

  • To avoid interpolation bias, one could inlcude units that are similar

  • Avoid overfitting by having too many units in the control group

  • SC requires enough pre-treatment time period

  • Credibility depends on ability to match pre-treatment covariates and outcomes

  • SC is not recommended if pre-treatment fit is poor, or just a few pre-treatment periods

Reference

Abadie, Alberto (2021), “Using Synthetic Controls: Feasibility, Data Requirements, and Methodological Aspects,” Journal of Economic Literature, 59 (2), 391–425.

Chen Xing
Chen Xing
Founder & Data Scientist

Enjoy Life & Enjoy Work!

Related