Balancing Weights for Causal Inference

Oct 22, 2025 5 min read causal inference

TL;DR

Cohn et al. (2023) introduces the balancing approach to weighting for causal inference in observational studies. Unlike traditional methods that model the propensity score directly, balancing weights are estimated by solving an optimization problem that directly targets covariate balance between treatment groups. The authors demonstrate that this approach offers protection against model misspecification, connects naturally to bias-variance trade-offs, and can be augmented with outcome modeling for improved performance. Applied to the classic LaLonde job training data, balancing methods achieve better covariate balance than standard propensity score approaches while maintaining reasonable effective sample sizes.

What is this paper about?

Covariate balance is fundamental to causal inference: randomized experiments achieve it by design, while observational studies must adjust for it. This chapter addresses a key challenge in observational causal inference—how to construct weights that remove confounding by balancing observed covariates between treated and control groups. The traditional modeling approach estimates propensity scores (the probability of treatment given covariates) and inverts them to create weights, but this relies heavily on correct model specification. When the propensity score model is wrong, the resulting weights may fail to balance covariates in the sample, leading to biased treatment effect estimates. The chapter explores an alternative: directly finding weights that achieve balance in the observed data, rather than first modeling the propensity score.

What do the authors do?

The authors formalize the balancing approach as an optimization problem that jointly minimizes covariate imbalance and weight dispersion (variance). They show how different choices of the “model class” M—the set of functions of covariates to balance—correspond to different assumptions about the outcome model and lead to different optimization formulations. Using the LaLonde dataset (a constructed observational study where the true treatment effect is known), they compare three designs: balancing main covariate terms only, balancing up to three-way interactions, and balancing an infinite-dimensional reproducing kernel Hilbert space (RKHS). For each design, they evaluate covariate balance using standardized mean differences, examine the effective sample size (a measure of weight dispersion), and explore the bias-variance trade-off by varying regularization parameters. The authors also demonstrate how balancing weights can be augmented with outcome regression to further reduce bias, and they establish asymptotic normality results for inference.

Why is this important?

This work matters because most observational studies include covariates in their analysis, yet practitioners often don’t carefully consider whether their weighting method actually achieves balance on the relevant covariate functions. The balancing approach makes covariate balance a first-order design criterion rather than a post-hoc diagnostic check. It reveals the implicit bias-variance trade-offs in weighting methods and shows that different assumptions about the outcome model (linear, interactive, nonparametric) lead to fundamentally different weighting strategies. The framework unifies many existing methods (entropy balancing, stable weights, kernel balancing) under one optimization structure and clarifies the connection between the balancing and modeling approaches through Lagrangian duality. Importantly, the chapter provides practical guidance on design choices—what to balance, how much dispersion to tolerate, whether to allow negative weights—that applied researchers face but often lack principled ways to resolve.

Who should care?

Applied researchers in economics, epidemiology, public policy, education, and medicine who use inverse propensity weighting or other covariate adjustment methods in observational studies. Methodologists working on causal inference, especially those developing new weighting estimators or studying properties of existing ones. Graduate students learning causal inference who need to understand the trade-offs between different adjustment strategies and the implicit assumptions behind common practices. Policy evaluators who must justify their modeling choices and demonstrate that their treatment effect estimates are robust to covariate imbalance. Anyone who has struggled with poor covariate balance after propensity score weighting or wondered how to choose between competing adjustment methods would benefit from this framework.

Do we have code?

The chapter does not provide standalone replication code or software packages. However, the authors note that many of the specific balancing methods discussed are available in existing R packages: entropy balancing in the ebal package, covariate balancing propensity scores (CBPS) in the CBPS package, and stable balancing weights in the sbw package. The kernel balancing approach can be implemented using standard kernel methods in R or Python. The LaLonde dataset used throughout the examples is publicly available and widely used in the causal inference literature, making it straightforward to reproduce the analyses with these existing tools.

In summary, this chapter reframes propensity score weighting as a balance-optimization problem rather than a pure modeling exercise. By directly targeting the balancing property of inverse propensity weights, the approach offers robustness to propensity score misspecification while making explicit the bias-variance trade-offs inherent in any covariate adjustment. The LaLonde application demonstrates that balancing weights can substantially reduce covariate imbalance compared to standard methods, though at the cost of reduced effective sample size. The framework provides both theoretical insight (connecting balancing to dual regression, establishing asymptotic properties) and practical guidance (how to choose what to balance, when to augment with outcome modeling, whether to allow extrapolation) that fills an important gap in applied causal inference.

Reference

Cohn, Eric R., Eli Ben-Michael, Avi Feller, and José R. Zubizarreta (2023), “Balancing Weights for Causal Inference,” in Handbook of Matching and Weighting Adjustments for Causal Inference, Chapman and Hall/CRC. https://www.taylorfrancis.com/chapters/edit/10.1201/9781003102670-16/balancing-weights-causal-inference-eric-cohn-eli-ben-michael-avi-feller-josé-zubizarreta

balancing weights propensity score weighting AIPW IPW

Chen Xing

Founder & Data Scientist

Enjoy Life & Enjoy Work!