Skip to main content

Targeted Learning with an Undersmoothed Lasso Propensity Score Model for Large-Scale Covariate Adjustment in Healthcare Database Studies

    Basic Details

    Healthcare data from routine-care delivery, such as electronic health records(EHRs) and administrative claims, can provide real-world evidence(RWE) on medical product effects. However, estimating causal effects can be challenging due to confounding and poorly measured information on comorbidities. To improve confounding control, data-driven algorithms can be used to identify and adjust for large numbers of variables that indirectly capture information on unmeasured or unspecified confounding factors. Lasso regression is a widely used tool for dimension reduction, but undersmoothing can improve confounding control in sparse high-dimensional datasets. In this study, we evaluate the effectiveness of collaborative-controlled targeted learning in data-adaptive undersmoothing for fitting large-scale propensity score(PS) models, revealing that cross-fitting was crucial for avoiding non-overlap in covariate distributions and reducing bias in causal estimates.


    Richard Wyss, Mark van der Laan, Susan Gruber, Xu Shi, Hana Lee, Sarah K. Dutcher, Jennifer C Nelson, Sengwee Toh, Massimiliano Russo, Shirley V. Wang, Rishi J. Desai, Kueiyu Joshua Lin

    Corresponding Author

    Dr. Richard Wyss; Division of Pharmacoepidemiology and Pharmacoeconomics Brigham and Women’s Hospital, Harvard Medical School