Details
Healthcare data from routine-care delivery, such as electronic health records(EHRs) and administrative claims, can provide real-world evidence(RWE) on medical product effects. However, estimating causal effects can be challenging due to confounding and poorly measured information on comorbidities. To improve confounding control, data-driven algorithms can be used to identify and adjust for large numbers of variables that indirectly capture information on unmeasured or unspecified confounding factors. Lasso regression is a widely used tool for dimension reduction, but undersmoothing can improve confounding control in sparse high-dimensional datasets. In this study, we evaluate the effectiveness of collaborative-controlled targeted learning in data-adaptive undersmoothing for fitting large-scale propensity score(PS) models, revealing that cross-fitting was crucial for avoiding non-overlap in covariate distributions and reducing bias in causal estimates.