Skip to main content

Electronic Health Record (EHR) Enhanced Signal Detection Using Tree-Based Scan Statistic Methods

    Basic Details
    Date
    Type
    Publication
    Description

    Tree-based scan statistics (TBSS) are data mining methods that screen thousands of hierarchically related health outcomes to detect unsuspected adverse drug effects. TBSS traditionally analyze claims data with outcomes defined via diagnosis codes. TBSS have not been previously applied to rich clinical information in Electronic Health Records (EHR). We developed approaches for integrating EHR data in TBSS analyses, including outcomes derived from natural language processing (NLP) applied to clinical notes and laboratory results, related via multipath hierarchical structures. We consider four settings that sequentially add sources of outcomes to the TBSS tree: 1) diagnosis code, 2) NLP-derived outcomes, 3) binary outcomes from lab results, and 4) continuous lab results. In a comparative cohort study involving second-generation sulfonylureas (SUs) and dipeptidyl peptidase 4 (DPP-4) inhibitors among adults with type-2 diabetes, with an a priori expected signal of hypoglycemia, diagnosis code data showed no statistical alerts for inpatient or emergency department settings. Adding NLP-derived outcomes resulted in an alert for "Headaches" (p=0.047), a nonspecific symptom of hypoglycemia. Progressively adding binary and continuous lab results produced the same alert. Integrating EHR in TBSS can be useful for the detection of safety signals for further investigation.

    Author(s)

    Massimiliano Russo, Sushama Kattinakere Sreedhara, Joshua Smith, Sharon E. Davis, Judith C. Maro, Thomas Deramus, Joyce Lii, Jie Yang, Rishi Desai, José J. Hernández-Muñoz, Yong Ma, Youjin Wang, Jamal T. Jones, Shirley V. Wang

    Corresponding Author

    Massimiliano Russo; Ohio State University

    Email: mailto:russo.325@osu.edu