Skip to main content

How Sentinel Gets Its Data

How Sentinel Works: The Sentinel Common Data Model and Sentinel Distributed Database

How does the Sentinel System gather and organize its data, and where does it come from? The Sentinel System is comprised of health care organizations, known as Data Partners, that have medical billing information and electronic health records. This data is collected routinely with every healthcare encounter and is used to answer FDA’s medical product safety questions. Each Data Partner keeps their own data and controls who can access it.

Each Data Partner runs analyses through a data program and sends the deidentified results to the Sentinel Operations Center. The Sentinel System’s distributed approach maintains patient privacy and data security.

A Standardized Data Structure: the Sentinel Common Data Model 

The distributed approach is important for creating larger datasets so FDA can study rare adverse events or drugs used in small populations. It also requires a standardized data structure. This standardized data structure is the Sentinel Common Data Model. Data Partners transform their data locally into the Sentinel Common Data Model format. During this transformation, Data Partners remove or mask directly identifiable patient information. The standardized format enables them to execute routine querying tools, enabling FDA to conduct studies quickly, compared to writing new programs for each new study.

A Combined Collection of Datasets: the Sentinel Distributed Database

The Sentinel Distributed Database is the collection of harmonized datasets from many different Data Partners. These datasets are all in the Sentinel Common Data Model format. The figure below illustrates Sentinel’s distributed data approach.

The Sentinel Distributed Database has quality-checked data from a variety of sources:

  • National health insurance plans
  • Large integrated delivery systems
  • Health care organizations

The data is from patient interactions in the United States healthcare system. Records of those interactions are held by the patients' insurers and providers.

For example, fictitious patient (shown below) is a male enrolled in a health plan. He's experienced several medical encounters between 2017 and 2019.

The following information from the patient's medical encounters appear in Sentinel data: 

  • Diagnoses
  • Procedures
  • Prescription information 

One of the most important features of the Sentinel System is that can capture each healthcare encounter, diagnosis, and prescription for patients—no matter where they get their healthcare. This comprehensive data is what makes it possible for FDA to study the side effects of drugs and other medical products.

Sentinel has the ability to set up linkages and broaden the kinds of data available for use in the Sentinel System. Complementary data sources include registries and other databases that collect information on vaccines, deaths, cancer, and other drugs or health outcomes using a system expressly designed to capture it.

Why the Sentinel System Has Complementary Data Sources

Collaboration with registries and prescribing databases supplements core claims and administrative data. Collaboration also creates opportunities to confirm exposures and outcomes used in Sentinel analyses.

The illustration below shows the connection between complementary data sources and Sentinel.

You can find projects exploring the potential use of complementary data sources in the table below.

The data quality review process is a joint effort between the SOC and its Data Partners. The SOC provides detailed guidelines for characterizing the data that Data Partners create using the Sentinel Common Data Model (SCDM) rules.

Sentinel Data Quality Review and Characterization Process

The following steps ensure that Sentinel data are of the highest quality.

Preparing Data for Quality Review and Characterization: Extract, Transform, and Load (ETL)

Data Partners routinely update the Sentinel Distributed Database (SDD) with new data. This begins with a process called Extract, Transform, and Load (ETL). 

1. The Data Partner extracts data from their internal systems.

These may be insurance claims databases, billing systems, or electronic health record systems. 

2. Data Partners then transform these data into the formats specified by the SCDM.

This ensures that Sentinel System users can compare data across Data Partners, regardless of the source of that data. 

3. Finally, the Data Partner loads the data into the SDD.

Here, it undergoes a thorough quality assurance review.

Sentinel Data Quality Review and Characterization Process

The Data Partner transforms new data into the SCDM and notifies the SOC that the new data are ready for review.

The SOC distributes a quality assurance computer program. This program checks the data according to the standards specified in the SCDM. There are three levels of data checks, as described below.

Level 1 Check: Is Everything There?

The package automatically checks the data for completeness and validity, analyzing for items such as missing variables, values, or entire tables.

Level 2 Check: Do the Data Make Sense?

This review is also done automatically by the package, evaluating the data for cross-table consistency.

Level 3 Check: How Does It Look Compared to the Last ETL?

SOC staff carry out this review. They ensure the stability and consistency of the data over time.

1. The Data Partner will then run the quality assurance computer program.

The computer program will perform the Level 1 and 2 measures described. The computer program is complete once it has run all Level 1 and Level 2 review checks without failing. The computer program will also create results for the SOC to review and perform Level 3 checks.

2. The Data Partner will explain any potential issues identified by the computer program.

The Data Partner will then return this report to the SOC. This report also includes the results produced by the computer program.

3. Once the SOC receives output from the Data Partner, the SOC will conduct a Level 3 review.

This ensures the consistency of the data across ETLs. The SOC will send a report containing questions about the data to the Data Partner. 

4. The Data Partner investigates and answers any questions from the SOC.

Then, the SOC will approve the data for use in Sentinel analyses.