Medicare Claims Synthetic Public Use Files in Sentinel Common Data Model Format: Datasets

Project Title Medicare Claims Synthetic Public Use Files in Sentinel Common Data Model Format: Datasets
Date Posted
Wednesday, April 25, 2018
Status
Complete
Deliverables
Related Links
Description

Medicare Claims Synthetic Public Use Files (SynPUFs) were created to allow interested parties to gain familiarity using Medicare claims data while protecting beneficiary privacy. These files are intended to promote development of software and applications that utilize files in this format, train researchers on the use and complexities of Centers for Medicare and Medicaid Services (CMS) claims, and support safe data mining innovations. The SynPUFs were created by combining randomized information from multiple unique beneficiaries and changing variable values. This randomization and combining of beneficiary information ensures privacy of health information.  

Sentinel uses a distributed data approach in which Data Partners maintain physical and operational control over electronic data in their existing environments. The distributed approach is achieved by using a standardized data structure referred to as the Sentinel Common Data Model (SCDM). Sentinel’s Cohort Identification and Descriptive Analysis (CIDA) tool is a set of SAS macros that allows users to select a cohort of interest. The CIDA tool specifically reads data that is structured in the SCDM.

The Sentinel Operations Center (SOC) has transformed the CMS SynPUFs into the SCDM format as part of an ongoing effort to make Sentinel resources available to external investigators, with the goal of creating a community of investigators who can understand, utilize, and contribute to the Sentinel enterprise.

This page contains:

  • SCDM-formatted SynPUFs datasets in the form of 20 subsamples and their related data element tables: death, demographic, diagnosis, dispensing, encounter, enrollment, and procedure, provided in zipped files.
  • Descriptive statistics of each SynPUFs SCDM subsample and a corresponding data dictionary.

Refer to the Medicare Claims Synthetic Public Use Files in Sentinel Common Data Model Format: User Documentation and Example Routine Querying Package page for user documentation, technical specifications, example routine querying package and SynPUFs demontration report.

Population / Cohort
Individuals 18 years of age or older
Time Period
January 1, 2008 - December 31, 2010
Data Sources
SCDM-formatted SynPUFs