Sentinel has created the Sentinel CMS DataMart, containing 100% Medicare FFS administrative claims data housed in the Center for Medicare and Medicaid Services' CMS Virtual Research Data Center (VRDC). Duke University Department of Population Health Sciences (DPHS) serves as the Sentinel Data Partner in accessing the source data in the VRDC, transforming it into a Sentinel Common Data Model (SCDM) compliant database, executing queries, and returning results to the Sentinel Operations Center (SOC). There are three components made available to the public: (1) Program Specifications; (2) Code pack; and (3) User Guide.
1. Program Specifications: Describes the required extraction, transformation, and loading (ETL) processes and mappings specific to Medicare FFS source data from 2010 through 2018. This document consists of the following sections:
- Medicare FFS Source Data: This section describes the content, structure, and update schedule of the 100% Medicare FFS data stored in the VRDC
- VRDC Environment: This section describes the relevant particulars of the VRDC computing environment
- ETL Specifications: This section describes the different types of information required before starting a new ETL and the different build types that need to be supported
- Source Data Mapping: This section describes the table-specific and field-specific mappings necessary to transform the Medicare FFS data into SCDM-compliant intermediate tables
- Final Production Tables: This section describes the process of combining intermediate tables to create the final tables that will be used in production
Except as related to implementation within the Medicare FFS data, the specification document does not otherwise discuss the rationale or content of the SCDM.
As guiding principles, the processes and programs created to accomplish this ETL should be flexible and extensible. This includes attributes such as the ability to handle different kinds of ETLs (e.g., incremental build v. full rebuild), the ability to create intermediate files that can be easily reused in a subsequent ETL, and the ability to easily add new Medicare data sources into the process.
2. Code Pack: Includes the following features:
- All parameters relating to each type of source file accessed (e.g., Master Beneficiary Summary File - Base (A/B/D), Inpatient Institutional Claims, Skilled Nursing Facility Claims, Outpatient Institutional Claims, Carrier Claims, and Part D Prescription Drug Events), whether those are annual or quarterly.
- All parameters relating to use of already transformed source data into SCDM-formatted data files and/or how to bypass their availability.
- Establishment of SAS data libraries (i.e., LIBNAMEs) for source, intermediate and permanent files.
- Highlights on code that is unique to the CMS VRDC environment, such as security settings, remote submits, standard data libraries, etc. which may not be applicable to public users of the code pack.
- Sequencing of any program execution.
- A list of included programs and macros to serve as a “packing list,” so that users of the code pack can be sure that their pack is complete.
3. User Guide: Includes information on how to use the code pack, along with some guidance on managing use by researchers who will likely have different source files available. The target audience for this document is researchers who wish to create SCDM-compliant tables using Medicare FFS data. While the programs in the Code Pack documented above are specific to the processing of the 100% Medicare data within the VRDC, we anticipate that the mapping information, specifically, will be of use to all researchers.
The associated files on this site are for Sentinel CMS ETL version five utilizing SCDM version 7.0.0., approved by Sentinel in August 2019.
- The content on this page is technical and intended for use by scientists, analysts, and programmers, in various areas of expertise.
- This SAS program package uses source data from the Centers for Medicare and Medicaid Services (CMS) Medicare 100% Fee-For-Service research identifiable files (RIFs) partitioned by quarter. The SAS program package was designed for execution within CMS’s Virtual Research Data Center (VRDC) environment administered by the Research Data Assistance Center (ResDAC) with the following technical resources:
- VRDC access provisioned with 32Gbytes of RAM
- SAS version 9.4 or later
- Sufficient disk storage resources for source datasets, SCDM datasets, WORK data library space, and results of program packages
- A SAS Grid of multiple computers enabling simultaneous processing
- Source data files obtained from CMS by other researchers may have different file names, different partitioning schemes (e.g., annual RIF) different samples of the data (i.e., not 100% Fee-For-Service) and possibly different variables and/or variable names. Users are responsible for making any adjustments to the SAS program package to be compatible with source data they receive from CMS for implementation in their technical environment.
- There is no mechanism for technical support by Duke University, the Sentinel Operations Center, ResDAC, CMS, or by the U. S. Food and Drug Administration (FDA) for use of this SAS program package.
- The SAS program package is distributed “as is” and with no warranties of any kind, whether express or implied, including and without limitation, any warranty of merchantability or fitness for a particular purpose.
- In no event shall any individual, the Duke University Department of Population Health Sciences, the Sentinel Operations Center located at Harvard Pilgrim Health Care Institute, nor the FDA be liable for any damages whatsoever relating to the use, misuse, or inability to use this SAS program package (including, without limitation, damages for loss of profits or revenue, business interruption, loss of information, or any other loss).
- The information contained on this website is provided as part of FDA's commitment to place knowledge acquired from the Sentinel System in the public domain.