Data Harmonization | Center for Data to Health

Project Description

Data respositories across CTSA hubs must have semantic and syntactic alignment to support federated query that imposes a minimal maintenence burden on CTSA hub sites. Leveraging the native FHIR application programming interfaces (APIs), not proposed as required for US EHRs by the Centers for Medicare & Medicaid Services (CMS), would mitigate extraction transformation load (ETL) costs and maintenence issues.

Harmonizing the data ecosystem will enhance and extend existing work being performed on the National Center for Advancing Translational Science (NCATS) Data Translator system. This system integrates clinical and translational data at scale for mechanistic discovery as well as other emergent systems, such as the NIH Commons. We will apply our strengths and existing activities to make data FAIR-TLC (findable, accessible, interoperable, reusable and traceable, licensable, connected). We will assist contributors and users to develop and apply data standards, common data elements (CDEs), and other commonly utilized data models such as FHIR and Observational Health Data Sciences & Informatics (OHDSI, pronounced “odyssey”). We will extend and supplement infrastructure, training, and collaborative environments to enable data to be shared openly so that groups can collaborate on their harmonization based on specific needs or standards. The data ecosystem will provision CTSA-wide quality assurance reports and data quality assessment, as well as gold-standard data sets and synthetic clinical data sets. Fundamentally, we aim to develop an open-science ethos and unite CTSA community data sharing with broader global efforts.

GitHub Repository

Onboard to CD2H

Archived projects do not have active meetings.

Project Leadership

Christopher Chute, MD, DrPH

Johns Hopkins University

Co-Program Director

Project Cores

Next Generation Data Sharing

Harmonizing the data ecosystem and enabling translational EHR analytics across CTSA hubs