Tools & Cloud Infrastructure

Tools and Cloud Infrastructure Infographic

Value and Vision

Computational technologies and tools vital to clinical and translational research are sometimes developed, deployed, and managed independently, which can render these processes tedious, costly, heterogeneous, and less secure. The Tools & Cloud Infrastructure Core aims to establish a common tool and cloud computing architecture to provide CTSA hubs with an affordable, easy-to-use, scalable deployment paradigm that can remove boundaries and help translational researchers promote and deploy their own tools as well as adopt others.

Research Strategy

Much has been written in the contemporary scientific literature and general media concerning the promise of leveraging advanced computational technologies and methods to enable new paradigms for clinical and translational research. Ultimately, this research can and should generate health benefits at both the patient and population levels, informed by the knowledge generated and disseminated via these efforts. We believe these types of emergent clinical and translational research paradigms can and should be predicated on the collection, analysis, and dissemination of relevant, timely, and comprehensive data and knowledge by a variety of end-users in a highly liquid and democratic manner.

The pursuit of clinical and translational research at a national level represents an exciting inflection point in the history of health and life sciences. Capitalizing on this opportunity requires democratization and wide-spread use of computational technologies by a broad spectrum of researchers with variable degrees of technical capability and training and requires us to:

  • Enable effective end-user adoption and utilization of computational platforms and tools in a variety of settings
  • Ensure technology deployment and user experience are compatible with “real world” workflows and environments
  • Overcome limitations in vendor-specific technologies that make it difficult to leverage systems for integrating and interacting with diverse and complex data types across traditional organizational boundaries
  • Ensure such platforms are elastic, scalable, and sustainable from both a technology and resource perspective

Community Core Objectives

  1. Create common cloud computing architecture that can enable the rapid deployment and sharing of reusable software components by CTSA hubs
  2. Demonstrate the use of shared tools and platforms for the collaborative analysis of clinical data in a manner that transcends individual CTSA hub “boundaries”
  3. Disseminate a common set of tools that can be employed for the both local and collaborative query of common data warehousing platforms and underlying data models
  4. Pilot the “cloudification” of software artifacts that can be shared across CTSA hubs to address common and recurring information needs

Presentations and Other Materials

Tools & Cloud Infrastructure Core community meetings have been repurposed to meet the needs of the N3C Collaborative Analytics workstream. See the N3C website for more information.

Projects (Phase III)

Cloud-based DUA

This project is based on a pilot with the FDA and will create a cloud-based data use agreement toolkit to support the entry of de-identified EHR data from partner institutions into the sandboxes. The project will leverage a preconfigured FHIR repository maintained on the CD2H/NCATS cloud or behind the partner institution’s firewall as a demonstration. The team will work with the community to write Governance, SOPs, and policy for CTSA informatics community collaboration. A pan-sandbox Governance group will have CD2H and community representatives to contribute subject matter for specific domains.     

Cloud-based Sandbox for Analytics (Natural Language Processing)

A continuation of Phase II collaborative work with the Informatics Enterprise Committee (iEC) working group, this project aims to deploy a suite of natural language processing (NLP) tools and realize evaluation measures and tools as well as best practices. 

Cloud-based Sandbox for Best Practices in Clinical Machine Learning (ML)

A sandbox project designed to create a best practices platform for deploying and evaluating clinical machine learning tools and algorithms. Goals include provisioning community-vetted solutions to common clinical machine learning challenges, including data preparation, analysis of bias sources, and evaluation/validation of algorithms. 

Cloud-based Sandbox for the Evaluation of Data Quality Assessment Methods

A sandbox project designed to develop, evaluate, and share tools and methods for data quality assessment. This sandbox project will include a pilot that leverages the Accrual to Clinical Trials (ACT) Network data to understand the quantity and completeness of ACT data and differences in coding practices across institutions.  

Tools & Cloud Core Architecture

This ongoing core infrastructure project focuses on establishing a CTSA tool registry to facilitate discovery and confident adoption as well as implementation of tools and algorithms by CTSA investigators, including the provision of standard mechanisms for cloud-based data access and use.

Projects (Phase II)

EHR DREAM Challenge

The EHR DREAM Challenge is a series of community challenges to pilot and develop a predictive analytic ecosystem within the healthcare system.


This project created an open source clinical Enterprise Data Warehouse (EDW) Data Browser to enable querying by data dictionaries, or ontologies, and allow for access to both de-identified and identifiable patient data in a compliant manner.

Peer Review Platform

This project titled "Competitions" is an open source tool to run NIH-style peer review of competitions, pilot projects, and research proposals in a cloud-based, consortium-wide, single sign-on platform. 

Tools & Cloud Architecture

This project was designed to demonstrate the collaboration of opportunities provided by deploying CD2H applications in the NCATS cloud.

Core Leads