Science & Research

5. Harness Diverse Data through Information Sciences to Improve Health Outcomes

Previous Section: 4. Ensure FDA Readiness to Evaluate Innovative Emerging Technologies

FDA receives a vast amount of information from a variety of sources, including product submissions, adverse event reports, de-identified patient data from health care providers, and results from surveys and basic scientific research. Successful integration and analysis of data from these disparate sources would provide knowledge and insight not possible from any one source alone. A few of the many currently untapped opportunities include: Monitoring adverse event trends and disease outbreaks; combining data from multiple clinical trials and postmarket studies as well as preclinical data; evaluating and comparing effectiveness and safety of medical and veterinary products in particular sub-populations, including sex/gender and race/ethnicity analysis, and ultimately host genomics and/or genomic response data; and large-scale active surveillance for rare events and data- and text- mining for a variety of research purposes. 1

FDA is in the early stages of constructing the Information Technology (IT) infrastructure necessary for this type of complex data integration, but full realization of the enormous potential in harnessing these diverse data will require extensive improvements to the current FDA IT environment, as well as new analytic approaches and tools. For example, enhancing FDA’s capability to carry out sophisticated data mining activities requires a networking and computational infrastructure that can support an enormous number of simultaneous queries of a large set of indexed data sources.

Within each FDA Center there are ongoing activities that illustrate the range of possibilities and the potential for benefits to public health that would be obtainable. Expansion, coordination, and improvement of the existing IT infrastructure would enhance and augment these ongoing activities.

Implementation Strategy

FDA will develop agency information sciences capability to address the following needs:

  1. Enhance information technology infrastructure development and data mining:
    1. Improve access to large, complex data sets to solve problems faster or to allow solutions of otherwise intractable problems (e.g., multi-dimensional map of Salmonella);
    2. Develop secure IT network environment (enclave) for scientific computing and collaborative research with internal (FDA) and external colleagues;
    3. Improve ability to access high speed networking and processing to facilitate transfer and application of computational functions to large, complex datasets (e.g., cloud computing); and
    4. Identify computational approaches for rapid search and retrieval.
  2. Develop and apply simulation models for product life cycles, risk assessment, and other regulatory science uses:
    1. Identify opportunities and develop computer simulation and modeling to streamline data analysis and model biological systems and their responses to agents of concerns, such as toxins, pathogens, electromagnetic energy, and biomaterials; and
    2. Promote novel clinical trial design using simulation, new statistical models, and novel animal models/animal model alternatives.
  3. Analyze large scale clinical and preclinical data sets:
    1. Continue to refine methods for analysis of post-market data, including data mining of spontaneous reports and analysis of electronic health records from accessible large healthcare databases;
    2. Continue and expand patient centered outcomes research by compiling datasets converted to standardized format across critical classes of drugs that are entered into the clinical trials repository and Janus 2 ; and
    3. Provide FDA access to data from a variety of large patient databases, including Sentinel 3 where FDA is working with multiple partners within government and the private sector. Mini-Sentinel project is an active prototype of the full system.
  4. Incorporate knowledge from FDA regulatory files into a database integrating a broad array of data types to facilitate development of predictive toxicology models and model validation (also see section 1)
  5. Develop new data sources and innovative analytical methods and approaches:
    1. Lead the development of scientific infrastructure for national and international registries to advance the regulatory science and surveillance of medical products throughout their lifecycle (e.g., International Consortium of Orthopedic Registries - ICOR); and
    2. Advance development of innovative methodological approaches, such as evidence generation, synthesis, and evaluation throughout device life cycle through the Medical Device Epidemiology Network (MDEpiNet) Initiative.

PACES Initiative
FDA is currently undertaking projects with enormous potential to unlock the data from product applications reviewed by FDA. By integrating and analyzing these data, FDA will be able to provide industry with new information that can be applied to future product development and potentially save billions of dollars in development costs. One of the programs is the academic Partnership in Applied Comparative Effectiveness Science (PACES) project funded by FDA. PACES facilitates pilot projects to conduct advanced analyses to detect clinical trends to determine which interventions will be most effective for which patients under which specific conditions. This will cut out many of the headaches of trial-and-error to find the right treatment for a particular patient.

Public Health Impact

Expansion and improvement of the existing FDA IT infrastructure and application of IT resources to support sophisticated analyses of data will have a number of positive impacts. Access to this data would provide the ability to better predict failure or better design future drugs, future and existing devices, and additional studies. This would increase the efficiency and effectiveness of new products and studies, potentially resulting in better products getting to patients faster. Development and testing of novel methodologies for the synthesis and systematic evaluation of all available evidence will allow comprehensive, up-to-date risk-benefit balance determination at any point of the product life cycle so that FDA can make optimally informed decisions and provide more useful information to practitioners, patients, and industry.

1. Additional information on the possibilities of data integration can be found in the PCAST report to the President on Realizing the Full Potential of Health Information Technology to Improve Healthcare for Americans, at

2. Janus is an enterprise initiative to improve FDA’s management of structured scientific data about regulated products in support of regulatory decision-making. For more information, please go to (

3. Sentinel is a national electronic system that will transform FDA’s ability to track the safety of drugs, biologics, medical devices--and ultimately all FDA-regulated products once they reach the market. For more information, please go to

Table of Contents: Strategic Plan for Regulatory Science

Next Section: 6. Implement a New Prevention-Focused Food Safety System to Protect Public Health

Page Last Updated: 05/17/2016
Note: If you need help accessing information in different file formats, see Instructions for Downloading Viewers and Players.