U.S. flag An official website of the United States government

On Oct. 1, 2024, the FDA began implementing a reorganization impacting many parts of the agency. We are in the process of updating FDA.gov content to reflect these changes.

  1. Home
  2. Science & Research
  3. Science and Research Special Topics
  4. Advancing Regulatory Science
  5. Next Generation Text Analytics for FDA – Relevant Text Mining
  1. Advancing Regulatory Science

Next Generation Text Analytics for FDA – Relevant Text Mining

CERSI Collaborators: Russ Altman, MD, PhD, Lichy Han, PhD, Steven Bagley, MD (Stanford)

FDA Collaborators: Elaine Johanson, Roselie Bright, ScD, MS, PMP, Larry Callahan, PhD, JD Frank Switzer, PhD

Project Start Date: September 2015

Regulatory Science Challenge

FDA has a vast amount of documentation on their processes and decisions, which contain valuable information that can help inform future FDA decisions. In addition, large amounts of published scientific literature relevant to FDA's mission contain critical information about how drugs, devices, and food affect the human body. However, it is difficult for FDA scientists to access this knowledge to support their decision making because it is an enormous amount of text that is impossible for any single human to read. Fortunately, the computer science community has created new tools with which computers “read” text and summarize its contents, while extracting key messages. This technology offers an opportunity to analyze both the internal FDA text and the external biological and medical literature to extract key information and provide it to FDA scientists when they need it.

The human gut microbiome is the community of microbes living in our stomach and intestines. Interactions between FDA-regulated products, the human gut microbiome, and human health are very complex and poorly understood. Thus, there is widespread interest across FDA in the impact on the healthy gut from drugs/vaccines, supplements, foods, tobacco use, antibiotics used in treatment or food animals, and use of medical devices such as endoscopes.

Project Description & Goals

This project focuses on prototyping tools to extract knowledge from publicly available documents to help FDA scientists do their work more quickly and efficiently. Stanford team members are using the system DeepDive and Stanford’s new fast prototyping tool Snorkel. Initially, information is being extracted that is relevant to how bacteria in the human gut microbiome interact with drugs to change how they are absorbed in the body or cause other effects. This is an important emerging area of science. The resulting synthesis of knowledge will help FDA officials who are evaluating new and marketed drugs. They will have a better understanding about how bacteria and drugs interact each other and what research gaps remain.

Back to Top