Division Director: Weida Tong, Ph.D.
Artificial Intelligence for Regulatory Science Research
About the Division
Artificial Intelligence for Regulatory Science Research
The Division of Bioinformatics and Biostatistics develops integrated bioinformatics and biostatistics capability to address increasing needs of FDA, such as Artificial Intelligence (AI), drug safety, drug repositioning, biomarker development, precision medicine, genomics, rare diseases, endocrine disruptors, and risk assessment. Additionally, the division provides support of NCTR’s 1) IT infrastructure, 2) bioinformatics support by analyzing data, managing commercial and in-house software tools, and conducting training sessions, and 3) research and scientific outreach.
Branches Within the Division
Research efforts focus on predictive toxicology, precision medicine, biomarker development, drug safety, and drug repositioning. Most research projects are in collaboration with scientists within NCTR, across FDA product centers, and in the larger scientific community. One key endeavor of the branch is to construct knowledge bases in the specific areas of FDA’s responsibility to provide a data-driven decision-making environment for enhanced safety evaluation and precision medicine.
Conducts peer-reviewed research of statistical methods to analyze toxicological and molecular data as well as data-mining techniques for pattern identification and signal detection. The branch also provides statistical support related to FDA’s mission to protect and promote public health.
Provides critical support and enhancement to infrastructure in the areas of software and database development for research support and research management, high performance computing, systems integration, and information system-asset management and procurement.
Strengthens the division, focuses on “knowledge uptake” of the Division’s research products for regulatory application, and enables “data liberation” of regulatory data from the FDA Product Centers. Thus, facilitating regulatory-science research in the Division and increasing NCTR's linkages with FDA Product Centers.
NCTR Bioinformatics Tools
ArrayTrack™ HCA-PCA Standolone Package — Hierarchical Cluster Analysis (HCA) and Principal Component Analysis (PCA) – powerful data-exploring tools extracted from ArrayTrack™.
The de novo Assembly Quality Evaluation Tool (dnAQET) — Framework designed to evaluate the contigs of a de novo assembly against a trusted reference genome.
Decision Forest — Novel pattern-recognition method for analysis of data from microarray experiments, proteomics research, and predictive toxicology.
Drug-Induced Liver Injury Rank (DILIrank) Dataset — A large reference list of drugs ranked by their risk for developing DILI in humans. This is an updated list from the LTKB Benchmark dataset.
Estrogenic Activity Database (EADB) — Comprehensive set of estrogenic activity data from a variety of data sources and a component of the enhanced Endocrine Disruptors Knowledge Base (EDKB).
Endocrine Disruptor Knowledge Base (EDKB) — Scientific resources for estrogen and androgen activity of potential endocrine disruptor chemicals.
FDALabel — Tool to conduct full-text search of drug labeling.
Liver Toxicity Knowledge Base (LTKB) — Collection of diverse drug-induced liver injury data associated with individual drugs and the use of systems biology analysis.
MicroArray/Sequencing Quality Control (MAQC/SEQC) Project — Project to develop microarray quality control metrics and thresholds.
Mold2 — Software that generates molecular descriptors from two-dimensional structures.
NCTR Liver Cancer Database (NCTRlcdb) — Database of 999 chemicals with assigned liver-toxicity classifications to facilitate the construction of better carcinogenicity models by FDA and other organizations.
2020 Select Accomplishments
- Breakthrough Therapy Designation (BTD) system
- Text mining study of Office of New Drugs (OND) regulatory documents (Meeting Minutes)
- Conducted studies under the ongoing project called Sequencing Quality Control Phase 2 (SEQC2); an NCTR-led consortium effort to assess technical performance and application of emerging technologies for safety evaluation and clinical application. Papers about studies in the following areas are scheduled for submission in 2020:
- Cancer genomics using whole-genome sequencing
- Cancer genomics using target-gene sequencing
- Reproducibility of whole-genome sequencing
- DILIst, using data from the NCTR-developed Liver Toxicity Knowledge Base, classified about 1,300 drugs known to cause human drug-induced liver injury (DILI)
- Led a CAMDA (Critical Assessment of Massive Data Analysis – platform to evaluate big data analytics using a crowdsourcing challenge mechanism) Challenge for artificial intelligence/machine learning to predict DILI with genomics data
Select Projects with Other Centers in 2020
Support DASH (Data Analysis Search Host) Tool
Develop IND (Investigational New Drug) Smart
Template to standardize the IND data submission and management
Risk Evaluation and Mitigation Strategy (REMS)
Text Mining Study of Office of New Drugs Regulatory
Documents (Meeting Minutes)
Develop Safety Policy Research Team (SPRT) System
Prototyping Automated Laboratory System (ALIS)
Artificial Intelligence for Food Safety (two projects)
Tobacco Constituents Knowledge Base (TCKB)
Topic Modeling of Tobacco Documents
Literature Analysis of Five Major Health Endpoints Associated with Smoking
Resources for You
- NCTR Grand Rounds: "Artificial Intelligence for Regulatory Science Research" (Presentation recorded in Adobe Connect on May 14, 2020)
- Annual Reports
- Bioinformatics Tools
- Meet the Principal Investigators
- NCTR Bioinformatics Support
- National Center for Toxicological Research
Food and Drug Administration
3900 NCTR Rd
Jefferson, AR 72079
- (870) 543-7538