About FDA

Bioinformatics and Biostatistics

Bioinformatics Tools
Recent Division Accomplishments
2015 Research Projects
Contact Information

Meet the Division Director:  Weida Tong, Ph.D.

Dr. Tong is a gifted computational chemist with broad expertise that spans the entire spectrum of computational methods in molecular modeling and bioinformatics applied to systems biology, predictive toxicology, and knowledge management. He is internationally recognized for his leadership in the areas of computer modeling and bioinformatics, serves as a Science Advisory Board (SAB) member for the Netherlands Toxicogenomics Center, and as a SAB member for the EU Framework Project on CarcinoGenomics. Weida received his B.S. in Chemistry (1983) and his Ph.D. in Polymer Chemistry (1990) from Fudan University in China. Weida’s efforts and leadership qualities have made a significant impact within FDA and worldwide. He has supervised the FDA-led community-wide MicroArray Quality Control Consortium, analyzing technical performance and practical utility of emerging molecular technologies; and coordinated the development of the Liver Toxicity Knowledge Base to address public health concerns related to drug-induced liver injury. He played a major leadership role in the conception, design, and development of numerous computational tools in bioinformatics, chemoinformatics, computational toxicology, biostatistics, and systems biology. His work and creativity have public health impacts in predictive systems toxicology and risk assessment. His research (>200 publications) is cataloged in eminent peer-reviewed journals.


The Division of Bioinformatics and Biostatistics develops integrated bioinformatics and biostatistics capability to address increasing needs in biomarker development, drug safety, drug repositioning, personalized medicine, and risk assessment. 

The Division is comprised of three branches:

  • Bioinformatics—conducts research on statistical methods to analyze toxicological and molecular data, and to develop/apply data-mining methods for pattern identification and signal detection of high-dimensional data. 
  • Biostatistics—conducts peer-reviewed research and provides statistical support related to FDA’s mission to protect and promote public health.
  • Scientific Computing—provides critical support and enhancement to infrastructure in the areas of software and database development for research support and research management, high performance computing, systems integration, and information system asset management and procurement.  

    Read more about the Division of Bioinformatics and Biostatistics in the Consumer Update titled "The Lab for These FDA Scientists Is a Computer Screen."

Back to Top of Page

NCTR Bioinformatic Tools

  • ArrayTrack
    DNA microarray data management, mining, analysis, and interpretation software
  • atBioNet
    An Integrated PPI (protein-protein interaction) Network Analysis Tool for Systems Biology and Biomarker Discovery
  • Decision Forest
    A novel pattern recognition method for analysis of data from microarray experiments, proteomics research, and predictive toxicology
  • Estrogenic Activity Database (EADB)
    Comprehensive set of estrogenic activity data from a variety of data sources and a component of the enhanced Endocrine Disruptors Knowledge Base (EDKB) 
  • Endocrine Disruptor Knowledge Base (EDKB)
    Scientific resources to predict estrogen and androgen activity
  • FDALabel
    Full-Text Search of Product Labeling
  • Gene Ontology for Functional Analysis (GOFFA)
    ArrayTrack™ tool to identify terms in Gene Ontology associated with a list of genes
  • Liver Toxicity Knowledge Base (LTKB)
    Collection of diverse drug-induced liver injury data associated with individual drugs and development of predictive models to assess risk of drug-induced liver injury
  • MicroArray Quality Control (MAQC) Project
    Development of microarray quality control metrics and thresholds
  • Mold2
    Generate molecular descriptors from two-dimensional structures
  • NCTR Liver Cancer Database (NCTRlcdb)
    Database of 999 chemicals with assigned liver-toxicity classifications to facilitate the construction of cleaner and better carcinogenicity models by FDA and other organizations
  • SNPTrack
    An integrated solution for the management, analysis, and interpretation of genetic association study data

Back to Top of Page

Recent Division Accomplishments

  • In 2014, the consortium completed the third phase of the MicroArray Quality Control (MAQC) project, also known as the Sequencing Quality Control (SEQC) project. The project assessed the next-generation sequencing technologies, specifically looking at appropriate approaches and standards to enable use of this developing science. The consortium published eight manuscripts, all of which are now available in a Nature Collections special issue. The SEQC was an FDA-led, community-wide consortium consisting of 180 participants from 73 organizations and 12 countries and was coordinated by the FDA.
  • In 2014, an enhanced version of the Liver Toxicity Knowledge Base (LTKB) was developed and made publicly available. The new version contains a large number of data for drugs along with the predictive models that can be accessed for prediction online. Most of the accomplishments in this project were published in peer-reviewed journals with high impact factors. This work continues.
  • Continued development of data-mining techniques including clustering, biclustering, and classification algorithms have been developed for subgroup identification and prediction. These techniques have been applied to the development of predictive models, and predictive enrichment classifiers in personalized medicine; serotype identification and characterization in outbreak investigations; and identification of drug subgroup to adverse-event subgroup association in the FDA Adverse Event Reporting System database.
  • EMEN2 Electron Microscopy Electronic Notebook)—provides a platform which enables efficient storage, management, and sharing of electron microscopy images and metadata. Staff at NCTR and across FDA will be able to access these images, which will now include nanoparticle images in support of the NCTR/ORA Nanotechnology Core Facility. Examples of Ongoing Research Projects

Back to Top of Page

2015 Research Projects

The following list is just a sample of research projects being conducted at NCTR in the Division of Bioinformatics and Biostatistics. 

  • Biomarker Study to Improve Adjuvant Treatment for ER-positive and HER2-negative Breast Cancer Patients
  • CTP Scientific Enclave, Tobacco Constituents Knowledge Base, and Topic Modeling for Tobacco Industry Documents
  • Data-Mining Strategy to Identify Hepatotoxic Drugs and Sensitive Patients
  • Development and Refinement of the FDA Genomic Tool, ArrayTrack™   for Advancing Pharmacogenomics and Personalized Medicine in the Context of the FDA’s Critical Path Initiative
  • Development of Liver Toxicity Knowledge Base (LTKB) to Empower the FDA Review Process
  • Drug Repositioning with Bioinformatics
  • Further Development and Refinement of the FDA Endocrine Disruptor Knowledge Base for Assessing Endocrine Disrupting Potential of Drugs and Food Additives
  • Integrated Analysis Of Single Nucleotide Polymorphism And Copy Number Variation In Genome Association Of Breast Cancer
  • Predicting Patient-Specific Treatment Outcomes:  Identification and Validation of Molecular Biomarkers Using In Silico Tools
  • QT Interval Correction Via Mixed-Effects Modeling
  • Scientific Enclave, Knowledge Base and Topic Data Mining for Tobacco Products
  • SEQC -(MACQ-III) -  The Sequencing Quality Control Project
  • Study of Translational Biomarkers for Drug-Induced Liver Injury with Next-Generation Sequencing  

Back to Top of Page

Contact Information

For more information, please contact Weida Tong, Ph.D. at 870-543-7142 or weida.tong@fda.hhs.gov.

Back to Top of Page

Page Last Updated: 05/19/2015
Note: If you need help accessing information in different file formats, see Instructions for Downloading Viewers and Players.