About FDA

Bioinformatics and Biostatistics

Weida Tong, Ph.D.
Meet the Division Director:  Weida Tong, Ph.D.

Dr. Tong's career has been dedicated to leadership in developing and applying bioinformatics, chemoinformatics, and computational methods in the areas of systems biology, predictive toxicology, and clinical application. He serves as a Science Advisory Board(SAB) member for several large initiatives involving multiple institutes within Europe and the U.S. He also holds several adjunct positions at U.S. universities. His division at FDA’s National Center for Toxicological Research spans, IT, software development, biostatistics, as well as bioinformatics research focusing on methodologies and standards to advance a diversity of areas in regulatory science and precision medicine. Some particularly visible research areas currently are: (1) the Microarray/Sequencing Quality Control (MAQC/SEQC) consortium to develop standard analyses protocols and quality control metrics for emerging technologies for regulatory science and precision medicine applications; (2) further development of the Liver Toxicity Knowledge Base (LTKB) for drug safety evaluation; (3) in silico drug repositioning for the enhanced treatment of rare or orphaned diseases; and, (4) development of the FDA bioinformatics system, ArrayTrackTM suite, software used for FDA review and research in pharmacogenomics. In addition, the division also specializes in molecular modeling and QSARs, with current activities involving estrogen, androgen, and endocrine active compound screening. His research (>200 publications) is cataloged in prominent peer-reviewed journals. 


The Division of Bioinformatics and Biostatistics develops integrated bioinformatics and biostatistics capability to address increasing needs in biomarker development, drug safety, drug repositioning, personalized medicine, and risk assessment. 

The Division is comprised of three branches:

  • Bioinformatics—research efforts focused on predictive toxicology, precision medicine, biomarker development, drug safety, and drug repositioning. Most research projects are in collaboration with scientists within NCTR, across FDA Product Centers, and in the larger scientific community. One of the key endeavors is to construct knowledge bases in the specific areas of FDA’s responsibility to provide a data-driven decision-making environment for enhanced safety evaluation and precision medicine.   
  • Biostatistics—conducts peer-reviewed research of statistical methods to analyze toxicological and molecular data as well as data-mining techniques for pattern identification and signal detection. The branch also provides statistical support related to FDA’s mission to protect and promote public health. 
  • Scientific Computing—provides critical support and enhancement to infrastructure in the areas of software and database development for research support and research management, high performance computing, systems integration, and information system asset management and procurement.

Back to Top of Page

NCTR Bioinformatic Tools

  • ArrayTrack
    DNA microarray data management, mining, analysis, and interpretation software
  • atBioNet
    An Integrated PPI (protein-protein interaction) Network Analysis Tool for Systems Biology and Biomarker Discovery
  • Decision Forest
    A novel pattern recognition method for analysis of data from microarray experiments, proteomics research, and predictive toxicology
  • Estrogenic Activity Database (EADB)
    Comprehensive set of estrogenic activity data from a variety of data sources and a component of the enhanced Endocrine Disruptors Knowledge Base (EDKB) 
  • Endocrine Disruptor Knowledge Base (EDKB)
    Scientific resources to predict estrogen and androgen activity
  • FDALabel
    Full-Text Search of Product Labeling
  • Gene Ontology for Functional Analysis (GOFFA)
    ArrayTrack™ tool to identify terms in Gene Ontology associated with a list of genes
  • Liver Toxicity Knowledge Base (LTKB)
    Collection of diverse drug-induced liver injury data associated with individual drugs and development of predictive models to assess risk of drug-induced liver injury
  • MicroArray/Sequencing Quality Control (MAQC/SEQC) Project
    Development of microarray quality control metrics and thresholds
  • Mold2
    Generate molecular descriptors from two-dimensional structures
  • NCTR Liver Cancer Database (NCTRlcdb)
    Database of 999 chemicals with assigned liver-toxicity classifications to facilitate the construction of cleaner and better carcinogenicity models by FDA and other organizations
  • SNPTrack
    An integrated solution for the management, analysis, and interpretation of genetic association study data

Back to Top of Page

Notable 2015 Division Accomplishments

  • Building on the success of the previous MAQC projects, we started a follow-up study, named Sequencing Quality Control Phase 2 (SEQC2), also known as MAQC-IV. This research aims to develop quality control metrics and benchmark bioinformatics approaches for analysis of whole genome sequencing. The project will also research targeted gene sequencing data to achieve best practices and standard analysis protocols that apply to newer methods in regulatory settings and for use in precision medicine.
  • Statistics and data mining techniques for large-scaled data inference-clustering, biclustering, and classification algorithms were developed for subgroup identification and prediction. These techniques were applied to 1) the development of prognostic models, predictive models, and predictive enrichment classifiers in precision medicine; 2) serotype identification and characterization in outbreak investigations, and 3) identification of drug subgroup to adverse-event subgroup association in the FDA Adverse Event Reporting Systems.
  • Completed updates on several IT infrastructure components in cooperation with the local Office of Information Management and Technology staff. This included a new Oracle test database environment, new versions of Oracle Application Express (ApEx) and Oracle Weblogic to enable new features and enhanced applications in the production and development environments.

Back to Top of Page

Examples of Ongoing Research Projects

The following list is just a sample of research projects being conducted at NCTR in the Division of Bioinformatics and Biostatistics. 

  • Of text and gene: using text mining methods to uncover hidden knowledge in toxicogenomics
  • Developing an intelligent recognition system for storage pest fragments contaminating food products
  • Multi-omics approach to identify an antimicrobial resistance marker of Staphylococcus aureus associated with antimicrobial-coated medical devices in a biofilm reactor
  • Development and evaluation of predictive models for the management of risk of drug-induced liver injury in the investigational new drug phase
  • Assessment of the disparities on drug-host interaction of drug-induced liver injury reported in large electronic medical record system using advanced methodologies for minority populations   

Back to Top of Page

Contact Information

For more information, please contact Weida Tong, Ph.D. at 870-543-7142 or weida.tong@fda.hhs.gov.

Back to Top of Page

Page Last Updated: 06/28/2016
Note: If you need help accessing information in different file formats, see Instructions for Downloading Viewers and Players.
Language Assistance Available: Español | 繁體中文 | Tiếng Việt | 한국어 | Tagalog | Русский | العربية | Kreyòl Ayisyen | Français | Polski | Português | Italiano | Deutsch | 日本語 | فارسی | English