MicroArray/Sequencing Quality Control (MAQC/SEQC)
Microarrays and next-generation sequencing represent core technologies in pharmacogenomics and toxicogenomics; however, before these technologies can successfully and reliably be used in clinical practice and regulatory decision-making, standards and quality measures need to be developed. The MAQC project is helping improve the microarray and next-generation sequencing technologies and foster their proper applications in discovery, development and review of FDA regulated products. Everyone is invited to participate in the MAQC project.
It is the FDA-led community wide consortium effort to address issues relating to the application of constantly evolving high-throughput genomics technologies to either assess safety and efficacy of FDA regulated products or their safe and effective use in clinical applications as in vitro diagnostic devices. The MAQC consortium completed three projects between 2005 -2014 (namely MAQC I, II and III), resulting in ~30 publications and one third of which were published in Nature Biotechnology. Furthermore, two papers of these papers were among the most cited in Nature Biotechnology in the last 20 years.
This is the fourth project of MAQC, named Sequencing Quality Control Phase 2 (SEQC2). The primary objective is to develop standard analysis protocols and quality control metrics for fit-for-purpose use of NGS data to enhance regulatory science research and precision medicine. The project consists of three specific aims: (1) to develop quality metrics for reproducible NGS results from both whole genome sequencing (WGS) and targeted gene sequencing (TGS), (2) to benchmark bioinformatics methods for WGS and TGS towards the development of standard data analysis protocols, and (3) to assess the joint effects of key parameters affecting NGS results and interpretation for clinical application.
The third phase of the MAQC project (MAQC-III), also called Sequencing Quality Control (SEQC), aimed at assessing the technical performance of next-generation sequencing platforms by generating benchmark datasets with reference samples and evaluating advantages and limitations of various bioinformatics strategies in RNA and DNA analyses.
This was an FDA-led community-wide consortium consisting of 180 researchers from 73 organizations across 12 countries. The project aimed to:
- examine the latest tools for measuring gene activity (RNA-Seq)
- establish best practices for reproducibility across different technologies and laboratories
- evaluate the utility of these technologies in clinical and safety assessments.
Specifically, three RNA-seq platforms (Illumina HiSeq, Life Technologies SOLiD, and Roche 454) were tested at multiple sites for reproducibility, accuracy, and information content. The project also extensively compared RNA-seq to microarray technology and evaluated the transferability of predictive models and signature genes between microarray and RNA-Seq data. The impact of various bioinformatics approaches on the downstream biological interpretations of RNA-seq results was also comprehensively examined and the utility of RNA-seq in clinical application and safety evaluation was assessed. The project was completed by the end of 2014 and generated many manuscripts (visit MAQC Publications under 2014 for a list). Most of the publications are available in a Nature Collections special issue.
The second phase of the MAQC project (MAQC-II) aimed to 1) assess the capabilities and limitations of various data analysis methods in developing and validating microarray-based predictive models and 2) reach consensus on the “best practices” for development and validation of predictive models based on microarray gene expression and genotyping data for personalized medicine.
Thirty-six teams developed classifiers for 13 endpoints—some easy, some difficult to predict, from six relatively large training data sets. These analyses collectively produced >18,000 models that were challenged by independent and blinded validation sets generated for MAQC-II. The cross-validated performance estimates for models developed under good practices are predictive of the blinded validation performance. The achievable prediction performance is largely determined by the intrinsic predictability of the endpoint, and simple data analysis methods often perform as well as more complicated approaches. Multiple models of comparable performance can be developed for a given endpoint and the stability of gene lists correlates with endpoint predictability. Importantly, similar conclusions were reached when >12,000 new models were generated by swapping the original training and validation sets.
The first phase of the MAQC project (MAQC-I) aimed to:
- provide quality control (QC) tools to the microarray community to avoid procedural failures
- develop guidelines for microarray data analysis by providing the public with large reference datasets along with readily accessible reference RNA samples
- establish QC metrics and thresholds for objectively assessing the performance achievable by various microarray platforms
- evaluate the advantages and disadvantages of various data analysis methods
MAQC-I involved six FDA Centers, major providers of microarray platforms and RNA samples, EPA, NIST, academic laboratories, and other stakeholders. Two human reference RNA samples were selected, and differential gene expression levels between the two samples were calibrated with microarrays and other technologies (e.g., QRT-PCR). The resulting microarray datasets were used for assessing the precision and cross-platform/laboratory comparability of microarrays, and the QRT-PCR datasets enabled evaluation of the nature and magnitude of any systematic biases that may exist between microarrays and QRT-PCR. The availability of the well-characterized RNA samples combined with the resulting microarray and QRT-PCR datasets, which were made readily accessible to the scientific community, allow individual laboratories to more easily identify and correct procedural failures.
The availability of the calibrated RNA samples combined with the resulting microarray and QRT-PCR datasets, which will be made readily accessible to the microarray community, will allow individual laboratories to more easily identify and correct procedural failures.
Resources For You
- About the National Center for Toxicological Research
- Bioinformatics Tools
- Nature Biotechnology - MicroArray Quality Control (MAQC) project
NCTR Bioinformatics Support
Food and Drug Administration
3900 NCTR Rd
Jefferson, AR 72079