Microarrays and next-generation sequencing represent core technologies in pharmacogenomics and toxicogenomics; however, before these technologies can successfully and reliably be used in clinical practice and regulatory decision-making, standards and quality measures need to be developed. The MAQC project is helping improve the microarray and next-generation sequencing technologies and foster their proper applications in discovery, development and review of FDA regulated products. Everyone is invited to participate in the MAQC project.
SEQC2 is the FDA-led community-wide Sequencing Quality Control (SEQC) consortium efforts to develop best practices with recommended standard analysis protocols and quality control metrics for whole genome sequencing and target gene sequencing technologies that will support regulatory science research and precision medicine.
The third phase of the MAQC project (MAQC-III), also called Sequencing Quality Control (SEQC), aimed at assessing the technical performance of next-generation sequencing platforms by generating benchmark datasets with reference samples and evaluating advantages and limitations of various bioinformatics strategies in RNA and DNA analyses.
This was an FDA-led community-wide consortium consisting of 180 researchers from 73 organizations across 12 countries. The project aimed to:
- examine the latest tools for measuring gene activity (RNA-Seq)
- establish best practices for reproducibility across different technologies and laboratories
- evaluate the utility of these technologies in clinical and safety assessments.
Specifically, three RNA-seq platforms (Illumina HiSeq, Life Technologies SOLiD, and Roche 454) were tested at multiple sites for reproducibility, accuracy, and information content. The project also extensively compared RNA-seq to microarray technology and evaluated the transferability of predictive models and signature genes between microarray and RNA-Seq data. The impact of various bioinformatics approaches on the downstream biological interpretations of RNA-seq results was also comprehensively examined and the utility of RNA-seq in clinical application and safety evaluation was assessed. The project was completed by the end of 2014 and generated many manuscripts (visit MAQC Publications under 2014 for a list). Most of the publications are available in a Nature Collections special issue.
The second phase of the MAQC project (MAQC-II) aimed to 1) assess the capabilities and limitations of various data analysis methods in developing and validating microarray-based predictive models and 2) reach consensus on the “best practices” for development and validation of predictive models based on microarray gene expression and genotyping data for personalized medicine.
Thirty-six teams developed classifiers for 13 endpoints—some easy, some difficult to predict, from six relatively large training data sets. These analyses collectively produced >18,000 models that were challenged by independent and blinded validation sets generated for MAQC-II. The cross-validated performance estimates for models developed under good practices are predictive of the blinded validation performance. The achievable prediction performance is largely determined by the intrinsic predictability of the endpoint, and simple data analysis methods often perform as well as more complicated approaches. Multiple models of comparable performance can be developed for a given endpoint and the stability of gene lists correlates with endpoint predictability. Importantly, similar conclusions were reached when >12,000 new models were generated by swapping the original training and validation sets.
The first phase of the MAQC project (MAQC-I) aimed to:
- provide quality control (QC) tools to the microarray community to avoid procedural failures
- develop guidelines for microarray data analysis by providing the public with large reference datasets along with readily accessible reference RNA samples
- establish QC metrics and thresholds for objectively assessing the performance achievable by various microarray platforms
- evaluate the advantages and disadvantages of various data analysis methods
MAQC-I involved six FDA Centers, major providers of microarray platforms and RNA samples, EPA, NIST, academic laboratories, and other stakeholders. Two human reference RNA samples were selected, and differential gene expression levels between the two samples were calibrated with microarrays and other technologies (e.g., QRT-PCR). The resulting microarray datasets were used for assessing the precision and cross-platform/laboratory comparability of microarrays, and the QRT-PCR datasets enabled evaluation of the nature and magnitude of any systematic biases that may exist between microarrays and QRT-PCR. The availability of the well-characterized RNA samples combined with the resulting microarray and QRT-PCR datasets, which were made readily accessible to the scientific community, allow individual laboratories to more easily identify and correct procedural failures.
The availability of the calibrated RNA samples combined with the resulting microarray and QRT-PCR datasets, which will be made readily accessible to the microarray community, will allow individual laboratories to more easily identify and correct procedural failures.