MicroArray Quality Control (MAQC)
Microarrays and next-generation sequencing represent core technologies in pharmacogenomics and toxicogenomics; however, before these technologies can successfully and reliably be used in clinical practice and regulatory decision-making, standards and quality measures need to be developed. The MAQC project is helping improve the microarray and next-generation sequencing technologies and foster their proper applications in discovery, development and review of FDA regulated products. Everyone is invited to participate in the MAQC project.
The first phase of the MAQC project (MAQC-I) aims to:
- provide quality control (QC) tools to the microarray community to avoid procedural failures
- develop guidelines for microarray data analysis by providing the public with large reference datasets along with readily accessible reference RNA samples
- establish QC metrics and thresholds for objectively assessing the performance achievable by various microarray platforms
- evaluate the advantages and disadvantages of various data analysis methods
MAQC-I involves six FDA Centers, major providers of microarray platforms and RNA samples, EPA, NIST, academic laboratories, and other stakeholders. Two human reference RNA samples have been selected, and differential gene expression levels between the two samples have been calibrated with microarrays and other technologies (e.g., QRT-PCR). The resulting microarray datasets have been used for assessing the precision and cross-platform/laboratory comparability of microarrays, and the QRT-PCR datasets enabled evaluation of the nature and magnitude of any systematic biases that may exist between microarrays and QRT-PCR. The availability of the well-characterized RNA samples combined with the resulting microarray and QRT-PCR datasets, which have been made readily accessible to the scientific community, allow individual laboratories to more easily identify and correct procedural failures.
The second phase of the MAQC project (MAQC-II) aims to:
- assess the capabilities and limitations of various data analysis methods in developing and validating microarray-based predictive models
- reach consensus on the “best practices” for development and validation of predictive models based on microarray gene expression and genotyping data for personalized medicine
Thirty-six teams developed classifiers for 13 endpoints—some easy, some difficult to predict, from six relatively large training data sets. These analyses collectively produced >18,000 models that were challenged by independent and blinded validation sets generated for MAQC-II. The cross-validated performance estimates for models developed under good practices are predictive of the blinded validation performance. The achievable prediction performance is largely determined by the intrinsic predictability of the endpoint, and simple data analysis methods often perform as well as more complicated approaches. Multiple models of comparable performance can be developed for a given endpoint and the stability of gene lists correlates with endpoint predictability. Importantly, similar conclusions were reached when >12,000 new models were generated by swapping the original training and validation sets.
The third phase of the MAQC project (MAQC-III), also called Sequencing Quality Control (SEQC), aims at assessing the technical performance of next-generation sequencing platforms by generating benchmark datasets with reference samples and evaluating advantages and limitations of various bioinformatics strategies in RNA and DNA analyses.
The availability of the calibrated RNA samples combined with the resulting microarray and QRT-PCR datasets, which will be made readily accessible to the microarray community, will allow individual laboratories to more easily identify and correct procedural failures.