• Decrease font size
  • Return font size to normal
  • Increase font size
U.S. Department of Health and Human Services

Medical Devices

  • Print
  • Share
  • E-mail

Ultra High Throughput Sequencing for Clinical Diagnostic Applications - Approaches to Assess Analytical Validity, June 23, 2011

The Food and Drug Administration (FDA) is announcing a public meeting Ultra High Throughput Sequencing for Clinical Diagnostic Applications - Approaches to Assess Analytical Validity.

The purpose of the meeting is to discuss challenges in assessing analytical performance for ultra high throughput genomic sequencing-based clinical applications.


Date, Time and Location

This meeting was held June 23, 2011, beginning at 8:00 a.m. at the following location:

FDA White Oak Campus
10903 New Hampshire Ave
The Great Room (Room 1503), White Oak Conference Center, Bldg 31
Silver Spring, MD, 20903

The event was webcast.

Background

Ultra high throughput genomic sequencing technologies are currently extensively used in research and are entering clinical diagnostic use; they are expected to bring transformative public health applications.  In order to effectively utilize new sequencing technologies for clinical applications, appropriate evaluation tools (e.g., standards, well established criteria) are needed to determine the accuracy of the results. Any regulatory strategy for clinical tests based on ultra high throughput genomic sequencing will benefit from novel and scientifically agreed-upon approaches to analytical validation. FDA is holding this public meeting to start discussion on approaches that can provide the most useful information in establishing safety and effectiveness of genomic sequencing technologies when used clinically.

This public meeting seeks input from academia, government, industry, and other stakeholders on validation methodologies, materials, and bioinformatics approaches needed to address unique analytical validation requirements of ultra high throughput sequencing based molecular diagnostics and confirm the sequencing quality and the accuracy of the tests. The ultimate goal is to accelerate and support the introduction of safe and effective innovative diagnostics in public health applications.

Agenda

8:00-8:05 AMWelcome and Logistics
8:05-8:15 AMOpening Remarks – Jonathan Sackner-Bernstein, MD, Associate Center Director, Technology & Innovation, CDRH/FDA
8:15-8:30 AMIntroduction and Purpose of the Meeting – Hui Lee Wong, PhD, OSB/CDRH/FDA
8:30-9:15 AMOverview of Genomic Sequencing Technologies and Applications – Vincent Magrini, PhD, The Genome Institute at Washington University
9:15-9:30 AMA Stakeholder’s Proposal for the Analytical Validation of Whole Genome
Sequencing - Laurence Kedes, MD, X PRIZE Foundation
9:30-9:45 AMQ & A
9:45-10:00 AMBreak
10:00 AM-
12:00 PM
 
 
 
 
 
 
 
 
 
 
 
 
 
Panel Discussion: Technical Performance of a Platform
Moderator:
Elizabeth Mansfield, PhD, OIVD/CDRH/FDA
Panelists:
Lisa Brooks, PhD, NHGRI/NIH
David Dimmock, MD, Medical College of Wisconsin
Narjol Gonzalez-Escalona, PhD, CFSAN/FDA
Ira Lubin, PhD, CDC
Tina Hambuch, PhD, Illumina (representing AdvaMed)
Madhuri Hegde, PhD, Emory University School of Medicine (representing AMP)
Vincent Magrini, PhD, The Genome Institute at Washington University
Brad Ozenberger, PhD, NHGRI/NIH
John Pfeifer, MD, PhD, Washington University School of Medicine (representing CAP)
Marc Salit, PhD, NIST
Alexander Wait Zaranek, PhD, Harvard Medical School
12:00-1:00 PMLunch
1:00-2:00 PMPublic comments
2:00-2:45 PMBioinformatics Tools for Genomic Sequencing – Russ Altman, MD, PhD, Department of Bioengineering, Stanford University
2:45-3:00 PMBreak
3:00-5:00 PMPanel Discussion: Bioinformatics
Moderator:
Weida Tong, PhD, NCTR/FDA
Panelists:
Russ Altman, MD, PhD, Stanford University
Lynn Bry, MD, PhD, Brigham & Women's Hospital (representing CAP)
Hyun Min Kang, PhD, University of Michigan
James Knight, PhD, Roche Diagnostics (representing AdvaMed)
Elliott Margulies, PhD, NHGRI/NIH
Joshua Sampson, PhD, NCI/NIH
Granger Sutton, PhD, JCVI
Alexander Wait Zaranek, PhD, Harvard Medical School
David Wheeler, PhD, Baylor College of Medicine
6:00 PMAdjourn

Presentations

Topics for Discussion

Ultra High Throughput Sequencing for Clinical Diagnostic Applications - Approaches to Assess Analytical Validity meeting will focus on analytical questions for human whole genome sequencing using current and upcoming genomic sequencing technologies.  The questions are intended to provide focus on how sequencing platforms can be analytically evaluated prior to clinical use as a whole platform (not necessarily by specific claims). The meeting will not be addressing regulatory questions.

The questions and discussion will primarily focus on the evaluation of whole genome sequencing for human genomic applications; however, lessons from sequencing microorganisms are applicable. In order to streamline the discussion, this meeting will not cover questions about clinical validation, clinical significance of findings, how to address incidental findings, etc.

Each session of the meeting will start with a background presentation, to provide an overview of including advantages and disadvantages, of ultra high throughput sequencing technologies.  Background presentations will provide a starting point for the panel discussion.

Panel Discussion: Technical Performance of a Platform

This discussion will focus on accuracy, rather than on other analytical performance characteristics. The accuracy of the platform may depend upon the application, e.g., gene panels / targeted sequencing, exome and whole genome sequencing (WGS). The following questions address issues in evaluation of accuracy.

  1. What evaluation criteria should be used to assess the accuracy of the sequencing platform / accuracy of generating a raw read, e.g., sequencing fidelity, completeness, quality scores, sequencing depth?
    • Are there performance benchmarks, such as % correct bases, completeness, haplotype error, that would be useful in understanding analytical performance of the platform?
    • Is there minimum sequencing depth that can be specified to supply reliable measurement results?
    • Is there a minimum percentage of genome that needs to be covered to understand the platform performance as a whole?
  2. What possible comparator methods/measures of truth could be used to evaluate platform performance?
    • Should a composite/consensus of various methods be considered?
    • Would comparators need to include use of orthologous or alternate techniques (e.g., Sanger, mass spec, SNP arrays)?
  3. Are there critical regions/variations that should be incorporated in accuracy evaluations for the sequencing platform as a whole? Examples to consider include difficult to sequence regions, highly polymorphic regions (e.g., HLA), SNPs, indels, CNVs, inversions, repeats, homologous or redundant sequences, translocations, haplotype assignments.
  4. Could an appropriate validation set of samples/panels be created to evaluate platform accuracy, considering that different types of genomic variation may have different characteristics?
    • Whole reference genomes?
      • Trios, unrelated individuals, duplicate samples, etc.?
      • Can data such as HapMap or 1000 Genomes be incorporated / used, in whole or in part, e.g., HapMap sequences that have been validated by bidirectional sequencing?
    • Validated gene panels, validated exomes?
    • Validation set of samples (e.g. prespecified cell lines)?
    • Do cell lines accurately represent clinical samples in terms of querying accuracy?
  5. How should pre-analytical issues (e.g., preparation of libraries, extraction and QC of nucleic acids, capture methods, amplification) be addressed?
Panel Discussion: Bioinformatics

This discussion will focus on how the informatics and software elements of the sequencer affect the generation of accurate sequence results. The following questions address issues in data analysis, data formats, and storage.

  1. Questions related to data analysis and bioinformatics pipeline:
    • Sequence evaluation:
      • Different platforms have different approaches to generating quality scores. How can quality scores be evaluated?
      • What criteria should be used for choosing appropriate sequencing depth?
      • What should be evaluated as a measure of informatics performance (e.g., base pair calls, completeness, accuracy, haploid phasing, other)?
        • If only the final output is evaluated, how can the fidelity of the processing from raw reads to final sequence be verified?
    • Sequence assembly:
      • How can the quality of assembly and sequence alignment algorithms be assessed?
      • How can an informatics pipeline be evaluated for its ability to detect the sequence variations sought (e.g., SNPs, indels, CNVs)?
      • What methods may be appropriate to detect sequence assembly errors?
      • What are the metrics appropriate for assembly quality control? Can the current global assembly accuracy measures be used, e.g., N50, misassembly detection, comparative methods for fine-scale inaccuracy detection?
    • What should be the performance criteria for interpretive algorithms (e.g., variant detection)?
  2. Questions relevant to data format and storage:
    There is a significant tradeoff surrounding which intermediate files should be kept and for how long. Additional questions may consider surrounding read and alignment files, which are large but potentially useful if questions arise surrounding a called variant or if a new variant detection algorithm emerges.
    • Should a unified way to handle this be considered?
    • Should there be standardized data formats (e.g., formats of raw sequence, aligned sequence, variants)?
      Examples include raw reads, FASTQ file (format in data-sharing among sequencing centers), BAM or SAM (Sequence Alignment/Map) formats used by Broad/MIT, Sanger, 1000 Genomes Project, that include information on genomic location, and can be used on viewers/data visualization platforms and compatible with certain sequence alignment algorithms.
    • What might be acceptable approaches to data storage, archiving, compression algorithms, and support infrastructure?
    • Do primary data files have intrinsic value for understanding sequence calls and performance, or are intermediate files sufficient?

Meeting Summary

Webcast

An online archive of the webcast for this event is available:

Transcripts

Online transcripts are available in PDF format.

Contacts for Additional Information

For information regarding the program, contact:

  • Zivana Tezak
    Center for Devices and Radiological Health
    Food and Drug Administration
    10903 New Hampshire Avenue, Bldg 66
    Silver Spring, MD 20993
    Phone: 301-796-6206
    Email: Zivana.Tezak@fda.hhs.gov