UNITED STATES OF AMERICA
+ + + + +
FOOD AND DRUG ADMINISTRATION
+ + + + +
CENTER FOR DEVICES AND RADIOLOGICAL HEALTH
+ + + + +
MEDICAL DEVICES ADVISORY COMMITTEE
+ + + + +
MICROBIOLOGY DEVICES PANEL
+ + + + +
+ + + + +
OCTOBER 12, 2001
+ + + + +
The panel met in Salons A-C of the Gaithersburg Hilton, 620 Perry Parkway, Gaithersburg, Maryland, at 9:00 a.m., Dr. Michael L. Wilson, Chairman, presiding.
MICHAEL L. WILSON, M.D., Chairman
ELLEN JO BARON, Ph.D., Temporary Voting Member
KATHLEEN G. BEAVIS, M.D., Member
KAREN C. CARROLL, M.D., Consultant
PATRICIA CHARACHE, M.D., Consultant
FRANKLIN R. COCKERILL III, M.D., Consultant
DAVID T. DURACK, M.D., Ph.D., Industry Representative
JANINE JANOSKY, Ph.D., Consultant
DAVID M. LEWINSOHN, M.D., Ph.D., Guest
VALERIE L. NG, Ph.D., Member
FREDERICK C. NOLTE, Ph.D., Temporary Voting Member
L. BARTH RELLER, M.D., Temporary Voting Member
STANLEY M. REYNOLDS, Consumer Representative
NATALIE L. SANDERS, M.D., M.P.H., M.B.A., Member
FREDDIE M. POOLE, Executive Secretary
Opening Remarks 5
Presentation by Manufacturer, Cellestis Limited
Dr. Jim Rothel
TB Diagnosis: Current Situation and
Medical Need 14
Dr. Antonino Catanzaro
Scientific Basis of the Test 23
Dr. Paul Wood
Development and Clinical Studies 30
Dr. Jim Rothel
Questions from Panel Members 46
Presentation by FDA
QFT Performance Characterization 85
Roxanne G. Shively
Senior Scientific Reviewer
Bacteriology Devices Branch
Division of Clinical Laboratory Devices
Clinical Use of QFT Results 97
Leonard V. Sacks, M.D.
Senior Staff Fellow
Divisions of Special Pathogens and
Immunologic Drug Products
Statistical Analysis of Data 111
John L. Dawson, MS, JD
Division of Biostatistics
Office of Surveillance and Biometrics
Questions from Panel Members 120
JAMES B McAULEY 132
Medical Director, Cermak
Cook County Jail, Illinois
STANLEY H. REYNOLDS 142
State Department of Health
Laboratory in Pennsylvania
On Behalf of:
Director, TB Control Program
Commonwealth of Pennsylvania
Open Committee Discussion 144
Open Public Hearing 193
Industry Response 194
FDA Response 195
Final Recommendations and Vote 208
CHAIRMAN WILSON: I'd like everyone to take their seats at this time, please, so we can begin.
Good morning. I'd like to welcome everyone to the meeting of the Microbiology Devices Panel this morning. I'm Dr. Michael Wilson. I'm the Chair of the panel from the University of Colorado and Denver Health Medical Center.
I'd like to begin the meeting this morning by having the panel members introduce themselves. If we could go around, just please state your name and your affiliation. We could start with Dr. Durack.
DR. DURACK: Good morning. I'm David Durack. I'm the industry representative on the panel, and I'm affiliated with Beckon Dickinson and also with Duke University.
MR. REYNOLDS: Good morning. I'm Stanley Reynolds. I'm the consumer representative on the panel. I'm Supervisor of Immunology and Virology for the Pennsylvania State Public Health Laboratory.
DR. CHARACHE: I'm Patricia Charache. I'm affiliated with Johns Hopkins University, a former panel member, and a consultant for this panel.
DR. BARON: I'm Ellen Jo Baron with the Department of Pathology and Medicine at Stanford University, and I am the Director of the Microbiology and Virology Laboratory, Stanford University Medical Center.
DR. CARROLL: Good morning. I'm Karen Carroll. I'm Associate Professor of Pathology at the University of Utah School of Medicine and the Director of Microbiology Laboratories at ARUP Labs, Incorporated, in Salt Lake City.
DR. SANDERS: Good morning. I'm Natalie Sanders, a general internist with Southern California Permanente Medical Group, also known as Kaiser, and I am on the clinical faculty at the University of Southern California.
DR. NG: Good morning. I'm Valerie Ng. I'm Professor and Interim Chair of the Department of Laboratory Medicine at UC-San Francisco. I'm also the Director of the clinical laboratories at San Francisco General Hospital.
MS. POOLE: I'm Freddie Poole, the Executive Secretary and Branch Chief for Bacteriology Devices.
DR. BEAVIS: Good morning. I'm Kathleen Beavis, the Director of the Microbiology and Virology Laboratories at Cook County Hospital in Chicago.
DR. JANOSKY: Janine Janosky, Associate Professor, Division of Biostatistics, Department of Family Medicine and Clinical Epidemiology, at the University of Pittsburgh.
DR. NOLTE: Good morning. Frederick Nolte, Emory University School of Medicine, Department of Pathology and Laboratory Medicine, where I'm Director of the Clinical Micrology and Molecular Diagnostics Lab.
DR. RELLER: Barth Reller, Professor of Medicine, Pathology, Division of Infectious Diseases, and Director of Clinical Microbiology, Duke University Medical Center.
DR. LEWINSOHN: I'm David Lewinsohn. I'm an Assistant Professor at Oregon Health Sciences University and have a laboratory that's focused on TB T-cell immunology.
DR. COCKERILL: I'm Frank Cockerill, Professor and Chair of Microbiology at Mayo Clinic, also Professor of Medicine and Infectious Disease Specialist at Mayo Clinic.
DR. GUTMAN: I'm Steve Gutman. I'm Director of the Division of Clinical Laboratory Devices, FDA.
CHAIRMAN WILSON: Thank you.
At this time I would like to have Ms. Poole read the conflict-of-interest statements.
MS. POOLE: Good morning.
The following announcement addresses conflicts-of-interest issues associated with this meeting and is made a part of the record to preclude even the appearance of an impropriety. The conflict-of-interest statute prohibits special government employees from participating in matters that could affect their or their employees' financial interest. To determine if any conflict exists, the agency reviewed the submitted agenda and all financial interests reported by the Committee participants. The agency has no conflicts to report.
In the event that the discussions involve any other products or firms not already on the agenda for which an FDA participant has a financial interest, the participant should excuse him or herself from such involvement, and the exclusion will be noted for the record.
With respect to all other participants, we ask that in the interest of fairness all persons making statements or presentations disclose any current or previous financial involvement with any firm whose products they may wish to comment upon.
CHAIRMAN WILSON: Thank you.
A couple of brief housekeeping items: Please, if you have a cell phone, a pager, or any other device that makes noise, I'd ask you to turn it off during the meeting, so as not to distract the proceedings.
The other thing is that, because of the time required to get to the airports now, several panel members reported they have to leave right at or in the middle of the afternoon session. So we're going to try to keep on schedule as much as we can today. Otherwise, we may lose our quorum late in the afternoon.
The other thing is, when you come to the microphone to speak, if you could please identify yourself.
The item of new business today is a pre-market approval application from Cellestis Limited for the QuantiFERON-TB. This is an in vitro diagnostic device for measuring the release of gamma interferon from sensitized lymphocytes in PPD-stimulated whole blood. This product is intended as an aid in the diagnosis of latent TB infection and to aid in evaluation of individuals suspected of having M. tuberculosis infection.
I would ask all the panel members to hold their questions until after the four presentations by the sponsor. I would like to remind the members of the audience that only the panel can ask questions of the speakers.
So at this time I would like to have Dr. Rothel begin.
DR. ROTHEL: Hi. I'm Jim Rothel. I am employed by Cellestis Limited as the Chief Scientific Officer and Executive Director, and I own stock in Cellestis Limited.
I'd first like to take this opportunity to thank the panel members for coming today, but especially those that had to fly here in these turbulent times. It's not a great experience just at the moment.
This first slide is to give you an outline of our presentation today. I'm going to give a brief introduction, and then Professor Tony Catanzaro is going to talk about the current diagnostic methods for TB and limitations of those methods. Then Professor Paul Wood will get up and talk about the scientific basis behind the QuantiFERON technology, and we'll also talk about the bovine model of the QuantiFERON test. Then I'll come back and talk about the development studies and the clinical studies that we're using to base the PMA submission on.
First, this next slide gives us a history of the development of the test. It was initially developed in the mid to late eighties by the Australian Government's research body, CSRO, for the detection of TB in cattle. At that time CSRO got an Australian company called CSL Limited as a commercial partner, and they went on to successfully develop this product into a commercial product which is now sold around the world.
Given the success of the bovine test, they then went on to develop a human version of the test, which is called QuantiFERON-TB. There's a large amount of pre-clinical and clinical studies in Australia establishing the test cutoffs and the various parameters of the test.
Then the large-scale, pivotal studies that we're using to support our PMA application were conducted by two U.S. Government bodies, the CDC and the Walter Reed Army Institute of Research, which I'll refer to as WRAIR from now on for simplicity. We'll present that later.
I should say the technology is now owned by Cellestis Limited.
This slide is a very simple schematic of the methodology of the test. It's simple because the test is simple. How the test is conducted is a heparinized blood sample is collected from individuals and four 1 ml aliquots of blood are pipetted into four different wells of a 24-well culture tray.
Then this whole blood -- it's not diluted -- we add the antigens to it. The first antigen is a negative control, which is basically PBS. The second well is tuberculin from Mycobacterium tuberculosis, and we'll refer to that as human PPD from now on. The third well is tuberculin from Mycobacterium avium, and we'll call that avian PPD. The fourth well is a mitogen-positive control for each individual, and that consists of a submaximal amount of phyto hemagglutinin.
Once these antigens are added to the blood, the plate is put in an incubator overnight at 37 degrees for 16 to 24 hours. During this period if there are any T-cells present in that blood that respond to mycobacterial antigens -- i.e., if the person has been exposed to mycobacteria -- they're responding in many different ways. But the main one that we're talking about is the secretion of gamma interferon into the plasma.
The next day the red cells are settled down in the 24 wells, and what you simply have is the plasma off the top and then assay for the presence of gamma interferon produced in each of those four plasma samples by a rapid EIA for gamma interferon. We're saying it's rapid. There's only one incubation step where you incubate the plasma and the conjugate at the same time following by a washing step and adding substrate.
This slide just depicts the type of test interpretation profile we should get. We'll go into detail a little bit later exactly how the test is interpreted.
But an individual who is negative in the QuantiFERON test will not respond to the nil antigen, to the human PPD, or to the avian PPD to any substantial amount and will have a robust response to the mitogen-positive control.
A person in whom the test indicates MTG infection will have a strong response to human PPD and also some response to avian PPD, but to a lesser extent. This is due to the cross-reactive nature of the tuberculin antigens in general between the two species. And, again, a response to mitogen.
The person who has reactivity to atypical mycobacteria or mycobacteria other than tuberculosis, or MTRs we'll refer to them throughout the talk, will have the inverse of that response, where the predominant response to the PPD's will be against avian PPD.
The mitogen-positive control also serves as a control for the quality of the blood sample to be able to produce gamma interferon or also energy perhaps. An individual in whom a mitogen response is not detectable, a test result cannot be obtained for that individual. That's a very rare event.
So I'll just leave you with the intended use and put it upfront. The QuantiFERON test is intended as an aid, and that's an aid in the detection of infection with Mycobacterium tuberculosis.
After that brief introduction, I would like to pass it over to Tony Catanzaro to talk about the current diagnostic methods.
DR. CATANZARO: Good morning. My name is Tony Catanzaro. I'm with the University of California at San Diego. I've been working on tuberculosis for over 30 years, since my introduction to TB in the TB Branch of CDC. Since then, I have been working with CDC on a number of projects.
One of the projects that I did with CDC recently was to work on QuantiFERON. Because of my work with QuantiFERON, Cellestis asked me to join the Board of Directors, and I'm now a stockholder in the company Cellestis and want to disclose that.
But I'm here to talk to you about the clinical aspects of tuberculosis. I want to start by pointing out that the prestigious Institute of Medicine recently published a very important report. In that report they cite that the greatest need in the control of tuberculosis in the United States is a new diagnostic tool to account for individuals who have latent tuberculosis.
The reason they focused on that is that CDC has led the charge, and that charge has been joined by the public health community in general, pointing out the way that the identification and treatment of latent tuberculosis infection is the best way to interrupt the transmission of tuberculosis, by preventing active cases from developing from those latent infections.
The diagnosis of latent tuberculosis is not a particularly simple task. People have said over and over in talking about this particular test, the QuantiFERON test, that there's no gold standard, and I'm not exactly convinced of that. It's true there's not a gold standard from a diagnostic or a device point of view, but clinicians are, in fact, able to diagnose latent tuberculosis.
They do this by taking into account the history of the patient and the possible exposure of that patient, the epidemiologic status, socioeconomic status, and clinical findings -- all that together with the cell-mediated immune response to tuberculosis. That's what we're talking about here today, one aspect of the diagnosis, specifically, the cell-mediated immune response to tuberculin.
That cell-mediated immune response or that TB sensitivity has been measured for 100 years now by the tuberculin skin test, initially developed by Robert Koch and made better by George Comstock and the CDC by a very specific algorithm that's been used. That's the basis that clinicians use to diagnosis tuberculin skin test sensitivity.
However, researchers have been very busy for the past couple of decades developing other aspects, other approaches to identifying T-cell reactivity; specifically, lymphocyte proliferation, cytotoxic lymphocyte assays, and the measurement of cytokine expression.
When we look at the tuberculin skin test, I think that the community has done a great job of taking a very old and imprecise technique and really learning how to use it. But I think when we compare the tuberculin skin test with the QuantiFERON-TB, we have to keep in mind the fact that the tuberculin skin test is a very, very complex thing with a lot of little points that a lot of attention has to be paid to.
You have to be careful about antigen handling, about antigen deposition in the skin, about reading the tuberculin delay-type hypersensitivity response, which is inherently an inflammatory response locally. It peaks at 48 to 72 hours. The patient has to return for interpretation, and there are almost always reactions to the antigen. That's what it's all about. That reaction is what you read, and occasionally vesiculation and necrosis occur. So there are some adverse effects from that antigen preparation.
We've learned to use the tuberculin skin test to a good effect in identifying people who have latent tuberculosis. I think it's important to recognize that there are some shortcomings from a false negative point of view specifically when we come to the diagnosis of active tuberculosis. Ten to 15 percent of cases of active tuberculosis have a negative tuberculin skin test, giving us a sensitivity not of 100 percent, but in fact closer to 50 or 90 percent -- in part, because tuberculosis is itself an immunosuppressive disease and in part because of some inherent deficiencies of the tuberculin skin test.
Some of those deficiencies revolve around the application. Again, you have to apply it just to the intradermal layer. If it's too deep, the antigen is picked up by blood flow and it's not there 48 hours later for a delayed-type hypersensitivity reaction to occur with.
There are also problems with the handling and storage of PPD. Finally, the immune status of the patient, even patients who appear to be immuno-intact, may be immunosuppressed to some extent. All these cause false negative reactions.
But the major problems with tuberculin skin tests are in another area. Specifically, the test has to be given and patients need to return for a reading. That's a problem. In many settings -- myself, I'm at UCSD Med Center and I run the TB control Lab and I run the skin testing lab. We have well-trained technicians, highly motivated individuals. About 30 percent of our patients do not return for their reading of the tuberculin skin test.
So the antigen is placed. All the costs involved in that are undertaken, but the information is not harvested. That's not a unique experience. That happens in many situations. Patients do not return for tuberculosis skin test reading.
Some people say, well, gee, you know, if you're only using it to identify latent TB and they can't come back for the reading, are they going to come back for the treatment? Well, that is a problem, but that's only part of the problem. There's also the epidemiology. There's also understanding how much is tuberculosis a problem in this population. You simply don't know if 30 percent of the readings aren't made. Not to mention the cost implications of not only applying the skin test, but followup and re-followup. These are major problems.
There are a number of inaccuracies in the measurement of induration. Measuring the size of a bump in the skin is inherently imprecise. Often we have inexperienced operators. Anybody feels they can read the amount induration, but to read it accurately, to read it within the limits that CDC would like of plus or minus 2 millimeters is not so easy.
But even under the best of circumstances, a 2-millimeter difference is a significant difference. That imparts another problem, which is subjectivity. There's a lot of subjectivity in reading a skin test. This has been demonstrated. There are a number of preferences. Some of these biases are conscious and some of the biases are unconscious and very difficult to control.
There are also false positive tuberculin skin tests due to BCG, mycobacteria other than tuberculosis, particularly avium. These are very common in the populations that we're trying to deal with latent TB.
The whole southeastern United States has a tremendous problem with hypersensitivity to mycobacteria avium. BCG is very commonly used in many countries from which immigrants come to the United States, and tuberculosis or reactivation of tuberculosis in the immigrant population is a major problem in the U.S. today. At least 50 percent of the new active cases develop in immigrant populations. So this is the target of the latent TB focus and this is a problem for the reading of identification of patients who have latent tuberculosis infection.
I want to talk about the discordance. I'd like to direct your attention to this slide because it's really quite important. We have two products on the market that are approved by the FDA for tuberculin skin test antigens, specifically Tubersol and Aplisol. They were recently studied by Dr. Villarino from the CDC, and a publication in JAMA describes that these two antigens are equivalent and can be used both to measure tuberculin skin test reactivity.
But look at the results that were obtained here initially in a low-risk population, 1,555 patients. This is with equivalence. We have 10 who had a positive to Aplisol and a positive to tuberculin, and the discordance was 3 and 18 with a Kappa of 0.48. Under most circumstances one would be a little bit concerned about saying that those are equivalent, but in fact they are.
The reason they are is because it's recognized by the manufacturers, by the FDA, by the scientific community, that the tuberculin skin test is not a precise measurement. You cannot get 100 percent agreement. This, a Kappa of .48, is considered agreement.
The same thing is true in another population of patients with current tuberculosis. The Kappa there was 0.5. I only point this out to help you understand that, yes, tuberculin skin test is the best we have, but there's a lot of room for improvement there.
So I'd like to point out to you the advantages that I, as a clinician, see for the QuantiFERON-TB test. I see us moving from the tuberculin skin test to an objective measurement which is controlled laboratory test, which has a lot of precision built into it and has the opportunity for much better quality control than the whole setup of tuberculin skin test provides for us.
It offers the advantage of a single patient contact. We'll be able to get the information as to what the tuberculin skin test reactivity in our population is with one visit.
There are no adverse reactions to tuberculin, and this may seem trivial, but in the patients who are reactive to tuberculin they always get pain, discomfort, irritation, whatever.
Finally, the test has a built-in control for reactivity to mycobacteria other than tuberculosis, and I think that's a tremendous clinical advantage.
So, in conclusion, the tuberculin skin test is the only currently approved method to identify T-cell reactivity to tuberculin. QuantiFERON-TB solves several important limitations of the tuberculin skin test. QuantiFERON-TB provides an additional tool for clinicians for the identification of T-cell reactivity to tuberculin.
Finally, clinicians need to have all the available information to interpret the clinical significance of T-cell reactivity to tuberculin. I want to emphasize that the diagnosis of latent tuberculosis infection is an exercise in clinical medicine, and by definition it requires incorporation of the patient's history, the patient's membership in epidemiologic and socioeconomic status, the physical examination, and an evaluation of tuberculin skin test sensitivity, which can be done classically with a regular skin test and, alternatively, what we're here to talk about today, the QuantiFERON-TB test.
Thank you for your attention. I'd like to turn the podium over now to my colleague, Paul Wood, who will talk about the scientific basis of the QuantiFERON-TB test.
DR. WOOD: Thank you, Tony.
My name is Paul Wood. I was the original inventor of the technology behind QuantiFERON when I worked for CSRO back in the 1990's. I now work for CSL Limited, and through them I act as a consultant to Cellestis, and I have stock in the company.
I want to take you back a bit to when we started. What do we know about mycobacterial infections? Well, one of the things we know is that they induce very strong T-cell responses, one of the distinctions about mycobacterial infections. This is the reason that tuberculin or the tuberculin skin test has been used so many years. We also know that that T-cell reactivity is generated fairly early in infection, and generally maintained for the life of the infection.
On the other hand, we know that antibody tends to come up light in infection and it's more mirrored with the mycobacterial load. So when we started off it was obvious for us to look for another measure T-cell-mediated immunity, in this case to look for an in vitro assay.
Bovine TB is a very good model for human TB. This is a natural infection, respiratory infection, of an organism in bovis closely related to M. tuberculosis, and, of course, in the early part of the last century, a major infectious disease of humans. The immunoresponse in cattle is very similar to what we see in human, predominantly a cellular response.
Most cattle that we see now, if you class it as having a disease or an infection like LTB, we seldom see generalized TB in animals. The majority of the animals that we detect have single lesions in a lymph node.
Similar to what you see in humans, we do see active TB. Often it's in older animals and in undernourished animals. The tuberculin test has also been used in cattle for over 100 years. In this case we use in bovis PPD. It's injected intradermally. In Australia we use the caudal fold.
In Europe, in particular, because of the rates of exposure to other mycobacteria, avium is used in comparative tests. So it's comparative testing that's used extensively in Europe. So I contend that we've actually got a very good model for human TB.
So why choose interferon gamma as the molecule we're going to measure? One of the tasks we were given is to find a test that you could test 500 to 1,000 animals a day. So, obviously, we needed something we use very rapidly. We know, as I said, that TB induces a strong T-cell response. We know that interferon gamma is a classical CMI cytokine. For those of you familiar with the type I/type II complex of T-cells, you know that interferon gamma is a classical Type I cytokine.
We also know it's produced in vitro in response-specific antigens, and it's created in measurable and stable amounts. Very importantly, because we wanted to use whole blood, because, again, as I said, we're looking to test lots of animals in a single day, that it's absent from the normal circulation. There's an extensive literature which is growing all the time showing the importance of interferon gamma in TB infection.
The assay in cattle, which we call Govigam, is very similar to what Jim has just described. It uses heparinized whole blood. In this case we substitute bovine PPD for M. tuberculosis PPD. We use avian PPD as a comparator, and we don't use the mitogen. As I said, this was the earlier version and we were testing whole cattle, and you classify TB in cattle on the basis of herd diagnosis.
So you incubate overnight, and again if there are specific cells there, they respond and secrete interferon gamma, which we harvest the next day and use a sensitive enzyme immunoassay detect. In this case the monoclonal antibodies are detecting bovine interferon gamma as distinct from human interferon gamma, as is the case with QuantiFERON.
Let me show you some basic raw data. This is the data we generated early, and we got three good animals here. In control animals you see no response or very little response to the PPD's in either of the control, as you can see in the first two animals there. However, with M. avium-infected animals, you see a distinct response to the M. avium PPD. These are just raw OD's I'm showing you here. It's greater than what we see to the bovine response. Of course, if you have M. bovis-infected animals, you see the reverse of that response.
As you can see there, this also shows, the point that Jim made, the cross-reactive nature of these antigens in the sense even in the M. bovis animals you can see quite a strong response to the M. avium. That's why we used this comparative. So it's basically an in vitro comparative assay.
This is the major study that we did in Australia. So in the study we had over 6,000 animals. All of these animals were tested and eventually slaughtered. The beauty of the cattle model is that we are able to post mortem our animals and collect extensive tissues and culture. So our gold standard here was M. bovis culture from those animals. In this case we had 125 culture-positive animals.
As you can see from that data, the interferon gamma assay was significantly more sensitive than the skin test. The figure there -- we got a 65.6 for the skin test -- was equivalent to studies that Francis did in the seventies, where he came up with a figure of about 70 percent.
When we combine the results of the two assays, we slightly increase the sensitivity, but not significantly over and above what we saw with the bovine interferon gamma line. But I point out again that we're able to actually have a gold standard in this trial.
More importantly as a scientist, I'm pleased to say that our studies have been consumed my numerous publications. There's over now 20 published studies in 11 different countries. We have 150,000 animals that have been tested in those studies. We're coming up with an overall sensitivity of approximately 90 percent with a good specificity.
What we see in those studies, people have used different cutoffs because people's programs around the world change. So if you're looking for eradication, which is what we did in Australia, then we maximize sensitivity and we sacrifice a little bit in specificity. In other circumstances where you want high specificity, then you can adjust your cutoffs. But, overall, all of these studies confirm the results we saw in the Australian trials.
So what are the lessons we've learned from the bovine assay? Well, in the bovine assay we have found that in general it's more sensitive than skin testing. It's able to detect animals early in the infection. We did in our studies in Brosboteland in New Zealand and also the British now have shown that generally within four weeks of infection -- this is with a low dose, 10-to-the-4 CFU -- you see a positive response. It's maintained for a significant period. We followed animals for three years, and although the actual level varies, they remain positive for all of those three years.
It's now used in a variety of countries, including here in the USA. With the white-tail deer problem in Michigan and the spread to cattle, it's now being used in the USA.
So, in conclusion, I believe that the whole blood interferon gamma assay is applicable to other mammals. We've now spread it, the technology. We have a primate-based assay that's going through finalization. We've developed a celine assay we call Cervigam. You see a thing coming through there. And people are now developing it for other species.
But, importantly, what you'll hear today is to hear about the human assay. The QuantiFERON assay uses exactly the same technology that I've just gone through in the bovine. It's my belief that the bovine data gives us a good start and extensive validation of the technology.
I'll now hand it over to Jim Rothel, who will take you through the clinical data on the QuantiFERON assay.
DR. ROTHEL: Thanks, Paul.
Tony's gone through the current situation with TB diagnosis, and Paul's just given us a nice overview of the scientific basis of the test in the bovine model. I'm now going to talk about the initial clinical studies that were conducted largely in Australia and then move on to the pivotal studies, the CDC and the WRAIR studies which were using as the basis for a PMA application.
A large amount of work was done by CSL in characterizing the performance of the QuantiFERON test, and I'll just go over these points here. The limit of the detection of the test was found to be 1.5 international units of gamma interferon above the nil control for any individual set of plasma samples. That is, a nil for a person, we can detect in a stimulated plasma sample site with a PPD, we can significantly detect 1.5 units above the value in the nil.
The linear range of the EIA is on the order of 200 international units per ml. Looking at reproducibility of the test, which is an important aspect, we looked initially at the blood culture phase, the first phase of the test. Looking at replicate cultures, we found the intraclass correlation coefficient to be greater than .95, indicating excellent reproducibility between the blood culture phase.
Looking just at the EIA phase, again interferon ELISA, that was found to be highly reproducible as demonstrated by both within-plate and between-plate coefficients of variation being less than 10 percent.
Looking at the test overall, looking between blood samples collected and sent to different sites and assayed by different operators, the ICC statistic again was found to be .948, indicating excellent reproducibility.
So after establishing these test parameters, the initial trial we did was that reported by Streeton, et al., in the IITLD journal. This trial was set up to establish the cutoff for the test. We enrolled 407 individuals who were deemed by the ATS/CDC guidelines as being uninfected with TB -- that is class not individuals by those guidelines -- and 182 individuals deemed as having latent TB infection by those same guidelines.
After testing blood from those individuals in the QuantiFERON assay, we then analyzed the data by ROC curves. This established that the appropriate measure of cutoff is this thing we've called "percentage human response" here, and I'll explain that in a little bit more detail later. We established that should be set at 15 percent. Using this cutoff on that data -- and this data was used to generate the cutoff, but we'll point it out to you anyway -- specificity was found to be 97.6 percent and sensitivity 89.6 percent.
We talked earlier about having avian PPD as well as human PPD in the tests, so it's a comparative-type assay. We have to determine the optimal method of distinguishing between TB infection and reactivity to MAC in this case or MOTT, using MAC as a representative of a MOTT mycobacteria.
For this we obtained blood from 50 individuals with culture-confirmed TB infection and 10 individuals with culture-confirmed MAC lymphadenitis. This graph there on the bottom, which is hard to see no doubt, but you can get the feeling, up the side here is the second cutoff we've chosen -- I hope I don't zap you with this laser pointer over there -- is the percent avian difference, which is the second cutoff we've chosen. Again, I'll explain it in a minute.
These individuals here are TB patients. These are the patients with MAC infection. The line across there, which is set at minus 10 percent, was chosen as the optimal cutoff to discriminate between those with TB infections and those with reactivity to MOTT.
So just to go through how those two cutoffs are chosen, the percentage human response is the response of an individual to human PPD expressed as a percentage of their response to the mitogen control well. These values are both corrected for nil.
The percent avian difference is calculated by subtracting the response to avian PPD from that to human PPD and expressing that as a percentage of the response to human PPD, again corrected for nil. That sounds a little complicated, but it's a very simple calculation. But what that essentially says is, what is the predominant response to? Is it to human PPD or to avian PPD?
One other or two other factors have to be included in the cutoffs used for the tests. As I told you earlier, the limit of sensitivity for the QuantiFERON EIA is 1.5 units per ml. So, therefore, to obtain a valid test result for any individual, their mitogen response has to be at least 1.5 international units above the nil sample for that individual. If it's not, that's an invalid test and we can't obtain a test result for that person. Again, that's a rare event. Similarly, seeing the sensitivity of the EIA is 1.5 units above nil, the human PPD minus nil has to be greater than that level to obtain a positive response in TB.
So now that we have established these cutoffs, we went ahead and did some clinical trials, some more clinical trials. What we would have loved to have done is to look at the response of individuals before being infected with M. tuberculosis and then following MTB infection, but ethically that's a very difficult experiment to do. So we did the next best thing and used an MTB complex organism, albeit very attenuative, which is in bovis BCG.
We recruited 53 low-risk TB individuals, that being medical students in Australia who are routinely BCG vaccinated, at this university at least, and tested them with QuantiFERON both before BCG vaccination and then five months after BCG vaccination. The data showed that 92 percent of these medical students showed an increase in their QuantiFERON response after BCG vaccination, and the amount of this increase was threefold above that found prior to BCG. I should add here that the vast majority of these were still below the 15 percent cutoff that was established for the QuantiFERON-TB assay, but that would be expected, knowing the BCG is a highly attenuated MTB complex organism.
First, we feel that this study demonstrates that an increase in QuantiFERON-TB response is generated following MTB infection. We have now established from the Streeton study and from these other studies that the majority of people who are not infected with TB don't respond in their QuantiFERON-TB test, and the majority of people that have latent TB infection give a positive response in QuantiFERON-TB. But what about those with active TB disease?
To look at this, we conducted a multi-center study in Australia, nine different hospitals around Australia, and recruited 129 individuals with culture-confirmed TB disease. Eighty-one percent of these patients were found to be QuantiFERON-positive, and this established that the test works in cases of active TB disease, where commonly the immune response is quite depressed to tuberculin.
That's a brief outline of the clinical studies that were conducted in Australia. Let's move on to the pivotal studies that were conducted by the CDC and Walter Reed.
First, I want to talk about the constraints of running clinical trials of any test for latent TB infection. There's no gold standard for latent TB. Tony told us about it before, and there just isn't a standard for it. Now TST is an aid to detecting tuberculosis infection. As Tony eloquently put, it's not a gold standard. It's definitely not a definitive indicator for LTB. So, therefore, we didn't have a gold standard. What do we do?
So the data analysis method we used was to recruit individuals with no known risk factors for TB infection and then use these to determine what are termed apparent specificity. We called it apparent specificity because we cannot guarantee that some of those individuals did not have latent TB infection, although the chance of that is very low.
To determine the sensitivity of the test for active TB disease, we do have a gold standard. It's culture of the organism. So for that, we can recruit culture-confirmed TB cases.
But the last group there on that slide is looking at the sensitivity of the test for latent TB infection. Without a gold standard, all we can do is recruit individuals with identified risk factors for latent TB infection and look at the concordance with this suboptimal standard TST. That's the best available to us.
So the CDC study recruited 1,500 subjects, or that was the goal. There were five different sites across the U.S., which was San Francisco, San Diego, Baltimore, Newark, and Boston. The main aim was to look for a concordance between QuantiFERON and the TST.
The four groups enrolled: a low-risk group, 98 individuals, and that was to look for specificity of the test; a medium to high-risk group that included contacts of active TB cases, immigrants from high-risk countries, homeless people, et cetera; TB suspects, people suspected of having active TB disease. And these three groups represent the intended population for QuantiFERON TB.
The fourth group that was included in the CDC study were those individuals that had culture-confirmed TB in the past and had completed their therapy for that within the previous two years. They're not in the intended population. For many reasons, they are not appropriate for us to study and we are not presenting any data from those.
For the Walter Reed study, there was nearly 1,700 recruits at the Great Lakes Navy Station in Illinois. These were stratified into three groups, the first group being 397 individuals with no identified risk factors for TB.
The second group had one limited risk factor, which is they were born in or recruited into the Navy from a U.S. state that had a TB rate of 10 per 100,000 or greater. This is a very low risk factor, I'll acknowledge that. What we were trying to do here was to make group one as TB risk-factor-free as we possibly could, but I'll acknowledge that group two is a very low respecter as well.
Group three individuals were those who had identified respecters. The majority of them were born overseas, although there were some recruits that reported contact with active TB in the past.
Adverse events, there were no adverse events reported in the CDC study for the QuantiFERON-TB, where there was 9.4 percent of individuals in the CDC study reported an adverse event for the TST.
Looking at the sensitivity first of QuantiFERON-TB for active TB disease, there were 94 people enrolled into the CDC study group three. They're the TB suspects group. After culture was performed, 54 of these were found to be MTB culture-positive. Forty-four of these, or 81.5 percent, were QuantiFERON-TB-positive, indicating that the sensitivity for QuantiFERON-TB, using that trial 15 percent cutoff we had established, was 81.5 percent.
Now this has to be the minimum sensitivity of the test for latent TB as well, because it's well acknowledged in the scientific literature that people with culture-confirmed TB disease often have depressed cellular immune responses, including gamma interferon responses.
We now look at the apparent specificity of QuantiFERON-TB, look at the three low-risk groups, one from the CDC and two from the WRAIR study. Using the TST at 10 millimeters and the QuantiFERON at a trial cutoff of 15 percent, we found the specificity of the TST 95.9 in the WRAIR study compared to 91.8, 98.7, 95.5 for the WRAIR low-risk group and 98 compared to 93.4 for the limited-risk.
But these individuals, group one individuals, are not recommended by the CDC ATS guidelines to be screened for TB. In reality, they are. The military is a prime example of an institution that routinely screens individuals with no risk factors. So it's important to be able to have a test that works for them.
For the TST, a stratified cutoff of 15 millimeters is used for these individuals. We can do exactly the same thing with the QuantiFERON test, and we have established that a 30 percent cutoff is the optimal cutoff to use in individuals like this with no identified risk factors.
So if we look now at the specificities for these three groups, three study groups, using the TST at the stratified 15 millimeters or QuantiFERON at a proposed 30 percent stratified cutoff for such individuals, you can see that the specificities in general are 98 percent.
Now we're looking at individuals at risk of being infected with latent TB. This is a two-by-two table obviously comparing QuantiFERON with TST. We're looking here at individuals from the CDC study recruited into group one or group two. That's low-risk or high-risk.
For the TST we're using a risk-stratified cutoff where the individuals in group one, we use a 15-millimeter cutoff for the TST, and for group two we use a 10-millimeter cutoff. This is comparing QuantiFERON-TB to the trial cutoff of 15 percent.
You can see that concordance is quite good with 85 percent of the individuals having concordant results with the TST, although there are a significant number of individuals that have discordant results on both sides of the diagonal. Kappa here was .554, indicating moderate, verging on good, agreement.
But if we use a stratified cutoff that we're proposed for group one individuals, what happens to the data? For groups one's, we use the 30 percent human response cutoff for QuantiFERON and 15 millimeter for TST, and group two we use 15 percent cutoff that was established in the Australian trials and a 10-millimeter cutoff for the TST.
We find that the sensitivity of the test is maintained. The only people that have moved in that two-by-two table are those individuals that were in group one, the low respecters, and we're assumed that they're all negative, that they don't have TB infection.
Kappa for this was .561, again indicating moderate to good agreement. I would point out again that this is a similar, slightly better Kappa than that attained when comparing Aplisol to Tubersol both in low-risk and TB-infected individuals.
So what are the potential reasons for the discordance we've just seen? It was random variation as you'd expect to see; again, as the Tubersol versus Aplisol story. If we look at the individuals that were positive in the TST but negative in the QuantiFERON test, 13 out of 80 of them demonstrated MOTT reactivity by the QuantiFERON test, and MOTT is a well-known source, MOTT reactivity is a well-known source of false positive TST reactions.
There was a significant association with individuals being BCG-vaccinated having that same response, being TST-positive, QuantiFERON-negative, suggesting that perhaps the TST is more affected by BCG vaccination than is QuantiFERON-TB.
Two other factors were found, age and gender, and we really don't have any explanation for why they should be associated with discordance.
So now we've shown that QuantiFERON-TB detects M. tuberculosis-specific T-cell responses. We've demonstrated with people that don't have TB infection the vast majority are negative in the test, 98 percent of them. We've shown that individuals that definitely have TB disease, as demonstrated by culture, 81.5 percent were found to be positive. And we've demonstrated good concordance with the TST at 85 percent in those at risk of LTBI.
But although we can explain some of the discordant results found by MOTT reactivity, as demonstrated by QuantiFERON in those TST-positive, what's the best way of demonstrating this? It's looking back, I believe, at the extensive data from the bovine animal model, which is an excellent model for TB for humans.
This slide shows two-by-two tables. The top table here is the data that you've seen before for the CDC group one and two individuals combined. The data down below is a study from the Wood, et al., paper, the key publication that Paul Wood referred to earlier with 86,000 cattle tested.
What I want you to focus on here are the numbers, the percentages in brackets. These percentage values are the percentage of individuals, or in this case individuals, who are positive to one or both of the tests. So 48 percent of individuals were positive to both of the tests as compared to positive to any.
The same thing down here for the animals, the cattle in that study. You'll see there very strong similarity between the percentages of discordant values found between the human test and the bovine test. So the same level of discordance is found in the bovine assay.
But the big thing about the bovine test is that we could kill the animals, we could take out extensive tissues out of these animals, slaughter them in the laboratory, and culture for M. tuberculosis disease, looking for foci of infection.
If you now look at the data based on culture, stratified by positive culture, you'll see that the animals that were positive to both tests, the TST and the bovine equivalent of QuantiFERON, 87 percent of those doubly positive were found to be culture-positive. But for those that were positive just in the TST and negative by the QuantiFERON or the bovine version of it, 53 of them, only two of them were found to be culture-positive. So 4 percent.
So the sensitivity of the TST, when it was positive only by itself, it was very low. Conversely, if we look at the animals that were positive by the bovine gamma interferon assay and negative in the TST, 55 percent of them were found to be culture-positive.
Paul showed you these figures before, but the TST sensitivity from this study was 65.6 and for the gamma interferon assay it was 93.6 percent. So it's reasonable to assume, to extrapolate from this bovine model, that for discordance results in the human test it's reasonable to suggest that those gamma interferon-positive are more likely to be truly TB-infected.
Just to go through the conclusions, Tony told us there definitely is a medical need for an improved diagnostic test for latent TB, as indicated by the IoM report that came out last year. Paul told us about technology, and it's based on sound, very well-established scientific principles. Hopefully, I've just shown to you that QuantiFERON is a very sensitive test and highly specific for the protection of TB infection.
QuantiFERON has a major logistic advantage over TST, and that is people don't have to come back to get a result. As Tony told you, 30 percent, or in many cases many more than 30 percent, of individuals you don't get a result when using the TST. With the QuantiFERON test, you will get a result close to 100 percent of people.
QuantiFERON is a controlled, laboratory-based test. It's not subject to those subjective issues that TST is well-known for. It accounts for activity in the MOTT, and the initial data says that it appears to be less affected by BCG than is the TST.
I'd just like to conclude by showing this slide. We believe the data provide reasonable assurance of the safety and efficacy of QuantiFERON-TB as an aid in the detection of infection with mycobacterium tuberculosis.
Thank you for your attention.
CHAIRMAN WILSON: Thank you.
At this time I would like to invite the panel members to begin asking questions. Dr. Durack?
DR. DURACK: Several short questions for Dr. Rothel. If this test becomes widely used, which I'm sure you'd be pleased to see, what is the story about the supply of mitogens? Is it adequate, reliable, quality-controlled, and would there be enough for an extensive application of this test?
DR. ROTHEL: Yes, the mitogen is a commercial product that I bought from Streeton -- I'm just trying to think who -- but it's commercially available and there's no problem with supply of it.
DR. DURACK: Standardized and reproducible?
DR. ROTHEL: Standardized and it's standardized in-house as well.
DR. DURACK: A question about the nil response: Do you see much variation in the nil response? What's the range?
DR. ROTHEL: The general range of the nil response would be from an optical density, if we talk optical densities, from zero to about .07.
DR. DURACK: Okay.
DR. ROTHEL: Occasionally, you do get an individual that has a higher response in the nil, and this is due to competing factors such as heterophile antibodies that are common when you're using an ELISA that uses unique plasma samples. The assay, again interferon EIA, is heavily formatted to reduce it for all antibody activity, but occasionally perhaps some person has very high reactivity there and we don't compute it all out. But, again, it doesn't affect the result of the test because that variable is subtracted from all the other plasma sample wells.
DR. DURACK: A question regarding the human versus the avian test: How often do you see an equivocal response, if you like, where they're about equal? Would you comment on that? Does it happen? How would you interpret it?
DR. ROTHEL: The cutoff has been fairly extensively backed up by the data we've seen, I must say. In the vast majority of cases -- and this is all off-the-top-of-my-head stuff without having the data in front of me to show you -- but in the vast majority of cases a person who is infected with TB, such as a culturally-confirmed TB case, the response to human PPD I would guess would be at least twice that to avian PPD, and the inverse in the few individuals who are seen that have had MAC infection.
DR. DURACK: Have you seen examples where the response is about equal?
DR. ROTHEL: Off the top of my head, I'm sure we have, but I can't come up because I don't really know any of them. I can't think of any specific examples.
DR. DURACK: Right. One last question: You've touched I think several times on this, but the degree of the response, the quantitative response, can you comment on the correlation between active disease versus latent disease and the correlation coefficient?
DR. ROTHEL: Yes, quite often people with active TB disease you get very low responses within the sensitivity of the test to both mitogen and to the human PPD, but they still come out positive, whereas, typically, individuals who would be suspected of having latent TB infection, the responses are much more robust.
DR. DURACK: Thank you.
CHAIRMAN WILSON: Dr. Cockerill?
DR. COCKERILL: Yes, a couple of questions. I know that the datasets are limited, but in the studies you've done and in the studies reported, is there any information regarding the test when applied to children?
DR. ROTHEL: We've excluded, limited our tests to not cover children, but, yes, there is a large body of data available in Australia from specifically one physician, Jonathan Streeton, that original paper, who has been using the test for many years. He routinely uses it in children in contact situations, and that has got excellent results. But we realize we have to do studies in children to be able to gain approval for use in that population.
DR. COCKERILL: And among the patients you studied or other studies that have been done, patients who were leukopenic, any information regarding the validity of the testing in those patients?
DR. ROTHEL: Again, excluded from there, labeled on things, but, yes, we have done studies in HIV-infected individuals both in Kenya -- and I think there was attached to your panel packet a summary of that study. Also, some studies have been initiated in Australia looking at the response to mitogen relative to CD-4 counts in HIV patients.
It's actually quite surprising; quite a number of individuals with CD-4 counts less than 50 give quite strong responses to the mitogen still. Then, again, others don't. But generally, if they're over 200 CD counts, 200 per ml., they do have a measurable response. Less than that, it gets a bit equivocal.
CHAIRMAN WILSON: Dr. Reller?
DR. RELLER: Dr. Wood, in the schema you used blood collected in tubes with heparin. What about other anticoagulants and the effect on the test: EDTA, citrate, SBS, et cetera?
DR. RADFORD: Yes, we tried sodium citrate, one or two other anticoagulants. None of them actually work. When you look at it, because we're using whole blood, they actually interfere with the interaction between the antigen-presenting cell and the T-cell. So heparin is the only anticoagulant that will work in the system.
DR. ROTHEL: That's also been validated in the human assay.
CHAIRMAN WILSON: Dr. Charache?
DR. CHARACHE: I didn't tell you what I do at Hopkins, but I am an infectious disease consultant as well as a microbiologist, and I'm telling you this to give you my orientation, which is to say that I very strongly agree with the advantages of a laboratory test as opposed to a skin test.
I do see some very basic questions here in its current formulation, not what else may we do, but I think perhaps I can show it best if we look at the concordance. I was, as an example, looking at the concordance in the WRAIR study of all tests. In our book it's on page 77.
Now the WRAIR group does not match the TB population that we usually look at, which is much more diverse in terms of age and underlying pathology, but it has the advantage of being young, Navy recruits, so it can get a good look at the lower-risk group. If we look at all comers in the WRAIR group, looking at the positives, because to me it's the positives that are important, not the negatives, because we're looking for latent disease, there are 18 in which there's concordance. There's 105 in which the QuantiFERON is positive and the latent is not. So, clearly, they're measuring different things. There's a tenfold difference.
We do know that this group includes the low-risk which has a high percent false positives on the tuberculin test, meaning that many of the 18 that were in the low-risk population are false positives. That raises the very serious question about the false positives with this test.
If we look on the next page at the low-risk group using the 10-millimeter cutoff -- I think it's two pages -- the moderate-risk category, primary-risk individuals, we similarly see a skewing, not quite as bad in this one, but you end up with a 15.1 percent positive rate for the QuantiFERON. Now if we translate that 15 percent into positive per 100,000, which is the way it's expressed generally, that group should have 10 patients or 10 subjects per 100,000 individuals which is positive. If this test were correct, it would come out to something like 15,000 patients per 100,000 positive.
And the same is true as you look at the others. The number that would be called positive, if we use this particular test, particularly in the low-risk population, would be between -- well, in the CDC study it would be 8,300 per 100,000. That's way out of line with what all the epidemiologic studies have said it should be. And your slide used the number 10 per 100,000 for your category two. This comes out, when you add the zeroes, to 15,000.
So I think it's going to be very important that we understand why this is calling so many more people positive or we're going to have a very abrupt jump in our incidence of tuberculosis in the United States that we're going to have to explain.
DR. ROTHEL: Sure. Can I reply to the last bit first, get that out of the way? The 10 per 100,000 is the rate for active TB cases per 100,000 individuals. We're looking at latent TB infection, which is meant to be at least 10 to 100 times higher than that for active TB cases.
Did I understand your question wrong? Is it that --
DR. CHARACHE: Actually, the numbers don't come out quite like that from the literature which I saw.
DR. ROTHEL: Yes.
DR. CHARACHE: But, in any event, what we're seeing is that for those at lower risk the QuantiFERON has eight to ten times the number of positives as the skin test does.
DR. ROTHEL: Sure, sure.
DR. CHARACHE: And then we know that it's not as sensitive as the skin test when we get to the active TB model, where it's less sensitive. So I'm questioning what this problem is that we're seeing with the discrepancies between these tests that is so striking and how do we adjust for them.
I'm interested in knowing what your discordance effects of age and gender are, and in the CDC study they noted there was discordance differences in results according to the particular study site that did the evaluation, in their table, that it mattered whether you were in site E or site A in terms of results. So I'm wondering if you can help us understand some of the factors that we then could modify or adjust, as you have considered adjusting your criteria for the lower-risks, and so on.
DR. ROTHEL: There's a lot of questions in what you have just asked me. I hope I can remember to answer them all.
The first one, if we go back in order, as far as the sensitivity in the low-risk WRAIR individuals, we're proposing to use a 30 percent cutoff in those individuals. We're not proposing to use the 15 percent cutoff. That large number you're talking about of individuals positive in QuantiFERON/negative in TST largely disappears if we use a 30 percent cutoff.
What we can assume in those individuals is none of them are truly infected. That's the assumption we make, and that was the basis of the study. So a similar number of falsely positive by TST is falsely positive by QFT, would be my response to that.
DR. CHARACHE: Now wouldn't that same propensity for false positives perhaps be carried over into the other populations? They're just hidden?
DR. ROTHEL: Sure, sure, and there's wobble in any biological test like this. That's the range of variables I was talking about in my presentation. We're always going to get false positives in any test. It's the nature of biological tests.
Now where were we?
DR. RADFORD: I think Tony's actually got a comment to make to this topic. So I would just ask him to speak.
DR. CATANZARO: I just wanted to talk about the purpose of screening in the Navy, for example. Obviously, you and I are both clinicians, and we're interested in patients with disease. But in that setting the purpose is actually to find individuals who are completely free of any suspicion of disease.
So the fact that a large number of Navy recruits were correctly identified as being free of tuberculin sensitivity is the object of the exercise. Now I grant you that this presents more workload to the clinician to look at these people who are reactors to tuberculin and figure out whether that reactivity is due to tuberculosis disease or due to some other immunologic phenomenon.
But I think as a public health person, and particularly as someone who's going to put young men on a Navy submarine, for example, the fact that you've identified a huge number of individuals who are clearly free of tuberculin reactivity is the purpose of the exercise.
DR. CHARACHE: I'm concerned about how it's going to be used and the toxicity of the drugs that will be applied, if we have false positives. So I'd like to see if we can't get rid of some of them.
DR. CATANZARO: I think you're absolutely right. That was the purpose of my presentation saying that a positive reaction to tuberculin skin test or, for that matter, to QuantiFERON does not result necessarily in the application of therapy. There's a clinician between the two who plays a very important role. There are many people who have tuberculin sensitivity with the tuberculin skin test who are not candidates for INH prophylaxis, and the same will be true for QuantiFERON.
DR. CHARACHE: If you have a positive QuantiFERON, knowing that there may be a very high false positive rate, based on the low-risk group where we can perhaps see it best, what would you tell the doctor to do to prove there was or wasn't latent TB? Would you suggest they do a skin test or --
DR. CATANZARO: No.
DR. CHARACHE: -- how else would you decide whether to use antibiotics or not?
DR. CATANZARO: No. First of all, we propose the gradation of having a 15 percent cutoff and a 30 percent cutoff. This is analogous to what we do with the tuberculin skin test: a 10-millimeter cutoff in certain situations, a 15-millimeter cutoff in other situations.
But what I would recommend to that individual, just like I do with tuberculin skin test reactivity -- and I've been doing this for the past 30 years, and you have as well -- you see an individual who's got a 10-millimeter tuberculin reaction. You get a history.
If that person has, for example, been brought up in Peru and been given BCG three times as he was growing up and now is 25 years old, it's likely that that 10-millimeter reaction was due to BCG. If that individual was raised in Atlanta in a low socioeconomic -- excuse me -- was raised in Atlanta, had a 10-millimeter reaction, chances are that it well be avium. On the other hand, if that person was raised in California, the son of a mother with active tuberculosis when he was 10 years old, that 10-millimeter reaction is most likely due to tuberculosis.
So you have to apply, I think, clinical judgment to the tuberculin reactivity with tuberculin skin tests and the same is required by QuantiFERON. I do think that there are a large number of false positives. I think that this wobble effect of getting a different reaction to QuantiFERON than you do with tuberculin skin test is exactly the same as we see with the tuberculin using Tubercol versus Aplisol. I don't think you'd identify one of those as false positives, just different.
DR. CHARACHE: Well, yes, I think that obviously suggests that the product has different antigenic properties in terms of stimulating your immunity. Here we have a very different mechanism. I'm satisfied, I think, as all of us are, that if we have a simple test that can be done effectively to screen for experience with the mycobacterium tuberculosis, it wold be used very widely. I certainly favor this.
I'm questioning how to make it more precise, because when we do the math in its current form, we would have statistics that are quite disparate from past experience.
DR. CATANZARO: I want to make one more comment, if I may, regarding the question you asked, "Would you do a tuberculin skin test?" I would no more do a tuberculin skin test for a questionable QuantiFERON than I would do a Coghnaunt skin test with a questionable Aplisol. I think that to do that is trying to beat a technology beyond its capabilities.
The disparity between Aplisol and Coghnaunt tuberculin skin test reactivity is not due to measuring very different things. It's due to the inherent error in the biological assessment of tuberculin skin test reactivity. You have to go to a completely different system; for example: history, physical exam, et cetera. That's my point of view anyway.
DR. CHARACHE: Well, I would think it might be helpful to see if we can understand better some of these discrepancies and what it looks like when you use 30 in the most unlikely to have TB. We haven't seen that data. But, also, when we look at the higher groups, we can still see things that we really can't explain too easily.
There was one comment that there were 55 patients tested who had discrepant TD skin test compared to the QuantiFERON, and there were 39 that were retested. Of those 39 that were retested, only 18 were repeat positive with the QuantiFERON. So I think these are some of the questions I have in terms of how we can improve it.
DR. ROTHEL: Introduce yourself.
DR. RADFORD: My name's Tony Radford. I'm the Chief Executive Officer of Cellestis, and I also have a shareholding in the company.
I think that I could perhaps best address your question by putting up an overhead which looks at the percentage positive in all of the studies that we've done, all the risk groups from one, two, three and all the Walter Reed studies, using the risk-adjusted cutoff at 30 percent for low-risk or what you might call almost no risk groups. I think if you look at the percentage figures there, you will see that the percentage differences are really quite small and won't lead to major changes in epidemiological beliefs in the instance of tuberculosis. That slide's just going up behind you.
You'll see that it's the Walter Reed low-risk group on the left, again, using risk-stratified cutoff both for QFT and the tuberculin skin test. What you can see is that the percentages very closely parallel each other in each of the independent groups.
We come up here to, of course, the top. This is the active TB group. You come down here to the at-risk group, and if I go to the Walter Reed primary risk group, what you'll see in that primary risk group, where there is a higher risk of tuberculosis and it is, in fact, a reasonable percentage, they are closely parallel. As we go across into our lower-risk groups, we're applying stratified cutoffs in both cases, and you will see there is no significant change to the instance of tuberculosis.
So I don't think you'll find there is a major change in the epidemiological beliefs in the country in the incidence of latent tuberculosis using this test.
DR. CHARACHE: Okay, I'm working from the tables in which there are three risk groups rather than six. So I couldn't really relate to this.
DR. ROTHEL: The data I think that you want to see is what I've presented in the talk. On the second one of those specificity slides -- I think you have a copy of the slides -- where we apply the 15-millimeter cutoff to the TST and the 30 percent cutoff for QuantiFERON. I think they're the figures you're wanting to see. Am I correct?
DR. CHARACHE: I'm sure you have it.
DR. ROTHEL: Oh, we do have it, yes. We can put it up.
CHAIRMAN WILSON: Dr. Janosky.
DR. JANOSKY: Just a very quick followup to the question that was asked: Do you have the data to show us the discordance based on age and gender?
DR. ROTHEL: Yes.
DR. JANOSKY: What's the directionality of the discordance of what I'm actually looking --
DR. ROTHEL: Yes, I can talk to it; it is probably easiest. The table that was in your panel pack actually shows the moderate directional analysis legacy regression.
Age was associated with a positive TST/negative QuantiFERON. Age greater than 60 was associated with that type of discordance. Male sex was associated with having a positive QuantiFERON/negative TST.
DR. JANOSKY: I did see it in the panel packet, but just to refresh my memory again, you're saying males are more likely to be called positive when they're not and older individuals are more likely?
DR. ROTHEL: Males are more likely to have a QuantiFERON positive/TST negative response than having a concordant response with both tests, either doubly positive or doubly negative. So that was the reference group for all of that discordance analysis for individuals with concordance results.
DR. JANOSKY: Okay. That's what I needed. Thank you.
CHAIRMAN WILSON: Dr. Lewinsohn?
DR. LEWINSOHN: I guess a couple of questions. The first was I think a test that doesn't require coming back to the doctor obviously has some real advantages over the current skin test. So I was just looking over the data that's on page 40 that had to do with exclusions from the trial. I'm just trying to add these up very quickly, but it looked as if about 70 were excluded because of reasons sort of related to the QuantiFERON test; that is, unable to draw blood, insufficient blood, blood clotted, or other QuantiFERON errors, and about 130 were excluded because of TST errors.
I guess my question was, and this is in the context of a clinical trial where things are being done very carefully: What's been your experience with regard to blood being drawn for the QuantiFERON and then ultimately not actually having the test successfully done?
DR. ROTHEL: It's a fairly uncommon event, and a lot of events listed there are quite explainable. One, an incubator went down. I think there were 40 or something blood samples just lost in one event by an incubator going down overnight. Another one was at one of the trial sites and the lady had been there to collect the blood samples, slipped on the snow, on the ice, and broke them all. Yes, there's a few things like that.
You do see occasional blood samples where the people haven't shaken the heparin tube and you get blood clots. There's no point in running that sample.
You do see occasions where a phlebotomist has not collected sufficient blood to do it. Quite commonly, people think, "Oh, I've got a mil in there. We'll take the tube off now and do the next person." That is an occasional thing. It's just a matter of training individuals to say we need at least 5 ml of blood in the tube.
DR. LEWINSOHN: And then some of the requirements are fairly tight. For example, incubating the blood within the first 12 hours, is that an issue for places that don't have a 24-hour lab?
DR. ROTHEL: I think it probably has got some issues in some settings, yes. Situations where there's a path lab associated with a hospital nearby, that's not an issue at all. It's quite a normal sort of practice. If you're out in the middle of -- we call it "the Outback"; I don't know what you call it here -- if you're out in the middle of there --
DR. LEWINSOHN: Oregon.
DR. ROTHEL: Yes, Oregon, okay. If you're out there and you collect a blood sample in some remote country town with no pathology lab, yes, it would probably be an issue to get it to the local town by then. But you've got to remember, too, the screening generally happens at large institutions. It's not something the local GP generally does to you.
DR. LEWINSOHN: I have two more questions.
CHAIRMAN WILSON: Go ahead.
DR. LEWINSOHN: My other question had to do with the issue of BCG. I was just going over the paper that was published where you gave the medical students BCG. While most people had a quantifiable rise, I guess it was about 15 percent that actually would have been interpreted as going from negative to positive in that regard, and that was just one point in time. Your argument is that perhaps QuantiFERON is better able to distinguish BCG exposure.
So my question is, first of all, in those medical students, have you had a chance to look down the road; that is, did their QuantiFERONs come back down as you might expect? Kind of as a corollary to that, at least we know from the skin test that most people, if they've had at-birth BCG vaccination, will have a negative skin test by the time they're 20 or so. So is there a correlation with age and the likelihood of having a test that's TST positive/QuantiFERON negative?
DR. ROTHEL: Yes, that group, I agree, there were about 15 percent positive by QuantiFERON and I think 12 percent or something positive by the TST. Interestingly, though, different people. But, no, we haven't had a chance them up, the short answer.
To give you a better answer to the question, in the Streeton study, out of 478 in the low-risk group, in the zero group, roughly 200-or-so, off the top of my head, came from Dr. Jonathan Streeton's practice. They're Australian-born individuals of various ages, and BCG vaccination was routinely used in Australians about 13 in years of age or 16 in 1994. So anyone of the appropriate age had been BCG-vaccinated.
Of those 200 that Jonathan recruited into that low-risk group, I think it was around about a third were BCG-vaccinated. There was absolutely no effect of BCG vaccination comparing them to the other individuals that hadn't received BCG. They were looking at a longer timeframe rather than this five-month experimental period we used.
DR. RADFORD: I might also ask Dr. Damien Jolly, who is a consultant statistician for Cellestis, as he has a comment to make on this subject, if that's okay.
DR. JOLLY: My name is Damien Jolly. I'm employed by Deacon University in Melbourne, Australia. I work as a consultant for Cellestis Proprietary Limited. I have purchased shares in that company.
I would like to address the question asked by Professor Carache particularly with respect to the table on page 77 of the provided pack, because I'd like to direct your attention to page 2-196 in the appendix quite a way through, appendix 2, page 196. In this title you'll find the complete breakdown of the WRAIR dataset by cutoff at 10 percent of QuantiFERON in human response, 15 percent QuantiFERON response, 30 percent QuantiFERON response, and also stratified by the various risk groups within the WRAIR dataset.
You'll notice that in these tables all the numbers in the middle column add up to exactly the numbers that are presented on page 77, which was the title which concerned you. The column on the right provides all the data for the cutoff at 30 percent, which provides the actual concordance and discordance data at the level of 30 percent.
I submit this, Mr. Chairman, simply for the point of clarification.
CHAIRMAN WILSON: Thank you.
DR. CHARACHE: I think that's very helpful. As I said, I'm looking for a way of having this available without the false positives. I'm wondering about the possibility of using that same cutoff for all risk factor groups.
The reason for changing the millimeters is based on positive predictive value. If we looked at it from the same perspective, I'm wondering if it would be of value to correct in a similar manner all groups, because you can see, even in the high group, you see a similar degree of change. So that's one among my question, is whether this is really set in a way that would avoid false positives.
CHAIRMAN WILSON: Okay. Dr. Ng?
DR. NG: I think my question is for Dr. Catanzaro. One of the arguments in favor of this test is the 30 percent no-show rate for the second reading of the TST. I'm assuming you want -- let me restate this. People who come back to get their TST read is often a surrogate for those people who will continue to be followed and be tracked, et cetera. So my question to you is, of that 30 percent who do not return, how effective is the public health system in identifying these people and following them and being able to track them down, if they don't return for this appointment?
DR. CATANZARO: Well, it depends completely on the clinical situation. As I said, I work at UCSD Med Center. We basically have no ability to follow people up and go out into the community. On the other hand, the Health Department is very much structured to do exactly that. I think, frankly, that's where this really makes a difference because, if you skin test 100 people, you can expect perhaps 10 or 15 percent, depending on the setting, to be reactive. To focus in on those individuals needing followup is to reduce the workload dramatically. I think that that's where this kind of test plays a very strong role.
A similar situation is prisons, where there are a large number of inmates that come through that system and often leave the system fairly promptly, depending on whether you're in a prison, jail, et cetera. Again, it's a matter of following up a small number of individuals rather than following up everybody.
I think you're completely right as returning for a reading being a surrogate of taking the pills on your own. The CDC has been stressing to a very great extent observed therapy under various situations -- in prisons, in substance abuse centers, in mental health situations. In each of these situations, knowing that the population you're dealing with is -- or focusing in on the target population -- is to eliminate a large part of the workload. So that's how I see the applicability of a one-time measurement being better than a two-time measurement, even though I quite agree with you that returning for a reading is a surrogate for whether you'll return for treatment.
DR. NG: So then you have no information, in your example, if you had 100 people skin tested, 30 don't return, how effective the system is at finding those 30 to get the second reading?
DR. CATANZARO: That's correct, I have no information. I submit it will be different in each setting.
CHAIRMAN WILSON: Dr. Baron?
DR. BARON: I just have a quick question about non-tuberculosis mycobacteria other than MAC. We see a lot of kansasii and chelonae and that sort of thing in our setting. Have you looked at the results in those patients?
DR. ROTHEL: No, we haven't looked at that yet. I think it's very difficult to find those patients. I'd be interested in speaking to you later to see if we can do a study. That's a very rare event, from my knowledge.
But the best information we have there is from the bovine model, where we experimentally infected animals with kansasii as well M. avium, if you remember, and animals with kansasii came out with the avian profile in the QuantiFERON, all above or equivalent in the QuantiFERON test.
CHAIRMAN WILSON: Dr. Cockerill?
DR. COCKERILL: I think this is in the data, but I was trying to determine this. This is an interesting slide here. When clinicians look at patients with tuberculin skin tests, even a 5-millimeter skin test can be considered positive for latent TB based on, I believe, the CDC criteria. It would be interesting to see specificity comparing the QFT to the TST when it is interpreted as a positive based on the CDC criteria, whether it be 5 or 10 millimeters. Fifteen, as I understand, is a positive, regardless of what the patient presents at, the point being that in groups one or low-risk groups you will find a 15-millimeter induration. As we focus on these various groups, I don't want to lose track of what the comparison is to "a standard" that may not be a gold standard, and by virtue of criteria that have to be developed to interpret it, we have some sort of gold standard. How does this stack up compared to the interpretation of 5 versus 10 versus 15?
DR. ROTHEL: All of the data that I've presented for the TST was done by a risk-stratified cutoff, which is the CDC guidelines cutoff. In the panel pack you have data presented to you using a 10-millimeter cutoff. Then there's also something called risk-stratified cutoff. That's precisely using the CDC ATS-recommended cutoffs for the TST.
DR. COCKERILL: So the positive 5-millimeter in the charts was a 5-millimeter that was interpreted as a true latent state based on the CDC criteria or was it just the measurement?
DR. ROTHEL: The CDC criteria suggests that for people that are TB suspects you use 5 millimeters; for people suspected of latent TB, having risk factors for latent TB infection, you use 10 millimeters; for people with no identified respecters, you use 15 millimeters. Those are the cutoffs we have used for those respective group.
DR. COCKERILL: Okay. So a 5-millimeter patient, if they come and they're 5 millimeter induration, if they don't fulfill criteria for a positive interpretation of that CDC criteria, that was not included as a positive?
DR. ROTHEL: No. So individuals at risk of latent TB, if they had a 5-millimeter reaction, would be deemed as negative.
CHAIRMAN WILSON: Dr. Beavis?
DR. BEAVIS: I had a question about your slide 31. It's also presented in the data packs that we received. It concerns the cutoff for the percent avian difference.
My understanding as to how that was determined is that you've got people with known TB, known MAC, and then drew a line trying to discriminate between the two groups. It was Dr. Wood, he said it beautifully. He said that adjusting the cutoff depends on the goal.
I was wondering what your thoughts were and how you picked the cutoff for this. Because you're not calling any of the people with known MAC positive for TB, but you are leaving a couple of people off who are TB-positive and calling them negative. I realize it's overlapping groups, and no matter where you set the cutoff, you're going to wrong in some of the patients. But if you could give me a little bit of your thought process with that, please?
DR. RADFORD: We haven't anything in here to sort of address that in the active TB groups, but I would say that the general thrust was to actually include all positive TB cases rather than to diagnose MAC infection. What we're trying to do is to exclude those that we can have a very strong assurance of are, in fact, MOTT-reactive rather than TB.
Now what we've done, and I think it might be in the panel pack as well, or is it?
DR. ROTHEL: No, I don't think it is. Oh, yes, it is, a graph.
DR. RADFORD: I have a graph here that looks at the use of avian at different cutoff levels in patients in the CDC study with active tuberculosis, culture-confirmed.
What we see, applying the minus 10 percent avian in different cutoffs, is that there is in fact a very large threshold. We could, in fact, increase the avian difference a great deal more before we start losing sensitivity for TB. So there is an argument to be made we have not put a stringent enough threshold on, but in the studies we've seen today we believe that it's better to diagnose tuberculosis than to identify a MOTT reactor. So, again, that's another reason for discordance that could occur in the test.
DR. BEAVIS: So are you saying that you would consider changing that cutoff for minus 10 percent?
DR. RADFORD: This is the best cutoff we've had to date, and the data we have supports it, and we believe it does. We have that original study that does support that. It's done in patients which actually have an immune response in many cases; other patients with MAC responses are immunocompromised.
DR. BEAVIS: I just want to be clear, make sure that we're in agreement. I guess it's always the case with any laboratory test, when you have two overlapping groups, that no matter where you put your cutoff, you're going to misclassify some patients. In this particular situation one has the option of calling some TB patients negative or you can call some MAC patients positive for TB. The way that the cutoff was made, it seems that the choice is made to make the error of calling some of the TB patients negative rather than MAC patients positive.
DR. RADFORD: If I can just add something, yes, it does like that from that original study we did. We set at a minus 10 percent, but that cutoff was then being used in all subsequent studies we have done. The best example is probably when we've got a gold standard, which is individuals with culture-confirmed TB disease. We haven't missed any, from off the top of my head. I have to check the figures. I don't think we've missed any individuals with a culture-confirmed TB disease due to them having an avian difference less than minus 10 percent.
DR. BEAVIS: Of minus 10 percent?
DR. ROTHEL: Of less than minus 10 percent, yes.
DR. BEAVIS: Okay.
DR. RADFORD: Well, to be absolutely correct, if you'll see my graph there, we missed one. If we had gone down to minus 40 percent --
DR. BEAVIS: Exactly.
DR. RADFORD: -- we would have had one more.
DR. BEAVIS: Okay.
CHAIRMAN WILSON: In light of our need to stay on a tight schedule today, we only have time for a couple of more questions.
DR. NOLTE: I'd like to follow up on the percent avium difference. Basically, has the data been analyzed if you didn't consider the percent avium? I mean I'm trying to figure out what the effect is on the overall test in terms of having this additional component, because there's little data presented to the panel that documents its effectiveness in avium or MOTT-infected individuals? Do you know what I'm trying to get at?
DR. ROTHEL: Yes. I know where you're coming from. We've got a slide to address that.
DR. NOLTE: I mean, does it contribute?
DR. ROTHEL: Yes, that contributes greatly to specificity.
DR. NOLTE: I'm sorry?
DR. ROTHEL: That contributes greatly to specificity.
DR. NOLTE: Greatly? Yes.
DR. RADFORD: What I have here, I've got more slides, looking at two different groups, three different groups, and illustrating the effect on sensitivity and specificity.
What we can see here -- and we're looking at the high-risk individuals in this top group, and we're looking at it with a range of various percentage differences cutoff, and "no ADCO" there refers to no avian difference supplied at all.
What you really have to look at, when you look at those tables, is think about it in terms of those two-by-two tables we described. The number of PPD positive, QuantiFERON positives in the CDC at-risk group up there is reflected on the NOAD code 158. So that's actually a rise from 145, I think, in the original figure to 158.
What we see, though, here is a TST-positive QuantiFERON negative at 70 percent, where it should be, but the QuantiFERON positive/TST negatives rise substantially from a figure -- actually, I think it was 80, my recollection, 72, sorry, up to 122. So we're getting 50 more QuantiFERON positives if we don't apply the avian difference level.
I think that generally is reflected in most of the data. We lose sensitivity -- sorry, we lose specificity. We do, in fact, maintain concordance. In fact, it's quite interesting to see that you actually can get a better concordance with PPD to some degree by actually doing this, but, of course, you do get these QuantiFERON positives at a higher level.
DR. NOLTE: Again, with the avian difference, the only patients that were documented avian infections are the 10 or 15 or so children that were described in the packet insert?
DR. RADFORD: That's correct.
DR. NOLTE: Obviously, this is not meant as a diagnostic aid for MOTT infection?
DR. RADFORD: No, this is not being intended as a diagnostic guide for MAC.
CHAIRMAN WILSON: Okay, time for two more questions.
DR. LEWINSOHN: I was very interested, there was a table that's shown on page 48 that looks at a subgroup of patients, I guess it was 39, who had discordant results and where you were able to retest them. I was actually surprised at the numbers that changed their results on retesting. So, for example, if you were QFT-negative, I think there were a total of nine that on retesting became QFT-positive. Also, if you were QFT-positive, I think there were -- what is it here? -- there was a total 21 --
DR. CHARACHE: It's on the last three lines on that page.
DR. LEWINSOHN: I think it was a total of 21 that changed. So I'm just interested to know what your thoughts about what accounts for those test changes. Obviously, the TST changed in some cases as well.
DR. ROTHEL: To be honest about my thoughts, I can't really glean anything from it. It's a terribly biased population of individuals. They had discordant results initially to start with. To really do this study, you need to do individuals that had concordant results, both positive and negative concordant results.
There's only a very small number of the individuals that were meant to be done who had this done. The biggest factor is: What is the effect of having a prior TST on the QuantiFERON test? We've done that in cattle, and we've shown that initially it depresses responses to subsequent QuantiFERON tests and then boosts them for a while, and then past 30 days they come back down to normal. We haven't done that in humans, but it's just to me some data we have to present in here because it was in the protocol, but it's somewhat irrelevant.
DR. RADFORD: To perhaps answer your question as to whether or not it actually relates to the stability of the test, we actually do have data presented showing reproduction of the test in individual --
DR. LEWINSOHN: No, I was more interested in the issue of interference between the TST and the QuantiFERON and as to whether you would, as part of your advice to clinicians, tell them to one or the other, or if they were interested in doing both, to do one first and then the other?
DR. ROTHEL: A good point. I think you've raised something I must admit we hadn't thought of, that you should advise people if they perhaps are going to do both tests. I don't know why you'd want to do that, but if you were going to do both, yes, you'd want to do QuantiFERON before placing the TST.
CHAIRMAN WILSON: A final question, Dr. Reller?
DR. RELLER: I work in North Carolina, where the prevalence is considerably higher than -- we're in the upper quartile nationwide. So it's more than 10 per 100,000. Smear-positive patients, to give some feel for the magnitude of MOTT infections, it's about four or five to one; that is, if we have a smear-positive, it's far more likely to be. Some of those patients, it's controversial what constitutes disease.
So in that sort of population, how would you expect this test to work? Do you have any experiences, is it even possible by looking at the other side of things, the response to the avium antigen stimulation that one might even be able to, owing to the response, separate out those people who have real disease with MOTT versus those who are simply colonized?
So there's two parts. One is, how would you expect the test to perform in our area and what about its use from a totally different perspective?
DR. ROTHEL: I would expect the test to perform quite well in your area in discriminating between the two infections. As far as looking at disease, that's specifically what that study was done that we used to establish the percentage avian difference cutoff, the paper by Stapledon, et al., which is appended in your panel packet, physicians working in Adelaid. They wanted to use the test to do exactly what you're talking about, discriminate between disease caused by TB or MOTT bacterium avium complex.
They found in that data they would use a different cutoff to do that interpretation. That's not what we're proposing the test for, of course, in this situation, but it discriminated 100 percent, I think is the conclusion they drew in that paper.
Again, it's a limited study, and I think for that application we need to do obviously vastly more work, but I think it's got applications there.
CHAIRMAN WILSON: Okay, while the FDA is getting their presentation materials together, let's take a very brief break, about five minutes.
(Whereupon, the foregoing matter went off the record at 10:44 a.m. and went back on the record at 10:54 a.m.)
CHAIRMAN WILSON: Okay, I'd like to reconvene the meeting at this time, please.
At this point we'd like to move on to the FDA's presentation. Again, I'd like to ask the panel members to hold any questions until all three presentations are complete. I'd like to remind the audience that only panel members can ask questions of the speakers.
FDA, the first presentation will be given by Roxanne Shively, who is a Senior Scientific Reviewer for the Bacteriology Devices Branch.
MS. SHIVELY: Good morning. It's kind of hard coming after such good discussions have already opened up a lot of issues.
For FDA, the QuantiFERON-TB application is a multi-level endeavor. Not only is there a bridging of the continents with Australia here, but within FDA we've had cross-center activity with CDER participation, CBER, and of course CDRH on this review.
We really appreciate the company's effort in compiling the panel packages for you and their complete presentation to you this morning.
Because of the public health importance of a test for used for diagnosing latent TB infection, FDA review of this application is expedited. We also brought this to the Microbiology Advisory Panel early in the review cycle because we recognize the importance of questions related to evaluating the performance of a new assay when the only current approach, the tuberculin skin test, has considerable limitations. We believe your input will help the company and FDA to most efficiently develop a path for identifying the clinical merits of this assay.
Next slide. The first part of FDA's presentation covers the intended use for the QuantiFERON assay, a brief discussion of in vivo versus in vitro testing, and then some elements of the QuantiFERON analytical performance that we believe is important to the discussion overall today.
The QuantiFERON assay is submitted as an aid in the detection of mycobacterium tuberculosis infection. This is the same labeled intended use as tuberculin PPD for in vivo use. The proposed labeling does have limitations, as already mentioned, and we would note that the primary clinical studies did not include these groups, either pregnant women, 17-year-olds, or HIV-positives, other immunosuppressed.
I would like to clarify one thing that came up the end of the discussion, that this assay is not submitted to differentiate individuals with MOTT infection. The avium PPD portion of the assay is intended to control for cross-reactivity, and it hasn't been evaluated for differential capabilities.
We are at the next slide. Much of the data and information available to characterize the QuantiFERON is relative to skin testing. One of our concerns is how to understand differences that would affect who is tested and how we use the results from the new assay.
The areas that I will initially present look at similarities and differences between these two tests. This slide blocks out in a very simple way the basic elements of the skin test versus the QuantiFERON. The company has already discussed differences here at the pre-analytic level; that is, intradermal injection versus collection of the venous whole blood, and performing the test in the clinic versus performing the test in the clinical laboratory.
We would like to point out that one of the cited advantaged that the company makes is that the QuantiFERON assay has the benefit of being a lab-based test that will add greater control and standardization. We will want to look and make sure that that control and standardization within the clinical laboratory is possible, too.
The direct common elements between the two tests is the human PPD reagent. That is the same reagent as the tuberculin PPD that's used in the skin testing. Although the two tests use different measures, they essentially are measuring an individual's immuno response. The TST does have the progressive end-points that have already been discussed. As the company has presented, they are proposing to change the cutoff for the QuantiFERON to a scaled differential cutoff based on risk.
I would like to point out that with this cutoff modification that we have encourage the company to look at options with the cutoff, and that the new analyses and data supporting this were submitted within the past two weeks, right at the time of your panel packs. So we wanted you to have this available, but we will be focusing on the original data that was submitted to us, looking at the implications of the tools and how we evaluate comparisons between the two assays and overall performance parameters.
Okay, next slide. This slide illustrates the initial immune response at the cellular level and what is being measured by the skin test on top and the QuantiFERON on the bottom. Both assays are detecting components of cell-mediated immunity reacting to antigen that is injected intradermally for the skin test and added to the blood culture for QuantiFERON. The skin test measures a delayed-type hypersensitivity reaction resulting from the interaction of multiple cells, including memory T-cells and the network of cytokines and other immune mediators. The QuantiFERON measures the presence of these memory T-cells, which are down in the dish now, in a venous blood sample by the production of gamma interferon.
One other difference at the cellular level that already came up in the discussion that could affect responses in each of these assays is that, when the PPD is injected intradermally, memory T-cells are recruited to the site of infection; whereas, with whole venous blood the circulating T-cells that are sensitized that are the memory T-cells are already present in the venous draw that is collected. So there's no recruitment.
You have already asked about the differences in white cell levels and the effect of those levels on QuantiFERON results. We would certainly welcome your comments on the need to look at that type of data to qualify and standardize this assay, too.
Next slide. The next few slides highlight some of the things we know about skin testing accumulated from its history of use. Our primary question to you today is going to be, how can we best describe similar attributes for the QuantiFERON and what statistical tools are best to use?
The delayed-type hypersensitivity reaction of the skin test is detectable two to twelve weeks after infection. From available research, we would expect gamma interferon to parallel that.
Sensitivity of skin testing approaches 100 percent in persons with normal immune responsiveness, but up to 25 percent of infected or diseased individuals we know may be falsely negative. Most of these may primarily be due to HIV immunosuppression, but also certainly the other host variables and problems cited by Dr. Catanzaro.
Next slide. Specificity of the TST is improved by increasing the reaction size that separates a positive from a negative reading, and we expect, as Dr. Charache has already pointed out, improved sensitivity using those cut points. We would expect that approximately 95 percent specificity when there is common cross-reactivity in the population with non-tuberculosis mycobacteria. We are including BCG and NTM together as non-tuberculous mycobacteria in this category as potential cross-reactants. When BGC vaccination or NTM is not common, we would expect the specificity to be higher and about 99 percent.
Our last point: The TST performance overall, both sensitivity and specificity, is affected by other population variables, too, such as age, the prevalence of disease, and in addition to BCG vaccination and non-tuberculosis mycobacteria.
Next slide. We've already discussed using the progressive cut points. These are the joint CDC/ATS criteria, using 15 millimeters for a low-risk population, 10 millimeters for those with increased or moderate risk, and the smallest cutoff, 5 millimeters -- actually, it's the most stringent -- in the high-risk groups.
Next slide. Risk assessments on which the cut points are based are from both epidemiological and clinically-defined groups. I am not going to go through all these, but we did want to have them available because it can get confusing, too. I do want to highlight that the ones in red are those that have the highest risk and would be read at the 5 millimeter cutoff. You and refer to Table 7 from the joint statement, too, for the complete listing of these.
Next slide. Using gamma interferon as a marker, a post cell-mediated immunity certainly has a solid foundation of research evidence. Besides the importance of gamma interferon in the cell-mediated immune response to MTB infection, reports have shown that production is decreased in patients with active TB, especially those with severe disease. This suppression may last more than a year.
We do want to note a word of caution: that the gamma interferon measurements from published research characterizing responses may not always be comparable, depending upon the host models used, the methods, and types of assay used.
Next. I think I am going to skip this slide because I know we are anxious to get through this.
I am going to go to the basic analytical portion of the QuantiFERON as detecting gamma interferon. Gamma interferon is estimated for each of the four harvested plasma samples, and this is done from an EIA standard curve using the kit standards which are provided in the kit. These are zero, low, medium, and high standard.
There are acceptance criteria for using these standard results. Again, I won't go through these, but they are critical because they are the only controls applied to the EIA portion of the QuantiFERON and there is no independent control material in the kit outside of the kit standards themselves.
Next slide. Okay, the QuantiFERON kit has no external control materials, and also the labeling doesn't recommend any external control materials that could be tested. Instead, the labeling recommends for QC that the acceptance criteria for the standard curve be used and also adherence to recommended procedures, and that following these procedures and using the curve acceptance criteria will contribute to control of the assay.
The design of this assay does, however, have an internal control, and that is the mitogen-cultured sample that is supposed to control for functionality of blood cells to produce gamma interferon. Another design aspect of the assay is the nil control, which essentially would control for background of gamma interferon activity in the patient sample. This is value is acceptable whether it is zero, less than zero, or greater than zero.
Although we would expect this value to be almost always zero, the nil result is subtracted out as background regardless. We do understand the importance of both the mitogen and the nil for getting reliable results with this assay, but we do question whether they are sufficient for ensuring reproducibly reliable results in clinical laboratories. We have put that question to you today.
Next slide. Oops, I'm sorry, that's it.
The decision thresholds are cutoffs for the QuantiFERON assay, and how those cutoffs are calculated has already been described by the company. The discussion has already rapidly moved forward on modifying these cutoffs and looking at variables that affect the cutoffs.
So I am not going to linger here, but I do want to point out that basic principles that are used in these studies may affect the outcome of the study and what cutoff chosen. The major question is whether the cutoff will be applicable to other populations other than the one where the initial study was done.
For the human response cutoff, the Australian guidelines are slightly different than those used in the U.S. Only nil values greater than zero were used in the calculations, and mitogen results less than 0.5 rather than 1.5 were considered indeterminate. We would ask whether any of these factors could affect use of this cutoff in other populations.
The same for the percentage avium difference. The study was originally done to show the difference between a group of children who had been infected with MOTT and a larger group of adults who had had TB disease. Again, we would question whether this cutoff would apply to general other cutoffs for controlling the level of cross-reactivity in populations.
One final cutoff that we consider to be important is the mitogen minus nil because it's this value that distinguishes whether an assay will be indeterminate or will be valuable in the QuantiFERON test.
We would note that, regarding this mitogen, in order to get a 15 percent human response, you would need to have at least a 10 international unit reading with the mitogen.
Next slide. The last area I want to cover this morning is reproducibility. There have been various studies presented by the company to support inter- and intra-assay reproducibility. As pointed out already, there are appreciably difficulties with designing these studies because of the nature of the assay itself.
We are going to look at the one study that we consider to be very good in that it looks at inter-laboratory reproducibility. We did not have inter-laboratory reproducibility established during the clinical studies. So this is an area that concerns us, to be able to ensure that the test can be done reproducibly in different laboratories.
The data is up here, and the table was done using 50 duplicate blood specimens tested at two different sites in Australia. If you look at the table, the majority of the samples tested were positive in the QuantiFERON, gave an agreement of 98 percent, Kappa .89, with an ICC of .94. Even though the agreement is good in this study, we would question whether we would see the same type of data when you have more negative results.
Because of our concerns with controls and not having inter-lab reproducibility across the range of the assay, we also are concerned that results from the clinical studies may possibly be affected by inter-laboratory variations. We would certainly welcome your suggestions in the discussion for bridging that concern.
Next slide. There are additional supportive data from published and unpublished literature with comparisons of QuantiFERON and skin tests. These include testing different or selected populations, and the company has discussed some of these this morning. These also include the Bovigam studies done using the assay that's very similar to the QuantiFERON but does have different reagents and a different methodology.
Also, we would note regarding the studies in animals that there is a different host, a different pathogen, and different tests were used. We would again welcome your comments on how to position these additional studies into the wealth of information that we have from the clinical studies, and even further, how to statistically evaluate those assays and derive some meaningful statistics from that data.
Next slide. Dr. Leonard Sacks will be covering the clinical studies in the next minute. Before ending, I do want to point out that there are differences, very small differences, between the published CDC data and what is being presented here. Also, of course, we are going to be looking at some new data today using the 30 percent cutoff. As I mentioned before, this has been very recently submitted. We would encourage you all to consider how we can best look at this proposal and the new analysis done, and how we should validate new cutoffs to be used in the different calculations.
So I'll turn it over to Dr. Sacks now. Thank you very much.
DR. SACKS: Good morning. My name is Dr. Sacks, Leonard Sacks, from the Division of Special Pathogens, and I will be spending the next approximately ten minutes reviewing the clinical use of QuantiFERON as an assay for tuberculosis. I will be restricting my presentation to the two pivotal studies that were submitted by the applicant.
Can I have the first slide, please?
Just a bit of background, and I think a lot of this has already been covered and most of the audience is familiar with it. But there are several ways in which people respond to exposure to tuberculosis. These may range from no detectable response through simple skin test conversion and self-limited primary complexes developing in the lung with or without positive skin tests. Then there are a couple of responses which may result in overt or active TB, the primary progressive TB, as a result of the initial exposure or reactivation subsequently once exposure has already occurred. It is really in the first three categories that latency becomes an issue. This is the area where QuantiFERON has proposed its utility.
Let's go on to the next slide. These were the intended uses of QuantiFERON as submitted in the original application. It was to be an aid in the detection of latent mycobacterium tuberculosis infection. There were a couple of other points that were included.
First of all, that a negative result does not preclude active tuberculosis. Second of all, that the QuantiFERON tests may be inconclusive in immuno-compromised or immunosuppressed individuals and those with no cellular or impaired cellular immune response to tuberculin. Finally, that the safety and the effectiveness of this test was not established in individuals under 17 years of age and in pregnant women.
Let's go on to the next slide, which again reiterates some of the points that were very adequately made early on, but there is no gold standard for the diagnosis of latent tuberculosis. The tuberculin skin test is one of the methods or one method that is used to detect latency. The tuberculin skin test allows the institution of prophylaxis to prevent reactivation in patients having a positive test, and that's how it is conventionally used.
The tuberculin skin test is fraught with problems. As we know, it is an archaic test. It has problems with sensitivity, particularly in patients who are immunosuppressed or such as HIV-positive patients or patients on steroids, et cetera. It has problems with specificity related to infections with mycobacteria other than tuberculosis, and it has the well-recognized practical limitations of compliance. Patients have to come back for a re-read after 48 to 72 hours. There is some subjectivity in interpretation of the size of the induration. There is some discomfort in the application.
The last point to be made here is that only a small proportion of TST-positive patients will actually develop TB, approximately a 10 percent lifetime risk.
Let's go on to the next slide. Now in the absence of a gold standard, what methods can we use to evaluate a new diagnostic test for latent tuberculosis? What I have done is just put up a couple of suggestions. There are obviously many other different ways in which this can be approached.
First of all, one could contemplate a prospective study to determine the ability of a positive test to predict active tuberculosis. Another method would be to compare with existing diagnostics for the diagnosis of latent tuberculosis. The third suggestion would be to correlate the performance of the diagnostic test with the clinical risk for tuberculosis. It is the latter two approaches that have been used by the applicants.
Let's go on to the next slide. There, too, pivotal studies, one performed in collaboration with Walter Reed, one performed in collaboration with the CDC, these were roughly the inclusion and exclusion criteria. The Walter Reed studied included naval recruits. It was a single-site study based in Illinois at a recruiting center, although the actual enrollees were from all over the country. They were to be HIV-negative.
The CDC study, to some extent this was a clinic-based study, a multi-center study on clinic subjects presenting for screening with tuberculin skin tests. It was a five-center U.S.-site study, as was mentioned before, in Massachusetts, Maryland, two sites in California and New Jersey. Patients over 18 years of age, also HIV-negative, and non-immunosuppressed.
So there were a lot of similarities but some differences between these studies. They do seem to reflect the demography of patients who would use this test.
Next slide. Just to give you some idea of the numbers, initially, there were 1,627 enrolled in the CDC study, 1,961 in the Walter Reed study, a total of over 3,000 patients; quite a number of exclusions, 670 in all leaving, approximately 3,000 evaluable patients when both studies were pooled.
Let's move on to the next slide. Just a word about patients excluded from the analysis. We did note that almost 20 percent, 19 percent, of all enrollees were excluded. There were 144 patients excluded at a single site in the CDC study, and this was apparently on the basis of unverifiable informed consent. The other reasons for exclusion were also mentioned earlier. Some of them were technical errors, incubator failure, the TST was not read at the right time or not read at all.
Let's move on to the next slide. This just gives you an outline of the demographics in both of these studies. In the CDC studies we see that this was a slightly older population. The mean age was 39 compared to 20 in the Walter Reed study. There were more females in the CDC study, 49 percent, and only 17 percent in the Walter Reed study. There was a higher representation of black persons in the CDC study, whereas 56.3 percent of the patients in the Walter Reed study were white.
Let's move on to the next slide. In practice, there were seven embedded subgroups within these two big studies, each consisting of different risks for development of tuberculosis. What I have done in this slide is I have ranked these subgroups for both studies according to increasing risk for tuberculosis as we go down the table.
So in the first Walter Reed subgroup there were 397 patients with no identified risk for tuberculosis, and a similar group of 98 patients in the CDC study, again with no identified risk. It was a low-risk group of 1,066 patients in the Walter Reed study from the U.S. state with a TB incidence of greater than 10 per 100,000. Then there were two subgroups here which represent the population where TST is often used to decide on prophylaxis. Two thirty-two patients were in the Walter Reed study who were TB contacts who came from countries where TB was prevalent, and a similar group over here, TB contacts, persons from countries where TB was prevalent: patients from shelters, intravenous drug addicts, and others.
Finally, there were two categories at the bottom where the risk of TB was appreciable. In group three these were patients with pulmonary symptoms which were compatible with those who were evaluation for tuberculosis. In the risk group four these were patients who had had previously cultured-confirmed tuberculosis and had completed therapy.
Now the next slide demonstrates the comparable performance of the tuberculin skin test and the QuantiFERON test. Let me just mention that for simplicity and a couple of other practical reasons which I will mention, I have used the 10-millimeter cutoff for the tuberculin skin test across the board. I thought that this was an equitable comparison because QuantiFERON doesn't use a ranked cutoff, so I have used the 10-millimeter for that reason. That is also the cutoff that people would defer to in the risk categories.
Here what we see is that in the low-risk populations up here, these are populations with no risk for tuberculosis. We see a tuberculin skin test positivity of somewhere between 1 and 4 percent, whereas the QuantiFERON is appreciably higher, between 5 and 8 percent.
When we look at the middle risk group, we see that the QuantiFERON and the tuberculin skin test positive rates are somewhat similar. In fact, in this particular group, CDC risk population two, 24 percent and 23 percent. When we move into the higher-risk categories of either confirmed or suspected tuberculosis, it is clear that tuberculin skin tests are much more frequently positive than QuantiFERON tests, 84 percent in the tuberculin skin test group, 70 percent in the QuantiFERON. In patients with previous confirmed tuberculosis, 92 percent positive by TST, 64 percent positive by QuantiFERON.
Let's move on to the next slide, which just shows the same information graphically. What I have done here is I have the increasing risk for TB along the X axis and the percentage positive by each of the two tests on the Y axis. I think it is quite clear that both of these tests correlate with increasing risk for tuberculosis, but there are some differences, and I am going to concentrate on those now.
Let's first take a look at this area of the curve. Let's go on to the next slide. How about the performance in high-risk populations? Well, we can see that there is clearly differences in sensitivity for the two tests in patients with confirmed tuberculosis.
Now it has been mentioned earlier that there are reports that gamma interferon is decreased in patients with active tuberculosis disease. The effect of this finding on the sensitivity of QuantiFERON in other risk groups is really unclear.
How about this section of the curve? Let's move to the next slide. What we are addressing here is the performance in low-risk populations. Now here, although the apparent difference is small, these are the patients who would qualify for TB prophylaxis over here.
Now just bear in mind that, since the lifetime risk of tuberculosis is only 10 percent, many healthy individuals may receive unnecessary therapy with potentially toxic drugs. So our aim would be to maximize the specificity of an assay in this sort of population group.
If we look at population one, which is at the end here, TST was positive in 1 percent of the population, and QuantiFERON was positive in 5 percent of the population. So potentially a fivefold difference in the number of individuals qualifying for treatment.
What about the middle of the curve? Let's move on to the next slide. The performance in the population for intended use, these are patients with risk factors for tuberculosis: patients from countries with a high incidence of tuberculosis, patients from shelters, and drug users.
I would like to draw your attention to population group five. I have mentioned these a little earlier. Here both tests look strikingly similar, and the question we are left with is whether the 23 percent that are positive by QuantiFERON in this group are the same individuals as the 24 percent that are shown to be positive by tuberculin skin tests.
The next slide addresses this in some detail. This may be a little confusing. These are not completely drawn to scale, but let me just orientate you.
This is CDC risk group two, intermediate risk for tuberculosis, 944 patients in total. Tuberculin skin test cutoff has been set at 10 millimeters. What we see here are those positive by QuantiFERON are in this circle; those positive by tuberculin skin tests are in this circle. Those negative on both tests are out here.
So we see that 68 percent of the population are negative on both tests, but we can clearly see that there is a discordance between the patients that are detected positive by TST and those that are detected positive by QFT. What we can see that, if you did a QFT, a third of the QFT-positive patients would not be TST-positive. Conversely, by TST, a third of the QuantiFERON-positive patients would not be found by TST. So there is a significant discordance even though the absolute percentage of positive tests in both of those groups appears the same.
Let's just look at this sort of analysis for a couple of the other risk groups, the next slide. Now this is the low-risk group, 98 patients with no observable risk or no identifiable risk for tuberculosis. What we see here is TST is picking up less patients; 91 percent or 92 percent approximately are negative by both tests. TST, as I say, is picking up less patients; QuantiFERON is picking up a lot more patients. In fact, almost five-eighths of the patients who are positive by QuantiFERON are not found to be positive by TST. This is in the low-risk group.
Let's look at the flip side, next slide. These are patients with confirmed tuberculosis. Here we see that the tuberculin skin test positivity is much higher than the QuantiFERON positivity. The overlap is pretty good, but QuantiFERON is not picking up almost a third of the patients that are picked up by the tuberculin skin test, a very small number of QuantiFERON-positive patients that are not picked up by the TST.
Okay, I would like to just change gears a little here. Let's move on to the next slide. This was mentioned a little earlier. I am just highlighting it as an issue of interest.
These were the discordant results reinterpreted, or at least retested by both QuantiFERON and TST. As has been pointed out earlier, this was not a randomized sample. This did not include patients who had concordant results. So I guess, treated with that degree of circumspection -- but what we see here is that patients changing from QuantiFERON-negative to QuantiFERON-positive, there were 22 patients who started off QuantiFERON-negative with discordant results and 41 percent of them became positive on retesting. When you do the same thing with the tuberculin skin test in 39 patients who had discordant results, you find that 26 percent of those who are TST-negative changed to TST-positive. So a bigger change in the QuantiFERON.
When we look at the reverse, the percentage of patients who changed from QuantiFERON-positive to QuantiFERON-negative, we see that 54 percent of the 39 patients became negative after an initial positive test, whereas in the tuberculin skin test it was unusual for patients to become negative on a second reading, only 18 percent or 4 out of 22.
Just one other aspect, the next slide, which was also touched upon. These were the results of a subgroup of patients in the CDC study who were identified as being BCG-positive. BCG, as we know, may itself affect the performance or at least affect the positive rates of the tuberculin skin test. It may also be a co-variable for exposure or risk of exposure to tuberculosis.
What we see here in 157 vaccinated individuals was that QuantiFERON was positive in 43 percent; tuberculin skin test was positive in 58 percent. In unvaccinated individuals, the positive rates were the same for both tests.
The next slide just discusses a couple of the thoughts that I had about the qualities of an ideal test for latent tuberculosis. Theoretically, such a test should always be positive in confirmed tuberculosis, should always be negative in patients with no TB risk. It should be negative in other mycobacteria infections. Conversions from negative to positive should correlate with TB exposure. Finally, there should be confirmed value of the test in its ability to predict the development of tuberculosis.
As you will notice, a couple of these points have been addressed by this submission. Several of them have not. That may leave some room for discussion.
The next slide just brings me to my conclusions, which were, first, that the sensitivity of QuantiFERON differs from tuberculin skin tests when it is evaluated in patients with confirmed tuberculosis. I do mention again, or remind you, that interferon production is reported to be inhibited in active tuberculosis. The effect of this on the sensitivity of QuantiFERON in other populations is unclear.
Next slide. Positive rates for QuantiFERON were higher than tuberculin skin tests in low-risk populations. The pivotal clinical studies did not determine whether this was an indication of poor risk specificity or increased sensitivity of QuantiFERON tests.
Finally, just to remind ourselves that the populations identified as positive by QuantiFERON or positive by tuberculin skin test often differed.
CHAIRMAN WILSON: Thank you, Dr. Sacks.
The next presentation will be by Mr. John Dawson, who will present the statistical analyses of the data.
MR. DAWSON: Good morning. Thank you for affording me the opportunity to present the FDA's statistical perspective on this application.
I am going to cover two things in my 10 minutes or so. First, sensitivity/specificity-type evaluation of performance of QFT relative to TST as a gold standard and, secondly, some measures of agreement and some results using them which may be appropriate if TST is, as we I guess generally agree, not a gold standard.
Next, please. The sponsor has in their draft labeling estimates of sensitivity and specificity that derive from the Streeton study, 1998 Streeton study. They estimate sensitivity at 90 percent and specificity at 98 percent. I have a little bit of a worry about using the Streeton numbers rather than the QFT current study, the PMA study estimates in the labeling, because the percent human response cutoff used in this study was derived in the Streeton study and also used to estimate sensitivity and specificity. When the cutoff is arrived at by ROC analysis, the problem is that performance may be overly optimistic, simply a function of trying to optimize or maximize something about the performance in choosing the cutoff.
This a little bit shows up and possibly explains what happens here when I use the PMA data to estimate sensitivity using the TB suspect category patients, and among those, those that are culture-positive, what I get is an 88 percent estimate for sensitivity compared to the 98 percent in the Streeton study.
Specificity using the low-risk group in the PMA data is 92 percent versus the 98 in the Streeton study. I don't know whether this is because of overoptimism, but I am simply pointing out that it is probably not appropriate to use the numbers from the Streeton study in the labeling in place of numbers from the PMA study.
Another problem that we have with these estimates, the sensitivity and the specificity, is that they are based on selected parts of the intended-use population, rather small groups at the two extremes, the low-risk group and the TB-suspect group. The problem there is what we know as spectrum bias can be work at here. The largest group of patients were in the intermediate-risk category. We have no justification for assuming that the estimates of sensitivity from those extreme groups would apply in the intermediate-risk group.
If there is no gold standard, then we have the option of evaluating agreement between QFT and TST. Now I want to move to that topic and talk a little bit about agreement. Next, please.
This is a depiction of the two-by-two table which you have seen numerous ones this morning. I use the term "agreement" to mean literally on a per-case basis, whenever we have QFT-positive and TST-positive, that's an agreement. If one is negative and the other is negative, that is also an agreement, and the overall agreement derived from a two-by-two table is basically the numbers from the main diagonal of the table divided by the table total.
Next, please. Now I want to give you a couple of other definitions very quickly, one of which is expected agreement. The reason for that is that the Kappa agreement statistic, which is the one that the company has chosen as their primary agreement measure, involves both the observed agreement on the main diagonal of the table, expected agreement, and I have to apologize; I have it written as "A plus B over N." It should be "A plus D over N."
What is done in getting an expected number is that you set up basically the null hypothesis that the two tests being compared are mutually-independent, and then you use the marginal frequencies, the proportions on the margins of the two-by-two table, to generate numbers for the four cells of the table, which is what you would expect if the two tests are conditional-independent.
I always have the same problem with this statistic in these kinds of method comparison studies because the null hypothesis simply is not reasonable. It makes it very easy to get a statistically-significant result because inherently the methods being compared have some amount of built-in agreement.
The Kappa correlation coefficient takes the observed numbers of cases on the main diagonal, subtracts out the expected number of cases on the main diagonal, and then is scaled by one minus the expected frequency.
Another measure is agreement with the positive skin test; that is, taking those that are given as TST-positive, what percentage of those are also QFT-positive. Agreement with the TST-negative, you take those that are TST-negative and divide that into the number which are also QFT-positive.
We have an agreement index, both a positive and negative variation. What this does that is different than those above is you take the total number of cases that are positive by TST, add that to the total number that are QFT-positive, and call that an overall number of positive results. Then you take the number that are positive by both QFT and TST, multiply that two, and that ratio then is what we call agreement index positive. In agreement index negative, you get the total number that are negative by either and divide that into the number that are negative by both.
Next, please. I am sorry this is such a massive table, but I think I can get you through it pretty quickly.
What this does is to compare the agreement between QFT and TST on the various indices just described. Just to orient you on this table, this first part deals with the low-risk group, using the 15 millimeters induration for the skin test. This little block over here is the array of the 98 cases in the low-risk category. The plus indicates the test positive; the minus is test negative. The columns are for QFT and the rows are for TST.
So we have, for example, 89 cases that are negative by both tests. We have just one case in the low-risk category that's positive by both.
Now I have to deal somehow here with the problem that we have with basically any measure of agreement, which is the dependency on prevalence. That is, prevalence is a confounding factor in any of these measures of agreement.
How do we know that prevalence is a problem? We know that because you can take the two-by-two table and write out a probability model of that table in terms of sensitivity/specificity and the probability of agreement between the two tests being compared and prevalence. So you put all those parameters together in a two-by-two table and it gives you what we can an expected number of the four cells in the two-by-two table that you can compare with the observed.
Once you have done that, then you are free to fix the parameters sensitivity/specificity in agreement and vary the prevalence. Each time you vary the prevalence, get your expected table and calculate your agreement statistics from it and see if they change, you haven't changed the performance. What you have done is changed the prevalence. Unfortunately, all of these measures undergo a change when you vary the prevalence.
Let me just point out the problem that we have with Kappa. It is well-known, established in the literature, and it is easy to show that Kappa, where performance is held fixed, will be very low at the extremes of prevalence. Very low prevalence and very high prevalence, it will be a low value. You see that 17 percent for the low-risk group. That's exactly what we would expect. Then when you go from the low-risk category up to the intermediate-risk and the suspect category, you see that it goes up considerably.
So when you see a Kappa that looks good, you need to ask, well, are we looking at a high prevalence population here? If it is, then, well, maybe that's just what you should expect because of the relationship with prevalence.
The same thing applies -- let me just quickly say something as a footnote here about the agreement. Where you get Kappa with a large agreement, or the expected number very large, producing a small value Kappa, is where the numbers are concentrated on that main diagonal in just one cell. So that you see for the low-risk, where you have 90 cases, 89 of them are down there in that lower righthand corner. That is the kind of thing that gives you a large expected number and a small Kappa.
So when you get to the next level of prevalence, the intermediate risk, you see there's a much more even distribution of cases between those two cells, and that that sort of lightens you up a little bit on the expected number. Then when you get to the high-risk group, it begins to fall off again because you've got numbers that are concentrated up in that upper lefthand corner.
All of the agreement indices show a pattern with prevalence. What I am going to suggest is the one that we might consider as my basic analysis of agreement between QFT and TST is the overall agreement, simply because it shows the least variation with prevalence.
Next slide, please. What I have done here is to calculate the overall agreement for the three risk groups and calculated the confidence intervals. I want to call your attention to the lower confidence limit, because that's what we like to say is what we know for sure, that the agreement is going to be possibly that low, but it may also be higher.
So for the low and intermediate group we've got 80 percent or more agreement in terms of the lower confidence limit. So I would say that basically is telling me what the chances are of agreement between QFT and TST for the suspected group. It falls off and the agreement is down around two-thirds. If you are a user of the McNemar test, I would also say that we have significant McNemars in the suspected group. It tends not to support agreement at that level, but it is okay at the low and intermediate levels.
Thanks for your attention.
CHAIRMAN WILSON: Thank you.
At this time I would like to invite the panel members to ask questions of the FDA's speakers. Dr. Charache?
DR. CHARACHE: I wonder if I could ask a question of Mr. Dawson. Looking at the percent agreement, if we go back to your next-to-last slide for a moment, I think maybe it is the one before it.
MR. DAWSON: No. 7?
DR. CHARACHE: No, it's the complicated one, No. 6. If we look at, instead of the overall agreement, which is the first three columns, if we look at the agreement, just the agreement with the TST-positive and negative, there the agreement is very good for the negatives, but only 12 percent agreement among the positives.
MR. DAWSON: In the low risk, yes.
DR. CHARACHE: In the low risk. Now looking at the WRAIR, it is also 12 percent for the low-risk group, and that's the group that we're targeting. So I'm wondering if, rather than looking at the overall agreement, which certainly in low-risk patients and moderate-risk patients who are the ones where we are really looking for latency in, the important question is agreement of the positives, not the negatives. There will always be more negatives. If we use the overall agreement, we will always see very good agreement, but the group we are concerned about are those who are candidates for therapy.
So I wondered if we could look at that number for the populations for which the test is proposed; namely, those for which there is a test of latency, and just look at the agreement of the positives, the candidates for therapy, which is the purpose of the test. Because it seems to me that for most tests we either want to look at the negative agreement or the positive agreement, and for this test we want to look at the positive agreement, which in the candidate populations for therapy are going to be in the low-risk category, where agreement is extremely poor.
Then we have to decide what to do with it. Maybe it is to increase the agreement by modifying the cutoffs. But I wondered what the comments would be on that thought.
MR. DAWSON: I think it is appropriate to look at agreement with the positive TST results as long as you keep the prevalence groupings broken out, because it is drastically different.
DR. CHARACHE: Yes, I would make the prevalence grouping the candidate population for which the test is targeted.
MR. DAWSON: Are you saying that there is one part or another of this table right here that we are looking at that would be appropriate for that interpretation?
DR. CHARACHE: Well, the low-risk group would. That's not a candidate for skin testing now, according to CDC, because of false positives. But the false positives under that category would be fair greater with the QFT test.
So I would want to look for the concordance with that population as opposed to negatives which will always overwhelm your ability to know about the group you want to treat when you're looking at the targeted population.
CHAIRMAN WILSON: Dr. Cockerill?
DR. COCKERILL: Another question regarding statistics: Of course, the negative 99 percent, as you mentioned, is not that remarkable considering it is a very low prevalence group. So you are going to have a very high percent there because the prevalence is so low.
Do you have any idea -- we saw some two-by-two's I think earlier -- if the cutoff is 30 percent versus 15 percent, how that would affect that positive 12 percent result, or can you make any comments about that?
MR. DAWSON: I don't have any intuition about that. We did see that when they raised the cutoff for percent human response from 15 to 30, that the specificity went from 90 up to 98. So it is possible here and now, after the fact, to go through and look at the different cutoffs, which the company has been doing. We encourage them to do that because you want to learn from the PMA studies as well as to get an approval.
We do have analytical means after the fact, a type of cross-validation involving what's known as the bootstrap to validate a different cutoff after the fact, using the clinical trial data. But I'm sorry, I don't have just off the top of my head any idea what that would do for agreement.
CHAIRMAN WILSON: Dr. Janosky?
DR. JANOSKY: The question is either for Mr. Dawson or Dr. Sacks. I want to go back a few levels, sort of thinking about the data and analyzing the data for a second. The sponsor had told us this morning that the values of test performance for the TST are quite low. If we use that as an assumption and we work from that, when we see discordance with these two tests, do we have any hint as to what might be going on?
I ask you, when you answer that, to please think about the fact that the odds ratio for the Asian population that the sponsor reports is about a 5, and the odds ratios for some of these other personal characteristic variables are quite high in the discordance.
MR. DAWSON: I don't have any analysis to offer on the discordance. Sorry.
DR. JANOSKY: Okay. I am still trying to tease apart as to, if we're trying to evaluate this test based on an imperfect test, who are we penalizing when we come up with disagreements? I mean, just think of some ways to sort of try to answer and think through the question, but since you two are very close to the data, I was wondering if either one of you had worked through some of those hypotheses.
MR. DAWSON: If Leonard doesn't have an answer, it may be that the company does because the company always knows the data better than any of us at FDA.
DR. JANOSKY: Well, I would feel comfortable also asking the question for the sponsor.
DR. SACKS: This is nothing really new, but I think the other way in which a clinician would look at the data is in terms of the TB risk. Obviously, in a population where the risk is negligible one would like to see the lowest positive rate; in a population where the TB risk is highest, one would like to see the highest possible rate, bearing in mind the caveats for the different types of tastes.
DR. JANOSKY: Yes. When I took a look at one of the tables that you presented today, which I thought was very illuminating, by the way, the one where you were looking at the different populations and the expected prevalence rates in both of those tests, if I think about it from a population perspective, my conclusions of those tests might be that I'm very comfortable with it. If I think about it on an individual basis, that is what I am trying to grapple with because that's really where we are.
DR. SACKS: Yes, I think, as we get down to the level of the individual, not only are the overall prevalences of positive tests in each population group important, but the concordance within those, and that's what I tried to highlight with the Venn diagrams.
Personally, I am not sure how in those groups one does interpret discordant results, a positive QFT with a negative TST, or a positive TST with a negative QFT. You know, all I can say is that with a TST, with all its pitfalls, at least it has some clinical validation over the many years of use. We know the percentage of patients who are going to get TB, if we found a positive TST. We know that TST is likely to convert if patients have been exposed to TB. So we have some sense of how the TST behaves clinically, but I'm not quite sure how to evaluate the QuantiFERON.
DR. JANOSKY: So, in that respect, you are more comfortable sort of putting the onus on the new test as opposed to the TST, just because of the performance and the current approval? Is that what you are concluding?
DR. SACKS: Well, in the absence of data, I think what we would have to do, the way I would phrase it is we would need to see data to validate the discordant results by QFT.
DR. JANOSKY: Okay. Then that goes to the question that I asked. Is there any information available besides seeing some of the discordant personal characteristics data that were presented in the application?
DR. SACKS: I will defer to the company there. I don't have any additional data.
CHAIRMAN WILSON: Would anyone from the sponsor like to comment on that?
DR. RADFORD: First, I will deal with the issue in the low-risk group, which we're actually stressing here because it is the one with the 12 percent.
The thing that we would actually like to make absolutely clear here is that this is an extremely low-risk group. This is a group that has been deleted on every risk factor that we can find. I would note that the FDA noted that there, in fact, in the initial classification we actually had to go back and delete out people who were set perhaps initially. No acquired risk. There is no risk.
So they are at absolutely no risk and put there because there is no reason to believe that any of them have tuberculosis whatsoever. So the point we make there is that we are not really looking at that data for concordance. We're looking at what you might call the random or the background variation of either test. Given that point, that is why we stress the 30 percent is a more effective cutoff in a very low-risk group because you don't want to show up in low-risk groups a large number of individuals.
I can answer the two-by-two table at the 30 percent margin by saying, in fact, there is no concordance. We actually have no double positives and we have two individually positive for the TST and to QuantiFERON at the 37 group, and the rest of them are the negatives.
But I think that is the point that we would like to stress: that if you actually start focusing in on the low-risk groups, the WRAIR one group, the CDC one group, you are looking at a group that is stressed to have no contacts, no possible exposure to anyone with TB, nothing. In fact, you will notice in the WRAIR group we even took out people from an incidence of greater than 10 in 100,000 states of the United States. Now that is a very severe cutback. So we don't expect great concordance in that. Of course, it is a low incidence group, and of course the cutoff will be low, as discussed.
So I don't think we should actually focus in on concordance in low-risk groups because basically none of these people probably have tuberculosis. That is why we say we should raise it up to 30 percent, in our case to get that specificity.
CHAIRMAN WILSON: Dr. Charache?
DR. CHARACHE: The concordance is also extremely low. It is not 12 percent. I didn't do the calculation, but it is maybe 15 percent in the secondary risk group at WRAIR. Those two groups, one and two, were added for analysis as being those that were candidates for the test.
DR. RADFORD: Perhaps I'll might this point clearly: In the ATS and the CDC guidelines, it doesn't say: Test people at no risk for tuberculosis. It says: Don't test people with TST with no risk for tuberculosis, but if you must, use the 15-ml cutoff. Low-risk people aren't generally recommended to be tested. The people who are recommended to be tested are those at some risk of latent tuberculosis detection.
The WRAIR two group, again, is in fact a fairly limited risk there because they're the group that's actually incorporated -- the only risk factor incorporated is they came from a U.S. state with greater than 10 cases of TB per 100,000.
Jim, would you like to speak to that?
DR. ROTHEL: Yes, if I could just -- we are not proposing to use the 15 percent cutoff for low-risk people. We are proposing to use the 30 percent. So in both of the low-risk groups from the WRAIR study, specificity is not yet equivalent to the TST.
But I want to come back to your discordance question because I don't know if we totally addressed what you were asking.
DR. JANOSKY: You didn't, so thank you.
DR. ROTHEL: I think it is terribly difficult to try and resolve what the real result is in human studies. They're going to be very long-term studies. They're going to take us a long time to do, confounded by the fact that if you identify an individual as being positive in a test, you may have to prophylaxis them. So, therefore, the possibility of their coming down with disease is vastly reduced.
So it is basically an ethically difficult study to do and a very long-term study. I think the best evidence comes from the bovine data, where we can actually kill the animals and we have a gold standard, or that is about the only conclusion we can draw within getting into terribly complicated, long-term studies that we probably wouldn't be able to ethically do.
CHAIRMAN WILSON: We have time for one more question. Mr. Reynolds?
MR. REYNOLDS: On the retesting of the discordant results, does anyone know how close the initial result was to the cutoff? Anyone from the manufacturer have any idea whether those discordant results have changed on retest, how close they were to the cutoff?
DR. JOLLY: If I might be allowed to address that question, Mr. Chairman?
CHAIRMAN WILSON: Yes.
DR. JOLLY: I can't give you quantitative answers. I can tell you that almost all of the changes were very close to the cutoff. I think this is a characteristic which is inherent in any test where we are trying to find a magic number. I think the strength of any quantitative test -- and this includes the TST as well as the QFT -- is that there is an underlying numeric quantity which allows us to alter the cutoff appropriate to this.
DR. NOLTE: Can I get a clarification on the retesting? For the QFT, that was a second sample drawn at another point in time? Or? Clearly, the skin test was.
DR. ROTHEL: Yes, I think Jerry Mazurek who is here from the CDC might be able to address that accurately, but from my memory anyone with a discordant result in the CDC study was meant to have another blood drawn within two months. Yes, Jerry? Yes. Thank you, Jerry. Retested, a very small percentage of those individuals that had discordant results were done. Some were retested as soon as a week after, and the others were tested up to a month after the initial test.
CHAIRMAN WILSON: Thank you.
At this point I would like to move to the open public hearing.
Two individuals have notified the FDA that they would make a public comment. The first is Dr. James McAuley from Cook County Jail, Illinois, who is going to discuss difficulties with tuberculosis testing.
DR. McAULEY: Thank you. My name is Jim McAuley. I'm the Medical Director at Cermak, which is Cook County Jail, one of the larger jails in the country. I have done TB control for about 10 years. I will make it very brief. I will just give you a quick overview of how we use it.
I do a lot of actually teaching on tuberculosis. I will say that when I teach, I always say that if you're going to use 15 millimeters, you shouldn't have done the test. I mean, that's really functionally how I think of it. I have always worked in high-risk groups. So, for me, when I talk, think of my population as being right in the middle.
I would also say that clinically I am very much a clinician in this regard: I don't use it at the other end either. If they clinically have tuberculosis, I don't use the skin test. I use my clinical and my laboratory. If they have a smear-positive, I see what that organism is.
Prisons and jails are an important environment because there are 2 million people behind bars in the United States with 600,000 in jails. Jails are pre-trial detection centers. So you're awaiting trial, or if you have been incarcerated for less than a year. Prisons are where you go for a longer period of time.
This is a high-risk group. This is a group that is a targeted testing group by the CDC's LTBI guidelines. Six million people pass through our correctional system each year, so a large segment of our population. It is mostly individuals who are high risk for tuberculosis.
It has been growing, so I think it is a population base that needs to be addressed from a public health point of view. This just gives you a sense.
Now, again, I work in a jail setting, which is a passthrough population. In our setting the majority are non-white and usually of lower socioeconomic status.
Again, I am going to go quickly because I just want to give you a flavor of what environment we practice in and then how we use the TB test. We have a lot of public health issues we address. The one we are obviously focusing on is tuberculosis, but there is a lot of HIV and AIDS in the correctional system. In our jail setting 2.5 percent are HIV-infected, but in New York it has been as high as 15 to 20 percent in serosurveys in their jail system. We also have a great deal of hepatitis C.
It is a congregate setting. So there are studies that I will show you real briefly in a second that show that jails amplify tuberculosis transmission in the community. In fact, I will mention it now, but in Tennessee 42 percent of their active tuberculosis had passed through the jail system in the preceding year. So they speculate that their jail was actually the transmission foci. In New York active tuberculosis, one of the independent risk factors for developing active TB in New York City is having spent time in the correctional setting. Again, the case rates for active disease are much higher.
So within that setting we have a fair bit of active disease. Now we want to target, as our cases go down in the U.S., we are really focusing on what to do with LTBI, or latent TB infection. So that is really the focus population.
Again, I don't think either of these tests, to my clinical judgment, are that important for active disease. We use chest x-rays. We use symptoms. We use all of that to determine active tuberculosis, but what we ask ourselves is: Can we identify people who pass through a correctional setting who are at high risk for tuberculosis and can we get them treatment for their LTBI, so that they do not develop tuberculosis down the road? That has been our big focus at our site.
Some of the references you have of the publications that discuss tuberculosis in prisons and jails, and, again, this is the Tennessee study, which basically said that it was very important.
I want to get to the -- maybe I will pass through the immigrants, because I am looking at the time and I know that there are people needing to go on. Again, I want to get to just what we are focusing on here, screening of this high-risk population.
We do also screen employees. So there are two ways in which we look for tuberculosis in our setting. The CDC says that we should have basically an appropriate policy. I also think it is very important to keep in mind that a jail in Chicago is not the same as a jail in Montana as far as TB goes. So in a jail in Montana you might not do either test. Always keep that in mind.
So all TB is local, and I think it is interesting to hear this discussion of 10 cases per 100,000 being the high risk. If you are from Illinois, where we are one of the high rate states, comparable to most of your southeastern states, if you are outside the metropolitan Chicago, your case rates of TB are about 2 per 100,000. So you are actually a low risk. So if you are a military recruit from rural Illinois, you're obviously a low-risk person, very low risk, but you would have been lumped into high risk. Conversely, the alternate would happen if you were from an urban center that was diluted by a rural population -- basically, the imperfections of all this epidemiology.
So at our site we screen 100,000 detainees a year. That's our passthrough population. On any given day, 10,500 detainees live on a 100-acre campus. So we have both geography, a large compound to deal with, and volume, 250 to 300 individuals passing through on a given day.
When you pass through our system, we screen you medically and we look for mental illness, and we do a mini-chest x-ray because active disease is the thing we are worried about from a transmission point of view. We do place a skin test. Frankly, I wonder if I want to place a skin test. I think it is an important public health service, but it is not very important for my institution, if you think about it, because I really need to just look for active disease. I will show you some data in a minute about why I wonder about whether we should place a skin test.
But having said that, many, if not most, states' regulations require correctional facilities to place skin tests because it has been entrenched as one of the things you ought to do to look for tuberculosis in a jail setting. So whether or not I believe it is scientifically valid or valid for the individual patient, I am required to place it.
So we place 250 to 300 tests over a few hours every day, and we try to read them at 48 to 72 hours. We successfully read between 25 and 30 percent of those skin tests. So 75 percent of the skin tests we place are not read.
We do a mini-chest x-ray, which is read within 12 to 16 hours. We read all of those, obviously. This is how you do it: You take the 100 millimeters, you blow it up; you look for tuberculosis.
We have found over the years that, fortunately, our TB case rates are going down. We find most of our cases by chest x-ray, but we do have some people who come in with a normal chest x-ray but give us symptoms that suggest tuberculosis.
As you would expect, our tuberculosis case rates mirror the city a little bit. We believe we have actually significantly contributed to the city's control of tuberculosis because, as an example, 60 percent of people who are homeless in Chicago pass through the jail each year. So we actually probably control a lot of the homeless tuberculosis inadvertently. So we contribute significantly.
Now to the case in point, where I think that skin tests or any blood test is important. Actually, I should take a second -- I didn't explain. I have had nothing to do with the company except they heard my presentation at a TB meeting earlier this year and asked if I would come. So they have paid my way here and for my time today.
So I say that because, obviously, I have been paid by them and they have paid my transportation, but my personal view is I would like to have a good test. I actually don't really care who gives me the good test, but I would like to have a good test.
We started looking at LTBI because we have this problem that we are placing 100,000 skin tests, 25,000 are being read. Then we started them Isoniazid, and only 11 percent completed because they pass through our jail so quickly. So we felt it was somewhat of a futile activity.
So we began using the two-month rifampin pyrimidazide, and we got our completion rates up to 67 percent. So now I think we are actually doing a good service for the community and for the individual patient, because not only can we identify them with infections, some of them, but we can get them on therapy and actually complete therapy. So now we are a little bit more excited about our latent TB program.
But what are our big challenges left? Well, our biggest challenges is this graph, which is probably better in your handout than on the screen. The next one will show it as well.
That is, when you come to jail, the good news is you get out right away. The bad news is I don't have time to intervene in your health care very well. What this translates into practically speaking is that, as seen on the very last slide, fully 22 percent of people are gone in 48 hours. So 22 percent of the skin tests I have no chance of reading, and then the rest trickle out over time, but then I have the logistics of staffing going to find these people over a 100-acre compound who have been moved around for security reasons, not for medical reasons. That is the other reason why we can't read the skin tests.
So from my point of view, when a person enters, if I draw their blood, which I do already looking for syphilis, because we play a big role in the city's syphilis elimination program, I could at least identify those who are positive. Now can I engage them and complete them in treatment? I think I can complete more of them than I used to because I am completing about two-thirds now. How many more I don't know, but from my point of view it would be significantly improved if I could actually identify quickly, without having to bring that person back.
I think it gets to the point about, if somebody doesn't come back for the reading, doesn't that mean that they are not likely to finish their therapy, which is what I think is inherent in the question. I think in our population what it means is we are just not able to get to them to read it. Now, again, we may not complete all of them because of them will go again.
So from my point of view, in a correctional setting a test that at least performs comparable to the current in that intermediate group, which I think is the right group to apply any test, would be of some value to us.
CHAIRMAN WILSON: Thank you, Dr. McAuley.
The second public comment will be given by Mr. Reynolds. I would like to note that Mr. Reynolds is prepared and is giving his statement from the State Department of Health Laboratory in Pennsylvania.
MR. REYNOLDS: This statement is actually from Mr. William Barry, who is the Director of the TB Control Program for the Commonwealth of Pennsylvania. I will make it very brief.
Thanks for the opportunity to comment on the QuantiFERON TB test. Our hope is that the test will be very useful in the diagnosis of latent tuberculosis infections and would be more accurate than the reported 25 percent false negative rate in some PPD studies.
Our problems with the PPD include ensuring trained staff, placing and reading the test with accuracy and consistency, patients returning within 48 to 72 hours after the test is administered for reading, and difficulty in separating the true latent tuberculosis infection from positive PPD's due to BCG or non-tuberculosis mycobacterial infections.
Hopefully, these problems could be resolved with an ELISA test. On a practical level, would the test be able to be performed by laboratories across Pennsylvania or just the Bureau of Laboratories? This would be important to us in the rapidity of specimen submission and obtaining results.
My understanding is that JAMA will have a report on the QFT test this week. We're looking forward to reviewing it.
I hope this is of some help to you. Again, thanks for the opportunity to comment. Any questions, please give me a call. Thank you.
CHAIRMAN WILSON: Thank you.
Does any other member of the audience want to make a statement?
If not, the open public hearing session is now closed.
We would like to take our lunch break now. We will reconvene promptly at one o'clock.
(Whereupon, at 12:16 p.m., the proceedings recessed for lunch, to reconvene at 1:00 p.m. the same day.)
CHAIRMAN WILSON: All right, I would like to reconvene the meeting at this time.
This is the open committee discussion portion of the meeting. This portion of the meeting is open to public observers. However, public observers may not participate except at the specific request of the Chair.
We have two primary reviewers for this PMA submission, neither of whom would like to make individual comments. Therefore, I would like the FDA to put up the first question for the panel.
Okay, the first question states: "Did the data from the two U.S. studies provide sufficient information on the performance of the QuantiFERON-TB assay, and are there other types of data or other types of analysis that can supplement those studies?"
So I would like the members of the panel to make any comments regarding those two questions. Dr. Charache?
DR. CHARACHE: The CDC paper emphasized that one of the significant variables that were found on multivariate analysis was the differences between the five sites that did the studies. Apparently, the patients were the same, but there were differences. I wonder about looking at the two-by-two comparative data from each site and then see if we can understand the differences between sites.
Similarly, I would wonder about looking at some of the differences, see if we can understand better the differences between gender and age. I am thinking here whether this is the kind of test that would use different breakpoints by gender or by age rather than a single one for all comers.
I think it would be very helpful to look at the data for all of the groups, not in terms of the overall agreement, but in terms of the population at risk and the purpose of doing the test in a given population to determine which variables should be addressed.
CHAIRMAN WILSON: Does the sponsor have the data divided in those ways, in a way that you could present it now?
DR. JOLLY: Mr. Chairman and Dr. Charache, if I can direct your attention to page 2-189, volume 2, page 189, in this report we compare one measure of agreement between sites in the CDC dataset and also between risk strata in the same dataset.
Now I will mention here that the fact is these tables are Kappa statistics which, as the FDA statistician pointed out, is a measure which is, if anything, biased toward low agreement status, because in the low population groups we chose this measure specifically because it did not give any implication of high value. The Kappa statistic is bias toward low values and low prevalence populations. This is why we chose this statistic.
Now if you look at the first table on page 189 of volume 2, you will see that we have got measures of Kappa broken down by each of the five different sites. All the values there are uniform. There's no particular variation between the sites and the agreement or disagreement status, whereas, as has been pointed out by the FDA statistician, there are differences, as one would expect, between the different risk groups because Kappa does depend upon the prevalence in the data.
I will also point out that on the page after that there are the same figures broken down by site within this group. So we get comprehensive breakdown there, Mr. Chairman, of the measures of agreement by site.
DR. CHARACHE: Yes, I think what I was referring to, again, was not the overall agreement. I think Dr. Sacks pointed out that we have to know the relationships between what the overlapping agreement is and how they differ. I was thinking in terms of table 5, the factors associated with negative tuberculin tests and the positive interferon gamma from the CDC paper in which it did vary by location.
DR. JOLLY: This is the JAMA paper?
CHAIRMAN WILSON: Yes.
DR. JOLLY: Yes. Jim, do you have a copy of that?
DR. ROTHEL: Thanks. I just got this, and I copied this, and it looks quite nice.
The only comment I would like to make is that, as far as my reading of the paper and my understanding of the data -- and I wish Jerry Mazurek was here, who actually did the study -- but the discordance associated with different sites is associated with the TST. It wasn't associated with QuantiFERON. It was associated with people -- did give preference, which are just two of the thoughts from memory.
DR. CHARACHE: I was just saying I think this probably would be helpful to look at, and I think might be helpful to look at with the two-by-two tables and see how the sites compared with each other.
DR. ROTHEL: For the individual sites.
DR. CHARACHE: Yes.
DR. ROTHEL: I understand. That is a good comment. I don't believe it is in your panel pack. The only trouble is that in some sorts there are as few as 15 or so people in group one, for example. The two-by-two table is very meaningless with the low numbers, but, yes, we can provide that if you need it at a later date.
CHAIRMAN WILSON: Dr. Baron?
DR. BARON: The only other information that I think would be helpful, which you don't have, and I fully appreciate the difficulty of gathering those data, are the results of your assay and skin tests in patients who are infected with pulmonary disease of mycobacterium other than tuberculosis.
CHAIRMAN WILSON: Dr. Charache?
DR. CHARACHE: That reminds me of another question, which has to do with the validity of the M. avium as an overall control for all mycobacteria other than tuberculosis. I think particularly in the cattle studies I would wonder about the mycobacteria that are found in the ruminant sacks of the cows.
I am wondering, if you had somebody with kansasii and you tested with the assay, whether the M. avium would be an adequate control or not. Or, if you just did the study that you did that showed that M. avium control was a very good one, where you looked at the ability of the M. avium to modify the results of the PPD, if some of these false positives you could do the same thing, but instead of using M. avium, use a different mycobacteria. I'm just interested in knowing whether we could extend the control if, instead of just M. avium as a control, there were other mycobacteria that are common causes of human disease included as part of the control. I think the control idea is terrific, however.
CHAIRMAN WILSON: Yes, go ahead.
DR. WOOD: Maybe I can just comment from the veterinarian point before it over to other people to comment on the human. M. avium is actually used worldwide as the distinguisher for comparative testing, probably mostly initially because it was a fast-growing organism and you could make the PPD. But, in practice, it is actually an extremely good antigen to us, as demonstrated by its extensive use.
Obviously, we made a decision in converting to new tests just to stick with the same antigens. The only other antigen that we have extensively looked at in the cattle is Joni's disease and using impaired tuberculosis antigens. It answers, I think, the question raised earlier: In the long run would this sort of technology work with MOTT infections? It is working quite well in that circumstance.
So I think you could possibly use other PPD's, but I think in general practice is showing us that M. avium is a pretty good indicator, although not absolute, like anything in these assay systems.
DR. CATANZARO: I wanted to remind us of the work that was done by the Navy when they look at the various tuberculins from rapid growers, from yellow bacillus, kansasii, PPDB, and from the radish, the scraphilacio. That work was done in skin testing. From that came the concept that PPDB from the battey bacillus or avium was used as a representative of other mycobacteria. That has been pretty well established in skin testing.
Obviously, it hasn't been looked at by QuantiFERON. But I think that rather than looking at it as a reflection of avium infection, we should look at the response to avium as representative of other mycobacteria.
DR. CHARACHE: Thank you.
CHAIRMAN WILSON: Dr. Nolte?
DR. NOLTE: I know the intended use of this assay is not, the way you have stated it, is not to include HIV-infected patients, but, clearly, the test is going to be used in those populations, either knowingly or unknowingly.
I am wondering, there was data presented, published data presented in the packet that, at least to me, indicated that the test performance, at least agreement with the tuberculin skin test was really not that much different with HIV-infected individuals as it was with uninfected individuals. Is there any way that more data like that could be included in terms of the submission?
DR. ROTHEL: Yes, I agree, we have a fair bit of data on HIV-infected people in here. There is a paper by Converse, et al., Quatamera, et al., and the Mason study that the abstract's reported in your panel pack.
The truth of the matter is we don't believe we have sufficient data to go to the FDA to get approval for it. It is something that we may do as a post-market study to extend their claims in HIV-infected individuals, but it is not a simple study to do and quite an expensive study to do.
DR. NOLTE: I understand.
DR. ROTHEL: Yes.
DR. NOLTE: The other thing that is of concern to me is the intended use. The package insert that you folks included was a little confusing to me. In one place it said, essentially, to be used as an aid in detection of infections with MTB, and in another place in the package insert it said it is an aid in detecting latent TB infections. I am not sure -- I mean there is not a lot of data that you presented in terms of the performance of this test in active disease.
So I guess, where are we going with the intended use here?
DR. ROTHEL: The intended use is not meant to have the "latent" in there. That was a typographical error.
We see no reason why not to include it for TB in general. When you are screening individuals for latent TB infection, you are invariably going to pop up very random, a very seldom event of someone with active TB. We have sufficient data, we believe, to prove that or to demonstrate that individuals with active TB disease are detected by the test in the vast majority.
DR. NOLTE: Again, in the data that is included as part of this submission, how many infected patients are --
DR. ROTHEL: There were 54 in there, and the other data we provided in support was 129 from that Australian study.
DR. NOLTE: So we're talking about a total of 200 or so?
DR. ROTHEL: Nearly 200 or so, yes.
DR. NOLTE: Actively-infected individuals?
DR. ROTHEL: Yes, and both studies have come out with a sensitivity of 81 percent. It's not perfect, but --
DR. NOLTE: Sure.
DR. ROTHEL: -- it does definitely have utility for detecting active TB disease.
DR. NOLTE: Okay, thank you.
CHAIRMAN WILSON: Dr. Carroll?
DR. CARROLL: Yes, along those same lines, could the sponsor then clarify in terms of the labeling whether you will then seek approval for both cutoffs, the 30 percent cutoff for the low-risk individual and the 15 percent cutoff for the intermediate? Is that what we're talking about here?
DR. ROTHEL: Yes, that's exactly what we're asserting, yes.
CHAIRMAN WILSON: Dr. Durack?
DR. DURACK: With regard to the question about supplementary data, I'm sure that it's clear from the discussion that the pediatric group is particularly important, and I know you will be working on that. I would personally put that as the first priority as far as supplementary data, and I would make the additional point that this could be a group where it may be important to separate the older children from the younger children, possibly even infants, younger children, and teenagers. So I think it might be better not to just lump everything as zero to 18 for that study.
CHAIRMAN WILSON: Okay. Dr. Beavis?
DR. BEAVIS: My hope, too, is as additional data is being collected that reproducibility be looked at, not repeating a specimen, you know, different time from the same patient, but splitting the specimens and testing them in different laboratories.
CHAIRMAN WILSON: Any other comments on the first question?
Okay, if we could have the second question then?
The second question states: "Testing of control material is not available to compare results between sites in the clinical studies. Are the manufacturer's procedural and specimen controls adequate to ensure reliability and reproducibility of QFT testing between laboratories?"
Any comments or questions from the panel? Dr. Nolte?
DR. NOLTE: If I remember correctly, the only data that we saw was the data that Ms. Shively presented that was new, I mean that wasn't part of the packet in terms of the two laboratories' split sample analysis. Am I correct?
MS. SHIVELY: That was in your packet.
DR. NOLTE: That was in the packet? That's the only data available in terms of interlaboratory reproducibility?
DR. ROTHEL: The full study, yes.
DR. NOLTE: Okay. Like I suggested, that's probably not enough.
CHAIRMAN WILSON: Dr. Charache.
DR. CHARACHE: I think the studies of interlaboratory reproducibility would go a long way in knowing about the ruggedness of the test, and I think there are some questions about the ruggedness of the test, if in fact there are differences between the labs. I think this would be very helpful to us to establish that, and then you could determine the extent to which you needed outside controls.
I'm obviously concerned about the false positives because of the therapeutic implications in the low-risk populations.
DR. LEWINSOHN: What would an outside control be?
DR. CHARACHE: I'm not sure what the outside control would be. That is why I am hoping we won't need them.
DR. LEWINSOHN: I'm sort of struggling with that, I guess, because it seems like your standard curve sort of is the control in a way, I mean unless you're going to ship serum from -- or not -- well, I guess it is serum -- from these assays or it's actually I guess plasma, from other assays as a control.
DR. CHARACHE: Yes, I suppose a surrogate, at least interferon that you should get within a given range in your system if the conditions are right. It's not perfect. It doesn't start with a leukocyte, but you're in better shape.
CHAIRMAN WILSON: If there's a suggestion, a recommendation to the sponsor that they provide additional data on this, the question would be: How much data would suffice?
DR. NOLTE: Well, I mean, basically, I'm trying to remember the data that we have in front of us, but it's two sites and 50 specimens, right? -- almost all of which were positive. I think one of the points that came out in terms of this was the sort of reproducibility of negative as well. So certainly that would be a component. In terms of the numbers, I would sort of leave that up to the statisticians to give me the best sort of estimate of what that should involve. Clearly, I don't feel comfortable that I know what the reproducibility of this test is on the basis of two sites and 50 samples, most of which are positive.
CHAIRMAN WILSON: Dr. Charache?
DR. CHARACHE: They should also include some in which the MAI was a factor or that had a high nil, to see how it came out when it was done in different places. So I think it should be just a nice gradient of tests, but I would also prefer the statisticians selected it.
CHAIRMAN WILSON: Okay. Would you like to comment?
DR. CATANZARO: I would like to remind the panel that while there was only that one formal study comparing two laboratories, that the CDC trial was conducted in five separate laboratories in five different cities. The results of the five sites are very uniform. So even though we didn't ship the patients around from one place to another to get them drawn in different labs, I think we can look to that data and see that there are significant, there are large numbers. If there was a significant variation from one lab to another, I think it would have shown up.
DR. NOLTE: You're talking about overall performance?
DR. CATANZARO: I'm talking about overall performance in five different laboratories as a surrogate for how it might work in five different laboratories. I mean it's a demonstration, I should say.
The other comment I wanted to make about reproducibility is that, while perhaps the exact study that was suggested wasn't done, that the hard data on the same individuals being tested over and over again a half a dozen times with no variation over a period of time -- it's not the same thing, but reproducibility is clearly very stable in that way.
CHAIRMAN WILSON: Any further comments? Questions?
DR. NOLTE: One relatively -- I don't know whether it is a small point or not. It is a point that bothers me, but it has to do more with -- I am looking for the slide.
DR. BARON: Could you speak into the microphone, please?
DR. NOLTE: I'll try as soon as I find the material.
DR. BARON: Okay.
DR. NOLTE: Basically, the decision thresholds or the values that are used to determine whether you have a valid test, there is this 1.5 international unit per milliliter for the mitogen versus nil that's the minimum to have an acceptable test? Am I stating that correctly?
Then we talked about being able to measure a 15 percent human response with the limited detection of the assay being 1.5 international units. I think I posed this to the sponsor in a written form, and I didn't understand your answer, so that's why I'm asking again. It came up on Ms. Shively's slide as well: that if that's the case, then don't you have to have a 10 international unit per ml minimum mitogen versus nil response to be able to reliably measure a 15 percent --
DR. ROTHEL: Yes -- no, because you can have a 1.5 international units per ml for the mitogen and you can have a 1.5 IU per ml for the human PPD, and get a 100 percent response and still be positive. So it doesn't mean that you need to have 10 units in your mitogen sample to get a positive answer, if that is what you are inferring.
The mitogen is --
DR. NOLTE: That's what I'm worried about, is having an acceptable test where you have 1.5 international units per ml and then 15 percent of that being below your detectable limit, so missing a 15 percent response at the low end of your --
DR. ROTHEL: Sure, and that may be the case, but the cutoff that's been used for all the clinical trials, and were established very early on, used that criteria, and that's what the data we have presented has been done using that criteria. Sure, it means that if your mitogen response is less than 10 IU per ml, you need a response greater than 15 percent to be positive in the test, but that same formula has been used for all clinical trials.
DR. NOLTE: How often do you find values that are that cutoff for the 1.5 international units per ml for the mitogen versus nil?
DR. ROTHEL: It would be less than 5 percent of the time, off the top of my head. We could actually give you that figure accurately.
DR. NOLTE: Your colleague over there is --
DR. ROTHEL: Do you know the figure, Tony?
DR. RADFORD: The answer is it's actually a small number.
DR. ROTHEL: Talk into the microphone, Tony.
DR. RADFORD: The answer is it's actually a small number. I can tell you the CDC one group has no risk, none. In the other risk groups, we can dig it out, but I think in fact we're talking about two or three. It's a very uncommon event.
DR. NOLTE: Thank you. I had the feeling it was probably a small point, but I just wanted to clarify.
The other thing that I find a problem in terms of the interpretation of your test is the fact that for an avium difference to be significant, it has to be less than minus 10 percent. I actually gave that criteria to some of my colleagues in laboratory medicine and then told them that, "Well, the difference is minus 100 percent. Is that less than minus 10 percent?"
DR. ROTHEL: Yes.
DR. NOLTE: And all of them got it wrong. Now I realize that the absolute -- I mean it's a difference of -- it's an algebraic problem, but I think if you've got people interpreting this, a minus 100 percent difference is a significant difference. At the face of it that is a larger number, not a smaller number, to many people, including myself, and I realize that's wrong mathematically, but conceptually I think you might be better served by having a different set of criteria for that part of the test.
DR. ROTHEL: That's a very easy mathematical calculation. We can change it to a positive value if we want to. The truth of the thing is that we will be preparing software to provide to people who will be using this kit and having to get it approved through the FDA, obviously.
DR. NOLTE: Yes, just don't convert it to logs, okay?
DR. ROTHEL: Yes. Done.
CHAIRMAN WILSON: Dr. Charache?
DR. CHARACHE: Yes, I'm just returning again to the question of reproducibility of the test. Again, from the CDC paper, there's only a single variable that was associated with having a negative skin test and a positive interferon assay, and that single variable was if you were enrolled in site C. On the other hand, there were three reasons for having a positive skin and a negative interferon. One was BCG vaccine; one was an avium complex assay, and the third was enrollment in site E. So I do think we really need to know about the relationships between these different labs in terms of reproducibility of testing.
It's not just enough when you add the negative and the positive agreements together. We really should know more about it.
CHAIRMAN WILSON: Thank you.
Yes, Dr. Lewinsohn?
DR. LEWINSOHN: Thank you.
I had a question that sort of related back to what Dr. Catanzaro had said earlier in the sense of this being, in a sense the clinical function being an integrative one. It is true that we tend to look at the intensity of the TST test as being a surrogate for true TB infection, certainly if it's greater than 15 millimeters or not, especially as we have been debating whether to change the cutoff for those people who we would consider to be low risk.
What I'm wondering is, is there a way to report out the test that would give some more information to clinicians? So, for example, you might say it's positive, but like weak, strong, low, so that a strong test might give you greater confidence in the low-risk population that it's a true positive.
DR. CATANZARO: I think that's an absolutely key factor, and, yes, the intention is to report that it's positive and how positive it is. Clinicians are always going to be faced with the problem of having to integrate T-cell reactivity with the rest of the analysis.
We have been talking about those cutoffs of 5, 10, and 15 as if they're written in stone. In fact, those 5, 10, and 15 have changed over my career in medicine a great deal from time to time, and today they're different from place to place. Those are the criteria that we have been using that CDC has been recommending.
I live in the State of California, which has a lot of the TB problem. The State of California says we don't accept those criteria of CDC; we have our own criteria for what we're going to interpret as a positive or negative skin test. I don't want to enunciate what those are. I simply want to say that clinicians and public health officials will change those cutoffs.
So this panel is not going to put those cutoffs in stone now and forever, probably for a week or two.
DR. LEWINSOHN: So the data that you would get back would be like --
DR. CATANZARO: Quantitative.
DR. LEWINSOHN: -- the percentage human response or something like that?
DR. CATANZARO: Yes.
CHAIRMAN WILSON: Dr. Cockerill?
DR. COCKERILL: This kind of goes back to a question I probably wasn't clear about earlier this morning. Is there any data that correlates the positivity of the interferon gamma assay with the raw measurement of the induration, the classification of the scientist to the risk group and the interpretation? Because that, to me, is probably a better way of looking at this. We're mixing apples and oranges here because, as we just heard from Dr. Catanzaro, over his career, and over mine too -- I'm getting older -- the interpretation of the PPD has changed. That's based on years and years of experience.
So we're comparing two different assays here, but the result for one of the assays is an interpretation based on classification of risk group. Am I on the right track here?
So is there any data that just basically looks at induration? There was some in the handout, I think, some correlative data looking at that agreement, induration compared with the positivity of the gamma interferon assay.
DR. ROTHEL: I think that our best indication of that would be on the regression comparing induration versus percentage of human response. That's been done in the vast majority of papers that have been published, and just about all of them have found that there is significant association with that regression. A couple of them have found no great association, but in the vast majority, yes, there is. The higher the induration, the higher the same human response you will get.
CHAIRMAN WILSON: Okay, any other comments on the second question?
If not, could we have the third question?
The question states: "In which populations of individuals could a positive or negative QuantiFERON-TB assay provide clinical utility alone or in conjunction with TST? Are there labeling restrictions? If any, if it would add to clinical utility for any population groups?"
DR. BARON: Well, Dr. Nolte has already talked about the fact that HIV-infected patients would be another indication for labeling. So we think once that group gets properly assessed, they should be included in here and children as well.
CHAIRMAN WILSON: Other comments or questions?
DR. NOLTE: I think we have touched on this, but I mean the relationship between CD-4 positive cell counts in this assay is known? We haven't seen the data, but I get the impression that that data is available? Is that one way to deal with this problem of using the assay in populations that you have some concerns about in terms of being immunocompromised? I mean the specific immunocompromised that we're worried about is depressed CD-4 positive cell counts?
DR. ROTHEL: I suppose my answer is the same as the answer I gave before. We do have a considerable amount of data showing that it works generally in cases of low CD-4 counts and HIV and other compromised people, but we don't have sufficient data to support its registration and approval by the FDA. So we have to go and get more data. Probably what we will do is a smaller study. We have a lot of data already, but we need to do a working study in the U.S. to extend that claim in the HIV-positive and immunocompromised people.
I should add that --
DR. NOLTE: Is it a realistic way to think about getting around this exclusion of immunocompromised patients is to hang it sort of on the CD-4?
DR. ROTHEL: Yes, I think that's quite an appropriate way to do it. If a person's HIV-infected, it doesn't mean they're immunocompromised.
DR. NOLTE: Right.
DR. ROTHEL: You should be looking at their CD-4 count and relating performance to CD-4 count or some other measure of immuno-activity.
DR. LEWINSOHN: Can I ask another question?
CHAIRMAN WILSON: Dr. Lewinsohn.
DR. LEWINSOHN: So I guess this gets back to that, I know admittedly, small number of patients who got TST's and QuantiFERON tests, which seemed to show more variability than you guys had seen when you just did the repeated testing on an individual over time, which in my mind raises this issue of whether the TST and QuantiFERON tests could interfere with one another or, specifically, whether the skin test interferes with the QuantiFERON test.
So would you propose that that's a part of the labeling, at least to make that suggestion, I mean to suggest to do the QuantiFERON first then?
DR. ROTHEL: Yes, I agree. I think I acknowledged that to you this morning, that we probably should have made the labeling to say that you shouldn't skin test within "X" number of days, probably 30 days, the same as just for a skin test.
CHAIRMAN WILSON: Dr. Reller?
DR. RELLER: Although it's plausible that patients with intact, or reasonably intact, CD-4 counts either before or after therapy would respond like most other individuals, I would think until the data are in hand that one couldn't count on that.
Secondly, do you have any experience with transplant populations? At least in this country a growing number of patients, and a rich source of clinical tuberculosis, sometimes recognized late, at least are recognized in our center. So that, theoretically, either before transplantation or at some point you would want to know that. Do we know what the effect of the whole range of immunosuppressive agents to preserve transplanted organs, what that does to this test?
DR. ROTHEL: No. That's a very good question, and we haven't done it. That's why we've contraindicated or limited the applications for those individuals.
CHAIRMAN WILSON: Dr. Charache?
DR. CHARACHE: To address this specifically, which is, in which populations of individuals could a positive or negative assay through clinical utility alone or in combination -- it seems to me that if you have a negative test for either and they have good, relatively good concordance in people with active tuberculosis, it would be a suggestion that you ought to look for other causes of the patient's pulmonary disease and not assume that it's only tuberculosis.
I think there the caveat is that neither is perfect. So you can't rule out TB. But it would be highly suggestive, based on the data that we have, that there is tuberculosis there.
In terms of looking for latent TB, I think right now we would probably want to see what happens with the change in the end-points moving up on the curve, to get rid of a lot of the false positives, because, hopefully, it would be useful there. But I think right now it could be problematic in causing overtreatment of a very large population.
CHAIRMAN WILSON: Thank you.
Yes, Dr. Cockerill?
DR. COCKERILL: Barth brings up a good point about the transplant patients where we're febrile and we're trying to figure out how the investigation is going to go. If you have a negative tuberculin skin test, the patients may be anergic. So we will check anergy.
Is there any data with the mitogen control with this assay as to mitogen-negative patients? Are they anergic? Were any additional studies done?
The reason I am bringing that up is that, if we have a mitogen-negative result, would it be possible to suggest that the patient have a full complement of anergy skin testing? Is there any data related to that?
DR. CATANZARO: I think before I let Jim answer the question about the mitogen, I want to remind you that CDC specifically recommends against anergy testing to assist in the interpretation of tuberculin skin tests. There's no correlation between those two things, and they recently submitted an MMWR advising people not to do that.
So I don't know if you want to comment about that.
DR. COCKERILL: Thanks. I didn't know that.
DR. ROTHEL: I can give you a little bit of data on that study done in Kenya. We did, from memory, I think 100 individuals, I think, and about 16 percent were HIV-positive and various CD-4 counts ranging down to 6. We looked at the main mitogen response of the individuals who were HIV-positive compared to those that weren't and also stratified it by CD-4. Yes, there definitely is a dropoff in mitogen as a main response for all those individuals with low CD-4 counts and with HIV infection.
But the trouble is there is variability. So a person can have a CD-4 count of 200 and have a decent mitogen response, whereas a person with a CD-4 count of 1,500 can have a lower response than that. So I don't think it's a definitive measure. Definitely if a person hasn't got a mitogen response, yes, you go looking.
CHAIRMAN WILSON: Other comments or questions? Dr. Charache?
DR. CHARACHE: I'm just wondering, I was just thinking about the mitogen as being a very nice side offshoot of this test, knowing about it. Is the mitogen stimulation quantification that's used here adequate to predict anything about the ability of a given patient to respond? Because you've got the data anyway. Can you use it? Or do we know if you can use it to predict responsiveness to mitogenic stimulation? And is that data known for those that were PPD-positive and interferon-test-negative?
DR. ROTHEL: I think Tony can address that question specifically. I will just state that there is something else we see as an application for the QuantiFERON technology, is a totally different test apart from TB, which we're here to talk about today, which is a measure of immune-competence, but we would use antigens other than mitogen.
But Tony can address your question specifically.
DR. RADFORD: Of course, as the ratio is what's used, it's not dependent upon the actual absolute mitogen response. We have, in fact, analyzed the mitogen response and the TST positivity.
One of the interesting facts is that you're twice as likely to be skin test positive if your mitogen response is above 50 international units per ml. However, we still don't believe we actually have enough data on the HIV population to address that.
CHAIRMAN WILSON: Any additional comments?
Okay, let's move to the fourth question.
The question states, "When the QuantiFERON-TB assay is positive or negative and not used in conjunction with TST, can available types of data from the two clinical studies be used to interpret the probability of TB infection for individuals with low, moderate, or high risk?"
DR. BARON: Can I clarify that question? Do you mean all by itself without any other clinical data?
CHAIRMAN WILSON: Steve, do you want to clarify the question?
MR. GUTMAN: Sure. The question, the heart of the question, is: If this product is approved, how to label it, what kind of message to give to people who use it. So, yes, we are looking for advice on how to characterize performance on the labels, and we need to know what advice to give people who might actually buy the test and use it.
DR. BARON: What does the skin test labeling say? I mean, I can't believe it would say: Here's your answer, all by itself. I am sure there must be all kinds of caveats with it that say, "in conjunction with a history" and "physical findings," and all those other things.
DR. ROTHEL: If I can briefly say, yes, it does. Their labeling claims are nearly identical to ours, and the diagnostic -- the detection of infection with MTB, but then they have a whole lot of caveats in interpreting in conjunction with all the clinical findings, history, et cetera.
MR. GUTMAN: I think we have somebody from CBER here who might be able to elucidate labeling because that's obviously from a different shop, but we'd be happy to share that.
CHAIRMAN WILSON: Could you come to the microphone, please, and identify yourself, please?
MR. MORRIS: Yes, I'm Sheldon Morris. I'm the Chief of the Mycobacteria Lab at CBER. Frankly, I don't have these labels memorized, but it basically says, as an aid in the diagnosis of MTB infections, and then it gives some caveats.
MR. GUTMAN: So I guess the question on the table is what you would like to see in this product. Do you want to see less? Do you want to see more?
DR. BARON: Yes, it looks good as they had proposed it in their written proposal with those caveats.
CHAIRMAN WILSON: Any additional comments? Dr. Nolte?
DR. NOLTE: I guess you're asking about the statistics, I mean how to describe the performance?
MR. GUTMAN: Well, I'm asking -- one way to do that is not to describe it. It's to provide just the most general contour of association. Another is to eloquently and extensively describe it. We have experience in the Division with both.
DR. NOLTE: I mean, clearly, they have data that addresses the performance characteristics of the test relative to TST and the three groups that you outlined there.
MR. GUTMAN: And would you like to perhaps --
DR. NOLTE: I think it would be reasonable to include that in the package insert.
CHAIRMAN WILSON: Dr. Ng?
DR. NG: I think the most illuminating way of looking at this data was Dr. Sack's presentation of Venn diagrams, because I think the user really wants to know what the non-concordance rate is, if you're just using a QuantiFERON assay and you don't have a TST to compare it with.
CHAIRMAN WILSON: Dr. Cockerill?
DR. COCKERILL: But I presume that would be modified based on the 30 percent, which we haven't seen that data.
CHAIRMAN WILSON: Dr. Charache?
DR. CHARACHE: I think it should go further than the current physicians' instruction section, which has a paragraph on page 139, "The possibility should not be excluded that a positive QuantiFERON-TB test is due to a prior BCG vaccination." It should also say that false positives exist or something, that it's not only BCG.
CHAIRMAN WILSON: Dr. Carroll?
DR. CARROLL: I would just like to reiterate what Dr. Cockerill said. I would like to see the Venn diagrams with the 30 percent cutoff, particularly in that low-risk group. I think that would be very helpful in terms of our comfort level with that low-risk group and the false positivity rate.
CHAIRMAN WILSON: Okay. Dr. Lewinsohn?
DR. LEWINSOHN: I was trying to think, I mean, the sort of setting, I guess, that it seems like we would most want to have this test would be in the setting of something like a contact investigation where we're really trying to tease out who's been recently infected or not. Obviously, we can't really tell who's truly infected, you know, where there is a discordance between those two data.
So are there settings where it should be recommended that you would do both tests, the hope being that either would be sufficient or would you propose that we would just do one or the other in that kind of a setting? It's a question to you, sure.
DR. ROTHEL: I think the talk we heard from Jim McAuley would say that it was perhaps a waste of time. In a real setting why would you use a skin test and miss half of your results?
DR. LEWINSOHN: Well, but he's looking at a different -- I mean he's screening for active disease where there is high risk of spread. In a contact investigation you're going to use your skin test information to figure out kind of how far to go, because each person who you find who's positive may have been a contact. So that turns out to be very practical there.
I'm just curious to know, would you do both, the idea being that either one would be sufficient to make you think they're a converter or --
DR. ROTHEL: My personal view would be, no, I wouldn't, but I'll let Tony respond too.
DR. CATANZARO: I think it would be tremendously burdensome to suggest to do both, and it would be analogous to say, well, why not do all three? Why not require Connaught, Tubersol, and QuantiFERON? I think that would be a very burdensome thing to do.
I think that very nice data has been presented here to show that the QuantiFERON is at least as good as the tuberculin skin test, and the physicians, the public health people can make a decision based on their circumstances which one to do. Then regardless of which one they do, it's an aid to the diagnosis; it has to be put in the clinical context. Lots of other information has to be collected before you go ahead and prescribe treatment.
So I think there's lots of safety leaving it as it is, as an aid, and I would be horrified if this panel recommended to do two or three tests every time we wanted to ask the question: Does the patient have latent tuberculosis infection?
CHAIRMAN WILSON: Dr. Cockerill?
DR. COCKERILL: If it's a false positivity specificity issue in your low-risk group and your incidence of a positive result for the QuantiFERON is very low, then confirming that with a second test may be reasonable. I'm not suggesting that, but based on the data for the 15 percent cutoff, we see 7 versus 1, I think, positive. There's a 12 percent agreement. But the total number for that low-risk group is very, very low, I think, in what I'm seeing.
So one could consider a two-tiered approach, not suggesting that, especially if the 30 percent doesn't decrease that "false positivity."
CHAIRMAN WILSON: Dr. Reller?
DR. RELLER: I can see two tests when one is very sensitive but lacks specificity, and there are ample models for this. But in this case I've seen no data that suggests that they're really complementary, and it would be to me defeating the whole purpose to have two tests.
Each has its limitations, but unless there were convincing data that you did one test and then the other one added something to what you already had, and vice versa, I think that would be the wrong way to go, particularly one of the rationales for considering this approach is all of the pitfalls with skin testing in the first place in terms of followup, and quite apart from interpreting, all of the things that have already been discussed. So I think, from what I have heard, the skin test and this test are not of the genre that would be logically done in sequence.
CHAIRMAN WILSON: Dr. Charache?
DR. CHARACHE: I'd like to agree with both Dr. Reller and Dr. Cockerill.
I'm going to suggest that in the high-risk group they're close enough. So perhaps in the high-risk group, since the sensitivity is better with the skin test, if I got a negative with the interferon assay, it might be worth doing the skin test, but not on general populations, and that's going to be a small number of people.
I would say the reverse is true with the lower-risk groups one and two of CDC and groups one, two, and three of the WRAIR study. For those, if you got a positive QuantiFERON test, it would be worth confirming that it was really positive with a skin test, because the skin test is going overcall positives in the low-risk group, and it's, therefore, a safety valve to get rid of the false positives. Otherwise, we are going to have, with this only 12 percent agreement in the low-risk group, if you're doing case studies, surveillance kinds of things, I think it would be helpful to take that small population which give you a positive QuantiFERON and follow it with a skin test.
CHAIRMAN WILSON: Dr. Reller?
DR. RELLER: This is probably the only time I've ever differed with Dr. Charache. To me, there are three groups of patients: the one that we're really worried about, and especially in a patient population that I realize the test is not at this point, would not, if approved, be approved for use in HIV-positive transplant patients. But if I'm really worried and the test is negative, I'm going to pursue other things: bronchoscopy, whatever it is going to take clinically to get the diagnosis excluded comfortably; that is, active disease excluded.
If it's a very low-risk population, I think we're wasting time and effort on patients who shouldn't be tested in the first place. And in the middle group the test is as good or better than skin testing or it shouldn't be approved for use, and if it is, you realize that neither of them is going to be perfect, and you do it. If things change in the patient, you escalate the diagnostic process. But you've got an opportunity, in passing through some of the testing operations that we saw portrayed here, and you do it and act appropriately on the results and get on with things.
CHAIRMAN WILSON: Dr. Charache?
DR. CHARACHE: I'm sure this is the only time I've disagreed with Dr. Reller.
Whenever we both have our hands up and he's called first, I don't have to speak.
But I think in this case I'm very concerned about the five- to tenfold increase in use of prophylaxis for the latent TB possibility. Now I don't know what that percentage will be when we look at different breakpoints. That may solve the problem.
But where we have such a low agreement, unless the agreement is 80 percent or more, I think it would be worth, rather than using prophylaxis, to do a skin test, and certainly a lot less cumbersome to the patient than the time they would need to be on therapy.
CHAIRMAN WILSON: Dr. Cockerill?
DR. COCKERILL: Well, I agree with both. I'm trying to maintain my friendship with both.
But I would agree that, first of all, most of us would not be doing risk one testing except for contacts. So if we it in that context, that this is a contact that we're screening, putting aside the Army and whoever else is screening probably inappropriately for the risk one, you will have six more with a cutoff of 15 versus 1 in this group, and I don't know what that percentage is, that will then, based on current recommendations -- and I'm not up-to-date on all the CDC recommendations -- you would treat with either six months of Isoniazid or two months of the combined.
That, to me, if we would stay at a 15 percent, one would then consider another test to substantiate that result. I don't know what that total incidence is, but it probably is pretty low. Even though we have that discordance and the agreement is only 12, there were very few that actually tested positive, either one, in that risk group one.
Now the 30 percent cutoff, when we see that data, maybe we'll get down to 3 versus 1, and then I agree with what you're saying.
CHAIRMAN WILSON: Dr. Sanders?
DR. SANDERS: Two comments -- actually, two questions. Although I agree with Dr. Charache's recommendations for potentially handling the risk groups, are we saying that this is a recommendation we're actually making and asking that to be printed in the package insert, if ultimately approved, or are we making this as a recommendation that could be considered by physicians treating the patient who actually has that patient in front of them to consider? So that's one question.
And then the other has to do with, we continue to speak about the 15 percent and the 30 percent cutoff, which we have not seen the 30 percent data. So I guess the other question is: Are we going to make a recommendation today, having not seen that data?
CHAIRMAN WILSON: Do you want to comment, Dr. Charache?
DR. CHARACHE: Yes. I think before we make any recommendation of follow-up testing, the most important points to be made are those that you and Dr. Reller have made, which is the individual physician will be assessing the patient. But I think guidance should be given.
Now in terms of making a recommendation of switching the cutoff to this 30 percent rather than 15, I would recommend that the recommendation be made that the cutoff be reviewed for each category of patient and be adjusted to optimize the purpose for which the test is to be performed. So if the purpose of the low-risk group is to determine who would benefit from antibiotic therapy, then the breakpoint should be set to optimize getting that information. But I don't think we're in a position to recommend what the numbers should be.
CHAIRMAN WILSON: Would you like to make a comment?
DR. JOLLY: Thank you, Mr. Chairman.
I would draw the panel's attention to the document which starts on page 192 of volume 2. In that particular analysis we do present all of the data for 15 percent cutoff and for 30 percent cutoff. I refer in particular to page 196, where there's a table which shows precisely the Venn diagrams of the FDA, not Venn diagrams but these tables, at the 15 percent cutoff and at the 30 percent cutoff.
I would like to draw the panel's attention to the fact that the specificity in each of those three risk groups changes to 98 percent, 98 percent, and 94 percent, respectively, when we move the cutoff from 15 to 30 percent. This is precisely the reason that we recommended the change to 30 percent, because we believe it matches these data precisely.
In terms of maximizing, why was that 30 percent chosen? I draw the committee's attention to the rest of that document which says that the 30 percent cutoff is appropriate for -- was chosen as being the appropriate cutoff point based on the CDC data.
So I would submit, Mr. Chairman, that those are right there and we would draw the panel's attention to those data.
CHAIRMAN WILSON: Thank you.
DR. CHARACHE: Just in reading that, the table is set up and I want to be sure that I'm reading it correctly. What this is saying is that, of those that are skin-test-positive -- there are four that were skin-test-positive -- there were 41 that were positive by the QuantiFERON?
DR. JOLLY: That's correct, but that's with a cutoff of 10 percent.
DR. CHARACHE: Then as we go across, the differences are a threefold change?
DR. ROTHEL: If you go --
DR. CHARACHE: Of discrepant results.
DR. JOLLY: -- to 30 percent, then it's six are in the discordant group --
DR. CHARACHE: Right.
DR. JOLLY: -- as opposed to two.
DR. CHARACHE: This is the Army recruits or the Navy recruits?
DR. JOLLY: This is correct.
DR. CHARACHE: So we would also like to see this in the other broader population, but this is exactly the kind of data that I'm sure the FDA and the sponsor will be looking at in terms of selecting the right cutoff.
This is just the low group, and then intermediate group we would be concerned about as well, where there's quite a few discrepants as well.
DR. JOLLY: Thank you.
MR. GUTMAN: Our statistician would like to make a comment.
CHAIRMAN WILSON: Okay.
MR. DAWSON: I have to take exception to the company's analysis arriving at the 30 percent cutoff for human response. It's based on ROC analysis, and, of course, that's a wonderful tool for deciding on a cutoff because you get to look at the whole spectrum of possible cutoffs and pick the one that gives you a desirable balance between sensitivity and specificity.
But the problem with what the company has done is to base their ROC entirely on a comparison with TST as a gold standard. All I can say is we can't interpret the result because we all in this room know or believe, or have certainly heard today, that TST is not a gold standard. So I basically would ask you to disregard anything related to those ROC figures in your panel pack.
As I mentioned this morning, we do have analytical means available to us for evaluating an after-the-fact change in the cutoff. It's a cross-validation method involving a technique known as the bootstrap. So if the company, for whatever reason, wants to change the cutoff, in this case from 15 to 30 percent, we can recommend an appropriate technique, but that technique would be what I would expect would have to be done for justification.
CHAIRMAN WILSON: Thank you.
Any other questions or comments?
Question No. 5, please.
Question No. 5 states: "Could conjunctive or adjunctive use of QFT with TST testing provide additional benefit in any of the above risk groups?" I think we've discussed this to some extent, but are there any additional comments or questions?
DR. LEWINSOHN: I guess if we're still kind of talking about labeling, I mean it seems like having the data certainly is, from both of the American studies along with the Venn diagrams, would be very helpful for the clinician. It seems to me, though, that we don't ultimately really know many of the answers that we would like to know in terms of who's likely to go on to develop active disease after they have either one of these tests turn up positive. I suspect those answers will come out with more study and more clinical evaluation. So that it might be smart just to have data, but a paucity perhaps of specific clinical recommendations in the package insert.
CHAIRMAN WILSON: Dr. Carroll?
DR. CARROLL: Yes, I just wanted to say something similar. As a clinician, I do not think that the labeling should include a recommendation for TST testing in conjunction with this assay. I think that should be left up to the individual physician's decision and the risk stratification of the patient and other data that will be used to decide whether a patient has active disease or is at low risk for disease.
So I would disagree with actually including that in the labeling. I would say, though, that all information should be provided to the clinician or the labeling regarding discordance for each of the risk groups.
CHAIRMAN WILSON: Mr. Reynolds?
MR. REYNOLDS: I again have a question on the current labeling for the PPD. What does that say about testing in low-risk groups? Anyone have any idea?
CHAIRMAN WILSON: Does anyone from FDA want to comment on that?
DR. CATANZARO: I don't know the labeling, but I know CDC's recommendations quite well. They recommend specifically against that. CDC recommends targeted testing, as does the IOM, targeting based on epidemiologic factors. As someone pointed out, that doesn't forbid anybody from using them in a low risk, and that causes problems in interpretation that a clinician has to spend a lot of time on, but CDC recommends targeted testing.
MR. GUTMAN: I do have, compliments of a panel member, the package insert, and the CBER person will quality control me, but it looks relatively nondirective.
CHAIRMAN WILSON: Thank you.
DR. COCKERILL: It does recommend additional testing, culture, chest x-ray based on clinical findings.
CHAIRMAN WILSON: Dr. Charache?
DR. CHARACHE: I think it would be helpful to provide guidance which is accurate with any changes that are being made in breakpoints, because I don't know that the average physician would understand how to use the data. We're struggling with how to interpret it here, and when you emphasize the agreement on the positives and when you emphasize the agreement on the negatives. I think that that's a lot to ask of someone who's, whether he's doing the case study or taking care of a family member of someone who's had TB, or whatever it is. So I think some guidance would be helpful.
But I think, as Dr. Sanders pointed out, this should also be emphasized in terms of the overall responsibility of the physician in deciding what's best for that patient.
CHAIRMAN WILSON: Any other comments? Dr. Cockerill?
DR. COCKERILL: Yes, I would agree with that because, as a clinician as well, we do have guidelines for interpreting the tuberculin skin test which aren't part of the package insert. We don't have guidelines for interpreting this test outside of the package insert. So anything that we can provide, especially if we have two different cutoffs, that information has to be in there as far as, what is a low risk, moderate, high risk, for the clinician to make some sense out of it.
CHAIRMAN WILSON: Okay, again, in an effort to help everyone get to the airport on time today, I'm going to rearrange the agenda somewhat. I would like at this point to go to the open public hearing. If any members of the audience would like to make a comment, please come forward at this time.
There being none, the open public hearing is closed.
I spoke briefly with industry over the lunch hour. They were hoping to have a little bit of time to prepare the industry response. So what I would like to do now is take a break from now until 2:30 to allow them at least 15 minutes to work on that, if you would like to take that time.
DR. ROTHEL: I think we would just like to thank the panel for their considerations today, and we're quite happy. Thank you.
CHAIRMAN WILSON: Okay. Does FDA need time to do anything to prepare their response?
MR. GUTMAN: No, we have no response.
CHAIRMAN WILSON: You have no response? Okay.
At this time let's move forward, then, with the final recommendations and vote. At this time it's the responsibility of the panel to provide final recommendations to the FDA and to vote on the product that is before us today. I would like to remind everyone that only voting and temporary voting members can vote.
Before we get there, I just want to make sure that if there are any last issues that the panel members have that they would like to clarify prior to the final recommendations and vote, we could do that now.
DR. NOLTE: Yes, we were talking about having guidelines or recommendations for how to interpret tests that were outside of the package insert. Clearly, there are guidelines for interpreting tuberculin skin testing that has come from the CDC and other places.
I wonder, since the CDC was so intimately involved with the clinical trial of this particular test, whether there are going to be guidelines forthcoming soon from them in terms of how to interpret such a test, should it be approved.
DR. MAZUREK: Jerry Mazurek, CDC.
Yes, we're working on it.
DR. NOLTE: Okay.
CHAIRMAN WILSON: Does anyone on the panel feel like they need any time to look at any more of the data, particularly the article that was passed out today?
DR. NG: Dr. Mazurek, I would be very interested in seeing the interlaboratory reproducibility before the CDC comes out with its guidelines. In other words, I want to know how reproducible a 15 or a 30 percent cutoff is from lab to lab.
DR. MAZUREK: For additional studies and studies that are coming up for the QuantiFERON, we will try to take that into account and include reproducibility and interlaboratory variations in assessing the test.
CHAIRMAN WILSON: Okay. Ms. Poole?
MS. POOLE: Good afternoon. I'll now read the panel recommendations, all voted options.
"The medical devices amendments to the Federal Food, Drug and Cosmetic Act (the Act) as amended by the Safe Medical Devices Act of 1990 allows the Food and Drug Administration to obtain a recommendation from an expert advisory panel on designated medical devices pre-market approval applications that are filed with the agency.
"The PMA must stand on its own merits, and your recommendations must be supported by safety and effectiveness data in the application or by applicable publicly-available information. Safety is defined in the Act as a reasonable assurance, based on valid scientific evidence, that the probable benefits to health under conditions of intended use outweigh any probable risk. Effectiveness is defined as a reasonable assurance that in a significant portion of the population the use of the device for its intended uses and conditions of use, when labeled, will provide clinically-significant results.
"Your recommendation options for the vote are as follows: approval, if there are no attached conditions; approvable with conditions. The panel may recommend that the PMA be found approvable subject to specified conditions such as physician or patient education, labeling changes, or a further analysis of existing data. Prior to voting, all of these conditions should be discussed by the panel.
"A vote of not approvable, the panel may recommend that the PMA is not approvable if the data do not provide a reasonable assurance that the device is safe or if a reasonable assurance has not been given that the device is effective under the conditions of use prescribed, recommended, or suggested in the proposed labeling.
"Following the vote, the Chair will ask each panel member to present a brief statement outlining the reasons for their vote."
Our voting members are Kathleen Beavis, Valerie Ng, Natalie Sanders, and appointed as temporary voting members -- and we have another citation to read:
"Pursuant to the authority granted under the Medical Devices Advisory Committee charter dated October 27th, 1990, and as amende August 18th, 1999, I appoint the following persons as voting members of the Subcommittee of the Microbiology Advisors Panel for the duration of this panel meeting on October 12th, 2001: Ellen J. Baron, Frederick Nolte, and Barth Reller.
"For the record, these people are special government employees and are either a consultant to this panel or a voting member of another panel under the Medical Devices Advisory Committee. They have undergone the customary conflict-of-interest review. They have reviewed the material to be considered at this meeting."
And it is signed "David W. Feigal, M.D., MPH, Director for the Center for Devices and Radiological Health," on October 10th of this year.
CHAIRMAN WILSON: Thank you.
Are there any questions from members of the panel?
All right, then at this point I will entertain motions regarding this PMA submission. Dr. Baron?
DR. BARON: I move that we vote for approvable with conditions, and I hope the panel will help me with the conditions here.
DR. SANDERS: I'll second that.
CHAIRMAN WILSON: Okay, we need to specify the conditions then.
DR. BARON: Karen has handed me a few.
Attached conditions should be statistical analysis, as suggested by Dr. Dawson and originally by Dr. Charache, about stratification of risk groups and appropriate cutoffs; interlaboratory reproducibility studies previewed and then followed by CDC guidelines for use external to the package insert, independent of the package insert.
CHAIRMAN WILSON: Dr. Gutman, I don't believe we can specify --
MR. GUTMAN: You can recommend that, but don't make that a condition of approval.
DR. BARON: Okay, and one more before I stop: physician recommendations for utilization of the results.
DR. SANDERS: Actually, I would like to modify that last one and ask for a physician education program to educate physicians, treating physicians, about the test. I know that there's probably a program in place for the laboratory physicians in order to be able to ultimately report and interpret the results, but an additional physician or practicing physician education program.
DR. NOLTE: In addition to any CDC recommendations that might be forthcoming?
DR. SANDERS: Well, we can't mandate that part, but we can ask the company to provide physician education.
DR. NOLTE: No, I'm asking you in terms, if there were CDC guidelines forthcoming, would you have the same recommendation?
DR. SANDERS: If there were CDC guidelines forthcoming, I would accept those.
CHAIRMAN WILSON: Okay, we have a motion of approvable with conditions, those conditions being that there be further statistical analysis with stratification of the risk groups by the varying cutoffs; that there be further data provided on the reproducibility, particularly regarding interlaboratory variability in test results, and the third one being recommendations for physician interpretation and education regarding the use of the product.
DR. NG: I would ask that there be expansion in your package insert for people like me, so when I use it, I have the different risk groups and the concordance and non-concordance of the two tests, so I can explain to my users.
CHAIRMAN WILSON: Dr. Baron?
DR. BARON: Dr. Charache is suggesting that we also add that data be presented in the package insert on the agreement of positives.
DR. CHARACHE: I shouldn't be speaking. May I speak? No, I shouldn't speak?
CHAIRMAN WILSON: No, you can't speak.
DR. BARON: Okay, she's suggesting that we add agreement not just on the positives and negatives, but data presented separately.
DR. SANDERS: Mr. Chairman, is it not our usual practice, after we have made our final recommendation and vote, that we then go through the package insert in greater detail? Is that our usual practice or do we do it now?
CHAIRMAN WILSON: We do it now.
DR. SANDERS: Well, if we do it now, I think also we had discussed earlier that we would be careful about the timing, if skin testing had been performed, that there should be perhaps a warning or a limitation indicated in the package insert of a timeframe with which not to perform the QFT. So that should also be added in the package insert.
CHAIRMAN WILSON: Dr. Baron, could you further clarify what additional data that you were suggesting be included?
DR. BARON: Well, it's not my suggestion.
CHAIRMAN WILSON: Yes, but you made the motion.
DR. BARON: I made the motion, but I don't quite understand it.
CHAIRMAN WILSON: We need to know before we can make a recommendation to the manufacturer --
DR. BARON: Can some other committee member agree with it or not, and then --
CHAIRMAN WILSON: Dr. Ng?
DR. NG: If I can interpret what I think Dr. Charache was asking, it's the two-by-two tables, because the agreement is looking at that diagonal axis of what in boxes A and D in the two-by-two table. What I was asking for was slightly different, which was the overlap and the missed populations between the two tests. But if we include all that information, it would really help with the interpretation of the test result.
DR. RELLER: So what Dr. Ng is talking about is basically the two-by-two tables plus the Venn diagrams?
CHAIRMAN WILSON: Correct. Okay, so we have a motion, then, for approval with conditions, and so far there are, depending on how you slice it, five or six conditions.
DR. RELLER: I'm assuming that in those conditions are the explicit description of the populations for which data are not yet available: transplant, et cetera. I think this is very important because with a new test that is more -- the scientific basis of it is more delineated. You're recalling memory from lymphocytes with a purified protein derivative of what you are seeking to elicit the memory of, that there is maybe an assumption that it's a better test.
With the CDC guidelines and more experience, it may turn out to be that way, but I can envision a situation where in the very patients for which there are no current data would be the very patients that Dr. Ng and others, including ourselves, would be pounded upon to do the test. I think that it should be very explicit, and then to come in subsequently, as the data unfolds and the guidelines are clarified, but to have that unequivocally spelled out in the package insert, so that there would be a sequenced introduction that was consonant with the database available.
CHAIRMAN WILSON: Thank you. Is there any further discussion of the conditions? Dr. Sanders?
DR. SANDERS: Well, I just want to make a comment that that actually, those limitations are actually spelled out as the company has given it to us, and I would be very surprised if they were not already planning to look at this in those populations.
CHAIRMAN WILSON: We have a motion and we had a second on the original motion. At this point I would need a motion on the amended conditions. Does everyone have firmly set what all the conditions are or would you like me to go over those again?
First is further statistical analysis, particularly regarding stratification of the data by the different risk groups and the varying cutoff points.
Second is the issue of reproducibility, particularly regarding interlaboratory variability.
The third is information regarding interpretation of the tests, both by laboratory physician or scientists as well as the practicing clinician.
The next is inclusion of further data, both the Venn diagrams as well as the two-by-two tables.
And the final one is that there be a comment regarding the possible effect of tuberculin skin testing on the QFT test and the need for possibly separating those two.
DR. BARON: Can I clarify the interlaboratory reproducibility studies, that they should include a lot of negatives. It's the false positives we're concerned about here.
DR. NOLTE: I think it needs to include a whole range, the range of expected values and sort of representative of what you might see in a population that you were screening. I know this is different; we're talking about different populations here, but something more representative of what you might actually wind up testing.
CHAIRMAN WILSON: Okay, thank you.
DR. NOLTE: I need a clarification on this physician education aspect of this and how this becomes a condition to approval. I mean, what are we suggesting when we say this, that the manufacturer contact each and every practicing physician and tell them how to interpret this or what? I mean, to do education programs? What are we buying into here by physician education?
MR. REYNOLDS: I was thinking something more along the line of a little booklet or leaflet or something that could be given out to physicians, explaining in more detail how this test works and how it should be interpreted. I don't know what the other folks on the committee were thinking of.
DR. SANDERS: Since I made that suggestion, actually, that's what I envision. But I envision it in two ways: one, that as this test becomes purchased by entities, there would be an education process for the laboratory and the supervising physician or lab director at that institution.
I would also envision, subsequently, some type of program for instructing the using clinicians, with materials provided by Cellestis. Now I'm not saying that Cellestis has to actually come out and do that education program, but with materials provided by Cellestis. That could actually be done by the lab director or the lab director's staff, because once that test has been purchased by the entity, they're going to want people to use it.
So that is how I had envisioned. Does that help you, Dr. Nolte?
DR. NOLTE: Yes, I guess it does, but I'm just trying to think if there really are going to be guidelines, of course, coming from CDC, it's hard to see how the information from the sponsor is going to have --
DR. SANDERS: I made that recommendation because I do feel that clinicians will need to be educated on how to use this test.
DR. NOLTE: Yes.
DR. SANDERS: And we could not, for the record, state that we would encourage CDC, another government agency, to do this. So we would have to then make it a recommendation for the sponsor.
DR. NOLTE: We've also asked them to include a lot of that type of information in the package insert. So I'm trying to figure out what this pamphlet from the sponsor is going to say that's not in the package insert.
DR. SANDERS: Well, as a treating physician, I actually never see the package insert for a lab test that I order.
DR. NOLTE: No, I understand that. I understand that, but whose responsibility is it to educate, the sponsor or the offering laboratory?
CHAIRMAN WILSON: Dr. Gutman?
MR. GUTMAN: Yes, we're prepared to work with the company and also to consult with CDC and try and create some path for it. I think you are trying to micromanage. You've made a recommendation. We'll try and take it to heart.
DR. NOLTE: Okay.
CHAIRMAN WILSON: We have a motion for approvable with conditions. I need a second on the conditions as clarified.
DR. NG: Second.
CHAIRMAN WILSON: We have a motion and a second. Is there any further discussion at this time regarding either the main motion or the conditions?
Okay, there being none, then I would like to take the vote. All the voting panel members who are in favor raise their hand.
(Show of hands.)
Do it by voice as well? Shall we do it again?
DR. RELLER: Reller, yes.
CHAIRMAN WILSON: Dr. Nolte?
DR. NOLTE: Nolte, yes.
CHAIRMAN WILSON: Dr. Beavis?
DR. BEAVIS: Beavis, yes.
DR. NG: Ng, yes.
DR. SANDERS: Sanders, yes.
DR. BARON: Oh, Baron, yes.
CHAIRMAN WILSON: The vote is unanimous. Thank you.
Okay, at this point then we would like to move to have each of the voting members state the reason for their vote, beginning with Dr. Reller.
DR. RELLER: I believe the data presented justified the recommendation and the vote that we have just taken.
CHAIRMAN WILSON: Dr. Nolte?
DR. NOLTE: Yes, obviously, I think this test represents an advance in terms of its intended use, and the issues that I have in terms of the data were essentially around the statistics to validate the 30 percent cutoff and the interlaboratory reproducibility, and both of those have been addressed in the conditions we attached.
CHAIRMAN WILSON: Dr. Beavis?
DR. BEAVIS: I want to thank and commend the sponsors for tackling, I think, a very difficult area and a severe public health issue in this country, especially being from Cook County.
Again, I think the data are very strong and that the additional data will only further support the use of this test.
CHAIRMAN WILSON: Dr. Ng?
DR. NG: I voted yes because anything has to be better than the skin test.
CHAIRMAN WILSON: Dr. Sanders?
DR. SANDERS: I would agree with the opinions that have already been expressed from my colleagues. Thank you.
CHAIRMAN WILSON: And Dr. Baron?
DR. BARON: I want this test. Also, I like the idea of having it be a laboratory test that I can charge somebody for.
CHAIRMAN WILSON: All right, thank you.
That concludes the business for today. I would, in particular, like to thank the sponsor. I think this was a very well-done submission, both in terms of the written material as well a their presentations today. I would really like to applaud the efforts that they have made.
I would like to thank all the panel members, particularly our guest, Dr. Lewinsohn, who had to leave a few minutes ago, could not stay, had to make a flight; all the members of the FDA for all the work they've done on this. This has been a very good meeting.
I would like to particularly thank everyone who made the efforts to get here in these trying times. Travel is not easy right now. I know what it's like, and we do appreciate everybody who's willing to fly at a time like this.
Thank you, and the meeting is adjourned.
(Whereupon, at 2:36 p.m., the meeting was adjourned.)