FDA Workshop

Anthrax Vaccines: Bridging Correlates Of Protection In Animals To Immunogenicity In Humans

DEPARTMENT OF HEALTH AND HUMAN SERVICES
FOOD AND DRUG ADMINISTRATION
CENTER FOR BIOLOGICS EVALUATION AND RESEARCH

Grand Ballroom
Hilton Washington North
620 Perry Parkway
Gaithersburg, Maryland

Printable Version (PDF 631 KB)
Note: Documents in PDF format require the Adobe Acrobat Reader®. If you experience problems with PDF documents, please download the latest version of the Reader®.


Thursday, November 8, 2007

Introduction and Welcome

Session 1:
Background Information

Session 2:
Animal Models for General Use Prophylaxis

Session 3:
Human Immunogenicity Data

Session 4:
Panel Discussion on General Use Prophylaxis

Adjourn


P R O C E E D I N G S

(8:33 a.m.)

DR. LYNN: For those of you who haven't met me, my name is Freyja Lynn, and I'm with the Office of Biodefense Research Affairs at DMID at NIAID.

And before we start, I'd like to give you a few housekeeping matters. The first is one of the most important -- that's lunch. We are not providing lunch. However, the hotel and Starbucks have both been notified that they're going to have a lot of people to feed, and so we'll have lunch available in the hotel.

There are also some local restaurants fairly close by. I know we have only an hour for lunch, because we have a very tight schedule today. And there are maps and some of that information out front, if you haven't gotten it with your meeting packets.

The meeting is being recorded and will be transcribed, so when you speak please state your name first and speak into a microphone. Transcripts will be available after the meeting.

I think there was one other thing, which I don't remember, Drusilla.

Okay. I think that's it for right now. Again -- oh, I remember what it was. We would appreciate it if you would allow the speakers to complete their talks. We have allowed time at the end of each talk for questions, so we'd appreciate you holding your questions until the end of each talk.

So let's go ahead and start. I'd like to introduce Dr. Karen Midthun, who is the Deputy Director at the Center for Biologics Research. Thank you very much.

DR. MIDTHUN: Thank you, Freyja. I'd like to welcome all of you to this workshop today that really has been a culmination of a lot of work by a lot of people, and I'd really like to acknowledge and thank the co-sponsors of this workshop which, in addition to the Center for Biologics Evaluation and Research, also include the National Institutes of Allergy and Infectious Diseases, and also the Department of Health and Human Services Office of Biomedical Advanced Research and Development Authority.

The subject of today's workshop is how to bridge animal efficacy data to humans in support of developing new anthrax vaccines, and, of course, I think we all recognize that this is an important goal for public health preparedness.

In 2002, we held a workshop to discuss efficacy testing of new anthrax vaccines, and that workshop provided a lot of excellent direction for non-clinical and clinical studies that could potentially provide data to support efficacy of new anthrax vaccines.

And since that time, several studies have been conducted, and we now have a much better understanding --

PARTICIPANT: The sound is very bad. We cannot hear you back here.

DR. MIDTHUN: Is this not working? I'm so sorry. Let me speak directly into the microphone. My apologies.

I was just thanking those who have come today and welcoming them, and also thanking those who are sponsoring this workshop together with the Center for Biologics. That is the National Institutes of Allergy and Infectious Diseases, and also the Health and Human Services Office of Biomedical Advanced Research and Development Authority.

And the subject of today's workshop is on how to bridge animal efficacy data from animals to humans, and this of course is very important in support of developing new anthrax vaccines, and I think we all recognize this development of new vaccines is an important goal for public health preparedness.

In 2002, we held a workshop, and at that time got excellent directions on the kinds of non-clinical and clinical studies that could be conducted that would help develop efficacy data in animals that could then potentially be bridged to humans. And I think since that time a lot of studies have been conducted, and now we have a much better understanding of the immune response that animals and humans have to anthrax vaccines, and also additional data in animals on efficacy.

And so I think today what we have the opportunity to do is to hear about those data and to further evaluate and get input on the approaches that have been taken on how to bridge data from animal studies to humans, and also figure out what data gaps there might be that would help to further assess this development of new vaccines.

And I guess I'd really like to take this opportunity to say that we really look forward to the scientific input from those who have come to this workshop today. We really appreciate the input and also that people are so willing to share their data, because this is so important to really furthering the discussion and developing good approaches to this very important area.

And with that, I'd just like to say thank you. I'm really looking forward to hearing all of the discussions today. And with no further ado, I'll hand it back to Freyja Lynn.

DR. LYNN: Thank you, Karen.

Unfortunately, our first moderator, Julianne Clifford, we think is stuck in traffic, because we had -- there was an accident on the Beltway. So I'm going to moderate the first session, and our first speaker will be Dr. Drusilla Burns. Sorry, I can't see anything without my glasses.

So, Drusilla?

DR. BURNS: Thanks, Freyja.

What I want to do today is just set some background, so that everybody starts from the same place. Now, I know a lot of you are very familiar with the Animal Rule, but what I wanted to do today is very quickly go over it, for those of you who may not be as familiar with it.

Then, I'd like to just summarize some of the very important points that came out of the 2002 Anthrax Vaccine Workshop that Dr. Midthun just told you about. And so let me start by describing the Animal Rule.

This regulation was first published in the Federal Register in 2002, and it's not called the Animal Rule. It has a much longer name. It's New Drug and Biological Drug Products/Evidence Needed to Demonstrate Effectiveness of New Drugs When Human Efficacy Studies are not Ethical or Feasible.

And there's four main criteria that must be fulfilled in order to use the Animal Rule. The first is that there is a reasonably well understood pathophysiological mechanism of the toxicity of the substance and its prevention or substantial reduction by the product.

The second is the effect is demonstrated in more than one animal species expected to react with a response predicted for humans, unless the effect is demonstrated in a single animal species that represents a sufficiently well-characterized animal model for predicting the response in humans.

The third is the animal study endpoint is clearly related to the desired benefit in humans -- generally, the enhancement of survival or prevention of major morbidity.

And finally, the fourth criterion, which actually turns out to be the most difficult to fulfill, is that the data or information on the kinetics and pharmacodynamics of the product or other relevant data or information in animals or humans allows selection of an effective dose in humans.

So what does this mean for anthrax vaccines? It means that the vaccine dose must elicit an immune response in humans that is comparable to the immune response of animals protected by the vaccine. And it's really this fourth criterion that we're going to be spending the next day and a half discussing how to fulfill it.

Now, there's a number of potential misunderstandings about the Animal Rule. The rule does not apply if the product -- if product approval can be based on standards described elsewhere in FDA's regulations. The rule is not an accelerated or fast track approval.

And I think that it's important to know the rule is not a shortcut to approval, as I think many of the people in this audience are now -- now know. In fact, it may take longer. And human studies are still required under the Animal Rule. You need to have safety studies and immunogenicity studies for anthrax vaccines.

The important thing to remember when -- as far as the Animal Rule is concerned is that the product is being developed for use in humans, not in animals. So the animal studies must be designed such that the data generated are relevant to humans.

This really means that the animal studies and the clinical studies need to be developed along a parallel track. That is, you have to have some human clinical data to know what the response in humans is likely to be, so that when you're developing your animal model you can keep that in mind and try and mimic the human response in the animals. Then, you can go back and do the larger clinical trials for the pivotal studies.

When designing the animal studies, the label indication is important -- that is, pre-exposure prophylaxis or post-exposure prophylaxis. And for pre-exposure prophylaxis during this meeting we're going to refer to it as general use prophylaxis or GUP, so you'll be hearing that an awful lot.

When developing your animal model, you should consider route of exposure, appropriate challenge dose, need to have appropriate statistics, and the assays need to be measuring the appropriate parameters, and they should be validated for the pivotal studies.

Now, as you heard, in 2002 there was a workshop that was held, and at that time we were really just starting to develop the strategy for how to implement the Animal Rule in regards to anthrax vaccines.

And this workshop was very, very valuable at getting a lot of good scientific minds together to evaluate the data that were available at that time, and try to come to a consensus on some very important starting points that could be used to move forward, and I just want to summarize those today.

So the workshop had four sessions. First session was review of pathogenic mechanisms, second was the review of animal models, third was possible strategies for the development of correlates or surrogates, and then we had a panel discussion, as we'll have two panel discussions in this workshop, and it's during those panel discussions where a lot of the ideas get kicked around and we can hear from not only our panelists but also people in the audience who might have some good thoughts and good ideas.

So in regards to the 2002 workshop, what were some of the consensus points that were reached? As far as the first criterion for the Animal Rule, that you have to understand, really, the pathogenesis of the organism and the host response and how the host is being protected.

The pathogenic mechanisms of B anthracis were reviewed and were thought to be reasonably well understood. And that is that the spores are inhaled, they then are taken up by cells such as macrophages, the spores then germinate, and the vegetative cells escape from the macrophages and get into the bloodstream, and they secrete anthrax toxin. And it is believed that the -- it's the result of the toxin that you get the manifestations of the disease.

And the toxin is a tripartite toxin composed of protective antigen, and either -- and a lethal factor and edema factor. Protective antigen binds to Eukaryotic cells, oligomorizes, the LF or EF then binds to the PA, it's internalized, the PA when it hits the acidic environment of the endosome forms a pore, allowing entry of either LF or EF. And, again, it's believed that the disease symptoms are caused by the action of this toxin.

So the new generation anthrax vaccines are in general PA-based, with the idea that if you elicit toxin-neutralizing antibodies then that would abolish the effect of the toxin and prevent disease. So at the 2002 workshop it was really felt that there was a sound scientific basis for this.

One of the other things that came out of the 2002 workshop which was very, very important was the choice of the animal species to use in order to mimic the human response. And the animal data from a number of animal species was reviewed, and there was consensus that there were two animal species that would best mimic the human. And the gold standard was thought to be the non-human primate, and sort of the working model where you could get large numbers would be the rabbits.

There was also a discussion about the challenge and what should the challenge dose be, and the consensus was the appropriate challenge dose should be one that might be reasonably expected in an anthrax attack.

And then, finally, we come to the fourth criterion. And at the last workshop people just laid out possible strategies for how to -- what types of studies might help in producing data that would be useful in fulfilling this criterion, and the consensus was that, really, probably both active and passive immunization studies in animals would provide valuable information that would help fulfill this criterion.

So what are we going to do in today's workshop? What we're going to do is review the overall strategy that has evolved since the 2002 workshop, review the data that have been generated since 2002, and then we'd like to obtain input from the panel members and you, the workshop participants, on how best to move forward.

Okay. Thanks so much, and I'll take any questions.

(Applause.)

DR. NASS: My name is Dr. Meryl Nass. I didn't attend the April 2002 meeting, but I have read the transcript, and I -- there must be something wrong with me, but I certainly don't recall that there was consensus regarding acceptance of these two animal models as ideal for anthrax in humans, and I sent comments to FDA several years ago during a comment period when this current anthrax vaccine was relicensed, pointing out why these two animal models were not good.

So I just want to point out for the record that I don't believe there is consensus.

DR. LYNN: Okay. Thank you.

Anybody else?

(No response.)

Thank you, Drusilla.

Our next speaker will be Dr. Bob Kohberger.

DR. KOHBERGER: Okay. Thank you, Freyja.

I'm going to talk about some of the statistical considerations in correlates of protection, sort of to set the stage for the next day and a half, from at least a statistical point of view. And the outline of my talk is, first of all, what is a correlate and what is a surrogate? I think there's some confusion on these terms as to what they mean, and I'd like to get some definitions down.

Second point is: how do we obtain correlates? And then, how do we obtain surrogates? And where do we stand today?

Well, what's a correlate and what's a surrogate? Now, this slide comes from Tom -- it's based on Tom Fleming's publication in Health Affairs. A Level 1 is true clinical efficacy, where we have a clinical endpoint -- survival, whatever your endpoint is.

The second level in the Fleming definition is called a validated surrogate. Vaccines will often refer to these as a surrogate. And this means that the variable, the immune response, explains all of the clinical benefit.

The third level is, in Tom's terms, the non-validated surrogate. It's reasonably likely to protect clinical -- predict clinical efficacy, and in vaccines we can call those predictive correlates. And the key point here is there is no statistical validation of this, but people -- scientists, experts in the field -- feel that it's reasonably likely to predict benefit.

The fourth and lowest level is just a correlate, and here the immune response is related to the clinical endpoint. We'll call that a correlate. Now, since I made this slide, one of our panelists, Dr. Self, kindly published a paper last week that goes into more detail on this.

So the next slide is not in your package, but it takes what Steve and his colleagues have done and basically -- and, of course, Steve will have a chance to rebut this -- it seems to me that what they're doing is a Fleming Level 2, which is a validated surrogate, they break it down into three more refined levels, because after all when we say it explains all of the clinical benefit, what does that really mean, and how do you do it?

Well, in this framework -- and there's the reference from JID, and it was just seven days ago -- starting with a Level 1 surrogate of protection -- and this is statistical -- that means that your immune response is predictive of vaccine efficacy within a defined setting, a defined population, a defined use. Usually it comes from a single large trial, and the typical analyses are the Prentice criteria for surrogates, and we're going to talk about those.

A Level 1 surrogate of protection principal -- same definition, it's within a defined setting, it's usually a single large trial. However, the analysis for causality -- and we're going to speak about this a little bit -- are using principal surrogates, which if you're familiar with this it's also known as the Rubin causal model, and Dr. Rubin is also on our panel. So questions about these kinds of causal models, we have some good people to help you with.

The Level 2 surrogate of protection says that it's predictive of vaccine efficacy in different settings, different populations, different uses, and it comes mainly from multiple trials which means you need a meta-analysis where we would test this in different age groups, we would test it in immune-deficient subjects, and we'd find exactly the same relationship of an immune response to our clinical endpoint. That would be a Level 2.

So from what's -- Dr. Self has done, I think, what it really takes is this validated surrogate and helps us define what all of the clinical benefit really means. And we're going to talk about that a little bit. So we have surrogates, and we have correlates.

Why do we care? First of all, there's a scientific understanding of the process. When you're developing a vaccine, you can't do a Phase 2 trial for efficacy. You're doing immunogenicity. So it helps in our vaccine development if we know how immunogenicity is related to the clinical endpoint.

If we're wrong on this, it's really the risk for the vaccine developer, because when it goes into the Phase 3 and clinical efficacy is tested the vaccine fails. An example of that is the recent Merck HIV vaccine. We use it to predict vaccine efficacy without an efficacy trial.

Very often efficacy trials are not feasible. In vaccines where you have combination vaccines we can't do efficacy trials. We can't have placebo groups. We need to use a surrogate or a correlate to predict efficacy and get products licensed. These are used for formulation changes, different products as I mentioned, combination vaccines.

If we're wrong, where is the risk? Well, the risk now is with public health, because products are getting licensed and used. But that's why we care about it.

How do you get a correlate? First of all, we're going to relate an immune response. And it may not be one response, it may be multiple responses. It could be the same response over time, and I think you're going to hear some of that a little bit later. It could be different responses, such as an acellular pertussis where we have four immune responses that we're measuring. But we want to relate that to some outcome of interest. Generally, it requires paired observations. You need the subject's immune response and the subject's clinical outcome. Some of the examples -- pneumococcal conjugate vaccines, and so on.

I emphasize paired responses in general because for pneumococcal conjugates, for invasive disease, when that product got licensed I believe there were only 20 cases. Just about all of them were in the placebo group, and none in the vaccine, so the vaccine works.

The problem was we didn't know what the immune responses were for those 20 breakthrough cases. We didn't have paired responses. We did, however, in -- I say "we," because I used to work for Wyeth and was involved with that, so that's why I say "we." There were paired responses for otitis media and paired responses for colonization. So we are able to do correlates in those settings.

How do you choose the immune response? Well, IgG ELISA is often used. The reason is it's easy to use, you can do a lot of observations very quickly, rather inexpensively. So the developers would like to see IgG ELISA used.

There are also functional assays, and we're going to hear about that next. Typically, I think most people would prefer a functional assay because of its name. It measures function of the antibody as opposed to the ELISA.

Second point -- when should you measure this immune response? You can measure it right after vaccination, or you can measure it prior to challenge. Now, this gets into duration of protection. If you measure right after vaccination -- and this example is from varicella, Merck's varicella vaccine, which is what they did -- and then follow up for two years, you can measure vaccine efficacy as would typically be done in a vaccine efficacy trial.

Sometimes you can't do that, and you measure prior to challenge whether it's an experimental challenge or whether it's like a household contact, and we'll talk a little bit.

How do you choose the event? Well, the type of the event is the clinical endpoint that you're interested in, and that you really want to use to predict for vaccine efficacy. Typically, it's infection, it's a clinical disease state, it's death. And when it occurs, as I mentioned before, is it over time? Is it a longitudinal follow up? Is it over a two-year period, or is it right after challenge? "Right after" may just -- may be days. So you have to choose your event carefully.

And the consequences are, as I said, when you measure the duration, number one is a typical situation in clinical efficacy. For varicella, you measured it over two years, and that's what vaccine efficacy was. Same thing for pneumo conjugate. This couldn't be done for pertussis, because basically right after vaccination acellular pertussis vaccines had very high immune responses. Efficacy was in the 70 or 80 percent range.

Well, why is that, when the immune response is so high? Well, antibodies decay over time. So in order to get a correlate in pertussis what they needed to do was look at a challenge type experiment, which was a household contact study where the subject -- the index case basically brought the organism into the household and subjects were there challenged. And luckily in Sweden they had multiple immune measurements on subjects and could come up with a pretty good estimate of what the immune response was just prior to exposure in the household.

So for pertussis you could do it, but it's important to remember what our inference is now for pertussis -- that it measures just prior to exposure. So the consequences of when your immune response is and when your event is are important, and you have to keep that in mind.

Well, how do you choose this relationship? There has been several different approaches. One is the step function, and this in a sense is very similar to protective levels, which we know for tetanus, diphtheria, and a whole host of others. And basically step function says that below some level the risk for the clinical event is quite low and constant, above some level it's quite high and constant, and that's a step function.

It's a weak model, in that most of us would think that the probability, the risk of an event, is continuous as a function of an immune response. And the step function says -- steps up. It just changes very quickly. So to look at this continuously, logistic regression has been used, and the formula for the logistic model is there.

Since the probability event -- of an event is equal to that formula, it has been used with a single response. It has been used with multiple responses, as in acellular pertussis, as in responses measured over time. That X there can be more than one variable. It can be a whole host of variables.

In addition to logistic regression, time to event models have been used. Cox proportional hazard models were used with varicella to look at the hazard ratios of the event occurring as a function of the varicella response.

Case control studies have also been used, and in one particular case -- Group B strep -- case control study where they were estimating protective levels. So there is a -- there's quite a few different ways of relating this immune response to the clinical endpoint. The two most popular or the two most that you see most of the time are this step function, which is just a protective level, and logistic regression.

So the results -- what do you get out of these models? Well, as I said, with the step function you get protective levels, their cut-offs. And if you're going to take this approach, you need to look at the sensitivity and specificity at the particular level that you choose. And I always get this a little bit backwards, so I have to refer to my notes on sensitivity and specificity, if you'll excuse me for a second.

Here we go. Sensitivity is the probability of being greater than the level given that you have the event. Specificity is the probability of being below the level, given that you don't have the event. In diagnostic testing these two are used most often to determine what the level -- what the cut point should be.

There is also something called a positive predictive value, negative predictive value. Positive predictive value is the probability of the event given that you're above the level. It's the reverse of sensitivity. And negative predictive value of probability, you don't have the event given that you're below the level. It's a little confusing. You'll see some examples later.

For the most part, positive predictive values and negatives are not used very often, because in epidemiology the positive predictive value depends on the incidence rate of the event in the population, not just on the diagnostic test, whereas sensitivity and specificity, because it's conditional on getting the event, it's not dependent on that. So sensitivity and specificity are most often used.

So if you're going to deal with protective levels you need to look at sensitivity and specificity. As I said, some of the examples are in diphtheria, tetanus, polio, Hepatitis B, influenza, meningococcal. They have protective levels.

As I said, we can use continuous functions. These are logistic models, survival models, where the relationship of the immune response to the clinical event is continuous. And some of the examples where this has been done is varicella and pertussis, pneumococcal conjugate and colonization, and in otitis media.

So to summarize, the most simple are a single response, antibody after vaccination, a single outcome, a disease state whether you have the disease or not, and a logistic model or a protective level. You can get more complex. You can get into time-varying, multi-variate immune responses. You can have time-varying longitudinal data series for the events that are happening, and the relationship model is going to depend on how you set this up.

And at the bottom one of my favorite quotations, which I think is attributed to George Box who is a statistician, is that all models are wrong, but some are more useful than others. So when we pick these models it's for their usefulness, knowing that they are not always completely correct.

So how do we obtain a surrogate? What we were talking about before are just correlates. All they do is correlate things. Well, we're looking for causality, and correlation does not mean causality. And this quote is from Tom Fleming that "A correlate does not a surrogate make." Just because we found correlation, it doesn't mean that it's a surrogate.

In general, a surrogate explains all of the relationship, and let's talk about how we define this. Did I skip something? I guess not.

Just as a little thought experiment -- and what I mean by "all" -- suppose we have two groups, and they're randomized to vaccine and placebo. We vaccinate them, then we challenge them, and we measure the immune response prior to challenge.

And the results of this experiment are that in the vaccine group 80 percent survive, 10 percent survive in the placebo group, and the immune response in the survivors is 10, and the immune response in those that died was two.

Did the vaccine cause an increase in survival? The answer is yes, because we've randomized these two groups. Randomization is what gets us to causality.

Did the immune response cause the increase in survival? Well, we have a significant difference in the mean response between those that survived and those that did not. We may have a significant logistic regression here where we can predict immune response and survival.

Is it causal? Did it cause it? Well, maybe. The immune response isn't randomized. It's a post-randomization event. The subjects that got these low responses may be somehow different from the ones that got high responses and that's why they survived.

So from this little thought experiment, the vaccine caused an increase in survival, but without some additional work we can't say that the immune response is a causative factor. So how do we do this? How do we get this causative factor? How do we obtain a surrogate? Whether it's in Level 1, Level 2, but it's causal.

Four kinds of approaches. There are causal diagrams, and we're going to go into these. There's apprentice criteria, there's the principal surrogates or principal stratification. It's known now I guess as the Rubin causal model. And there's Tom Fleming's hierarchy.

Now, causal diagrams are diagrams that demonstrate the causal effects. And this experiment, which is from Judah Pearl's article, referenced here -- we have a fumigant, and we want to estimate its effect on crop yields. Well, the way the fumigant works is we have to worry about last year's worm population, the worms are eating our crops.

We have to worry about the predator -- the worm predator populations, and the worm populations before, after, and end of the season, the growing season. And this structural equation model shows how all of these factors interact for the fumigant, the crop yields, and you can get a causality through something called structural equation models.

My opinion is that these kinds of causal diagrams are very good for looking at causal effects, for visualizing them. Using structural equation models are a little harder, and I have difficulty with them, and I -- I haven't seen too many applications in vaccines of these.

What is more popular and used more often are the Prentice criteria. And I mention these -- this would be a Level 1 surrogate of protection statistic in the recent publication. Four criteria -- references for these are down at the bottom.

The treatment impacts on the surrogate endpoints. The vaccine impacts the immune response. The treatment impacts the clinical endpoint. The vaccine increases survival. Third one is that the surrogate is related to the clinical endpoint in a correlative sense. In other words, immune response is correlated to the clinical endpoint. And the fourth one is that the surrogate contains all of the information about the clinical endpoint. And if you meet these, you've met the Prentice criteria to get a surrogate.

Mathematically, and this is I think about the only little math slide I have in here, the first three are not hard to meet in particular, because they're tests of significance. Is vaccine related to survival? Is immune response related to survival? Is vaccine related to immune response? And that's just a significance test. That's pretty easy.

The last one, does it explain all of the response? That's a lot harder. It's an equivalence test. Basically, what you have to show is that when you have the immune response in our model against the clinical endpoint, this term in here, which is treatment, and I just show it as one, that coefficient has to be zero. We can't prove anything is exactly equal to zero. It's an equivalence test.

So the thought in statistics is to look at proportion of treatment explained. In other words, if we explain 90, 95 percent, that's pretty close. By the way, this doesn't necessarily have to be just one factor -- vaccine, yes or no. We can look at things like, is the immune response consistent in the vaccine group and in the control group? Is it consistent across age groups? For the Prentice criteria, these would all have to be zero.

Now, I'm not going to talk about the Rubin causal models in the interest of time, and I think since Don is a panelist, if you want to get into that, you can. But I'd like to go into, where do we stand today?

Just to remind you, as we talk through these things, you're going to hear I think mostly about correlates, what's related to survival. We need to see -- we need to remember, three is a predictive correlate where the scientific body of knowledge says that, yes, it's reasonably likely to predict it.

Moving up is Level 2, which is our surrogate, and there can be three kinds of surrogates as in that JID paper, where Level 1 is in a specific application, Level 2 is general, and there's different ways of proving that.

So as we go through the next day and a half, there's a couple questions I'll leave on the table. Has a correlate been obtained? How do we move from a Level 4, which is the correlate, to the Level 3, which is a predictive correlate? What information is needed to show that it's reasonably likely to predict clinical benefit?

I think it may be unlikely that we can get the Level 2 surrogacy, that validated surrogate, but maybe we can. And, if so, what do we need to do?

So to summarize, these correlate models, we need to determine the most useful model for relating the immune response to the clinical outcome. We need to consider surrogates, most likely in the Prentice sense, but a realistic goal is to move from this Level 4, which is just correlation, up to Level 3 where we think that it's reasonably likely to predict clinical benefit.

And I haven't mentioned at all pan-species surrogacy. Is it acceptable to infer from rabbits and non-human primates into humans? And how do we do that?

So I will answer any questions that you have.

(Applause.)

MR. SUTER: Mark Suter is my name. I would like to go to the last point. How do you actually do that? Maybe you compare -- is there a logical immune response between the rabbit and the human? The human has four IgG, the rabbit has one. The human has IgD, rabbit has none. Human has two IgA, the rabbit has 12.

DR. KOHBERGER: I'm a statistician, not an immunologist.

(Laughter.)

So I'm going to finesse this question, because I don't know. I mean, from a -- you know, from a statistical point of view, I mean, we think pretty -- I don't want to say simplistically, but we'd like to see efficacy trials in humans. But we can't do it. I mean, you know, it's impossible.

So what we need are the immunologists to come to some sort of an agreement that these immune responses that we obtain in non-human primates and rabbits are reasonably likely to predict efficacy in humans, in the face of all that you've said.

MR. SUTER: Thank you.

DR. KOHBERGER: Anything else?

(No response.)

Thank you.

DR. LYNN: I'll just introduce myself again. I'm Freyja Lynn, and what I want to talk about in the next 20 minutes or so is the status of the assay that you'll be hearing a lot about in the next day and a half.

This is the assay that is a likely candidate to be used as a correlate of protection, and I just want to sort of talk about the assay performance, so that in fact we can reassure ourselves that the assay is a good choice, just from an assay standpoint. So I'm not going to get into any of the correlative stuff. I just want to talk about the assay itself.

Before I go any further, I have to admit that I didn't generate any of the data, didn't do much of the data analysis either. This assay was originally set up in the USAMRIID Laboratory, was transferred to the CDC. The CDC did a tremendous amount of work on standardizing the assay and tech transferring it, and I think some of the data I'll show will show you how successful they were in that effort.

I'll be showing you data from an interlab study. The participants are listed here. Tremendous amount of work from Battelle groups. And, finally, the data analysis was done by Precision Bioassay -- David Lansky is the head of that group -- and his people have done a tremendous job looking at these data for us.

So what do we think about when we think about an assay that's going to be used as a correlate of protection? The toxin neutralization assay that I'll be discussing today actually measures the ability of serum to neutralize the effect of lethal toxin on a cell substrate or a monolayer of cells. And I kind of broke this into three areas that I tend to think of.

The first is the relevance. Obviously, that's the most important in certain regards, and I think we'll hear a lot in the next day and a half about the relevance of this assay. And just to touch on that briefly, we do know that antibodies to toxin are a mechanism of protection. You'll see -- and, again, I'm not going to present any data on the relevance issues right now.

TNA is attractive, as Bob said, because it measures the functional antibody rather than just all of the antibody that's generated, and you'll see data in the next day and a half that show that it correlates quite well with protection in rabbits and non-human primates.

An assay has to be applicable. You have to be able to use it, and it has to be appropriate to answer the questions that you're asking from an assay performance standpoint. And I think what I'm going to try to do today is show you that, in fact, the assay is adequately sensitive, dilutionally linear and precise, and that it is a pan-specific assay, or a pan-species assay. And I think this is critical as we move forward.

The question we just got: how do you compare among species that have different antibody subclasses? Well, I think a functional assay that performs the same across species is a good first step.

Finally, the assay has got to be practical. You can't have an assay that takes you two weeks to run a single sample. And, again, I hope to convince you that we have good precision in this assay that allows for a high throughput, and that it actually is quite robust across multiple laboratories.

For those of you who are unfamiliar with what the assay does, essentially all you do is mix lethal toxin together with your serum sample. Antibodies in the serum will neutralize the toxin. You add that to a cell monolayer, and if there's any free toxin left it will intoxicate the cells and kill them, and you measure the viability of the cells after the intoxication.

The data analysis, for those of you -- just a quick brief for those of you who are not assay geeks like me, just so you have a clue what I'm trying to tell you, you run a series of dilutions of each serum sample, and so you get a titration curve -- you get a titration curve that is just simply plotting the OD, which is the cell viability, against the dilution of the serum.

The data are reduced for each titration curve using a four-parameter logistic fit. The four parameters are the lower asymptote, the upper asymptote, the inflection point, or the ED50 as we call it, and the slope, and I'll be talking about those four parameters in a little bit -- a little bit later.

You'll see here that for very high titer samples we get very nice full curves. For lower titer samples we get what we call partial curves. Again, I'll be talking about that a little bit later. The readouts that we're using are the ED50, which again is simply this inflection point, and something we're calling the NF50, which is simply the ED50 of the sample divided by the ED50 of a reference run on the same plate. We've found that this actually normalizes data between assays and between labs, and I'll show you some data on that in a moment.

DMID has been sponsoring a variety of different studies that have been conducted in a variety of different locations. I'm not going to go into all of them. I just wanted to give you an idea of what we've been working on. What I'm going to talk about today are some of the validation that we've been doing for rabbit and non-human primate. We do also have a human validation underway, but I don't have data on that right now.

And mostly I'm going to talk about the -- what we'll call the interlaboratory study, so let's get into that. The interlaboratory study was a tremendous effort from a lot of different people, and it involved seven different laboratories.

And what we did was we put together 108 samples that we sent out as a panel blinded to each of the seven different laboratories. We had a mix of rabbit, non-human primate, and human samples, and we asked the laboratories to provide us with two reportable values, so that we could look at precision as well as agreement.

We included in the panel -- we included in the panel low, medium, and high samples. We also did spiked samples where we took high samples, we spiked them into negative serum so we could look at dilutional linearity in each of the three species, and we asked each laboratory to run their own assay. And I think it's important to note that six out of seven of the laboratories had participated in a common tech transfer sponsored by the CDC, and those data I think are quite interesting.

For analysis, we were interested in essentially three different areas. One was the pan-specific or pan-species performance. Can we really say that the assay performs the same for each of the three species and allow us to make direct comparisons among antibody levels among the three species? And we looked at titration curves, the individual titration curves, and the dilutional linearity for each species.

We also looked, then, at agreement among the laboratories. We have different laboratories doing different assays that ultimately will have to probably be compared, one laboratory doing clinical samples, another laboratory doing animal samples, and so we need to understand how the data from each of these laboratories are related.

And, finally, we looked at some -- I'm going to show you some precision data, both from the interlaboratory study and from some of the assay validation work.

So right into the data. This is from the interlaboratory study, and we were looking at, again, the similarity of the species. So we're going to talk about the species comparability within the assay. What we have here is we had four different laboratories which provided us with the parameters from the four-parameter logistic curve fit.

We have the lower asymptote, the upper asymptote, and the four-parameter slope. And across the bottom you can see that for each laboratory we have the reference material, which is a human reference, the human samples, the non-human primate, and the rabbit samples. And all we're doing is comparing the parameters for the titrate -- for each of the -- or from all of the titration curves for each species.

And what you can see is that for the lower asymptote and the upper asymptote, for each laboratory each of the three species is essentially behaving exactly the same way. So this provides us evidence that the titration curves themselves among the species are very similar in the assay.

You'll note that within a lab you may see a little bit of difference from lab to lab, but within a lab they are very similar. In particular, I find it interesting that the slope looks very good.

This is a very similar analysis, except it combines all of the data from the four laboratories, and it looks at each of the three species with regards to the human reference material. So if we're going to use like a human reference in all species to incorporate the NF50, which I spoke of earlier, then we need to show that the titration curves are the same, so that we are legitimate in making that comparison.

And, again, the lower asymptote, the upper asymptote, and the slope, especially for the human and non-human primate, are very tight. The rabbit may be a little bit different here, but it's under 30 percent, and this is a cell-based assay.

I think it's interesting and probably predictable that the human and the non-human primate would be the most similar, and having a little bit of a different rabbit is not unexpected, and we don't feel that this is a huge, huge difference to raise any real concerns.

The other aspect we looked at was the dilutional linearity. Again, we took a sample and we created a series of spikes from that sample, and then measured the ED50. So if you plot the spike versus the ED50, you should get a straight line with a slope of one.

We had, you know, several samples that we did dilutional linearity for, for each of the three species, and as you can see for the most part the broken line is the ideal, the solid line are the actual data, and for non-human primate and rabbit you can see that they are astonishingly dilutionally linear.

The human may be varying just a little bit. It turns out that this slope is about 1.16. Again, maybe a little steeper than the other two species, but, again, well within what we would expect for a bioassay.

So, essentially, this is just a conclusion stating that we think that essentially the species are performing the same within the assay.

The next thing we looked at were the laboratory-to-laboratory agreement in the interlab study. These are a modified Bland-Altman plot, where the -- each laboratory is compared to a consensus ED50 that was calculated by using all of the data from all of the laboratories.

And on the Y-axis, here is exact agreement one to one, so perfect agreement would be a straight line at the one to one. This would be a two-fold difference, a four-fold difference, and these are the 95 percent confidence intervals. So as you can see, when you lose the ED50 as a readout, we are seeing some systematic shifts like, for example, between Lab A and Lab C.

I think it's interesting to note Lab D was the one laboratory that did not participate in the tech transfer, and they are the ones that may be just a little more different. But if you start looking at the higher ED50 values, they also agree quite well. We lose agreement for the most part at the low end, and that's actually true generally across the board.

And, again, that would be expected. You tend to get your least precise, least accurate values at the lowest ends of the assay. And this is using all reportable values, so we haven't taken into account a limit of detection or a limit of quantitation.

If you do the same analysis but you use the NF50, again, that's the ratio of the ED50 of the sample to the ED50 of the reference, you can see how that starts to normalize the data, so that some of these shifts start to level out, everybody kind of comes closer together in terms of agreement. But overall we think the data show that the labs are very, very close.

This is the same kind of a plot. It's just each laboratory compared head to head to every other laboratory in the study. And I'm just including it; I'm not going to go through it. It essentially gives us the same message.

So essentially if you look at the data we had very good agreement among all seven laboratories, especially when you look at ED50s well into the working range of the assay. And I'll show you some data on LOD and LOQ in a minute. And the six laboratories that participated in this tech transfer had actually quite phenomenal agreement.

And, again, I remind you this is a bioassay. This is not an ELISA. This is a difficult assay to run, but it has been so well characterized and standardized that, in fact, it performs amazingly well.

When we looked at assay precision, again, this is from the interlaboratory study. If we take all of the data from all of the labs, and we say, okay, over everything, between all the species, between all the labs, how precise is this assay, and if you look at the ED50 -- and this is a percent relative standard deviation, which is essentially the same as a coefficient of variation, if you're used to seeing CVs -- the total variation is only 45 percent. That's seven laboratories and three species and a bioassay.

And I'll tell you that when we did the validations our criteria in an individual laboratory was round about 50 percent. So even if we go to seven laboratories we're still under 50 percent.

Here is some nice data about the NF50. If you go to using the NF50, you can see that the laboratory variability goes from 29 percent to 13 percent, which in turn drops the total variability down to 35 percent. And I think at this point, for those who are interested, we are looking into moving forward with an NF50 readout, so that we can normalize data and hopefully make data more comparable as we move forward in the project.

That was all laboratories, all species. I just thought you might be interested in seeing just one species in a single laboratory. These are the rabbit validation data from Battelle Biomedical Research Center, who is performing validations on all of these laboratories. This is, again, just ED50. The NF50 data are similar. And you can just see the various different components.

One thing to point out here is, as I pointed out in the data slide, we have full curves, partial reactive curves, and non-reactive curves. These are our pretty curves where we get upper and lower asymptotes and everything is really pretty. The partial reactives and the non-reactives rely on the software to do some extrapolation. And, again, these are the lower samples.

And as predicted, your CVs are lower. These are your CVs. The PRSD is lower for full than for your non-reactive. And, again, this is just a reflection of the fact that we're well within the working range of the assay, and you can see how tight -- this is just -- the plate is.

If you go down to the total variability, again, for full and partial reactive, we're running about 25 percent. We get to non-reactive, our lowest values, and our PRSD goes up as expected. But, again, even 37 percent at this low value is quite good.

So I think just in terms of assay precision, within a laboratory it's actually a lot tighter than we thought it was going to be at about 25 to 30 percent for the ED50. And when we move to multiple laboratories, multiple species, we're still only at 45 percent. And, in fact, we can improve that if we go with the NF50 readout.

A little bit about assay sensitivity. Limit of detection, this is calculated looking at the probability of a non-zero ED50. Essentially, if you measure a sample and you get a positive result, a value spit out at you, how confident are you that the next time you assay that sample it will be positive again?

And so this is where we know at about an ED50 of 25 percent -- or, I'm sorry, an ED50 of 25 we have about a 95 percent confidence that if we measure that again we'll get a positive value. So we know that any ED50 above 25 is truly a positive measurable value. Anything below 25 we have a little bit of question as to whether it's truly negative or truly positive.

The limit of quantitation is the point at which you begin to improve your precision, you get more confidence in the precision of your data, the LOQ -- this is showing both rabbit and non-human primate, where the LOD is the same. This is where we show a little bit different between the rabbit and the non-human primate, the LOQ for the rabbit being 35, for the non-human primate being 45. This is based on the probability of a reactive curve.

Again, we know that the reactive curves are giving us our best data. So this is actually a pretty conservative estimate for the LOQ, and you can see that in fact it is quite good at 35 and 45. The other point is that the rabbit and non-human primate end up with the same LOD and very similar LOQs, which again is evidence that this assay is measuring very similar things in the two species.

So, in summary, I think our data to date suggests that we don't have any really important differences among the species, and that this is in fact a pan-species assay. We've looked at the performance of the neutralization curves.

Again, one of the issues with the neutralization curve is if you are seeing very different mechanisms, if the avidities were very different among the species, if there was truly differences in character of the antibody, you'd start to see that reflected in the titration curves. And we're not seeing a big difference.

The species are performing the same with regard to dilutional linearity, precision, and the LOD, and the LOQ. I think the other thing is that the laboratories, especially when they cross-calibrated, are performing very similarly and reporting results almost identically. And I think that bodes well for the use of this assay moving forward.

And with that, I'll take any questions.

(Applause.)

MS. VOLKMANN: Ariane Volkmann. You have shown -- I have two questions, actually. You have shown that the standard deviation is much smaller when you look at the rabbit assay as compared to the total where you have the three species. Is that because it was performed in one lab, or is that because there was a real difference between the species?

DR. LYNN: I think that's mostly because it was performed in one laboratory. I'd have to go back and pull up the slide again. I think it's mostly because it was performed in one laboratory. When we finish the non-human primate, we'll know the precision for the non-human primate versus the precision for the rabbit, and right now it looks like those precision values will be very similar.

MS. VOLKMANN: So if you look at the ED50 and compare the species, they have the same values?

DR. LYNN: I don't understand what you mean by "the same values." Each animal has its own ED50 value, but when you --

MS. VOLKMANN: Okay.

DR. LYNN: -- repeatedly measure a sample, the variability that you get among your measurements is the same whether you're measuring a rabbit sample or a non-human primate sample.

MS. VOLKMANN: Okay. Yes, I'm asking because of the comparison.

DR. LYNN: Right. Right.

MS. VOLKMANN: And the second question is: does that functional assay correlate well with the ELISA? Because when you measure by ELISA, because it's easier, you always measure the neutralizing antibodies as well. So if it correlates well, I don't quite see why you do have to do the functional assay, because you know that you have functional antibodies detected in your ELISA.

DR. LYNN: Exactly. And that's a very good point. And, in fact, these two assays do correlate quite well. The problem that's coming to light at this point as we get more data is that depending on when you measure the immune response, whether you measure post first vaccination or post second vaccination, early in the response or late responses, the correlation between those two assays changes.

And so you can't -- you can't universally say that the assays correlate all this time the same way. That correlation changes. And so to me that means that you're going to get a slightly different answer in different studies depending on which assay you use. And my bias is to go with the functional assay where we can, because we think that is probably the more relevant measurement.

MS. VOLKMANN: Okay. Thanks.

DR. CHAWLA: I'm Anil Chawla from Panacea Biotec Limited. In slide 8, when you were showing the similarity of species with the laboratories, you have only used eight laboratories -- four laboratories, data from four laboratories. Why is that so?

DR. LYNN: That was -- that's simply convenience. Those four laboratories were actually the only ones that reported the four parameters directly, so it was very easy to extract for those four laboratories those data. We can go back and get those data, but for this analysis it wasn't worth going back to all seven laboratories. So it was purely convenience. We had the values for those four labs.

DR. CHAWLA: My second question is related to MTT dye. There are issues in using the MTT dyes. So there are better dyes, water-soluble dyes, which are available now. Are you having any move to move toward that -- those kind of dyes?

DR. LYNN: There are some laboratories that are working on developing the assay further, and one of the aspects is using a different dye. And, in fact, one of the laboratories in this study does use a slightly different MTT method, and their data came out looking essentially exactly the same. So I think we could -- we could do that kind of improvement.

DR. FERRIERI: Pat Ferrieri. I wanted to be sure that I understood that the assay done and reflected in your data was based on the PIT publication in terms of the precise doses of recombinant PA and lethal factor. Is that the case or not?

DR. LYNN: I would have to go back and look at that, but the doses of toxin are very similar among all of the assays. In fact, most of the laboratories that are running the assays are using the same material from List, and CDC is actually the one taking the assay from RID and did some more fine-tuning in terms of selecting the optimum doses, but, yes, they are essentially the same.

DR. FERRIERI: So it is the rPA.

DR. LYNN: Yes. Oh, absolutely, yes, it is rPA.

DR. BURNS: Drusilla Burns. I just want to get back to the question about can you use ELISA instead of the TNA. And I think one of the big problems in using ELISA is ELISA uses species-specific reagents in order to develop it. So I'm not sure that an ELISA titer from one species could be directly translated to the other. The beauty of the TNA is there is no species-specific reagents.

DR. LYNN: Yes, Conrad.

DR. QUINN: Conrad Quinn, CDC. I apologize, I'm losing my voice, so I'll be squeaking later this afternoon. To address Dr. Ferrieri's question, the assay that we're talking about here was technology transferred from the CDC. It was based on publications by Steve Little and Art Friedlander.

The amount of toxin is titrated to give 95 percent cell death, so we're actually building a model around 100 percent survival and 95 percent cell lysis. So it's titrated to give those values.

MS. BELLE: I'm Archana Belle from Planet Biotechnology. With regards to species-specific, I had two questions, one is with regard to the ELISA and species specificity. We, of course, now allow us to do this without having a second detection agent, so we can bypass the issue of species-specific reagents.

A second question is this TNA does not look at the clearance mechanism of the toxin/anti-toxin complex. Have you any thoughts about differences in species? And are animal models still correlative to humans? Or any thought about, how do we look at that as a big picture?

DR. LYNN: We've not done any looking at the clearance mechanisms, and I know that there are several -- there are several papers out there looking at the clearance and how the toxin is cleared at this point in time. But no, we haven't -- we haven't looked at that.

MR. KAMMANADIMINTI: Srinivas from Cangene. I would like to know what was the reference standard used for NF50 calculation.

DR. LYNN: AVR-801.

MR. KAMMANADIMINTI: 801.

DR. LYNN: Yes. That is -- that's the reference that was developed by the CDC. It is available through the BEI -- NIAID BEI program, and ultimately I think that's probably going to be our gold standard for our work.

DR. QUINN: Conrad Quinn, CDC, again. Regarding ELISA and TNA, or ELISA versus TNA, I think in our perspective it is important to use both, because, yes, the TNA does measure neutralizing antibodies, but in relevance to clearance antibodies that bang protective antigen are still biologically active. Although they may not neutralize the toxin, they are still part of the clearance process and complex formation, so they should not be excluded from our analysis. So I would suggest that ELISA and TNA are both important.

DR. LYNN: Yes, I would agree.

Okay. We are, amazingly enough, running 15 minutes ahead. So let's go ahead and take our break, and we'll see you back here at the appointed time.

Thank you.

(Whereupon, the proceedings in the foregoing matter went off the record at 9:47 a.m. and went back on the record at 10:26 a.m.)

DR. HEWITT: Okay. We're going to Open Session Number 2 on Animal Models for General Use Prophylaxis. Our first speaker from DMID is Ed Nuzum, and he's going to talk about the Rabbit Challenge Model: Interpretation and Implementation of the Animal Rule.

DR. NUZUM: Thanks, Judy. And good morning, everyone.

So in my talk today I'd like to just kind of reflect on some of our experiences and how those experiences impact our interpretation and implementation of the Animal Rule with regard to the rabbit anthrax aerosol challenge model.

PARTICIPANT: We can't hear you.

DR. NUZUM: Okay. Is that better? So I'm going to talk about our experiences with regard to implementation of the Animal Rule. As most of you probably know, this is a critical part of our rPA anthrax vaccine development program, so it has been something that we have given a lot of emphasis to.

So this fairly simple cartoon was made by Scott Winram a few years ago, and it certainly oversimplifies the complexity of everything we're trying to do. But most of you are I'm sure familiar with non-clinical studies and clinical studies that are conducted in support of vaccine development.

There is a couple of pieces I think we tend to overlook or not get enough attention to early enough, the first one being the product itself, the countermeasure that you're working on. And it's important with regard to the animal model because once the natural history studies are done, you have to have a product to put into the model to continue development.

And there's a concept that we try to talk about of quality and maturity of both the model and the product. Early on in the product development stage we think it's fine to have product that -- well, you're not going to have product that's high quality product, and the model itself will also be immature.

But the idea would be that as the development path -- as you go down the development path, and the product matures, the model matures, such that when it's time for pivotal studies that are IND- or BLA-enabling studies, the model would be much better characterized.

The other piece is assays, and Freyja just gave a nice talk on what's involved with that and how complex that whole piece is. And it's absolutely essential, because that's the piece that ties all of the data from multiple species and multiple labs, and it's very complex. We're very fortunate to have Freyja help us run those trap lines.

Our approach is really very straightforward for the GUP indication. We do active immunization with increasing doses of vaccine. And as we evaluate the vaccine, dose-dependent immune response with regard to protection, then determine an immune response at the time of aerosol challenge, and then concurrently we conduct clinical studies and then we evaluate the protective titers in animals with regard to the immune response in humans.

When we first started these studies, or this entire effort, we came up with several areas -- or folks -- those of us in the government, contractors, everyone came up with several areas, large focus areas that we thought would be needed to be looked at during the development cycle for the product. And we probably -- well, we have the first few years concentrated on these first three bullets. We're currently moving our focus into passive immunization and time to protection, and ultimately we'll do high-dose challenge studies and duration of protection, so that -- we'll do those studies when we have a more final drug product that's consistency lot material, it's GLP product made at full scale.

I want to talk about several assumptions, and on the surface they seem very straightforward. And those of us that work with this every day, they have become kind of second nature. But I want to emphasize that these assumptions have either came out of public workshops or considerable internal debate, and there is a lot that has been -- that has happened and discussed behind the scenes that go into these assumptions.

But they have been absolutely critical from the standpoint of starting the studies, and then also for advancing and making progress with the model as we get additional data and perform additional studies.

So the first assumption is that rabbit and non-human primate are relevant model species for prediction of protective behavior of anthrax vaccines in humans, that their protective correlates developed in non-human species will be protective -- predictive of protection for humans, and that the clinical benefit provided by countermeasures and relevant models provides confidence that similar effect of countermeasure in humans will be predictive of clinical benefit in humans.

So that's kind of a mouthful. It seems rather circuitous, but the key piece here is the relevance of the animal models. And I think that's really the -- I want to emphasize that, because I think that's the basis of the Animal Rule. If the animal responses, immune responses, pharmacokinetic responses, the pathophysiology, are similar to what you see in man, then that enables you to make this extrapolation from animal efficacy to human immunogenicity.

Anti-PA antibody mediated neutralization of anthrax toxin is an acceptable correlate because it can be used to reasonably predict clinical benefit, and it is associated with the prevention of known pathological -- pathophysiological mechanisms.

Now, Drusilla touched on these same points, and I guess what I want to emphasize here is that the terminology, you know, regarding clinical benefit, pathophysiological mechanisms, comes straight from the Animal Rule. And what I want to emphasize, we do use the Animal Rule as our guiding principle in this model development, and we're actually implementing it. And I hope that this workshop will show that is in fact what we are doing.

Circulating antibodies to PA at the time of animal challenge are an appropriate predictor of protection. This is a point -- and Bob alluded to this in his talk -- there's -- this is our assumption. There's other ways to do this with regard to timing of when you measure the immune response and the event, and I think CDC will talk about this a little bit. But another option is to look at peak responses soon after vaccination, and then look at how that correlates with protection when challenged at some point in the future.

I don't -- it's not that either is right or wrong. It's just -- I'm just pointing out there's different approaches, and this is the assumption we're using for our models.

Next, there is an antibody threshold above which protection is conventionally adequate, and the antibody threshold is the same in animals and humans. Again, conceptually, on the surface this sounds very simple, but implementation, doing the studies, getting the data to demonstrate this is not so straightforward.

The Animal Rule does not specifically require correlates or surrogates of protection. I mention -- I make this point, because depending on the complexity of the model, the endpoints you're looking at, it simply may not be feasible to obtain actual correlates. So I think that's something we need to keep in mind. But that said, the correlates provide us a very powerful tool.

And if we can get a correlate, we need to do it, and that's the reason this discussion will -- this workshop that will focus a great deal -- well, it is the focus of the workshop. So it's just -- but it's a point that I think we need to be -- to keep in mind what the Animal Rule really says and what it doesn't say, and then keep that perspective.

So the assumption, then, is that -- here is that to the extent that correlates of protection are feasible, attainable, and facilitate implementation of the Animal Rule, every effort is made to develop meaningful correlates.

The Animal Rule requires that the effect is demonstrated in more than one animal species. However, it does not require that one non-human species is predictive of another, or that multiple non-human species are comparable. Again, this is something we have debated, and it was -- this point was really kind of brought to my attention by Bob Kohberger in that establishment of the correlate will be much easier to do if multiple species are comparable, and if they are predictable. And we need to do it or make every effort to show that they are.

So, again, what the Animal Rule says, what it absolutely requires, and what we need to do or try to do maybe a little bit different, and the assumption here, then, is that the demonstration of comparability between non-human species is highly desirable and will be attempted but is not essential for product licensure.

The Animal Rule does not require 100 percent lethal animal models to the extent that human lethality is not 100 percent. Again, we full appreciate the value of models where all controls die, and we make every effort to develop models where that's the case. But just keep in mind it's not a specific requirement, so, again, I'm trying to give the perspective here of what's desirable, what's nice to have, what's doable, and what you have to have.

So the assumption on this slide is the demonstration of fully lethal animal models will be attempted, but is not essential for product licensure.

The Animal Rule requires adequate and well controlled animal studies. It does not require validated animal models and systems. Again, this is -- this is kind of second nature to most of us working in this now, but it -- in the early stages this was a subject of considerable debate.

And what we've concluded and our approach is, that the -- and the assumption here is that components of an animal model system that can be validated will be validated, and models will be developed to generate data produced from adequate and well controlled animal studies.

So this is usually where the question will come up, "Well, what is good enough? When is it adequately developed, and how do I get there?" And unfortunately there is not a simple answer, and it's not going to be given up front. It will only come with data and numerous studies and numerous discussion -- or lots of discussion and analysis.

This slide kind of follows on to the same point. It's taken from a presentation given by CBER in January of 2006. And what I want to point out here, really, is that studies must be reproducible and predictive for infected negative controls.

I'm not going to talk about all these points here. They basically summarize the kind of things we would normally want to do in the conduct of good science.

I would mention that pivotal or definitive studies must be GLP, so they implication there is that studies prior to pivotal studies do not need to be GLP. However, as the model matures, as the product matures, you want to get the increased quality and rigor in your studies, you will incorporate GLP studies probably well before pivotal or definitive studies.

However, for initial proof of concept studies you probably wouldn't want them to be GLP. In fact, I would argue they probably shouldn't be GLP from a resource perspective.

Now, as we've -- as we've gotten more data, we've done more studies, you know, other issues crop up, and so I've kind of lumped these under other considerations. You know, there was a question this morning on Ig subclasses, but antibody functionality may be affected by vaccine regimen and/or time since last vaccination.

Antibody from active immunization may be different than antibody that's passively transferred. Purified IgG may be different than unpurified IgG that's passively transferred in plasma. There may be differences between similar vaccines. There may be differences between species. So this is a new area, a relatively new area that we've thought about and we're beginning to explore.

Correlate endpoint levels are generated from active and passive -- immunization studies may be different, and we'll see some data later in the workshop that makes this point.

Initial development of correlates for rPA vaccine will be for the GUP label indication, and -- but with an emphasis on time to protection. And this is probably a difference between HHS and DoD where DoD is probably more concerned about duration of immunity, HHS is more concerned about time to protection with regard to post-event scenarios.

I have several conclusions here. Multiple studies are required in each species for each indication. BSL3 aerosol GLP studies are complex and costly, and, in fact, non-GLP studies are complex and costly in this environment. So there is -- the overall -- because of this complexity and cost the overall model development plan strategy needs to be well thought out.

From the standpoint of cost, staffing, facilities, animal utilization, especially non-human primates, it requires careful thought and planning -- a long-term plan to the extent that that's possible that the right studies are done in the right sequence and that we minimize redundancy.

And, again, the assumptions are important. Because of the cost of these studies, the assumptions we make help us move forward without having to address every possible question. Perfect solutions to specific issues are rare. I kind of modified Bob's quote when I did this, but perfect solutions to specific issues are rare, but good planning, science, and data are essential to address them in the best manner possible.

And these are all kind of related. I mean, there's a theme here you can tell. But it's not -- neither is it feasible to appropriately address all possible questions. It's much easier to ask questions than to do the studies to answer them.

So when these questions come up, an I'm sure there will be many in this workshop, we have to ask ourselves constantly, "Do we really need that answer? If we have that answer, how will it help us?" What are the possible outcomes of the study that could be -- including the negative outcomes? And will the study really be worthwhile?"

And I guess, finally, when we're thinking about questions, we have to ask, "Is it attainable?" You know, it may be a very interesting question, it might be potentially very useful, but it simply may not be attainable or not attainable in a practical and feasible manner.

And, finally, one of the main points that I like to make is that animal model and assay development consist of iterative process development studies and data-driven decisions that guide FDA, funding agency, and product sponsor decision-making.

The other statement that I guess I wanted to conclude with -- and it's not on here -- but if I can make one statement that the -- that a very high level, practical way to capture all of this, I would say that unless the animal model is very well developed it's unrealistic to expect that one study is going to address adequately any specific question or issue that has been raised.

So with that, I will conclude. I'm happy to take any questions.

Thank you.

(Applause.)

DR. CHAWLA: Hi. I'm Anil Chawla from Panacea Biotec. What is the scientific basis of claiming that antibody threshold is same in animals and humans? Because of difference, you could drive the animal threshold to -- you can correlate really, but they cannot be same. What do you say about that?

DR. NUZUM: They cannot be the same in what regard?

DR. CHAWLA: The value. They can be same in terms of value?

DR. NUZUM: Well, if certain levels are protective in animals -- so I think your question is: how do you know what the level that's protective in animals is going to be protective in humans?

DR. CHAWLA: Exactly.

DR. NUZUM: Right. So that's where I think that the Animal Rule is -- we have to rely on the Animal Rule, and the requirements for animal models that are relevant and well characterized.

I mean, at some point it's a leap of faith, it's a prediction. But the data -- the efficacy data that you see in animals, if you get a similar endpoint, a similar threshold level, whatever -- and there's going to be more talk on this, and maybe this point will come out better, because we're going to have clinical and efficacy data. But at some point you make a leap of faith based on your animal efficacy data that those same endpoints in humans will be predictive of clinical benefit.

DR. CHAWLA: My second question is related to the multiple studies that are required in each species for each indication. Can you --

DR. NUZUM: Well, the one slide I listed with regimen studies -- I mean, you do your initial proof of concept, your regimen, the different studies for determining the correlate, time to protection. Those are what I'm referring to.

DR. CHAWLA: Can they be included in one study or -- because one study can have multiple arms or multiple outcomes?

DR. NUZUM: In our study designs, we try to get as much information as we can, you know, address more than one question if we can. The danger with that, of course, is that the studies can become too large, too complex, and that creates a level of risk in itself, so it's a balance.

I would say the short answer to your question is yes, but we do it with caution.

DR. FERRIERI: Ed, one of the leaps of faith for me is the assumption that the spore challenge in the animals is similar to what might be an anticipated exposure in real life. And I wonder if you might comment on that, because in the various papers I've scrutinized some of the spore doses have varied a lot in different experiments within some of the published papers, and I guess if I were in the Metro, in the front somewhere where the ventilation is going to deliver, I don't know, tens of thousands of spores, a million, or if you're farthest away maybe none or 50 to 100 spores, could you reflect on that briefly, please?

DR. NUZUM: I'll try. Well, a couple of things. First of all, as we've -- we've done enough studies now where there is quite a range in spore challenge dose in our different studies. And we have done analyses on more than one study, and our general conclusion is that the effect, at least in animal models, is that the challenge dose does not correlate, does not impact the response.

With regard to what people might actually be exposed to, that's the reason for -- one reason for the high-dose challenge study we'll do down the road. And in our -- our understanding is that there is -- it hasn't been stated anywhere that the vaccine has to protect at 1,000 LD50s or 2,000 LD50s. We're using a target of 200 to 400. We consider that reasonable and practical and it's feasible.

But we will do this high-dose challenge study just to have the information. What happens if there is high-dose exposure?

DR. FERRIERI: Thank you very much.

DR. NASS: I guess my question has to do with how you establish the LD50 when it is species -- anthrax strain-specific, species-specific, animal strain -- you know, there are many factors.

DR. NUZUM: Well, that's a good question, and, I mean, the thing you didn't mention is just -- well, or maybe what you were implying is there is a lot of biological variability associated with different species, with the challenge, what the assays measured -- used to measure the actual or calculated and held dose and all of that.

But basically it's done by challenging at different doses -- you know, a challenge dose response, as any LD50 study would be done. And you look at the death at low doses going through the high doses.

DR. NASS: I guess what I'm saying is if you go back and look at different LD50s for different species of animal and different strains of anthrax, you'll find very widely varying numbers. And although the number of 8- to 10,000 for a human has been batted around, there is really very little evidence for that. So how are you calculating your LD50, and what's the reliability?

DR. NUZUM: Well, the calculation is no different than LD50 has ever done. I think the point we should make here is that the other aspect of what you're saying, they haven't been -- LD50s haven't been done that many times. You can't do LD50 studies and NHPs over and over to get a lot of confidence that you have the right number.

I think rather than concentrate on LD50 value itself, we need to -- we need to talk -- and this is another internal debate we've had, there has been discussion of doing away with the reference to LD50 period for some of the reasons that you state, and just give the challenge dose in terms of number of spores.

It's a -- I'm not sure what -- referring to this in terms of LD50 numbers adds that much value. It has historically been done. The main point is that these animals get lots of spores, and that's -- and it protects against them.

MR. SUTER: You said that a certain serological titer between the different species can be correlated to protection. Is there a correlation between the different titers on the LD50 between the different species? So say you have a rabbit of maybe five kilos. You calculate it to a human, and then you say, "This titer and the LD50 correlates what you know from human exposure to the bacteria."

DR. NUZUM: I'm not sure I understand your question. The ED50s -- I'm sorry. I didn't understand.

MR. SUTER: If you normalize the serological titer, you should also be able to predict what this would mean in terms of overcoming a challenge. That is, if you have a titer of X in a rabbit and you say you have an LD50, which this rabbit can support, can you then extrapolate what the dosage is you can tolerate in a human?

DR. NUZUM: The challenge dose?

MR. SUTER: Yes. I mean, you probably know in some of these bad cases how much spores were around, and we can probably extrapolate how many bacteria they had to fight against. So is there a certain correlation between --

DR. NUZUM: Well, I don't -- I don't think it's -- I think it's very difficult to make a direct extrapolation, because we don't -- we don't know the lethal doses to that extent in humans. And, again, that's the reason for giving rabbits lots of spores, and at some point we will have more information on very high challenge doses. And that information I think would give us confidence that that information would extrapolate to protection in humans. I'm not sure I've answered it.

MS. VOLKMANN: I have a comment to the first question, which is that you're assuming that the threshold of titer for protection is the same in all species. And when we look at most titers, for instance, in small pox using a well characterized vaccine such as Dryvax, we always get much higher titers in mice than we would get in monkeys, and yet those titers, although they are different, they are protective.

So what do you think about rather than comparing titers directly between species using a well characterized vaccine as available for anthrax as well as a comparison and always run that comparison for all assays for all species when you have I guess a better comparison, don't you think?

DR. NUZUM: Do you mean the same -- a same vaccine as the control --

MS. VOLKMANN: Yes, like a gold standard or well characterized or licensed vaccine as a comparison in all your assays.

DR. NUZUM: In many of our studies we do include BioThrax.

MS. VOLKMANN: Because then you don't have to rely on the same titer in a mouse or rabbit or a human.

DR. NUZUM: Well, it's another reference point, yes. And we do include BioThrax in many of our studies.

DR. NASS: But the obvious problem -- Meryl Nass -- doing that with BioThrax is that the different lots of BioThrax contain different amounts of PA and other proteins and have not been individually characterized. So there really is no gold standard --

DR. NUZUM: Well, we're --

DR. NASS: -- using BioThrax.

DR. NUZUM: Right. Well, we're aware of that, but it is licensed and it provides a reference. And, certainly, for the toxin neutralization assays, ELISA capitalizes -- it's focused on PA. It is in consideration.

DR. HEWITT: Okay. Thank you.

I would like to remind all the questioners to please identify yourselves when you're posing your question.

And our next talk is going to be by John Bigger from Battelle, and he's going to present his data on the rabbit model.

DR. BIGGER: Thank you. I guess the platform can be raised, can't it? Yes, for the altitudely-challenged.

Good morning. Can you hear me in the back if I just leave the microphone right here? Are we okay? Okay, great.

(Laughter.)

Barely? Okay.

I'd like to thank the organizers for allowing me the opportunity to come here today and share this data. We received the task to provide small animal models for bacillus anthracis vaccine testing using rPA vaccine candidates. And then, having established that model, we were then tasked to test two rPA vaccines that contained alhydrogel adjuvant for efficacy within this rabbit aerosol challenge model, and then evaluate the immune response to the vaccines.

The test articles themselves were two rPA vaccines in alhydrogel. They were provided by two separate commercial companies. The route of vaccination was intramuscular. We used New Zealand white rabbits, both male and female, balanced set, and then we challenged them with a target dose of 200 LD50, which comes out to be about 2.1 times 107 colony-forming units. That was our target inhaled aerosol challenge dose.

And then, for this study -- the studies that I'm going to present, the endpoints were limited strictly to antibody titer and to survival.

So here is our study design. We had six groups of rabbits, ten rabbits per group, and we vaccinated them with diluted doses of rPA, depending upon the cohort, at week 0 and week 4. We then collected serum from these animals every other week for 10 weeks for ELISA and TNAs, and then at week 10 we then provided an aerosol challenge.

Logistically, we could not challenge 60 animals in a single day, so we spread the challenge out over three days. So the animals were randomized both by cohort and by challenge order and by challenge day. We then monitored the animals for 14 days, and at the end we did the whole thing over again for the second vaccine.

So let's look at the immune response. We're looking at vaccine A and vaccine B for ELISA or TNA, and what we see is that after the first vaccination at week 0 we did get a dose-dependent immune response, and here at week 4 they received a second vaccination which then boosted the immune response in all vaccinated cohorts.

The unvaccinated controls show their ELISA data right here. The ELISA, the antibody titer, peaked at week 6, and then began to decline out here at week 10, which of course is our challenge date. Importantly, the immune response between both vaccines looked very similar, and again, importantly, down here in the week 10 ELISA or TNA titers we also see a dose-dependent immune response by TNA.

I'd point out that the two -- I do not have the same data on the TNA. Over here I'm representing the TNA ED50. Over here I'm representing the TNA NF50, which as you saw earlier the NF50 is normalized to a control serum. But as you can see, again, we get a peak at week 6 and a dose-dependent response at week 10.

So in deference to some questions that were asked earlier, here is a correlation experiment or analysis looking at ELISA versus TNA results, and what we see in each of these slides is -- in the darker black is a week 6 correlation between ELISA and TNA, and then in the lighter line, lighter colored line we have the week 10 correlation.

Now, keeping in mind the fact that we had -- we did not use as many animal cohorts in the week 6 analysis we still feel confident that the week 6 correlation is a little lower than the week 10 correlation in both vaccines -- again, demonstrating another comment that was made earlier that we do have some evidence that there is a change in the relationship between the ELISA titers and the TNA titers as time progresses in this model.

So a moment to discuss our aerosol challenge. We start with a well characterized challenge material spore lot -- bacillus anthrax, Ames strain -- characterized for 10 separate areas, including purity, genotype, and phenotype characteristics, virulence, and aerosol performance.

Having done that, then, we use a muzzle-only exposure chamber where the animal is loaded into a real-time plethysmography chamber. Plethysmography, if you're not familiar, is a way to measure the real-time inhalation volume and inhalation rate, and then by comparing that with the predetermined spray factors of the aerosol system we're able to estimate as the challenge is going on how many spores the animal is inhaling.

When we reach the targeted challenge dose, we turn the aerosol challenge off, and then bring the animal out. During the challenge, the aerosol chamber itself is sampled by glass impingement. And then, taking the impinged sample and enumerating the spores by spread plate, we can then back calculate the actual number of spores that the animal actually inhales during the challenge.

So this is the -- a little analysis of our aerosol challenge across both of the experiments. Here we have three days of challenge for vaccine A, three days of challenge for vaccine B, and the take-home point here is that the mean challenge across all six days of challenge across both experiments was very close, very tight within the day and very tight across all six days.

Over here we can see the first experiment graphically. Each one of these numbers represents a rabbit, and then each number on the Y-axis represents the challenge dose that that animal received. So here is challenge day 1, challenge day 2, and challenge day 3 of the rabbits in order. And the important point here is there is no real pattern to how the animals -- to the challenge that the animals receive.

While we believe this is a very tight and very reproducible pattern, we do recognize that there is a range of doses -- for instance, on day 1 -- from about two times 107 to four and a half times 107. So there is a range of challenge doses that the animals receive, so we did ask the question: did the challenge dose affect survival?

And this was statistically analyzed by our Stats Department, and without going into it the short answer is no, challenge dose did not play a role in survivorship. As long as the animals received a challenge, then the amount of challenge did not affect the endpoint.

Well, as you note here -- okay, you don't note here, that's okay -- we had ELISA titers across a full spectrum. And as it turns out, above 30 micrograms per ml ELISA all of the animals survived challenge. Similarly, if the animal had an ELISA titer below seven, they succumbed to challenge.

So we actually have a range where animals both lived and died, and so we asked the question, okay, if you had a 30 microgram per ml ELISA titer, but you were given a lower challenge dose, did that affect your ability to survive? Maybe you could survive that challenge, where if you had received a high challenge dose you would have succumbed. And, again, the short answer is no, that did not play a role.

Okay. So let's look at the survival data. Here we have, again, vaccine A and vaccine B, 60 animals per experiment, 10 animals per group. Here we have the dark line showing you the survivorship of the unvaccinated or the mock vaccinated controls. The control animals began dying at day 2, and we had 100 percent fatality in both experiments by day 5.

Contra-wise, the animals that received the highest vaccine groups either had in experiment B no fatalities or in experiment A we had one fatality out at day 11. The vaccinations in between showed a very nice dose-dependent survivorship.

And just as importantly, while we received fatalities in the low vaccine groups that statistically were not different than the controls, we did see a dose-dependent time to death change showing that while we -- at the endpoint we still had a statistically insignificant survivorship, we did have a statistically significant protection offered by time to death.

Okay. So we then took a look at the immune response of both of these vaccines, and we compared their ELISA titers and the TNA titers to survivorship in these experiments, and we found that in both experiments there was a strong correlation between immune response and survivorship. And then, comparing those statistically we saw no difference between the two experiments, so the data were combined and then the immune titers were compared to survival and provided in these logistic regression plots.

So each plot represents 100 animals that were vaccinated with rPA. The control animals are not represented here. So we have animals that had higher immune responses on the right-hand side on each of these scales. Here is your ELISA down at the bottom, TNA ED50 here, and TNA NF50 here.

Animals that received a lower -- had a lower immune response to vaccination are here, and then by comparing these to the actual survival data we were then able to show a probability of survival as shown by the dark black line with the 95 percent confidence intervals shown in the dotted gray lines on either side. And then, the actual survivorship and immune response data are binned and pointed out in these black dots here.

So, importantly, each of these plots show a statistic correlation between the immune response and survivorship to the P equals lots of 0001, .0001 level. All of the plots show a very similar curve, and because we're able to do this with 100 animals the 95 percent confidence intervals are very tight in each of these curves. Especially in the mid-range of the curve, the 95 percent confidence intervals are very tight, whereas you get up toward the asymptotes and you've got a little more room here, a little more uncertainty.

So if we then take these data and put them in tablature format, we can provide an estimate of probability of survival at any given ELISA or TNA measurement. Looking at the 75 percent level of -- or probability of survival, we find 25 micrograms per ml ELISA, a TNA ED50 of 131, or a TNA NF50 of .12. So if you had a TNA ED50 titer of 131, and then the animal was challenged in our system, the animal would have a 75 percent probability of surviving.

Again, down in the 95 percent level, we have an ELISA titer of 71 micrograms per ml, or a TNA ED50 of 951, TNA NF50 of .35. In our hands, however, during this experiment the animals that had above an ELISA titer of 29 all survived. So we had 100 percent survival in this experiment at an ELISA titer of 29, as compared to the NF50 of 72 basically.

Now, the confidence interval, the 95 percent confidence intervals are shown in the parentheses, and my statisticians assure me that if we had an infinite number of animals that this value would come up, and we would indeed see only 95 percent survival given that budgetary non-restraint.

So at this time, hopefully I've convinced you that the rPA vaccines used here did provide a dose-dependent immune response. Our ELISA titers and TNA titers correlated highly, and we showed a change in correlation over time. Our aerosol challenge model is well characterized and reproducible. The survival of the animals was not challenge dose dependent, but was vaccine dose dependent, and the survival of the animals correlated to pre-challenge -- week 10 pre-challenge serological titers.

I'd like to acknowledge Dr. Roy Barnewall who is our aerobiologist and supervised all of the aerosol challenges. Our lead statistician on this was Dr. Greg Stark, who is here with us today, if we have any questions on the statistical analysis.

Many of the Battelle staff that were -- or maybe not many, but some of the Battelle staff that are involved in the spore growth and analyses and the TNA and ELISA experiments are here with us today. Again, if you have any technical questions that I can't address, they are here, and I'd like to point out their extremely hard work in this endeavor, and also the animal studies group.

Thank you very much. At this point, I'll open it up for questions.

(Applause.)

DR. NASS: Meryl Nass. You didn't specify what type of rabbit, but I'm assuming these are all genetically identical.

DR. BIGGER: The rabbits were New Zealand white rabbits, and they are an outbred population. So they are not genetically identical. They are an outbred population.

DR. NASS: Did you try this experiment with any more genetically diverse group of rabbits?

DR. BIGGER: The short answer is no, and while I'm limited and could open up the audience on my knowledge of the model, I believe that New Zealand white rabbits are preferred. Does anybody use any other species of rabbit?

(No response.)

Going once, sold. Sorry. No, we haven't.

Come on down.

(Laughter.)

DR. CHAWLA: Anil Chawla from Panacea Biotec. What are the scientific bases of the schedules of zero to four weeks and then challenge at tenth week?

DR. BIGGER: What is the --

DR. CHAWLA: What is the scientific basis of choosing that schedule?

DR. BIGGER: The scientific basis of choosing zero weeks and four weeks for the vaccination --

DR. CHAWLA: And then challenge at tenth week when you have the peak titer at sixth week.

DR. BIGGER: And then -- okay. And there was lots of discussion when these experiments were being set up as to whether or not we should use zero weeks and two weeks, zero weeks and three weeks, zero weeks and four weeks. Obviously, you know, what you'd like -- prefer to do in a vaccination is to vaccinate them and allow them to rest for a period of time.

And it was discussed in depth and four weeks was arrived to as a consensus. Again, the challenge date of ten weeks was discussed, whether or not we wanted to do it earlier, whether or not we wanted to do it later, and really that was just -- we wanted to do it early enough so that it was not a long-term wait. Okay? So that it was not a six month or year or two year wait to -- we wanted to address the near-term efficacy of the vaccines, not the long-term efficacy.

DR. CHAWLA: When you say that X microgram of rPA was used, do you really check it at the time of vaccination, it was X microgram or it was a predetermined value which was tested maybe at the time, which was manufactured two or three months back?

DR. BIGGER: Yes, that's an outstanding question. And the question was: did we do a dose confirmation on the rPA vaccines at the time of vaccination to ensure that we were really giving them the dose that we had anticipated? And the answer is is that at the time -- and I don't believe this has changed -- the rPA mixture with alhydrogel made it impossible for us to do a back dose titration on the rPA. It binds irrevocably to the alhydrogel, and that made it impossible for us to assay.

So I don't know if the technology -- and there's an experiment now that can make that happen, but at the time that was not possible. So we had to rely on the manufacturers to assay the amount of rPA, and then let us know what that was, so that we could dilute appropriately.

DR. CHAWLA: Thank you.

DR. NABORS: Hi. Gary Nabors from Emergent Biosolutions. My question is, for the TNA NF50 assay, or that endpoint from the study, was that the same standard that was discussed before the AVR-801, or was that a rabbit standard that was used in the assay?

DR. BIGGER: I believe -- and Chris can give me the nod here -- that was a rabbit standard. And I would like to point out that we've had, you know, several years of assay development and increase in fidelity and changes in platform as we continue to refine that assay since these data were conducted.

DR. NABORS: So just as a quick followup, do you think that these data are translatable to immunogenicity data in any way that you would see in humans, or was this more of a sort of model development effort?

DR. BIGGER: I think this was a model development effort, and as we continue in TNAs, as they continue to refine, we're going to get closer and closer to be able to make that comparison. Still, I think if we were to go back and rerun these samples today, the data are sound. And whether or not they are comparable to humans is part of what this workshop I guess is all about. I'm not going to comment on that.

Sir?

DR. GOTSCHLICH: Emil Gotschlich, Rockefeller University. I must have missed this, but what was the intended dose to be given to these rabbits of antigen?

DR. BIGGER: Sir, I'm not at liberty to discuss the intended -- you're talking about the vaccine dose?

DR. GOTSCHLICH: Yes.

DR. BIGGER: I'm not at liberty to discuss the vaccine dose given to these animals. It's proprietary information for the companies that provided the vaccine. So, I mean, the -- our intent here was to focus more on the immune response that was generated and how that immune response correlated to survivorship.

DR. GOTSCHLICH: I find that an amazing answer.

MS. PASETTI: Marcela Pasetti, University of Maryland. It's a beautiful study. I know you are concentrated on TNA, but did you perhaps free cells or did you do some cell-based assays -- cytokines or antibody-secreting cells or B memory? I know there are limited reagents for rabbits, but --

DR. BIGGER: So, yes, the -- you know, the ability to do that kind of work has evolved so much in the past couple of years since these studies were done. But no, at the time we did not try to do any of that work. We have since then brought in -- I'm sorry? Some of my staff members were hinting to me maybe?

But since then, we've brought online B-cell memory assays and we can detect various Ig levels in different species. And I'm not sure if we can do that in rabbits with the IgA, but we can certainly do it in non-human primates at least.

MS. PASETTI: Thank you.

MR. SUTER: Maybe I've done that experiment. Would it be possible to transfer serum from an immunized rabbit into a naive rabbit, get the same titer, and then challenge it? And do you see the same protection?

DR. BIGGER: Later this morning --

MR. SUTER: Okay.

DR. BIGGER: -- Mark Perry is going to present some data on rabbit passive transfer studies, and I think following that -- and the discussions that we'll have with the panel -- the discussion is going to be very exciting, comparing the results of this active vaccination experiment to his passive transfer experiment.

Thank you. Thank you very much.

(Applause.)

DR. HEWITT: We are going to move on. Our next speaker is Louise Pitt from USAMRIID, and she is going to talk to us today about her rabbit model of active immunization.

DR. PITT: Well, good morning. I'm going to present a series of experiments that were carried out at USAMRIID over the last maybe eight to ten years. I'm looking at an in vitro correlate of immunity in the rabbit model.

These studies were initiated, as I said, probably 10 years ago. At that time, there was a scientific opinion that antibodies actually did not correlate with protection. That assumption was based on a series of non-human primate vaccine efficacy studies, which had small numbers of animals, and on varied guinea pig efficacy studies looking at a variety of different vaccines.

But at that time nobody had actually designed a study to look at the question specifically. So at USAMRIID we decided to approach that. And in terms of where we stood at the time we knew that the New Zealand white rabbit was probably an appropriate model.

We had done the disease and pathology comparison with non-human primates and humans, and somewhat understood the differences and comparisons between the human and non-human primates and the rabbit. We also had done vaccine efficacy studies in the rabbit with both rPA and the licensed anthrax vaccine, and knew that it was predictive of efficacy in the non-human primate.

We also came from the standpoint that protective antigen combined with an adjuvant provided complete protection, as I said, and that we did have this quantitative ELISA and the toxin-neutralizing antibody assay that could be used for correlates.

So the approach we took in the first study was to take the licensed anthrax vaccine and dilute it down. We knew that the human dose gave full protection, two doses of the human dose, and so dilute it down in order to start getting survival and non-survivors so we could then compare the responses and see if there was a correlation.

The study design was two doses at zero and four weeks. We bled the animals and looked at the titers at week 6, and then immediately prior to challenge. We chose to challenge the animals at week 10 to match a lot of the vaccine efficacy data that we already had, and have found that that was appropriate time -- six weeks after the second dose -- to look at efficacy.

There were two studies performed with the licensed vaccine looking at two different lots, and their challenge doses were an average of 133 LD50s or 84 LD50.

So this is a summary of the initial study. Here, as you can see, we did one in four dilutions in groups of animals. The undiluted, as expected, gave the 100 percent efficacy. And as you can see, as you go down, we get excellent efficacy from 100 to 90 percent when you get down to one in 64 dilution. And then, at the one in 256 in this lot, we started to lose animals.

If you look at the week 6 quantitative ELISA, we got a very nice titration if you look at it as a group, both at the six week and at the 10 week. And, indeed, in the TNA ED50s, the titer gave a nice gradation.

In the second study, we increased the numbers in the middle groups in an attempt to increase the non-survivors so as it would improve our statistics. And, again, you can see excellent efficacy at the first two dilutions, and then starting to drop off with zero survival at the one in 256.

The same pattern in terms of the quantitative ELISA quantities, both at the six weeks and at the 10 weeks, and, again, a similar gradation in the toxin-neutralizing antibody levels.

This is just a graph of the actual individual animals with the live in the closed diamonds and the dead in the open to show you exactly the titers of each individual animal. As you can see in the top group, you clearly have a group that is solidly protected. In the lower groups, although they do have some levels of antibody, they are clearly not protected, and then in between you have some that are and some that aren't. And this was true for both lots of vaccine that we did experiments.

So this is the concentration at the time of challenge; the previous slide was at the peak, the six weeks. And this shows you a very similar pattern in terms of the solid protected and the solidly not protected. That gives us the gradation in between. The TNA level at six week again followed the similar pattern of the groups.

So in terms of predictions of survival, both the six-week peak and the 10 week ELISAs were significant predictors of survival, and as was the toxin-neutralizing antibody assays.

So in the next series of studies -- and these were led by Steve Little from USAMRIID -- the next logical step was to look at the rPA vaccine and say, "Did this hold true, and was the pattern similar?" Again, a similar study design. In the initial rPA studies we looked at a one-dose vaccination to see if this -- if there would be any correlation following one dose and a challenge.

The doses of rPA that were chosen varied between .08 micrograms and 100 micrograms of rPA. They were combined with .5 milligrams of aluminum, so the amount of aluminum in each dose remained constant. This is different from when we diluted the anthrax licensed vaccine, because we diluted that in PBS.

So the amount of aluminum changed in the initial experiment, but the ratio of antigen to aluminum maintained. In this experiment, the aluminum remained constant and the rPA was titrated.

The animals were bled at week 2, and then at time of challenge, and, again, we looked at the ELISA and the toxin-neutralizing antibody levels, and the animals were challenged at week 4 with approximately 200 LD50s.

So this is the table that summarizes. Here we have the dose of the rPA going from 100 down to .08. This column just shows you the number of experiments. As you know, we can't do hundreds of rabbits at any given time, and so it had to be split up into several experiments.

And here we have the survival column where at 100 micrograms of rPA, one dose, four weeks later, 93 percent are protected; 65 with 25 micrograms; 43 with five, 16 percent with one, 10 percent with .2, and then zero at .08. So we got a very nice titration in survival that follows the dose of the rPA.

When look at the ELISAs, we got a similar pattern of gradation at both week 2 and week 4, and the toxin-neutralizing antibodies followed a similar titration pattern.

This is a graph of the actual live, dead animals. This shows you there is quite considerable overlap after the one dose of rPA in terms between 100, 25, and five micrograms, and then the response starts to drop off. And this is the quantitative ELISA at four weeks, which is the time of challenge.

This is looking at the TNA response at two weeks, again showing a similar pattern. Clearly, down here, a group that are solidly not protected, but after one dose you have groups where clearly there is a more mixed group of survival and of non-survival.

But looking at it in terms of significant predictors of survival, the PA ELISA was significant at week 4, and indeed it was also significant at week 1. Looking at the TNA, it was a significant predictor at both week 2 and week 4.

So moving on to the next series of experiments, we then looked at what would happen with two doses of rPA. This slide actually has some mistakes on it, and I'll go through it. The doses were varying from .08 to 10 micrograms of rPA. Again, it was combined with aluminum at the same concentration, so each -- each injection had the same amounts of aluminum and the rPA was titrated.

This sera -- they were bled pretty much weekly, but we concentrated really on looking at the six-week and then the ten-week, the prior to challenge. The aerosol was done at week 10, not week 4, and they were given over 200 LD50s.

So this is the summary table of the survival with the doses going from 10, 1.2, and .08, two doses zero and four weeks, 100 percent survival with two doses of 10 micrograms rPA. No difference seen in survival with the one microgram or the .2 microgram, and then a drop-off at two doses of .08.

In looking at the ELISA at week 6, again, you can see a nice gradation, but right here you can see there is somewhat difference in terms of the ELISA, but no difference in terms of survival.

At week 10, similar gradation. And when you look at the TNA, again, a nice titration in terms of the assay. This then shows you the individual animals and their levels of ELISA at week 10 just prior to challenge, and you can see here solidly protected group here. These are the controls down here, and your 1.2 and .08.

The toxin-neutralizing titers at week 8 showing a very similar pattern where you've got the solidly protected and then the mixed in between.

So in terms of significant predictors of survival, the PA ELISA at week 10 was indeed a significant predictor, and the TNA at week 8.

So the last study that I will present approached in a little different fashion. Instead of looking at a short-term challenge, this was looking at a six-month and a 12-month challenge. In this study, the animals received two doses of 50 micrograms of rPA combined with the same amount of aluminum. Blood was drawn at various times through the experiment, and one group was challenged at six months and another group challenged at 12 months.

So this is the efficacy at six months where 74 percent of the animals -- 20 out of 27 -- survived the challenge. You can see this is the weeks 4, 6, 8, 13, and 26 levels between survivors and non-survivors. Week 26, in terms of the ELISA, there was a significant difference between the survivors and the non-survivors, and indeed that was a significant predictor of survival.

In terms of the TNA assay, there was a significant difference at week -- between survivors and non-survivors at weeks 8, 13, and 26, and the week 13 was shown to be a significant predictor of survival.

Looking at the 12-month challenge, at this time we got nine out of 24 survivors. The response -- there was a significant difference between survivors and non-survivors at 26, 39, and 52 weeks, and the 26-week turned out to be the significant predictor of survival.

Looking at the TNA assay, there was a significant difference between survivors and non-survivors weeks 6, 8, 13, 26, 39, and 52. But the week 39 in the TNA was the most significant predictor of survival.

So in summary, looking at these series of experiments, we showed that these two assays -- both the quantitative ELISA and the TNA -- are useful assays to serve for correlates in estimating the immunological status of rabbits. We found that the antibodies to PA are a serological correlate of vaccino-genized immunity in this model. And this could provide a basis of an in vitro test to serve as a correlate.

This data has all been published, and the references are available. And I would like to acknowledge all of the people at USAMRIID who have participated in these studies over the years, particularly Steve Little who did so much work not only in the animal studies but on the assays and the development of the assays.

And as you know, these studies take a large number of people, and I would like to acknowledge them.

Thank you.

(Applause.)

DR. FERRIERI: A quick question, Dr. Pitt. I gather that the rPA is not absorbed to the aluminum hydroxide. And my question is: is there -- what is your opinion of what it would do, its behavior immunologically, if it had been absorbed?

DR. PITT: It was absorbed. It was just --

DR. FERRIERI: It was.

DR. PITT: -- not a formulation. It was absorbed within 24 hours, and then given to the animals.

DR. FERRIERI: Okay. Thank you.

MS. WILLIAMSON: Louise, I just wanted to ask about the duration of immunity. In terms of anti-PA ELISA, that seems to be fairly standard in these longer-term studies, that week 26 was the critical time point. But the dynamics for the TNA titer seemed to vary much more.

Can you explain why, or any theories why that might be? Because one would like to -- the function antibody titer, I would have expected the function antibody titer to follow the ELISA titer. So, you know, you're developing antibodies, and within that you're developing a functional antibody. You might expect that to be a slower process, but it doesn't seem to be from this data necessarily.

DR. PITT: No, I agree, but that's -- I honestly don't have an explanation at this time.

MS. WILLIAMSON: Thank you.

DR. CHAWLA: Anil Chawla from Panacea Biotec. The question is in clarification of the first question she asked. You have used different amount of rPA starting from 0.8 microgram to 100 microgram on same amount of aluminum that is 0.5 milligram. Did you carry out absorption studies that how much was absorbed? Was it 100 percent absorbed in all cases?

DR. PITT: I don't know that.

DR. CHAWLA: Thank you.

DR. NASS: Did you perform a functional assay of the PA to find out whether it was biologically active?

DR. PITT: In terms of using the TNA to show that PA is active, yes.

DR. CHAWLA: Again, a clarification on the question she asked. When she said that rPA which was used, was it biologically active, I mean, was in microphage license assays done? Or not the TNA, because for checking the biological activity of PA you need to carry out the microphage license assay.

DR. PITT: Yes, that was performed.

DR. CHAWLA: It was. Okay.

DR. HEWITT: Thank you, Louise.

We'll move on to our next talk by David Madigan, and he is going to tell us about his non-human primate analysis.

DR. MADIGAN: Thank you. Good morning. Indeed, I'm going to tell you about the non-human primate anthrax vaccine study run by CDC. And just at the outset, I'm very grateful to Brian Plikaytis and Conrad Quinn, who are here from the CDC, and they are going to answer all of the questions.

I have the wrong slides. Can we take a five-minute break? These are the wrong slides that are loaded on the laptop. Can we take a five-minute break, please?

(Whereupon, the proceedings in the foregoing matter went off the record at 11:40 a.m. and went back on the record at 11:43 a.m.)

DR. MADIGAN: Okay. Can we resume?

So I apologize. The version of the slides that I'm now showing you are updated versions of what's in the handout. And if there are some extra slides and some corrections. If you would like a copy of these slides, feel free to e-mail me, and I'll send you this updated copy of the slides.

Okay. Okay. So this particular study was run by the CDC. The goal of the study was to find immunologic markers that endorsed the human clinical trial endpoint, and confirms human vaccine protection, and identifies when protection is achieved, and also it quantifies how long protection lasts.

And this study was heavily scrutinized by an IOM Committee several years ago. Several members of the Committee are here in the audience. Subsequent to that Committee, the Statistical Advisory Committee prepared a statistical analysis plan for this study, and there are a couple members of that Committee here also. And then, it fell to me to actually implement the statistical plan.

Basically, I'm primarily just going to show you some of the data from the study, and very briefly I'll describe some of the statistical methods that we implemented.

So this study was a lot like some of the other studies we've just been hearing about. It was in non-human primates, and we -- there were different doses of the vaccine used, the human dose of 1/5, 1/10, 1/20, 1/40, and as well as saving controls. And the human -- the proposed human vaccination schedule was 0 weeks, 4 weeks, and 26 weeks IM.

And so our goal was to build a comprehensive -- or the study did build a comprehensive immunological profile. The animals were challenged at different times, some at 12 months, 32 months, and 42 months, and the statistical goal was to build a model to -- for predicting survival using these assorted -- large number of measurements of the state of the immune system gathered throughout the study. And the longer term goal is to apply this relationship to the human clinical study.

So a little bit more specifically, the question was: are measurable aspects of the state of the immune system predictive of survival? The answer to that is yes, as I'll show you. And the basic statistical problem we had here is that we had -- we had literally hundreds of different assay time points, different assays measured at different time points, but there are fewer than 100 animals in the study.

I'll describe a descriptive analysis, and then briefly I'll talk about some of the fancier statistical things that we also explored.

So here are some basic statis