FOOD AND DRUG ADMINISTRATION CENTER FOR DRUG EVALUATION AND RESEARCH EIGHTIETH MEETING OF THE CARDIOVASCULAR AND RENAL DRUGS ADVISORY COMMITTEE 8:30 a.m. Thursday, February 27, 1997 Jack Masur Auditorium Building 10, Clinical Center National Institutes of Health 9000 Rockville Pike Bethesda, Maryland APPEARANCES COMMITTEE MEMBERS: BARRY MASSIE, M.D., Chairman (present morning session) Director, Coronary Care Unit Department of Medicine Veterans Administration Hospital 4150 Clement Street San Francisco, California 94121 JOAN C. STANDAERT, Executive Secretary Center for Drug Evaluation and Research Food and Drug Administration 234 Summit Street, Room 117 Toledo, Ohio 43604 ROBERT CALIFF, M.D. (present afternoon session) Professor of Medicine Director, Duke Clinical Research Center Duke University Medical Center 2024 West Main Street, Box 31123 Durham, North Carolina 27707 JOHN DiMARCO, M.D. Professor of Medicine Cardiovascular Division University of Virginia Hospital, Box 158 Hospital Drive, 5th Floor Private Clinic, Room 3608 Charlottesville, Virginia 22908 CINDY GRINES, M.D. Director, Cardiac Catheterization Division of Cardiovascular Disease William Beaumont Hospital 3601 West Thirteenth Mile Road Royal Oak, Michigan 48073-6769 MARVIN KONSTAM, M.D. Professor of Medicine New England Medical Center 750 Washington Street, Box 108 Boston, Massachusetts 02111 APPEARANCES COMMITTEE MEMBERS: (Continued) JoANN LINDENFELD, M.D. (present morning session) Professor of Medicine Division of Cardiology University of Colorado Health Science Center 4200 East Ninth Avenue, B-130 Denver, Colorado 80262 LEMUEL MOYE, M.D., PH.D. Associate Professor of Biometry University of Texas Health Science Center at Houston Coordinating Center for Clinical Trials 1200 Herman Pressler Street, Suite 801 Houston, Texas 77030 CYNTHIA RAEHL, PHARM.D. Consumer Representative Chair, Pharmacy Department School of Pharmacy Texas Technical University Health Science Center 1300 South Coulter Drive Amarillo, Texas 79106-9711 DAN RODEN, M.D.C.M. Vanderbilt University Division of Clinical Pharmacology 532C Medical Research Building-1 23rd and Pierce Avenue Nashville, Tennessee 37232-6602 UDHO THADANI, FRCP (present morning session) Professor of Medicine Division of Cardiology Oklahoma University Health Sciences Center 920 S.L. Young Boulevard, 5-SP-300 Oklahoma City, Oklahoma 73104 MICHAEL WEBER, M.D. Chairman, Department of Medicine Brookville University Hospital Medical Center 1 Brookville Plaza Brooklyn, New York 11212 APPEARANCES COMMITTEE CONSULTANTS: JEFFREY BORER, M.D. ROBERT CODY, M.D. (present morning session) RALPH D'AGOSTINO, PH.D. FOOD AND DRUG ADMINISTRATION STAFF: BOB FENICHEL, M.D. RAYMOND LIPICKY, M.D. NORMAN STOCKBRIDGE, M.D. ROBERT TEMPLE, M.D. MEDCO REPRESENTATIVES: JAY COHN, M.D. LLOYD FISHER, PH.D. CESARE ORLANDI, M.D. JOSEPH QUINN, M.S.P.H. SMITHKLINE BEECHAM REPRESENTATIVES: WILSON COLUCCI, M.D. LLOYD FISHER, PH.D. MILTON PACKER, M.D. ROBERT L. POWELL, PH.D. NEIL SHUSTERMAN, M.D. JIM TIEDE, PH.D. C O N T E N T S MORNING SESSION NDA 20-727, BIDIL (hydralazine HCl and isosorbide dinitrate) to be indicated for congestive heart failure AGENDA ITEM PAGE OPEN PUBLIC HEARING 7 MEDCO RESEARCH, INC. PRESENTATION: Introduction by Dr. Cesare Orlandi 11 Historical Overview, Clinical Efficacy by Dr. Jay Cohn 12 Statistical Overview by Dr. Joseph Quinn 32 Summary/Conclusions by Dr. Jay Cohn 44 COMMITTEE REVIEW AND DISCUSSION 114 C O N T E N T S AFTERNOON SESSION NDA 20-297 s-001, COREG (carvedilol) to be indicated for congestive heart failure AGENDA ITEM PAGE SMITHKLINE BEECHAM PRESENTATION Introduction - by Dr. Robert Powell 205 Clinical Program - by Dr. Neil Shusterman 213 COMMITTEE REVIEW AND DISCUSSION 303 P R O C E E D I N G S (8:30 a.m.) DR. MASSIE: I want to welcome everybody to the 80th meeting of the Cardio-Renal Advisory Panel which we're going to have today. Before getting started, let me briefly just introduce the members of the committee who are sitting from my left to my right: Dr. Dan Roden, Dr. Marvin Konstam, Dr. Cynthia Raehl, Dr. Michael Weber, Dr. Lemuel Moye, Dr. JoAnn Lindenfeld, our Secretary, Joan Standaert, Dr. DiMarco, Dr. Rob Califf, Dr. Udho Thadani, and not yet but to come later, Dr. Cynthia Grines. Dr. Lipicky representing the Division of Cardio-Renal Drugs is on the far left, and I guess Dr. Temple will be joining us. In addition, we have several outside consultants for today's meeting. Dr. Ralph D'Agostino, who will be a voting member as a special government employee, as will Dr. Jeffrey Borer, and Dr. Robert Cody is our special consultant, but unfortunately not able to vote. The first order of business is that we are open for public comment. If anybody has any comments, we'd be happy to entertain them at this time. In the absence of public comment, we can proceed with our business. Joan Standaert is going to discuss the waivers and potential conflicts of interest of the committee members. MS. STANDAERT: The following announcement addresses the issue of conflict of interest with regard to this meeting and is made a part of the record to preclude even the appearance of such at this meeting. Based on the submitted agenda for the meeting and all financial interests reported by the committee participants, it has been determined that all interests in firms regulated by the Center for Drug Evaluation and Research present no potential for an appearance of a conflict of interest at this meeting, with the following exceptions. In accordance with 18 U.S.C. 208(b), full waivers have been granted to Drs. Barry Massie, Lemuel Moye, and Dr. Robert Califf, which permit them to participate in all official matters concerning Posicor. In addition, Dr. Dan Roden and Dr. Udho Thadani are excluded from participating in all official matters concerning Posicor. Further, in accordance with 18 U.S.C. 208(b)(3), a limited waiver has been granted to Dr. Udho Thadani. I'm sorry. I'm reading the wrong announcement. Well, I'll start over again. Sorry, excuse me. We'll do that again tomorrow. This is the announcement for February 27th, 1997. The following announcement addresses the issue of conflict of interest with regard to this meeting and is made a part of the record to preclude even the appearance of such at this meeting. Based on the submitted agenda for the meeting and all financial interests reported by the committee participants, it has been determined that all interests in firms regulated by the Center for Drug Evaluation and Research present no potential for an appearance of a conflict of interest at this meeting, with the following exceptions. In accordance with 18 U.S.C. 208(b)(3) full waivers have been granted to Drs. JoAnn Lindenfeld, Lemuel Moye, Marvin Konstam, and Dr. Dan Roden, which permit them to participate in all official matters concerning BiDil. In addition, Dr. Robert Califf is excluded from participating in all official matters concerning BiDil. Further, in accordance with 18 U.S.C. 208(b)(3), a waiver has been granted to Dr. Marvin Konstam, which permits him to participate in all official matters concerning Coreg. However, Drs. Barry Massie, JoAnn Lindenfeld, and Dr. Udho Thadani are excluded from participating in all official matters regarding Coreg. Copies of the waiver statements may be obtained by submitting a written request to the agency's Freedom of Information Office, Room 12A-30 of the Parklawn Building. We would also like to disclose for the record that Dr. Robert Califf and his employer, the Duke University Medical Center, have interests which do not constitute a financial interest within the meaning of 18 U.S.C. 208(a), but which could create the appearance of a conflict. The agency has determined that notwithstanding these involvements, that the interest of the government in Dr. Califf's participation outweighs the concern that the integrity of the agency's programs and operations may be questioned. Therefore, Dr. Califf may participate in all official matters concerning Coreg. With respect to FDA's invited guest expert, Dr. Robert J. Cody has reported interests which we believe should be made public to allow the participants to objectively evaluate his comments. Dr. Cody would like to disclose that he has conducted clinical trials and consulted for SmithKline Beecham, Merck, and Zeneca. He has also given presentations which were sponsored by SmithKline Beecham, and Merck. In the event that the discussions involve any other products or firms not already on the agenda for which an FDA participant has a financial interest, the participants are aware of the need to exclude themselves from such involvement and their exclusion will be noted for the record. With respect to all other participants, we ask in the interest of fairness that they address any current or previous financial involvement with any firm whose products they may wish to comment upon. That concludes the statement for February 27th, 1997. DR. MASSIE: Thank you very much, Joan. Well, as is probably apparent to all the members of this audience, as well as all the committee members, we have a very full agenda today and I'm going to do my best to keep the first half on time. In the interest of trying to proceed smoothly, I'm going to ask the committee members to try not to interrupt the sponsor's presentation midstream because we will allow a block of time for questions thereafter and I think that will allow the information to flow more smoothly. So, I guess we are ready for the presentation for BiDil, NDA 20-727. DR. ORLANDI: Dr. Massie, members of the committee, Dr. Lipicky, ladies and gentlemen, good morning. We are here today to present you BiDil for the treatment of congestive heart failure. BiDil is a formulation of two drugs you are very familiar with, hydralazine and isosorbide dinitrate. Our application is based on two landmark clinical trials conducted in the 80s by the Veterans Administration, the V-HeFT I and V-HeFT II studies. Based on the results of these studies, we propose that BiDil is useful in the treatment of congestive heart failure as an adjunct to digitalis and diuretics. It is our opinion also that it's most appropriate the use of this formulation in patients that are not taking ACE inhibitors, which have become also part of standard therapy. Dr. Jay Cohn, who led the V-HeFT trials effort, will provide a historical overview of the trials. Mr. Joe Quinn will then address specific statistical issues that have been raised by the agency. And Dr. Cohn will also conclude our presentation with a brief summary of the findings. I just wanted to mention briefly that we have a number of consultants in the audience to address any specific question that the committee may have. This list includes Dr. Lloyd Fisher, who conducted a re-analysis of the V-HeFT trials, Dr. Uri Elkayam, Dr. Krik Adams, Dr. Ho-Leung Fung, and Dr. Alan Forrest. Dr. Cohn? DR. COHN: Thank you very much, Cesare, and I'd like to express my appreciation to the FDA and to the committee for giving me the opportunity to review with you the trials that we initiated really almost 20 years ago with the planning of the first V-HeFT trial, the vasodilator heart failure trials, which have continued to date, and the results of these first two trials will be the basis for our discussion this morning. What we would like to propose to you at the end of this presentation is that there is a strong basis for approval of BiDil for heart failure and we would propose that this be based on a survival benefit for BiDil as compared to placebo, on the basis of a strong trend from proved exercise tolerance versus both placebo and versus Enalapril in these two trials, on the basis of a sustained increase in ejection fraction that we believe not only confirms the mechanism of action of this drug combination but also confirms that there is a long-term effect of this drug combination. This combination of therapy has a well established rationale and an even better rationale today than at the time these studies were initiated, and we'll go into that in the course of this presentation. The safety of this drug combination of these two long-used agents is well established. This combination is already widely recommended as a treatment option in essentially all of the treatment guidelines that have been published in the last few years. And the approval of this combination is required to provide prescribing information to physicians who have been told to use this drug combination. Now then, hydralazine and isosorbide dinitrate were first used in combination. We did this, and Joe Franciosa, who worked with me at that time, is in the audience here today. We did this on the basis of the potency of this combination as a vasodilator, and the dramatic acute hemodynamic effect that this drug combination produced. At that time we predicted that this favorable hemodynamic effect might be translated into a long-term benefit but there were no long-term data available in order to determine that. V-HeFT, then, was organized as a landmark heart failure study, the first mortality trial undertaken in heart failure, with a goal to assess long-term efficacy of this vasodilator therapy added to conventional therapy, which at that time was digitalis and diuretics. ACE inhibitors had not been developed at that time. And it was possible, of course, at that time to include a placebo group because there was no other effective therapy, and this provided the first and, I must say, the only data that will exist either now or in the future of digitalis-diuretic therapy with placebo added in long-term therapy of heart failure. We would suggest that the impact of the findings of V-HeFT are that there is now demonstrated efficacy of chronic therapy, and this was indeed the first therapy which was demonstrated to be effective, and it has provided a new treatment option for the management of the patient with heart failure, which has already been accepted by most guideline committees. Well, V-HeFT I was a trial assessing vasodilator therapy in long-term therapy compared to placebo, added to, as I pointed out, digoxin and diuretic therapy for patients with heart failure. The two vasodilator regimens that were employed in this study were the hydralazine isosorbide dinitrate combination, and an alternate vasodilator, Prazosin, which had a rather similar hemodynamic effect in this patient population when given acutely. The comparison of the survival times between the placebo arm and the vasodilator arms was proposed to use a one-sided hypothesis because there was no reason at that point to consider any adverse effect of this therapy. The question was, is the therapy effective. So, it was a one-sided hypothesis, and therefore one-sided tests were proposed. V-HeFT II was initiated after the completion of V-HeFT I, and it was undertaken to determine whether the effective arm in V-HeFT I, that is the hydralazine-isosorbide dinitrate arm, had an effect comparable or different from that of Enalapril, which at that point in time had already been evaluated for short-term therapy of heart failure and it appeared to be effective. These drugs then were added to pre-existing conventional therapy of digoxin and diuretic. No placebo arm was included in V-HeFT II because it was felt by the planning committee that it was unethical after the results of V-HeFT I to have a placebo-treated group for long-term therapy. And since it was not known which of the two treatment arms would be more beneficial, it was a two-sided hypothesis and two-sided tests that were employed. Now, these trials that I'm going to tell you about were both randomized, double-blinded. One was placebo-controlled, the other had a positive control. All patients were followed for at least 6 months after randomization into the trial, and the survival status was confirmed in all patients at the planned date of completion of the trial. Both of these trials were planned to be completed at a specific date, and that date was indeed utilized in termination of the trial. The inclusion criteria were all males. The studies were all performed in Veterans Affairs hospitals. They were males between the ages of 18 and 75. They had a history of heart failure, with limitation of exercise tolerance for at least 3 months prior to screening. They all remained symptomatic, despite the use of digoxin and diuretics, and they had objective measurements that made them eligible. That is, there was cardiac dysfunction, as defined by either an enlarged heart on chest x-ray, greater than .55 cardiothoracic ratio, or a radionuclide ejection fraction of less than 45 percent, or a dilated left ventricle on echocardiography with a left ventricular internal dimension and diastole of greater than 2.7 centimeters per meter squared. These criteria were used in both trials. In addition, the patients were all subjected to a bicycle ergometer exercise test with measurement of gas exchange. And they had to have a reduced peak oxygen consumption less than 25 ml per kilogram per minute to be eligible for the trial. The major endpoint in both trials was survival time, and two related endpoints were utilized. That is, the overall survival, and of course the 2-year survival. Of course, the reason for doing that is that if one follows patients long enough, everyone will die and it was thought that perhaps a 2-year endpoint might be a more sensitive marker for a favorable effect of the therapy. So, these were both proposed as analytical endpoints. Now, the survival time was proposed to be carried out by the log-rank test, with the addition of a Cox proportional hazards model, using baseline patient characteristics as modifiers for the Cox model. I will, in addition, talk to you abut two major endpoints of the trial, secondary endpoints, at least. That is, changes in left ventricular ejection fraction and changes in peak oxygen consumption, both of which are the major determinants of survival in this population. These two endpoints were selected by the FDA with its consultant Milton Packer, and Milton is in the audience if there are any questions about his selection of these two as two of the criteria on which to adjust the mortality with the Cox analysis. These, we all agree, are major endpoints in the management of heart failure, so that these two I will talk to you about in some detail. These were assessed by repeated measures analysis, and the t-test of change from baseline by individual visits was utilized for statistical analysis. Well, these are the patient characteristics in the two trials. I think you can see that the characteristics in the two patient populations are quite similar. That is, the patients were somewhere between 58 and 60 years old, a little older in V-HeFT II. The study was obviously performed later. The ejection fraction ranged around 28 to 29 or 30 percent in both trials. The cardiothoracic ratio exhibited an enlarged heart in both studies. The peak oxygen consumption averaged around 15 or 14 ml per kilogram per minute, and patients all had heart failure for at least 2 or 3 years. The majority of the patients were Caucasian. That is, about 70 percent of them in both trials, but there was a fairly sizeable number of African-Americans in the trial. We won't go into that, but we have much data comparing the Caucasian and African-American responses. This is the duration of heart failure, which predominantly was between 6 and 48 months. About 55 percent of the patients had coronary disease as the etiology of their heart failure, and about 45 percent had what was thought to be non-ischemic cardiomyopathy. Now, this was the major endpoint of the trial, which was to monitor mortality, and I plot here the differences between the placebo and the hydralazine-isosorbide dinitrate groups during the 4-plus years in which the patients were followed. You can see that at the initiation there were 273 placebo patients and 186 hydralazine-isosorbide dinitrate patients. That was a planned preference for placebo entrance because we had a third treatment arm, which was Prazosin, which I don't have plotted here. I'll show you the survival curves, including Prazosin, in a moment. But this was an attempt to have a larger placebo group so that we could have more confidence in the placebo arm. You can see at the end of 1 year, there had been a 19.5 percent mortality in the placebo arm, and a 12.1 percent mortality with H/ISDN, and that was a 38 percent mortality reduction. At the end of 2 years, the differences were 34 and 25.6 percent, or a 25 percent reduction. At 3 years the reduction was at 23 percent, and by 4 years it was a little under 10 percent. By 5 years, of course, the numbers became very small. So, after 3 years we really have very little power and obviously an instability of the survival curves at that point in time. This indeed is a plot of the three survival curves in V-HeFT I. H/ISDN in yellow at the top showing that there is a clear reduced mortality or improved survival in this treatment compared to the placebo group in blue. The Prazosin group in red, superimposes on the placebo group until this very terminal end, where there's great instability in the numbers. I think this was the first evidence that a potent vasodilator, which Prazosin is, is not necessarily effective in heart failure, so the original concept that the two vasodilator arms would behave similarly was contradicted by this study. We now know that the efficacy of the vasodilator chronically is not necessarily related to its hemodynamic effect. This is just a brief summary of the statistical analysis of this trial. You'll hear a good deal more about this later from Joe Quinn, but just to briefly tell you what the statistics were. Using the log-rank test, the 2-year mortality reduction from hydralazine and nitrate compared to placebo was .0279, and the risk ratio .7. And as you can see, the 95 percent confidence intervals did not overlap 1. When the Cox model was employed, using the three variables that were chosen by the FDA and its independent consultant to be employed to adjust the log-rank test, the p value fell to .0168 and again, the confidence interval did not overlap 1. Overall mortality by the log-rank test was a p value of .046, and the confidence interval did include 1. When the Cox model was employed, that fell to .0177 and the confidence interval did not overlap 1. DR. MOYE: One question to clarify. Can you go back to the previous slide, please? DR. COHN: Yes, sure, Lem. DR. MOYE: When you show the log-rank p value of .046, now say again what the threshold was that the investigators determined prospectively for stat significance here. DR. COHN: Oh, we'll get into that a good deal later. I'm just showing you the raw p values with no intent to suggest that this has met any criteria that was established. So, we'll get into that in more detail later on. These are the raw numbers. Now, the other major endpoints that I wanted to bring to your attention were the peak oxygen consumption and the ejection fraction. This is the intent-to-treat analysis of changes in peak oxygen consumption over time in the two treatment arms of interest. There was clearly a trend, but no statistical difference between the two. A trend for the H/ISDN group to exhibit sustained improvement in peak oxygen consumption, which did not occur in the placebo arm, but none of these differences were statistically significant. Now, V-HeFT I gas exchange was done by a primitive methodology that we developed ourselves and we put some instruments together. There was no commercial instrument at that point available for use. It was a mixing chamber. It was a pretty crude way to measure peak oxygen consumption. V-HeFT II, that I'll show you in a moment, was carried out with modern technology with breath-by-breath gas exchange data, so that I have more confidence in the V-HeFT II gas exchange data. However, the protocol said that exercise tests would only be included for analysis if they were terminated by dyspnea or fatigue. And therefore, in the protocol analysis, there were fewer patients included because those who had orthopedic reasons, et cetera, for not finishing an exercise test were excluded. When we did that, the data looked pretty similar. That is, the green line, which is H/ISDN, exhibited some improvement, which seemed to be at least unchanged or improved over time. The placebo group in blue exhibited what was a trend toward a decline. And one time point, at 1 year, exhibited a significant p value of less than .05 for the improvement of exercise performance in the H/ISDN group compared to the placebo group. That might be shown even a little more clearly on this next slide, in which we have done an analysis looking at the changes in exercise performance by groups. So, this represents in the placebo group on the left and the H/ISDN group on the right all patients who reached 1 year after randomization and what their exercise tests showed. First of all, there were more people who died in the placebo group. This has already been pointed out. So, this excluded them from a repeat exercise test, and more were excluded in the placebo arm. Then we've looked at three different levels of peak oxygen consumption, using .07 ml per kilogram per minute as the dividing point because the mean increase in the H/ISDN group was 0.7 at 1 year. You can see that in purple are those whose exercise performance worsened over that 1-year period of time, and there were fewer here than here. In yellow are those whose exercise performance stayed the same between the two, and there were more people in this group than in this group. In this top bar are those whose exercise performance improved over that period of time, and there were more here than there were in the placebo group. And these are the ones who, for administrative reasons, had missing data and they were equal in the two treatment arms. By a chi-square analysis of these two distributions it's significant at the p .024 level. Now, the ejection fraction changes were very dramatic and consistent. That is, H/ISDN produced a significant and sustained improvement in ejection fraction. These are all measured by radionuclide techniques sequentially. In contrast, the placebo group exhibited no improvement and a progressive decline over time, which did not occur in this group. I personally view the sustained improvement of ejection fraction as a structural alteration in the left ventricle with reduction of the remodeling process, which appears to progress in the placebo group. Now, this slide adds the Prazosin group to this analysis. The Prazosin group now is in blue and the placebo group in yellow. You can see that these two track together, and there was a progressive decline of ejection fraction in both groups, indicating that this vasodilator, Prazosin, did not favorably affect the structure remodeling process in the left ventricle, whereas this vasodilator, H/ISDN, did. This indeed tracks directly with the mortality results that I've shown you before, and we have data that I won't have time to go into to suggest to you that the changes in ejection fraction are indeed very powerful predictors of the change in mortality in the individual patient groups. Well, when we completed V-HeFT I, there had been also the publication of data from the CONSENSUS study done in northern Scandinavia, in class 4 heart failure, using Enalapril as the treatment option. That study eventually led to the approval of Enalapril for mortality reduction in heart failure. And that was a 1-year trial, so that at the end of 1 year, in CONSENSUS, this was the mortality in the placebo group, very high because these were class 4 heart failure patients, and this was the mortality in the Enalapril group at 1 year, and that represented a 31 percent reduction. When we looked at the V-HeFT data in terms of 1-year data, this was the reduction from 20 percent to 13 percent which represented a 38 percent reduction in mortality. We thought that these two reductions were quite comparable, but this was indeed a very different patient population than V-HeFT and we therefore asked the question, would Enalapril have the same beneficial effect, or greater beneficial effect in mild to moderate heart failure as did hydralazine and isosorbide dinitrate in V-HeFT I, and that was the basis for the design of V-HeFT II. These were the survival curves from V-HeFT II, Enalapril in green, H/ISDN in yellow, and you can see that there was a clear trend for improved survival with Enalapril, compared to hydralazine and isosorbide dinitrate. This is the statistical analysis in brief of that difference by log-rank test, and a two-sided p value here. The p was .017 for the 2-year mortality difference between the two, favoring Enalapril, with a risk ratio of 1.46. The overall mortality difference did not achieve statistical significance, .0828, but a clear trend for a favorable effect of Enalapril compared to H/ISDN. But here the confidence intervals overlap 1.0. The other endpoints, again, in V-HeFT II are striking. This was the intent-to-treat changes in oxygen consumption during exercise, and once again, H/ISDN exhibited a modest but sustained improvement in peak oxygen consumption, certainly for the first two years, and at least three time points these increases, when compared to the changes with placebo, were statistically significant -- not placebo -- Enalapril, I'm sorry, were statistically significant. At no time point during follow-up did Enalapril produce an improvement in peak oxygen consumption. In fact, oxygen consumption tended to decline over time. Now, this was the intent-to-treat analysis, but the protocol analysis, again, defined the changes to be identified only in patients who stopped exercising for dyspnea or fatigue, and this is the protocol analysis, showing pretty much the same thing, that there was a strong trend for an improvement with H/ISDN and not with Enalapril. These were the statistically significant points. I must emphasize to you that at 3 months and at 6 months, which represents the time frame for which the FDA has all existing data on changes in exercise and heart failure therapy, H/ISDN exhibited significant improvement in peak exercise capacity compared to Enalapril. I think that if the study therefore had been terminated at 6 months, as have most other exercise studies, there would have been no question that this therapy was more effective than the ACE inhibitor in symptom relief or exercise performance in heart failure. Now, once again, the ejection fraction changes were very striking with both therapies. That is, both Enalapril in green and H/ISDN in yellow produced a sizeable and sustained improvement in ejection fraction. In fact, at 3 months the increase in the H/ISDN group was greater than the increase in the Enalapril group. Thereafter the two were similar, suggesting that both interventions favorably affect the remodeling process in the left ventricle. Now, we were struck when we completed V-HeFT II, and we had the same treatment arm in both trials -- that is the H/ISDN arm was exactly identical and the therapy was identical in the two trials. The survival curves for these two treatment arms were superimposable. That implied to us that there must have been some stability in this response since it was so reproducible. Now, this of course then comes to the placebo arms because we did not repeat a placebo in V-HeFT II and therefore, we are dependent on the placebo group in V-HeFT I. I have given you a list here of the so-called placebo groups in more recent heart failure trials. I have given you data on the number of deaths in these trials, in the placebo arms, the duration of follow-up, and the use of what may be critical co-therapy. That is, the use of nitrates and the use of ACE inhibitors. In V-HeFT I there were 120 deaths, so this is a rather robust sample. The follow-up was 2.3 years. None of the patients in the placebo arm received nitrates, and none of them received ACE inhibitors. This is a true placebo group, added to digoxin and diuretic therapy. CONSENSUS, that I have already alluded to, had only 55 deaths in the placebo arm. The follow-up averaged only 0.5 years. 45 percent of those patients were treated with a nitrate, which obviously potentially contaminates the placebo group. In the SOLVD trial, which is the largest clinical experience in heart failure trials, there were 510 deaths in the placebo arm, and a follow-up of 3.4 years, which makes this a very robust placebo group. But it isn't a placebo group because 45 percent of these patients were treated with nitrates chronically, and 23 percent in the placebo group were given ACE inhibitors as drop-in therapy. So, this is certainly not a placebo group. Now, the more recent trials, PROMISE, which exhibited an adverse effect of milrinone in heart failure, had 127 deaths in the placebo group, an average follow-up of only 0.5 years. But 59 percent of that placebo group was treated with nitrates, and essentially all of them at least were by protocol on ACE inhibitor. We don't have the actual data in the paper. The Vesnarinone trial, which was not replicated by the more recent VEST study, have only 33 deaths in the initial Vesnarinone mortality trial in the placebo arm. The follow-up was only 0.5 years. We don't know about nitrates, but 90 percent of the placebo group were on ACE inhibitors. The more recent Carvedilol American data that you'll be reviewing later this morning, in the placebo group there were 31 deaths. As you know, the follow-up was only 0.5 years, but once again, 32 percent of them were receiving nitrates and essentially all of them were receiving an ACE inhibitor. So, this is to point out to you that we will never again have a placebo arm comparable to V-HeFT I because it is ethically indefensible to any longer treat patients without an ACE inhibitor, and we would like to suggest that after today's meeting it would be equally indefensible to treat them without a nitrate along with hydralazine. Well, if we can use that placebo arm, then, as a comparator, we can put a plot of the five treatment arms from V-HeFT, and in fact this analysis was recommended by the FDA for us to do. So, this is in response to their raising the issue about comparing the V-HeFT I placebo group here in red with Enalapril in blue and with the two hydralazine-isosorbide dinitrate curves in yellow. Now, a 2-year endpoint was indeed a pre-study endpoint, so we have dropped a vertical at 2 years in the placebo arm, and discovered this is the mortality at 2 years and we put a horizontal line over to this mortality, which is about 65 percent. Then we determined at what time point would you reach that same mortality if you had instead been treated with hydralazine and isosorbide dinitrate, and these two curves, which were just superimposed, show you that there is a prolongation of life by an average of about 10 months. If instead we had used Enalapril, the prolongation of life would have been longer, at maximum probably another 8 months. Obviously the effect is more than 50 percent of this effect, and this effect is enhanced by the fact that there was a little blip on that Enalapril curve there, but be that as it may, it's clear that Enalapril had a more favorable effect than did hydralazine and isosorbide dinitrate. But both are very importantly better than the placebo or Prazosin arms shown here. Well, in summary, then, I've told you that mortality is reduced by H/ISDN, that there is a strong trend for improved exercise tolerance by H/ISDN, and that there is sustained improvement in ejection fraction with H/ISDN. Now, a number of statistical issues have been raised by the FDA, and I'll now turn the podium over to Joe Quinn, who will address some of these issues. Joe? DR. QUINN: Thank you, Dr. Cohn, and good morning to everyone. I would like to discuss several important statistical issues that have been raised by the agency that potentially impact the interpretation of the nominal p values in the application. The first issue is the impact of the interim analysis. This slide summarizes the interim results of the overall survival time that were provided to the V-HeFT I Operations Committee. Note that even though the protocol specified -- sorry. This slide summarizes the interim results of the overall survival time that were provided to the V-HeFT I Operations Committee. Note that even though the protocol specified a one-sided test hypothesis, the p values shown are two-sided p values, as the committee wanted to be conservative in their decisionmaking. There were four interim analyses that were conducted using the O'Brien-Fleming criteria. These are shown on the right-hand side with the critical values. Additionally, there were four interim analyses conducted for administrative purposes. The columns represent the protocol-specified ways of comparing the arms. Overall tests between the 3 curves, using a 2 degree of freedom test, a combined active versus placebo arm, and the two pairwise comparisons of the active arms to placebo. Note that the first three looks at the data were performed using an overall test. A trend was observed in February of 1983 comparing the best arm, which was H/ISDN, to the worst arm, placebo. It was after this analysis that the Operations Committee unblinded themselves. In May of 1983 it is important to note that a significant difference using the O'Brien-Fleming stopping boundary was observed between H/ISDN and placebo. However, the trial continued without change. The majority of the protocol specified comparisons were made after the significant interim result was established, pointed out in this area. Note that even though the overall survival time was used for this analysis, the significant results obtained in May of 1983 were more similar to a 2-year endpoint. The O'Brien-Fleming method was not the pre-specified method in the protocol but was used after the method was published in 1979 as it was easier to implement than the Canner method that was a pre-specified method. The next slide I'm going to show you is not included in the committee packet of slides, nor has it been shared with the FDA, due to the recent completion of this simulation, but we feel it shows strong, supportive information that also assesses the sensitivity of the O'Brien-Fleming method that was used. This slide summarizes the simulation of an interim analysis using the protocol specified Canner method. The results of this simulation support the findings of the O'Brien-Fleming method, indicating a superior mortality benefit for H/ISDN over placebo in May of 1983, as well as in August 1984. The critical p value that was used for this simulation was a .0125 for the comparison of H/ISDN to placebo, a .0125 for Prazosin versus placebo, and .025 for the combined active versus placebo. There were several reasons that the committee did not stop the study after the significant interim finding in May of 1983 was observed. Importantly, the impact of the differences in the baseline characteristics of the patients upon survival had not yet been assessed, and the committee wanted to establish the length of benefit of the H/ISDN effect. In summary, there was a statistically significant interim analysis in May of 1983, according to the O'Brien-Fleming stopping criteria. The study continued beyond May of 1983 to investigate the length of benefit of effect. The protocol-specified secondary analyses -- that is, the Cox model -- were justified based upon the significant interim results. As the May 1983 analysis met the stopping criteria, no penalty is required for the interim analyses that were conducted after this May 1983 finding. The next issue is the multiple treatment arm comparisons. As previously shown, the interim testing was conducted in a protected fashion. First, the overall test of the three treatment arms, using a 2 degree of freedom test, was employed. Secondly, the best versus the worst arms were compared in February of 1983, and again in May of 1983. Only after a significant result was obtained in May of 1983 were the combined active versus placebo arms and pairwise comparisons made. We would suggest that after significant differences were established in May of 1983, no alpha penalty is warranted for the protocol-planned comparisons performed subsequent to this time. The next issue is the stepwise approach to the analysis, that is, a non-significant log-rank test, then a Cox model analysis. The significant log-rank interim analysis in May of 1983 justified the protocol-specified secondary analysis, the Cox model, without alpha penalty. The analysis of a covariance method gave a more precise estimate of the true treatment effect, especially for overall survival where the estimate of effect is more variable because of the small number of patients in the trial after 3 years. The next issue is the imputation of missing covariate values. There were a total of 459 patients in the placebo and H/ISDN treatment arms in V-HeFT I. There were 51 of 459 patients that were missing either baseline ejection fraction or baseline peak O2, two of the covariates selected by Dr. Packer as a consultant to FDA. This slide shows the baseline mean values of ejection fraction and max oxygen consumption by survival status. There was a consistent trend independent of treatment group showing that patients dying during the study had a lower baseline EF and lower baseline max O2 than those alive at the end of the study. This slide shows similar data in a slightly different fashion. This slide shows the cumulative mortality by baseline ejection fraction and baseline oxygen consumption during peak exercise. The patients with the lowest ejection fraction and lowest oxygen consumption had the highest mortality during the study. There were incremental advantages in total mortality observed by baseline ejection fraction and oxygen consumption, with patients having the higher baseline ejection fraction and higher baseline oxygen consumption having the lowest mortality. This next slide summarizes a simulation analysis performed by Dr. Jim Hung at FDA, showing alternative methods for imputing missing values of baseline ejection fraction and maximum oxygen consumption. The first row of this table shows the results when the maximum non-missing value is used to impute the missing values for the patients that died, and the minimum non-missing value is used to impute the missing values for the patients that survived. This approach may not make sense, given the data which I have just shown to you, and what we know about the trials regarding the prognostic significance of ejection fraction in oxygen consumption upon survival. The second row shows the results if one uses the mean value to impute the missing values for all patients with missing ejection fraction or max oxygen consumption, regardless if they died during the trial. This is the method that most closely resembles the approach used by Dr. Lloyd Fisher for the analysis submitted in our application. This method leads to a p value of .016 in the Cox model for overall survival and a p value of .013 for 2-year survival. The third row shows the results obtained if one uses the minimum non-missing value for ejection fraction in max O2 as the imputed value for those patients that died during the trial. And the maximum non-missing value of ejection fraction and max O2 for those that did not die during the trial. Based upon the data that I have just shown you, and based upon what we know about the prognostic significance of ejection fraction and oxygen consumption from other trials, this method has strong intuitive appeal. Using this method to impute the missing 51 covariate values, one obtains a p value of .007 for the overall survival, and .01 for the 2-year survival. The true p value probably lies somewhere in between .016 and .007 for the overall survival and probably somewhere between .013 and .01 for the 2-year survival. Finally, I would like to point out regarding the last column, labeled log-rank/Cox, these columns indicate the simulation results for the incremental increase in the p value for conducting the Cox analysis after a non-significant log-rank test. As previously mentioned, the statistically significant log-rank test in the May 1983 interim analysis provided a rationale for conducting the Cox analysis without an adjustment in the p value for this approach. In summary, the sensitivity analysis conducted by Dr. Hung indicates a range of nominal p values, depending upon the method used for imputing the missing covariates. Use of a minimum value for deaths and maximum value for survivors is reasonable, given the observed findings and what we know about the prognostic significance of these covariates. Use of a mean value may lead to a more conservative p value, especially for overall survival. The next issue is the two protocol-specified primary endpoints. There was a 33 percent reduction in mortality through 2 years, and a 27 percent reduction in overall study mortality for H/ISDN treated patients. The one-sided 95 percent confidence intervals indicate the H/ISDN risk reduction is consistent with the range of observed findings. The observed risk reduction at both time periods was consistent and correlated, and both findings represent different point estimates of one endpoint, that is, survival. In summary, the consistent risk reduction was observed at 2 years and overall study. The protocol specified a valuation at two time points to assess the length of benefit of effect. The estimate of effect may be influenced by the sample size at 2 years, and at the end of study, and it may be reasonable to consider survival data as one endpoint, having two point estimates of effect for the modest alpha penalty imposed upon the nominal p values. The last issue is the issue of the replication of the study findings. There are three questions that have been suggested by the agency that must be addressed for this issue. Would H/ISDN have beaten placebo if it had been studied in V-HeFT II? And what is an appropriate placebo group to use for this comparison? And is the point estimate of the effect size for H/ISDN less than half the effect size of Enalapril? Because of ethical concerns, the demonstrated mortality benefits observed in V-HeFT I were not replicated. However, the agency has suggested analyses that might be supportive of the mortality benefit and the following is presented as supportive information. We strongly feel that the randomized concurrent control arm of V-HeFT I is the appropriate basis for the mortality benefit. It has been proposed by the agency that the placebo arm from the SOLVD treatment study may be an appropriate arm for comparison. This slide shows the risk ratio for mortality relative to Enalapril for SOLVD treatment, placebo, and V-HeFT II H/ISDN. When the H/ISDN effect is compared to this placebo, there is no observed difference in the risk estimates. This is true at both the 2-year and the overall time points. However, this placebo arm is flawed for purpose of making this comparison, of the following reasons. This study allowed the active use of vasodilators and nitrates and the study also allowed open-label use of ACE inhibitors. This placebo arm is therefore not an adequate control for making this comparison. And one would not expect to observe a difference between H/ISDN in such an arm. A more appropriate control arm is the placebo arm from V-HeFT I. The placebo arm from V-HeFT I allowed only digitalis and diuretic use. Once V-HeFT I was completed, it was no longer ethical to use this control arm in this patient population. Use of the V-HeFT I placebo group as a control group for V-HeFT II makes sense, given the similarity of the patient populations studied and the conduct and handling of both trials. As previously shown by Dr. Cohn, this slide shows the survival profile for H/ISDN treated patients in V-HeFT I and V-HeFT II. It is clear that this profile is very similar, but this does not allow one to conclude that the risk reduction for H/ISDN is replicated in V-HeFT II. To do that, one must also consider the data from the second arm of that trial, Enalapril, and how each arm would have performed relative to a placebo group, had there been one. This slide shows the risk reduction and the 95 percent confidence intervals for V-HeFT I and V-HeFT II, as well as the risk reduction for Enalapril compared to V-HeFT I placebo, and H/ISDN from V-HeFT II compared to V-HeFT I placebo. It is important to note the following. The risk reduction observed for H/ISDN and V-HeFT I, .73, is consistent with the observed risk reduction for H/ISDN and V-HeFT II, compared to V-HeFT I placebo, .75. There is a strong suggestion of an overall Enalapril benefit in V-HeFT II, even though the 95 percent confidence interval includes 1. The risk reduction observed for Enalapril, compared to V-HeFT I placebo, .61, is consistent with the expected conclusion of an Enalapril survival benefit. Importantly, the point estimate of the H/ISDN risk reduction, .75, is not less than half of the point estimate of Enalapril, when both are compared to a common placebo. Also, the upper bound of the Enalapril effect overlaps the point estimate of the H/ISDN effect, as does the lower bound of the H/ISDN effect overlap the point estimate of Enalapril. In summary, V-HeFT I was the only study with a true placebo arm. The Enalapril survival benefit versus V-HeFT I placebo was consistent with the expected survival benefit of Enalapril. The H/ISDN survival benefit from V-HeFT I was replicated in V-HeFT II when compared to the V-HeFT I placebo group, and the point estimate of the V-HeFT II H/ISDN survival effect was not less than half the effect size of Enalapril, when compared to a V-HeFT I placebo. In conclusion, it is reasonable to expect little or no impact upon the nominal p values due to the issues described. The extent of the alpha penalty does not impact the interpretation of the observed survival benefit for H/ISDN in V-HeFT I. And it is reasonable to conclude that the H/ISDN survival benefit was replicated in a second study. And now Dr. Jay Cohn will provide a clinical wrap-up to the presentation. DR. COHN: Well, a number of other endpoints were monitored in V-HeFT I and II and time won't allow us to go into all these, but a few of them have been specifically addressed by the agency, and I'll try to provide those data. In V-HeFT I and V-HeFT II, we measured cardiac hospitalizations as well as we could. Quite a different population because these were VA centers and the criteria for admission to a VA hospital are quite different from those to private hospitals. Quality of life was assessed in both trials, but I must point out to you that in 1979 when we planned V-HeFT I, there were really no appropriate quality of life instruments that could be used, so this was truly not a quality of life assessment. We did use a form in V-HeFT II that I will show you in a moment. It was never validated. It has not been re-used. We have subsequently developed a Minnesota Living with Heart Failure Questionnaire, which was not employed in all the centers in V-HeFT II. We monitored heart size, we monitored echocardiograms. We did Holter monitoring, and we measured plasma norepinephrine levels, and time will not allow me to go into these endpoints. The time to death or hospitalization is shown here because the agency asked about hospitalizations. This is the V-HeFT I data showing time to death or hospitalization, and you can see there was a clear trend for the H/ISDN group to fare better than the placebo group, but this was not statistically significant. This is the analysis of the V-HeFT II, that is, time to death or hospitalization. In V-HeFT II, and as you might predict, there was a more favorable effect of Enalapril compared to H/ISDN, largely reflecting the mortality difference because when we look at just the time to first hospitalization for any reason in V-HeFT II, the two curves for Enalapril and H/ISDN superimpose and there is no difference at all between them. If one accepts, then, that Enalapril has a significant impact on hospitalizations and reduces it, as it has in other studies, one might conclude that H/ISDN is not different from Enalapril in that regard. This is the quality of life assessment we did in V-HeFT II, called a Heart Condition Assessment Score. This is the changes over time, an increase being an improvement in quality of life, a decrease being a decrease in quality of life, and there is no striking difference between H/ISDN and Enalapril. At the first time point, 3 months, where the agency has almost all of its data on quality of life in heart failure and the effects of therapy, H/ISDN exhibited a significant increase. Enalapril did not. That p value was less than .05 at 3 months. Thereafter, quality of life declined progressively in both groups, which tells you a little bit about the natural history of heart failure. At all time points, though, there was a greater decline in quality of life in the Enalapril group than in the H/ISDN group, suggesting a trend for more favorable effect of H/ISDN, consistent with the trends on exercise performance. Now, the safety of these two drugs I won't go into. You have it in your document, all the side effect data. The safety has been well characterized. We know, and it has been confirmed, that H/ISDN causes headache and that is reduced when the dose is reduced. We know that Enalapril causes cough and that clearly appeared in the database. There were essentially no instances of lupus in V-HeFT I. There were two possible cases in V-HeFT II, but it's clear that the incidence of lupus as a complication of hydralazine is exceedingly uncommon in this patient population. Now, the issue of nitrate tolerance has been raised repeatedly, both in the clinical arena and by the agency because of the well-known tolerance that develops to continuous nitrate administration in their treatment of angina. The mechanisms for this nitrate tolerance have in the past not been clarified. There are many mechanisms that have been suggested, but there is recent and perhaps the most exciting data of all, the role of hydralazine as an inhibitor of nitrate tolerance. It appears that when we serendipitously put these two drugs together in the late 1970s, not knowing at all what the interaction was but knowing that they were both vasodilators, we did something that proved to be remarkably effective, and that is, we added to nitrate a nitrate tolerance inhibitor. I'll show you just briefly the data on that issue. It has been well established in a number of laboratories, laboratories of Munzel and Harrison and Besange, that nitrate tolerance is associated with the generation of superoxides at the endothelial surface. These superoxides chew up nitric oxide and thus inhibit the nitric oxide effect which characterizes the hemodynamic response to nitrates. This is just one slide from a paper by Winslow that was published in the Journal of Clinical Investigation last year, in which superoxides are measured in response to NADH addition as a substrate. This is carried out in ground-up aortas from rabbits, who were either not treated with nitroglycerin or treated with nitroglycerin, or treated with nitroglycerin in addition to hydralazine for 3 days before the aortas were taken out and ground up. You will notice that this is the superoxide production in response to NADH in a controlled animal that received neither nitroglycerin or hydralazine. This second black bar is the increase that is identified when the animal had been treated for 3 days with nitroglycerin, an excess by about two or three-fold of the amount of superoxide that is produced in the vasculature. When hydralazine was added to nitroglycerin in the treatment of these animals for 3 days, there was no excess of superoxide produced, implying that the hydralazine had prevented the generation of the superoxide which causes nitrate tolerance. Now, the in vivo documentation of this combination has been well established. This is a study performed by Dr. Ho-Leung Fung, who is in the audience in case there are any questions raised about this, in which he took rats with myocardial infarction who had an elevated left ventricular end diastolic pressure and infused nitroglycerin continuously. In open circles is the response of the left ventricular end diastolic pressure to nitroglycerin. It comes down and then gradually recovers, despite the fact that the nitroglycerin infusion is continued. This recovery to pre-treatment levels implies nitroglycerin tolerance, the hemodynamic effect of the nitroglycerin. When in fact he added hydralazine to the regimen, which in itself did not change LVEDP, the fall was comparable with the nitroglycerin but now the nitroglycerin effect was sustained over 10 hours. This is a significant inhibition of the tolerance that developed in the nitroglycerin-alone treated rats. And then to bring this to the clinic, Dr. Uri Elkayam and his colleagues in Los Angeles -- and Uri is also in the audience in case there are any questions -- did the same trial in humans with heart failure. Infusion of nitroglycerin in these patients with heart failure produced a decline in the pulmonary capillary wedge pressure and then when the nitroglycerin infusion was continued, the wedge pressure rose progressively, implying tolerance to the hemodynamic effects of nitroglycerin. When hydralazine was co-administered with the nitroglycerin, the favorable effect of nitroglycerin on pulmonary-capillary pressure was sustained. So, there appears to be rather persuasive evidence now that hydralazine is a potent antioxidant which inhibits the tolerance that may develop to nitroglycerin or to isosorbide dinitrate. Now, I am not willing to accede that hemodynamic tolerance is necessarily also implied, that there is tolerance to the anti-remodeling effect of nitrates on left ventricular function. I think these must be viewed as separate endpoints, and we can't assume that one is related to the other. But I think that this is clear evidence that whatever tolerance might develop during chronic administration of isosorbide dinitrate should very much be inhibited by the co-administration of hydralazine. Well, I alluded at the beginning to the fact that the guideline committees have approved this therapy already and I just remind you, and you have in your briefing document the details of these guidelines, and in fact many members of this committee have served on these guideline committees. There are three identified here. That is, the guidelines issued by the American College of Cardiology and the American Heart Association for the treatment of heart failure, the guidelines issued by the Agency for Health Care Policy Research, and the guidelines for heart failure treatment issued by the World Health Organization. All of these guidelines recommend for therapy of heart failure digoxin and diuretics, the use of ACE inhibitors, and the use of hydralazine and isosorbide dinitrate in patients who are not taking an ACE inhibitor. They do not suggest this should replace ACE inhibitors, that this should be used in patients who do not take those drugs. Well, I just would like to finish up by putting in context what I have learned from these V-HeFT trials because this has changed the paradigm. We used to think that heart failure was a syndrome in which there were many endpoints, all of which should be in concert. I think we now know that they are distinct, and that the progressive process in the left ventricle with dilatation, which we call remodeling, and a progressive fall in ejection fraction leads to premature death from arrhythmias or pump failure, and this process may continue and progress to death in the absence of symptoms. In fact, the SOLVD trial, the SOLVD prevention trial, was initiated to identify patients out here with a low EF and no symptoms. So that it is quite possible to go through this whole disease without symptoms. The presence of symptoms relates largely to noncardiac factors which may be variably stimulated by this process in the left ventricle, and may include neurohormonal activation and multiple other factors as well. Most of the data that have been previously reviewed by the FDA for treatment of heart failure for relief of symptoms have involved short-term studies in which symptom relief is really a short-term goal of therapy. In contrast, if one is interested in this process leading to death, one must do a long-term trial and one must used therapies to interfere with this process that may be quite separate from therapies aimed at relieving symptoms. So, I currently view the management of heart failure really with two different goals in mind. One is short-term symptom relief, and for that we often use -- we do use -- diuretics and vasodilator may favorably affect short-term symptoms by producing a favorable hemodynamic effect. And we even use occasionally positively inotropic drugs like dobutamine and milrinone in order to have a favorable effect on hemodynamics and on symptoms, despite the fact that we know that these drugs shorten life expectancy, apparently, and some of these drugs have no effect on life expectancy and some may shorten it. So that there is no relationship between the favorable effect of these drugs on symptoms and the potential for therapy to alter the long-term course of the disease. From what we now know, progressive left ventricular dysfunction can be inhibited and therefore mortality reduced by ACE inhibitors, by hydralazine and isosorbide dinitrate, I believe by beta-blockers -- and you are going to be dealing with that contentious issue this afternoon -- and perhaps by other neurohormonal inhibitors which can alter the milieu and influence the rate at which the left ventricle remodels, yet to be determined out here. But I think we have reached the point now where we have to identify specific endpoints for a therapeutic approach. The only agent which appears on both sides of these columns is hydralazine and isosorbide dinitrate because it does relieve symptoms and improves exercise as a potent vasodilator, and it also inhibits the progressive remodeling process in the left ventricle. Well, in summary, then, I hope we have been able to convince you, Mr. Chairman, that there is a strong basis for approval of BiDil for congestive heart failure. That is, that the combination of hydralazine and isosorbide dinitrate exhibits a survival benefit compared to placebo; that it exhibits a strong trend for improved exercise tolerance versus both placebo and versus Enalapril and V-HeFT II; that it produces a sustained improvement in ejection fraction, which I believe means that it is inhibiting the remodeling process and it also confirms the long-term effect of these two vasodilators; that this combination therapy has a well-established rationale, even more well-established by the recent data relating to nitrate tolerance; that the safety of this combination is well-established; that it is already widely recommended as a treatment option in all the guidelines issued for the management of heart failure; and that indeed the practicing physicians require prescribing information to properly utilize this remarkably effective therapy. Thank you very much. DR. MASSIE: Thank you very much, Jay. The way I think we should proceed from here is first open up this presentation to questions from the committee and our consultants. We are going to lead off with our reviewers, as we usually do, and our consultants, and if the reviewers from the FDA want to ask some questions at that point, that would also be appropriate. Then we'll ask the reviewers from the FDA for comments and then we'll proceed on to the questions. So, why don't we start. Lem, do you want to start, since you had some statistical questions? DR. MOYE: Sure. In nowhere as part of the slide presentation that we saw today did I see the -- and if this was here and I missed it, I apologize, but I don't think I saw the log-rank analyses which led to the p value of 0.093. And I wondered if you could comment on that. DR. QUINN: I think you are referring to the two-sided log-rank test that was in the original application? DR. MOYE: That's right. DR. QUINN: Well, that has been presented as the one-sided p value that corresponds to that two-sided test, as the protocol specified the one-sided p value as the appropriate method. DR. MOYE: And so the one-sided p value is what precisely? DR. QUINN: Can I go back to that slide? It would be the one from Dr. Cohn's presentation of the summary of the V-HeFT I survival. DR. MOYE: That's where I think I first asked the question. It's 0.04. DR. COHN: It's .046, I think. DR. MOYE: Okay. Now, the threshold for significance, which was prospectively specified by the investigators, was at, again one-sided, 0.025. Is that right? DR. QUINN: Well, it's difficult to interpret the protocol actually. The protocol suggested that different alternatives could be employed, depending upon the number of comparisons that were made. And the protocol suggests that if the combined active versus placebo arm was compared, as well as the two individual active arms to placebo, then that the individual active arms to placebo could be compared at the .0125. However, the protocol doesn't necessarily lead one to believe that all those tests would be conducted and the pairwise comparisons could also have been tested at .025 and the rationale that I'm trying to make is that the interim analysis of May 1983 that met the O'Brien-Fleming stopping criteria, was the significant log-rank test found for the trial. DR. MOYE: But since the trial was allowed to continue, I think it's also admissible that that might not be the definitive p value because of course, as you get these multiple p values, as you go through the interim analyses, one could choose any p value they wanted and continue to go through the trial, amassing additional p value. There is a problem with that approach, right? Okay. One other question. The protocol is actually quite laudatory of the log-rank test. I will not read the individual statements from the protocol, but there are I think two locations where they mention the superiority of the log-rank test and that it is distribution-free, and I think they go so far as to say that one of the best tests available to identify small differences between treatments is the log-rank test. Yet, now there is a good deal of emphasis on the Cox analysis approach. I could only find one brief mention of the Cox analysis approach in the protocol and if I compared statements about the log-rank versus statements about the Cox, my view would be that the investigators were hanging their hopes on the log-rank and not the Cox. Yet, we see a good deal of analyses today centered on the Cox progression analysis approach. DR. QUINN: Well, the survival curves become more variable at later time points of the trial, and the Cox model helps to partition out some of that variability and to assess the treatment effect. DR. COHN: Yes, if I could just comment about that, Dr. Moye, because you have to remember, this protocol was planned in 1978 or 1979. There were no data yet on long-term follow-up of heart failure. So, the possible potency of covariates and variables in influencing mortality was completely unknown at that time. I think that all current trials in heart failure are done recognizing those variables and adjusting for them, usually with a Cox analysis. I agree with you. At the time this protocol was written, the Cox analysis was not necessarily identified as an important determinant, for the very reason that we were not very cognizant of how important these were going to be in influencing this ultimate survival. DR. MOYE: So, I guess the crux of the matter here for me is, is there ever a circumstance when the primary statistical prospectively stated analysis plan can be adumbrated, can be substituted by another analysis plan using another stat analysis procedure? DR. COHN: Again, I think you are entirely right, and that is why we have gone into this intensive analysis of the statistics because that question has come up repeatedly and we can only show you the data as they are. These are the p values. One has to interpret them as one chooses to do. But keep in mind that this is a study designed 20 years ago. This was a VA cooperative study. This was not designed really as a regulatory study so that careful selection of criteria for endpoint were not as precise as one would see in a protocol designed today with the goal to come to this committee and ask for approval. So, one has to look at this a little differently than one might at a more recently organized mega-trial in which p values are clearly defined as the goals for the trial. DR. MOYE: Thank you. DR. MASSIE: Let me just read the statement I think that Dr. Moye was pointing out. This is in the analysis method of the protocol, the PF1 on page 34, where it said that variables which are prognostically important will be identified by comparing survival curves of patients on different levels of baseline variables. The life table regression procedures of Cox will also be used to identify variables that are prognostically important and to obtain estimates of treatment effects adjusted for any equality in their distribution between treatments. Now, one thing that struck me on the baseline characteristics is there were no inequalities of those prognostically important variables. Was that the case? DR. COHN: Yes. There were no significant differences when one asks are there differences between the two groups, but of course there are subtle differences which may impact upon mortality that don't reach statistical significance when one compares the two groups. It has been the usual approach in V-HeFT to look at all variables and not just confine oneself to variables that show a significant difference between the two treatment arms. And you can see the degree of adjustment that was required when we switched from a log-rank test to a Cox analysis, albeit using now only those variables identified by the agency and not the variables that we had originally planned on using because they were preselected independently. DR. MASSIE: It's just that in my naivete I was surprised that there was such a substantial difference in the outcome of those analyses, despite the lack of what looked like even trends. I saw a .5 difference in VO2, but everything else looked right-on. I wondered how much of that might have been as a result of imputation of the missing values as opposed to -- DR. COHN: Well, there were only 51 missing values in this whole group out of -- DR. QUINN: 459. DR. COHN: -- 400 and some patients, so it's really a relatively small number. It would probably be appropriate -- Lloyd, do you want to make a comment about that? Because Lloyd has really spent a lot of time going over these data. DR. FISHER: Well, just that the reason you can get a difference is, there are papers out showing in the Cox model, if the Cox model with covariates holds, if it is appropriate -- and that is an if -- then if you leave out other covariates, you bias the estimated effect downward. That is, I think, Piantadosi and Sam Weyend and some other people have published that. So, perhaps the reason there is a change is slightly analogous to in the analysis of variance you can reduce your variability by taking into account factors. It is not that you are correcting for baseline imbalance, but you have a more precise treatment estimate when you also take into account the other factors, and that does not contribute to the variability. That would be my guess that that is how this happens. Now, having said that, how you would actually prove a statement like that I am not sure, but it certainly can happen mathematically. DR. MASSIE: Lem, and then I thought I would ask Dr. D'Agostino to comment after Lem. DR. MOYE: Just one brief question. I wonder if you would comment on the concerns that have been raised about the lack of fit of the Cox progression model. DR. FISHER: Pardon me. About the lack of fit? DR. MOYE: The difficulty with the believability of the underlying assumptions required by the Cox regression model. DR. FISHER: Yes, I would be happy to comment on that because that's actually how I got involved in this. Things were somewhat down the road and the FDA review said the fit was examined in two ways, minus log plots. And also on the SAS output there was a statistic for fit. One of the p values for goodness of fit was .049 or something. And I came in there and I looked at the plots and I said, hey, this is proportional hazards. I knew that. I mean, I looked at it. Now, this isn't proof, this is Gestalt. So, what I suggested to the agency, I said, let's go to the randomization test. We'll use the Cox statistic but because we're worried about the parametric assumptions, we will go to the randomization test for the treatment effect, which is what we did, the primary thing actually that I did in my analysis. Before that was done that was agreed upon at a meeting with the agency that -- of course, the randomization test is always valid. It doesn't depend on the assumptions. The p value actually turned out to be almost exactly the same. To be perfectly frank, even before we did it, I knew that would happen because I had seen the plots and it looked like proportional hazards. But nevertheless, I think it will alleviate that concern with the agency. I assume that Jim is here, and if the agency still has concerns about that, they can bring them up. I don't think that's much of an issue here. DR. MASSIE: Ralph, do you want to help? DR. D'AGOSTINO: Well, I'm not going to help but I'm going to say something. I guess I'm not overwhelmed with the notion that the protocol says a log-rank test as the major test, and then later on one may want to shift to a Cox. I have written protocols where the analysis that we actually used wasn't even invented when the protocol was being written. So, the notion of shifting is not too dramatic. But in this case here there is such a heavy reliance on the log-rank test that you sort of say that this is the procedure to be used, and then when they're shifting to the Cox, as the analysis is produced, it does become bothersome in terms of trying to sort out, is it chasing after something that's going to show significance, or is it something that you really believe is the best method. The other point that really bothers me is that I can't sort out what the primary variables are. It seems to me like there are a lot of primary variables, which means that there is a lot of testing that is going to go on. And now it looks like there is only a couple of primary variables, which means maybe there shouldn't be too much adjusting. Could someone really clarify? I thought I heard a presentation that there were a lot of secondary variables, but in the materials I had, there were something like six primary variables. That would lead you to say that you committed to those variables. DR. COHN: The protocol did have, I think, six variables as primary endpoints. I know if I were rewriting the protocol today -- and I can't do that -- and we had in mind a regulatory consideration, we would have more precisely defined what were primary and what were secondary. In 1979, that was not done. We all knew as we were progressing -- and we certainly have learned since then -- that the important variables are the ones that I focused on this morning because we now know those are the important variables in heart failure. How did we learn that? We learned that from V-HeFT. So, this is a self-fulfilling prophecy. You do the study, you learn about the disease by doing the study, and then it would be nice to then go back and redesign your protocol, but we don't have the luxury to do that. So, what you are saying is correct. One has to recognize there were a number of variables. The beauty, of course, of this is that every variable went in the same direction. So, we haven't hidden anything. I have alluded to some of those. The trends were all favorable in everything that we looked at. I hope that gives some comfort to the agency in the approval of the drug because there really is consistency across all the variables. DR. D'AGOSTINO: One of the concerns that I think we might have with that is that you then really use the study in an exploratory fashion, which is we learn from studies. But it still then leaves us with the sense that, do we believe the way the variables ultimately were sorted out would, in fact, be confirmed in yet another trial. I think this is where my problems come from. DR. COHN: We have what we believe is strong support for the other variables in V-HeFT II. So, you have seen two trials in which the second -- the other endpoints all went in the same direction, and I think that should give you confidence that V-HeFT I has been replicated. DR. MASSIE: JoAnn? DR. LINDENFELD: Dr. Cohn, I have some concerns about dosing intervals. V-HeFT I and V-HeFT II were both q.i.d., and I understand the approval is for t.i.d., or is it for q.i.d.? DR. COHN: No, the approval should be for q.i.d. The data are q.i.d. DR. LINDENFELD: All right, good. DR. COHN: I think some of the recommendations, at least in one of the guidelines, is for t.i.d., based upon intuition, certainly not based upon data, and we are here with data, not intuition. DR. LINDENFELD: Good. DR. MASSIE: I'm just going to go to our consultants first and then we will open it up to the whole committee. Bob and Jim, any comments, questions? DR. CODY: A couple of questions. Did any patients who participated in V-HeFT I participate in V-HeFT II? How many would you say, what percentage? DR. COHN: Yes. I think about 15 to 20 percent of V-HeFT I patients who survived V-HeFT I were recycled and re-randomized into V-HeFT II. This of course is a major reason why we have never merged these two databases because of the overlap of patients. We have done extensive analysis to see whether there was any difference in behavior of those patients who were re-randomized as compared to those new patients entered into V-HeFT II, and there appeared to be no interaction whatsoever. So, we feel comfortable that they can be treated as if they were the same subset -- from the same set of the population, but that does influence a couple of things in terms of age, for instance. They already were a few years older. DR. CODY: In terms of the very sophisticated statistical analyses that have been done and presented today, has this been factored in, or does it need to be factored in? I would have to defer that to people who know a lot more about statistics than I do. DR. QUINN: Actually I can answer that question, that the results were done both ways for V-HeFT II, both using all the patients that were randomized to that trial, as well as looking at the patients that had not been in the V-HeFT I study. The results were absolutely consistent using both methods. DR. CODY: What percentage of the patients in V-HeFT I and II were women? DR. QUINN: There were no women in the trial. It was all conducted in the VA hospital setting. DR. CODY: I raise this because of the current VA and NIH push to include women in heart failure trials in a more representative fashion. This is certainly an issue with at least one of the VA-sponsored heart failure trials that are currently underway. We are assuming that we could extrapolate these findings across genders. Is that a reasonable assumption? DR. COHN: Well, I guess your assumption, Jeff, is as good as mine. Certainly when one has looked at the response in women versus men in the trials where both groups have been included, such as the SOLVD trial, there appeared to be no difference in the therapeutic response. Women were not included in V-HeFT because we recognized we would have so few that it would not be possible to analyze them separately, so we confined the study to males. The extrapolation to the female population then is going to be a matter of judgment rather than of data. DR. CODY: I guess a final comment is, I think a very important statement that has been made by the presenters, and that is the need for prescribing information for this combination. What data exists to suggest or to guide people when to use BiDil instead of an ACE inhibitor? When do you use BiDil in addition to an ACE inhibitor, and can these findings of functional class 2 and 3 patients be extrapolated to functional class 4. DR. COHN: Your last comment is a very important one, and since there was an exercise entrance criteria in all of these patients, class 4 patients were substantially eliminated from the trial. There were a smattering of patients who were said to be in class 4 failure, and as you know, a patient might have been in class 4 failure last month and now is ambulatory and functional and gets included in the trial. Is he now a class 4 or is he a class 3? We argue about that all the time. But there really is little data in class 4 patients in this trial. There is a good deal of experience with the combination therapy clinically on hemodynamics in class 4 failure, but they were not included in this trial. I think your first issue was -- DR. CODY: Using BiDil instead of an ACE inhibitor or in addition to? DR. COHN: Yes, the place of this in therapy. Obviously, the labeling that is being requested would point this out as alternative therapy to an ACE inhibitor in patients who were not receiving an ACE inhibitor usually because of intolerance or perceived intolerance. We know that the analyses done of the use of ACE inhibitors in patients with heart failure continues to suggest that there is a large number of patients not receiving an ACE inhibitor who, on the basis of the data, should be receiving an ACE inhibitor. So, this would be alternative therapy for that group of patients who physicians choose not to use an ACE inhibitor. We are providing no data on this combination added to an ACE inhibitor, and we would not anticipate that that should be included in the labeling. Many of us in clinical practice use that combination because we have found anecdotally that it is effective. But there have been no systematic studies done of hydralazine-isosorbide dinitrate added to an ACE inhibitor to justify that as a labeling indication. DR. CODY: I agree with you that that there are patients where we would use the combination, and generally those would be the patients who aren't doing well. They might be the functional class 4 patients who are not responding to an ACE inhibitor or the hydralazine nitrates, so we combine them. Where clinically, where this piling on concept is used for the sickest patients, do we have to have some special wording or recommendations about that? DR. COHN: Yes, I agree with you completely, Bob. That is really the way we have to focus this therapy based upon the data from V-HeFT. We have to limit the indication to what has been demonstrated in V-HeFT. I appreciate your comments, Bob. DR. MASSIE: Just one qualification and then Ray has a question. When you say in people who are not using ACE inhibitors, would it make more sense in people who have been tried on ACE inhibitors and have not tolerated them? In order to best serve the educational function that what we are trying to do is get people to use ACE inhibitors and we know there are people in whom you can't, but not just in people who are not on them because that would be any heart failure patient who is newly diagnosed and hasn't yet had a chance to be treated. DR. COHN: Well, you know, you may be right. On the other hand, if you look at the ancillary endpoints such as exercise, and one had a therapeutic goal in an individual in whom prolongation of life, based on whatever other issues might be present in that individual, was not your primary emphasis, and your primary emphasis was to allow the patient to do a little more exercise, one might conceivably feel that in that instance the mortality benefit of Enalapril was not important to this patient. Now, these are judgmental issues that physicians have to cope with, so it is difficult to demand that all physicians give all patients with heart failure an ACE inhibitor. It is important to show them the benefit of ACE inhibitors so that they can choose to use those drugs in the appropriate patient population. So, it is a very nebulous kind of distinction, but I think physicians have to be given choices. DR. MASSIE: Ray? DR. LIPICKY: I have forgotten the operant policy during the studies with respect to how the dose was manipulated with respect -- DR. MASSIE: Can you speak a little louder, please? DR. LIPICKY: Sorry. I have forgotten the operant policy with respect to how dose was manipulated during the studies. Was it titrate to maximum tolerated dose with some upper limit? DR. COHN: Yes. The upper limit was 40 milligrams 4 times a day of isosorbide dinitrate, and 75 milligrams 4 times a day of hydralazine. It was a dual titration, and that is, both drugs were increased at subsequent visits until the patient achieved that higher dose. If headache, which was the major side effect, intervened, the dose could be either held or even reduced, and that is why the mean dose in V-HeFT I was about 240 milligrams of hydralazine, not 300, and the mean dose of isosorbide dinitrate was about 110 milligrams and not 160 milligrams reflecting that. DR. LIPICKY: But dose was increased for both. DR. COHN: For both. DR. LIPICKY: They were not changed independently. DR. COHN: No. Although if a side effect occurred, the physicians were encouraged to reduce the dose of the ISDN first because it was our impression that that was the more likely cause of headache. So, they might have reduced one and not the other. And sometimes they discontinued one and not the other. Now, we had a little trouble dealing with that discontinuation of one of the drugs because they were taking one and they weren't taking two. Knowing what we know now, it's possible you need to take both in order to get the beneficial effect. But it was all an intent-to-treat analysis anyway, so that analysis was not influenced by whether they did or didn't take both drugs. DR. LIPICKY: From the vantage point of instructions for use, and based on the experience, do you think it's a problem that one has to take both and does not have a choice in titrating one or the other, depending on adverse symptoms? DR. COHN: I think our data would suggest that if one wants to attain the benefit of this drug combination, one should use the two drugs. We have no way of analyzing what the optimal dose of each of those two combinations is, as you know, and this was not a dose response study. So, we are left with a strategy for therapy, a strategy for reducing the dose if side effects occurred, and when one used that strategy, we reduce mortality. Now, I think from a labeling standpoint, all we can do is recommend that strategy in the labeling, knowing full well that that may not be the optimal strategy or the only strategy, but the only strategy we studied. DR. MASSIE: Ray, while you have the microphone, before we open the general discussion, maybe we can get you to clarify something for us. The idea of a combination drug as opposed to the two components of the drug. I think you started hinting on that point a bit. What would the FDA see as a reason for approving a combination drug when we have the two components, and I guess then I would like Jay to follow up and tell us why he thinks it is better to have this combination rather than the two components, what advantage it provides, because ordinarily I know in combination drugs where we have dealt with them for hypertension, you have to show both components are effective and then there is some advantage to having the two together. Maybe, Ray, you could tell us why we should be thinking about this. DR. LIPICKY: It could be a very long discussion but I think the short discussion is that if one has a trial where one thinks there has been documentation of an alteration in irreversible harm, and one knew that, say, it was a single chemical entity, but it was a racemate, nobody would have any problems whatsoever in saying that the drug, a racemate, did it. I think that if you consider this to have been documented, to have an effect on something that is irreversible, well, then you are stuck, or not stuck. It is appropriate to consider the combination as a single drug. Ordinarily one would expect to be able to document that drug A plus drug B has a bigger effect than either drug A or drug B alone at the appropriate doses. But ordinarily one would be concerned about that if one could in fact do studies that would allow one to determine that. It is unlikely to be able to do them for irreversible harm, especially with a study that is 20 years old. DR. TEMPLE: Barry? DR. MASSIE: Can I just let Jay respond? DR. TEMPLE: Barry? I'd like to add something. DR. MASSIE: Go ahead. DR. TEMPLE: We have a combination policy that doesn't distinguish really between taking the two drugs separately and putting them in the same tablet. That is theoretically of no concern. It is never a benefit to have them in one tablet except convenience. There can't be a medical benefit from taking them together as both separately. The question is, do they each contribute, as Ray said. The longstanding policy is you have to demonstrate that each component makes a contribution. We have, however, tried to confront the question of, suppose somebody shows you that you've done something important with two drugs and it is really not possible anymore to test the two components because you can't have the placebo group to do it. What we have said is, if there is a plausible basis for having both components, on theoretical grounds we would sort of live with the discomfort of approving the combination if it had an important effect on survival or irreversible morbidity or something like that. DR. COHN: I think the answer to your question, Barry, is a complicated one. Let me put it this way. If one looks at the use of this drug combination in its generic form, out in the community, there is very little use of hydralazine. There is substantial use of nitrates. ISDN is widely employed in heart failure, without labeling, and without indication and without marketing. Hydralazine is not used, perhaps for several reasons. Number one, physicians don't like writing so many prescriptions because they would have to write two separate prescriptions. Patients do not like taking so many pills. And there is no dosage form available of hydralazine which matches the dosage form used in V-HeFT. So, there are several impediments to the use of hydralazine. The nitrate use suggests that physicians are very comfortable using ISDN because they are comfortable with that drug. And they are using it for reasons which are mysterious because there is no existing database which suggests that ISDN should be used in patients with heart failure, other than the V-HeFT database, which we now believe strongly suggests, based upon the new data that I showed you, that hydralazine should be used along with ISDN. So, having labeling for BiDil, if it will help physicians to understand the application, the dosing and the usefulness of these two drugs and can do it in a single prescription, with a single tablet that patients will be much more comfortable taking, I think it can have a profoundly favorable effect on the management of this syndrome because, despite the fact that all the guideline committees recommend using this combination, it is not being used. There has to be some explanation for that, and that's the best explanation I have, is what I have given you. DR. MASSIE: Okay. Well, what we are going to do is, Jeff, since he has not gone, and then we will start from the right. DR. BORER: Most of the questions I had have been answered, but I need a clarification here, if I can have one, please, and then based on the response to that I have several questions I would like to pose. First of all, I would like a clear statement of what is being requested for approval here. What is the indication? Are we talking about approving the combination for reducing mortality rate in patients with congestive heart failure, or are we being asked for approval of the combination for the treatment in general of people with heart failure because at least three things look like they get better, mortality rate and maybe exercise tolerance and maybe ejection fraction? What indication is the sponsor seeking here in the approval process? DR. COHN: Well, I guess if you are speaking of the sponsor, maybe we should turn to the sponsor. Cesare, do you want to respond to that as the sponsor? DR. ORLANDI: The indication that we are pursuing is for treatment of congestive heart failure in addition to digitalis and diuretics in patients actually not taking ACE inhibitors. This is based, indeed, on data that we feel are convincing, that are mortality data and ejection fraction data. DR. BORER: Okay. So, you are not specifically suggesting that the drug is indicated for reduction of mortality, but rather that it is indicated in general for treatment of patients with heart failure. Is that right? DR. ORLANDI: We feel that we have demonstrated actually an effect on mortality as well. DR. BORER: I think that Jay is absolutely right, of course. You can't be penalized for not doing what you didn't know to do at the time you did it because the data were not available. I come to these data with a sort of a general bias in favor of the combination being good. However, we are being asked to approve the combination for something here. Now I understand that it is for the general treatment of patients with heart failure, particularly because of mortality reduction. And that may be a good thing to do. But if we are going to do that then obviously everybody has to feel comfortable with the consistency and reproducibility of the effects, and therefore there are a number of statistical considerations that I would like some, again, clarification about here. I don't think that the p values are ironclad rules that one must follow because they say this or that, and I know the FDA regulations aren't written that way either. They are guidelines. On the other hand, we have information really from two trials. One of them was placebo-controlled. As I look at the data, the general Gestalt is that ejection fraction clearly is improved when you give the combination. That is a good thing. Exercise tolerance, well, you know, it doesn't really quite make it statistically but it goes in the right way. That is convincing. And then we have mortality, which is of course a very compelling argument if it is a reasonable one. But that is where this desire I would have to be able to be convinced of the consistency and reproducibility of the results begins to founder a little bit because of statistical considerations that I am not sophisticated enough to really understand. The way I see it, we have a hypothesis that allowed only a one-directional response, maybe reasonable, so we used a one-tailed t-test. We say that there is no penalty for looking at the data many times if you passed a predetermined boundary that was determined by the Data and Safety Monitoring Committee at an early look. That may be right, but I have never heard that before, but maybe it is right. We have multiple pre-specified endpoints and we have no penalty for looking at those, even though they presumably could have gone either way. And that is okay because mortality is so important. But we only have one placebo-controlled trial and then we use a second trial where the placebo is present but it's a historical control. So, all of that is not the way we are accustomed to seeing data, and I would like to have clarified for me whether it is really legitimate to say we don't have to pay a penalty after a pre-specified stopping rule is passed but we decide to go on anyway because we wanted to see if the result was consistent over time. If it is really legitimate not to pay a penalty when we talk about consistency, if there are multiple pre-specified endpoints but there is one that really looks real good. What's the answer to that? Is there a statistician? Lloyd, perhaps? DR. MASSIE: We've heard the answer from the sponsor. I would like to have the answer from our two committee statisticians. DR. COHN: Could I just add one point here because I didn't bring this up before. I am reading now from the V-HeFT I protocol under Sample Size and Duration. "The primary objective of the study is to determine if the survival time is increased on vasodilator therapy as compared to the survival time in the placebo group." That was the primary endpoint. So, don't allow all these other endpoints to dilute that out. It is the primary endpoint. DR. BORER: That is a good point and I accept that. DR. COHN: I would like the response from the agency, but just to remind you, the reason for imposing a penalty for multiple looks is that you always have the opportunity to stop the trial if you surpass the guidelines for the endpoints and the multiple looks. If you surpass the endpoint and don't stop the trial, you really have eliminated the need for any more penalty because you haven't responded to it in the first place. So, the multiple looks have not really contributed to your final decision. That is a nuance, and I would love to hear responses from the statisticians on that, but just intuitively it seems to me that makes sense. DR. MASSIE: Lem? DR. MOYE: I have somewhat a different view. (Laughter.) DR. MOYE: The purpose of corrections for multiple looks is to ensure that you have preserved the type 1 error at an acceptable bound. The type 1 error, I think, is really a cause for lots of concern and lots of confusion. From my way of looking at it, the type 1 error is a matter of population protection. The experimenters have an obligation to protect the population from which they derive their patients and the derived sample. They protect the derived sample, of course, by taking care of the patients as best they can. They protect the population by ensuring that they don't inflict unnecessary false positives or false negatives. The way to provide the insurance for false positives is the alpha level. For every decision that is made concerning a hypothesis test, you have the potential for sampling error propagation, and there are two ways to handle that. The far superior way, again in my opinion, is for the investigators to handle it. That is to say, the investigators must say with absolute clarity what they are going to do with the primary endpoint, how they are going to test it, and what they are going to do with secondary endpoints. They must provide, if you will, a decision path, how they are going to work through the collection of endpoints that they have. They are in the best position to do it because they can do it prospectively. They have an excellent fund of knowledge to do it, but I must confess they are not used to doing that, and perhaps the reason is that we have not asked them to do that. Because we haven't, we find ourselves again in the position of trying to make some determination and some post hoc correction of these accumulated decisions. I think if the investigators surrender their mandate, because that's what they have in the beginning, for controlling these alpha errors to us retrospectively, then it is up to us to come up with our own. My personal one is a very conservative one which penalizes investigators for each decision they make, so that in this circumstance that where the type 1 error at the 2-year interim analysis is very small, then you nevertheless accumulate some error because that decision may have been wrong. You accumulate alpha from that and you move on, so that as one progresses through the secondary endpoints, the alpha eventually accumulates. You stop when you reach the bound, whatever that bound happens to have been. Typically it's at the .05 level. So, I am arguing for, number one, for a prospective plan for the spending of alpha, but in its absence -- and most times I am afraid it is absent -- a very conservative post hoc plan for the accumulation of alpha, and that way we can ensure that the probability of making it at least a type 1 error from all of them is acceptably small for the population at large. DR. MASSIE: Ralph? DR. D'AGOSTINO: I agree very much with the spirit of what was just said, but I would like to add a couple of comments to it. I think this idea of saying you cross the boundary and then you no longer pay a penalty, well, as you cross the boundary you find later on that your mortality for the full study isn't significant. Do you still believe it is significant because you crossed the boundary earlier? Do you start running into making decisions later on that you will change your mind or you will do different things, depending on what those later analyses produce, and you have to have some kind of way of guiding yourself in terms of alpha -- I don't like this notion of alpha spending, but what do you believe about it as you start looking at the data in a further fashion? I think that -- and this is a good example -- you have marginal significance with the 2-year mortality. I mean, why isn't it .001 so there was no confusion? It is hovering around. You fuss with one analysis and it crosses over the significance. Another analysis and it becomes slightly better for you. There is not a very comforted feeling on that, and these multiple looks at the data really can't be just dismissed as you have protected yourself earlier. So, I think that we really are in a situation that we can't say you crossed the boundary, therefore you forget about the alpha. I don't think that really is the case here. And I do think that this question that you raise -- and I was trying to say somewhat the same. You come into this study with certain notions that you want to look at survival, and you've seen something that looks like a 2-year survival. Will you see it again? I am not sure you will see it again. I am not sure I am convinced with what the data I have seen here. And I realize that survival is the major thing, but you still carry with you six primary outcomes and what are you going to do with those? Are you just going to ignore them? Those are all sort of look-see. I am not going to think about any significance on them? You certainly are, and once you start playing that game, I think you have to say, how am I going to use my alpha, how am I going to be able to sit back and say, I really believe what I have. And I think that we are left in a situation where we see the survival but I would like to see another test of it. DR. MASSIE: Jeff, do you have more? DR. BORER: No. DR. MASSIE: JoAnn, go ahead. DR. LINDENDFELD: Just in this same vein, I wonder when we use a second analysis to assess mortality, when we know that the first analysis has been borderline, does the second type of analysis need to be stricter than ordinary criteria, once we know that the initial analysis was of borderline significance? DR. D'AGOSTINO: Are you asking me that? DR. LINDENDFELD: Yes. DR. D'AGOSTINO: I'm not sure I know the question. Are you saying if they put a second study together? DR. LINDENFELD: I'm sorry. In the initial study, once you know the initial method of statistical analysis was of borderline significance, and we go to a second, already knowing that the first was borderline. DR. MASSIE: This is the log-rank versus the Cox? DR. LINDENDFELD: Versus the Cox, right. DR. D'AGOSTINO: No. This is the notion, I think, that was raised in the question, are you looking for the test that is going to do the best for you? DR. LINDENFELD: Exactly. Shouldn't the second be perhaps stricter than if it was primary -- DR. D'AGOSTINO: I think so and I think that there is real justification for that. Again, I don't see anything wrong with, say, doing an analysis that makes no adjustment for baseline variables, seeing where that goes, and then doing a sharper analysis that includes covariates to get rid of some of the variability. I am not sure it's imbalances that you need to correct, but you want to reduce some of the variability. But as you progress through that, if it is stated in the protocol that the real analysis that you are going to put your final weight on is the Cox regression that does all the covariates, then I think you can wait to see what that produces. But if your protocol says I am going to look at the log-rank and maybe look at the Cox, or it is unclear what you are going to do with the Cox, and then you really move to the Cox with the hope that it is going to give you some significance because the log-rank didn't, I think you are in a situation where you are beginning to doubt how much certainty you can get from the study. DR. MASSIE: I think the committee has been restive and also very cooperative in not interrupting. We were scheduled for a break, but I would like to try to make a pass-through here and continue the discussion, starting down there. Udho? DR. THADANI: I have a couple of comments and a couple of questions. I want to reiterate, the study of V-HeFT I and V-HeFT II was in class 2 and 3 failures, and only in females, so the application of what we say is only to those groups. There is no data on top