June 23, 2004
Mathematical Statistician (Barbara Krasnicka) HFZ-542
Division of Biostatistics, OSB
Statistical Review of PMA P010012/S026: Expanded indications for use of Guidant Cardiac Resynchronization Therapy-Defibrillation (CRT-D) System, Guidant (04/15/04)
Owen Faris, Ph.D., HFZ-450
Division of Cardiovascular Devices, ODE
Through: Director, Division of Biostatistics, OSB ____
The Comparison study of Medical Therapy, Pacing and Defibrillation in Heart Failure (COMPANION) was conducted to demonstrate the safety and effectiveness of the optimal pharmacologic therapy (OPT) plus cardiac resynchronization therapy without defibrillation (CRT-P; Guidant device CONTAK® TR™ - cardiac resynchronization therapy pacemaker) and OPT plus cardiac resynchronization therapy with defibrillation (CRT-D; device CONTAK® CD® - cardiac resynchronization therapy defibrillator) through the comparison with OPT alone. It was expected that compared to OPT alone, CRT-P or CRT-D could reduce
· combined all-cause mortality and all-cause hospitalization (primary effectiveness endpoint)
· all cause mortality (secondary endpoint)
· cardiac morbidity (secondary endpoint)
and could improve
· exercise performance (sub-study)
in patients suffering moderate to severe chronic heart failure (CHF) with left ventricular dysfunction and intra-ventricular delay.
The study was approved by FDA in October, 1999, under IDE G990214. After that, the protocol was modified three times (June 2000, May 2001 and May 2002).
The Guidant CONTAK CD® CRT-D system was originally approved in May, 2002 (PMA P010012) based on a separate CONTAK CD study with patients who were NYHA class II, III and IV at the time of implant. A subgroup of this original patient population who remained in NYHA Class III/IV at the end of the post recovery period was used to prove the effectiveness of the device in the treatment of heart failure. Also, the results from the CONTAK CD study supported the safety for the whole device system and the effectiveness of the ICD (Implantable Cardioverter Defibrillator). In addition, according to the FDA approval letter, a post-market follow-up study should be conducted to evaluate the long-term safety and effectiveness of the system.
Currently, on the basis of COMPANION study, the sponsor is seeking approval of an extension in the label for the CRT-D and is proposing the following expanded indication statement [proposed at the time of this review – see lead reviewer memo for the sponsor’s final proposed language]:
‘Guidant Cardiac Resynchronization Therapy Defibrillators (CRT-Ds) are indicated for reduction of all-cause mortality and symptoms of moderate to severe heart failure (NYHA III/IV) in patients who remain symptomatic despite stable, optimal heart failure drug therapy, and have left ventricular dysfunction (EF </= 35%) and QRS duration >/= 120 ms. Guidant CRT-Ds provide ventricular antitachycardia pacing and ventricular defibrillation for the treatment of life-threatening ventricular arrhythmias.’
2. Scope/Design of the COMPANION Study
The COMPANION trial was a prospective, multi-center, randomized study on patients having moderate or severe heart failure (NYHA Class III / IV). As mentioned above in Section 1, the purpose of the study was to demonstrate that CRT-P and CRT-D were safe and effective in all-cause mortality/hospitalization and all-cause mortality reductions in patients with moderate to severe heart failure (NYHA Class III/IV) resulting from left ventricular dysfunction (EF </= 35%) and QRS duration >/= 120 ms. However, in this submission the sponsor focused only on the comparison of CRT-D treatment vs. OPT in its statistical analysis and presentation of safety and effectiveness of the device, as described in the agreement meeting (see lead review memo). The clinical trial followed a group sequential design. There were four interim analyses. The actual number of these interim analyses was not specified in advance. The stopping boundaries were defined using the Lan-DeMets alpha-spending function, a modification of the O’Brien-Fleming sequential design. The total alpha spent across repeated analyses did not exceed the nominal type I error, 0.03. The Data and Safety Monitoring Board (DSMB) met 5 times (approximately in six months intervals) to review the trial progress and the data collected up to a given time point.
The study enrolled 1,638 patients at 128 centers in the United States, but only 1520 patients were randomized to OPT, CRT-P, and CRT-D, respectively, in the 1:2:2 ratio. Due to some changes in patient’s status, one hundred eighteen (118) initially enrolled patients were no longer eligible for randomization. The first patient was enrolled into the study in January, 2000 and the last one in November 2002. The Steering Committee and the Guidant company halted enrollment into the COMPANION trial on November 21, 2002, after being informed by the independent DSMB that the protocol pre-specified boundaries (criteria) for the trial termination (for both primary mortality/hospitalization and secondary mortality endpoints) had been crossed (met) in November 2002. At the moment of trial stopping, 941 “potential” primary endpoint events had been identified, i.e., the target number of primary events had been approximately reached. Additionally, withdrawal and crossover rates were escalating and the DSMB was afraid that a longer trial period might result in increase of study contamination (Amendment to PMA, May 14, 2004, Summary Notes, November 18, 2002, page 2).
3. Data Collection and Quality
The results of a clinical trial strongly depend on the quality of study data set. Quality of data is influenced by clear definitions of response variables and methods used for data collection, editing and assessment. Clear definition of the primary effectiveness endpoint should be included in the protocol and written in such a way that all investigators could apply it in a consistent manner throughout the trial.
The primary effectiveness endpoint was modified a few times during the COMPANION trial. Based on the protocol IDE #G990214, the primary effectiveness endpoint was defined as “all-cause mortality and all-cause hospitalization, where all-cause mortality is defined as death from all cause and all cause hospitalization is defined as admission to a hospital for any reason. In addition, this endpoint will include emergency room visits (or unscheduled office visits) that result in treatment with intravenous (IV) inotropes or vasoactive drugs”. This definition was revised later. The last modification of the all-cause hospitalization took place about 10 months before the end of the study. The all-cause hospitalization was then redefined as the one for which the discharge date was different from the admission date or as hospitalization (outpatient admission) longer than 4 hours during which patient received IV inotrope/vasoactive therapy.
It is worth noting that some hospitalizations of study patients were not accompanied by the Hospitalization Case Report Form (the PMA Amendment, May 14, 2004, page 9) and some of them were adjudicated from the source documents. The question is whether the sponsor was able to find all hospitalizations without case report forms.
Additionally, ‘the independent statistical group recommended and the Steering Committee implemented a policy of approaching withdrawn patients to sign a consent allowing collection of vital status and hospitalizations occurring on or before 11/30/02’. This approach of collecting additional data on the withdrawn patients raises many questions concerning the accuracy of the information.
In summary, the clinical trial data set raises the following concerns.
a. The CRF (Case Report Form) especially for hospitalization visit did not reflect exactly the definition of the primary effectiveness endpoint. Also, the definition was changed a few times during the study. Additionally, the collections of hospitalization events were based only on admission and discharge dates not taking into account exact time.Therefore, the capture of hospitalization events longer than 4 hours during which patient received IV inotrope/vasoactive was based on the follow-up CRF and duration of the IV therapy.
b. Some hospitalization events did not have a CRF (Case Report Form). Therefore, such events may not be captured as hospitalizations and it is unknown whether hospitalization events for the primary effectiveness endpoint are missing.
Therefore, the hospitalization data that is essential for the primary effectiveness variable is of uncertain reliability, and raises a question about the accuracy of the statistical results for the difference between CRT-D and OPT groups on the primary endpoint.
4. Patient disposition
In the study under review, 1520 patients were randomized in the 1:2:2 ratio to OPT (optimal pharmacological therapy; 308 patients) alone, OPT +CRT (CRT-P; 617 patients), OPT+CRT + ICD (CRT-D; 595 patients), respectively. In the CRT-D group, the implementation was successful in 91% of cases (541 of 595). Table 1 gives a summary (by treatment group) of patient disposition over time through 12 months after randomization.
Table 1. Patient Follow-up Status over Time (The sponsor’s Table)
Because the study stopped in November, 2002, some patients were followed up for only a few weeks. It is worth noting that a relatively high number of patients were lost-to-follow-ups or withdrawals from the study. The withdrawal rate was especially high in the OPT group. At 12 months, it was 21% (64 subjects) in the OPT group, but only 4% in the CRT-D group. The reasons for so many withdrawals were probably the status of the patient’s health, dissatisfaction of the treatment received and effective marketing of the CRT-D. Due to many withdrawals and an imbalance between the two treatment groups in the number of withdrawn patients, it was difficult to minimize biases during the statistical analyses. Statistical conclusions drawn from these data may be problematic.
5. Main study endpoints and hypotheses
As mentioned before, the primary effectiveness endpoint was a composite of death from any cause and hospitalization for any cause. According to the final definition, the all-cause hospitalization referred to a hospitalization with different dates of admission and discharge or hospitalization (outpatient admission) longer than 4 hours during which patient received IV inotrope/vasoactive therapy. All-cause mortality and cardiac morbidity were chosen as the secondary endpoints.
The sponsor’s intention was to show that the biventricular pacing with defibrillation leads to a reduction of combined all-cause mortality and all-cause hospitalization, cardiac morbidity, and all-cause mortality. The hypotheses related to the primary effectiveness endpoint were specified, without referring to the event-free rate at any time point of evaluation or event-free distribution during the whole course of study, as follows:
H0: Total mortality and hospitalization in the CRT-D group is equal to total mortality and hospitalization in the OPT group and
Ha: Total mortality and hospitalization in the CRT-D group is not equal to total mortality and hospitalization in the OPT group.
In the sample size calculation, the sponsor did mention that with enrollment of 2200 patients in two years and the follow-up for three years, the study would have at least 80% power to detect a 25% reduction in relative risk (CRT-D group vs. OPT group) with respect to the combined all-cause mortality and hospitalization as well as all-cause mortality. Differences between treatment groups with respect to the primary effectiveness endpoint and mortality would be evaluated using the log-rank tests and the Kaplan–Meier product limit estimator. Cox proportional-hazard regression model would be applied to estimate hazard ratios and 95% confidence intervals. Thus, the study was designed to detect a 25% reduction in relative risk (CRT-D group vs. OPT group) for the combined all-cause mortality and hospitalization as well as all-cause mortality alone with alpha 0.03, but the hypothesis given in the protocol did not precisely state this objective.
6. Statistical analyses
a.1 Across the centers
The sponsor did not perform any statistical analysis evaluating baseline characteristics of all patients enrolled across the centers. One reason of this was that the number of patients across 116 centers varied from 1 to 25 (with median 6), with one outlier center which enrolled 41 patients. Patients from 16 (14%) centers belonged only to one group of the study, i.e., 4 and 12 centers had patients only from OPT and CRT-D groups, respectively. The sponsor tested the influence of centers on the mortality and claimed that no significant effect of centers was detected.
a.2 Across the treatment groups
It seems that OPT and CRT-D groups were on average well balanced in terms of almost all collected baseline characteristics. OPT group was nominally slightly older (68 vs. 66) and the percentage of patients from NYHA IV class was higher (18% vs. 14%, p= 0.13) in this group.
As mentioned previously, the conduct of the trial led to problems in interpretation of the primary effectiveness endpoint.
The statistical analyses for the primary effectiveness endpoint were performed by this reviewer with patients censored at withdrawal date or at the end of the study. The ‘elective’ hospitalizations were not included as endpoints. By excluding post-withdrawal information (that was collected after stopping the study, see §3) regarding deaths or hospitalizations, some additional noise was avoided. The dataset used by this reviewer contained 202 and 386 primary events in the OPT and CRT-D arms, respectively. In the original data set used by the sponsor, there were 216 and 390 primary events in the OPT and CRT-D groups, respectively.
Based on the Kaplan-Meier estimates, the 360-days event rate of the primary effectiveness endpoint was 66% in the OPT group as compared to 56% in the CRT-D group. Changes (by treatment groups) of event rate over time are given in Table 2. The smallest difference (1-2%) in event rate between the two groups occurred at 200 days and during the first several days after randomizations, and the largest (10%) took place at 400 days.
The estimated median time free from the primary effectiveness endpoint event are 274 and 218 days for CRT-D and OPT groups, respectively.
Table 2. Kaplan-Meier estimates of the failure functions for the CRT-D and OPT group (reviewer’s table)
The Kaplan-Meier curves given in Figure 1 demonstrate some separation of both curves over time. However, the hazard functions (intensity functions) (Fig. 2) for the two groups may not be parallel. Fig. 2 shows that the two hazard curves (for CRT-D and OPT groups) do not appear to be clearly different. The curves are clearly separated (in the right sense) only in some period of time about one year after randomization. In all other periods of time, the plots fluctuate and interweave. However, it is worth noting that the hazard estimates for the later years are based on the relatively small number of observations and may be unreliable.
Fig. 1. Estimated survival curves for CRT-D and OPT groups (the reviewer’s figure)
Fig. 2. Life-table estimates of the hazard functions for the CRT-D and OPT groups (reviewer’s figures)
Performing statistical analysis for the primary effectiveness endpoint, the sponsor utilized the log-rank test and the Cox model. Using the sponsor’s methods of the analysis on the data set with previously mentioned censoring, the reviewer received the following results:
I. For the log-rank test, the p-value equal 0.014.
II. For the Cox model, the hazard ratio 0.81 (i.e., 19% reduction in the relative risk),
95 percent confidence interval 0.68 to 0.96, p = 0.015.
The true survivor function curves do not cross if the two hazard functions are proportional. However, not-crossing is a necessary but not a sufficient condition for the proportionality of hazard functions. The sponsor did not provide evidence that an analysis checking this critical assumption was performed. In fact, the hazard functions shown in Fig. 2 and the Schoenfeld residuals (which are not presented here) raise the question of proportionality assumption for the primary effectiveness endpoint data. Since the key assumption of proportionality underlying the Cox model and the log-rank test may be questionable, the conclusion that CRT-D reduced the relative risk of the primary endpoint when compared to OPT patients by 19% may not be correct, even though the survival curves (Fig.1) for the two groups are different.
In summary, the primary effectiveness hypotheses were not precisely specified, but in the protocol, the sponsor wrote that the objective of the study was to demonstrate that OPT combined with the CRT-D device was superior to OPT alone in reducing all-cause mortality and all-cause hospitalization (page 5). In the power calculation paragraph (page 15), the sponsor mentioned about 25% reduction in the relative risk.
Generally speaking, the primary effectiveness endpoint ‘survival functions’ for the CRT-D and OPT groups are different at p =0.025 (Wilcoxon test). However, the results of the statistical analyses of the difference magnitude between the two treatment groups may not be accurate and could be biased due to the following reasons:
· The primary effectiveness endpoint definition was changing during the study (see §3)
· The proportionality assumption may not be satisfied
· Open-label study
· The type of censoring may not be a non-informative one (the censoring may not be independent on the occurrence of the event because some patients withdrew from the study due to worsening of their health status).
For the purpose of the mortality analysis, data on patients who underwent heart transplant or whose vital status was not known at the end of the study were censored on the day of cardiac transplantation or on the date of the last known contact, respectively. This means that the analysis was performed on an ‘extended’ data set that included data on some withdrawn patients (see §3). Post-withdrawal collection of information on death seems to be ‘cleaner’ than the similar recapture of data on hospitalizations. Table 3 shows the summary of the mortality rate by treatment groups over time.
Table 3. Mortality Rate over Time (Reviewer’s Table)
The mortality rates at 30 days after randomization are similar for the CRT-D and OPT groups: 1.2% (7 pts.) in the CRT-D group, as compared to 1% (3 pts.) in the OPT group (p=0.89). During the one year follow-up period, 112 (19%) and 74 (24%) patients were censored in CRT-D and OPT groups, respectively. Based on the Kaplan-Meier estimates, the one year death rate for the CRT-D group was 12.11% and for the OPT group was 18.91%. Therefore, the difference in the survival at one year is 6.8%. The change in death rate over time by treatment groups is given in Table 4. During the first 150 days after randomization, the differences in death rates between the two groups are small (about 2 %).
Table 4. Kaplan-Meier estimates of the death rate for the CRT-D and OPT group over time (the reviewer’s table)
The effect of the CRT-D therapy on the death from any cause is shown in Fig. 3.
Fig. 3. Kaplan-Meier estimates of the survivor functions and their confidence limits for the CRT-D and OPT groups (reviewer’s figure)
The sponsor stated that the CRT-D was associated with a significant (36%) reduction in death risk when compared to the OPT patients (hazard ratio 0.64; 95% confidence interval 0.48 to 0.86; p=0.003). The reviewer performed a statistical analysis on the final updated data and obtained similar results. Plot (versus time) of the scale Schoenfeld residuals for the treatment (CRT-Ds vs. OPT) variable is given in Fig. 4 and demonstrate that the sign of treatment (CRT-D vs. OPT) coefficient changes over the time. The treatment coefficient is positive for the first 100 days after randomization, which corresponds to an increase in death rate for the CRT-D group. During this period there is almost no separation of the survival curves in Fig. 3. The treatment coefficient in Fig. 4 is roughly equal to zero between 100 and 200 days after randomization. Afterwards, between 200 and 500 days, the treatment coefficient is negative which corresponds to a decrease of death rate for the CRT-D. This is the time interval during which the CRT-D arm demonstrates a beneficial effect from the device. It is, however, important to notice that for the later days (about 700 days and later) the number of observations is small and the results may be unreliable.
Fig. 4. Schoenfeld residuals (CRT-D versus OPT) for the treatment group (reviewer’s figure)
The estimates of the hazard ratios for different subgroups are summarized in Figure 5 (the sponsor’s figure). The point estimates are almost always less than 1. Only one subgroup of patients with diastolic BP <= 68 mmHg has a hazard ratio slightly greater than 1 (hazard ratio 1.08, 95% confidence interval 0.63 to1.88).
Fig. 5. Hazard ratios for death (the sponsor’s figure)
In summary, while the sponsor states that “COMPANION was designed to detect a 25% reduction in both the primary and in mortality endpoints at alpha = 0.03” (protocol, page 16), the mortality hypothesis evaluated whether a difference of any magnitude was exhibited between the CRT-D and OPT arms. The survival functions for the CRT-D and OPT groups are different at p =0.003. However, the statistical analyses of the difference magnitude between the two treatment groups in the case of all-cause mortality may not be accurate and could be biased due to the following reasons:
· The type of censoring may not be a non-informative one (the censoring may not be independent on the occurrence of the event because some patients withdrew from the study due to worsening of their health status).
· The proportionality assumption may not be satisfied.
· Open label study.
Cardiac morbidity was defined in the protocol as the occurrence of any of the following events:
• Worsening heart failure resulting in use of intravenous vasoactive or inotropic therapy exceeding four hours
• Mechanical respiratory or cardiac support
• Any cardiac surgery, including heart transplant
• Resuscitated cardiac arrest or sustained ventricular tachycardia requiring intervention (e.g., chest thump, external cardioversion, or external defibrillation)
• Hospitalization for acute decompensation of heart failure
• Hospitalization that results in death from cardiac causes
• Significant device-related events resulting in permanent disability or hospitalization for pending death or permanent disability.
The sponsor’s intention for the cardiac morbidity secondary endpoint was to establish possible reduction of occurrences of cardiac morbid events in patients receiving combined OPT and CRT-D treatment in comparison to patients treated with OPT alone.
Based on the submission, the morbid events were recorded continuously up to November 30, 2002. It seems that an assumption in the sponsor’s analysis was made that the cardiac morbid events could not occur without hospitalization. It is also unclear why some cardiac adverse events (e.g., ventricular tachycardia) were not included in the statistical analyses.
The number of patients who had in-hospital cardiac morbid events was 193 (32%) and 151 (49%) for the CTR-D and OPT groups, respectively. Table 5 provides descriptive statistics of the hospital morbid events during the first 30 days after randomization.
Table 5. Hospital Cardiac Morbid Events during the first 30 days from the point of randomization (reviewer’s table)
Based on Table 5, during the first 30 days after randomization, the number of morbid events was 32 (in 18 patients) and 72 (in 38 patients) for the CRT-D and OPT groups, respectively. However, sometimes multiple cardiac morbid events occurred during one hospitalization (there were 19 and 42 hospitalizations during the first 30 days in the CRT-D and OPT groups, respectively). The limitations previously discussed with respect to the hospitalization data quality (i.e., possible missing CRF, post-withdrawal data collection discussed in § 3) are also valid for the morbidity case. It is assumed that the sponsor did capture all hospital related cardiac morbid events.
Cardiac morbidity based only on hospitalization data does not supply the full information on all events. Some events could and did take place outside hospitals. For instance, there were a total 5 cardiac deaths in the CRT-D group and 3 cardiac deaths in the OPT group during the first 30 days after randomization. However, as described in Table 5, only cardiac deaths which occurred in the hospital (0 for CRT-D and 2 for OPT) were counted in the cardiac morbidity analysis. It is not clear what the difference between deaths in and outside hospitals was and why the sponsor did not use both types of death in the analysis.
In summary, the results of sponsor’s statistical analyses for the cardiac morbidity endpoint are questionable and can be biased because of:
1. not taking into account all cardiac morbid events that occurred outside hospitals
The definition and categorization of adverse events were given in the protocol. Adverse events were defined by the sponsor as “undesirable clinical outcomes and included device-related events as well as events related to the patients' general condition”. However, a CRT-D system safety endpoint was not defined in the COMPANION study. In the submission, the sponsor proposed a post hoc system safety endpoint (primary safety outcome), defined hypotheses, and performed a statistical analysis. System safety was assessed by the system-related complication-free rate (SRCFR) up to 6-months follow-up. The SRCFR was defined as the ratio of the number of patients who did not experience a system-related complication to the total number of patients who were successfully implanted with the CRT-Ds. System-related observations were reported, but were not included in the study endpoint. The sponsor showed that the complication-free rate was larger than 70%. Based on the sponsor’s submission, 61 patients of the CRT-D group experienced system-related complications with 81 system-related adverse events. For the system-related complication-free rate, the study resulted with a point estimate equal 87.4% and the lower limit of one-sided 95% confidence interval equal 85.1%. The complication-free rate was calculated for 540 patients implanted with CONTAK CD system without taking into account dropouts, deaths and a short time of the follow-up for some patients (caused by stopping of the study). Each patient lost-to-follow-up before the 6-months was classified as event-free (between the lost-to-follow-up date and 6 months). All together 14% of patients did not have a follow-up at 6 months. Therefore, the analysis may overestimate the complication-free rate. This reviewer could not perform the system safety analysis because of the lack of the proper data set (there was no indicator on the system-related complication in the safety data).
An analogous statistical problem to that discussed above was encountered for the CRT-D device- or procedure-related safety endpoint. A total of 498 device- or procedure-related adverse events were reported in 290 (out of 595) patients randomized to the CRT-D. The device- or procedure-related adverse event rate through 6 months was 84% (498 events/595 patients). Given the problem with many withdrawals, the point estimation of the device- or procedure-related adverse event rate was underestimated.
Table 6 gives the temporal summary of the device- or procedure-related adverse events through 6 months (the numbers are based on the ‘no-extended’ data). The observed adverse event rate is 0.96 events per patient (496/512) through 6 months.
Table 6. CRT-D device- or procedure-related adverse events through 6 months (reviewer’s table)
In total, there were 4953 adverse events (3726 for the CRT-D and 1227 for the OPT groups) during the study. Of the 595 patients in the CRT-D group, 560 patients experienced at least one adverse event (range from 1 to 118, median equal 5), whereas 245 of 308 patients in the OPT group had at least one adverse event (range from 1 to 23, median equal 4). There were 48 and 9 patients in the CRT-D and OPT groups, respectively, who experienced more than 15 adverse events.
Table 7 presents the over time summary of all adverse events through 6 months (the numbers are based on the ‘no-extended’ data). Using very conservative approach, the adverse event rate through 6 months was 3.73 (1909 events/512 patients) for the CRT-D arm, while the rate was 2.80 (632 events/226 patients) for the OPT arm. Assuming that each lost-to-follow-up patient before the 6-months was event-free, the adverse event rates were 3.21 (1909 events/595 patients) and 2.05 (632 events/308 patients) for CRT-D and OPT arms, respectively. Therefore, the patients in CRT-D group experienced more adverse events during 6 months after randomization.
Table 7. History of the Adverse Events during 6 Months after Randomization (The reviewer’s table)
The sponsor maintains that the higher rate of some adverse events in the CRT-D groups was likely due to the nature of the implantable device, which had the ability to document some events, such as arrhythmias. These events might not recorded for patients in the OPT group.
In summary, according to both the worst-case and best scenario analyses for adverse event rates during six months after randomization, the OPT patients experienced fewer adverse events. Since multiple adverse events within a patient could occur and are correlated, frailty models may need to be considered. The sponsor performed general exploratory analysis without taking into account lost-to-follow-up patients, correlation within a patient and time of an adverse event occurrence. Given these limitations, all analyses should be interpreted with caution.