STATISTICAL REVIEW AND EVALUATION

 

 

NDA NUMBER: 20-386

SERIAL NUMBER: SE1-032

Date RECEIVED BY CENTER: July 25, 2002

DRUG NAME: Cozaar (losartan potassium) tablets

INDICATION: Treatment of left ventricular hypertrophy

SPONSOR: Merck & Co., Inc.

DOCUMENTS REVIEWED: Electronic submission and data sets

STATISTICAL REVIEWER: John Lawrence, Ph.D. (HFD-710)

STATISTICAL TEAM LEADER: Jim Hung, Ph.D. (HFD-710)

BIOMETRICS DIVISION DIRECTOR: George Chi, Ph.D. (HFD-710)

CLINICAL REVIEWER: Thomas Marciniak, M.D. (HFD-110)

PROJECT MANAGER: Edward Fromme (HFD-110)

1 Executive Summary of Statistical Findings *

1.1 Conclusions and Recommendations *

1.2 Overview of Clinical Program and Studies Reviewed *

1.3 Principal Findings *

2 Statistical Review and Evaluation of Evidence *

2.1 Data Analyzed and Sources *

2.2 Statistical Evaluation of Evidence on Efficacy/ Safety *

2.3 Findings in Special/Subgroups Populations *

3 Conclusions and Recommendations *

 

 

 

  1. Executive Summary of Statistical Findings
  2.  

    1. Conclusions and Recommendations
    2. Based on one randomized study losartan appears to be superior to atenolol in reducing the rate of the composite endpoint Stroke/ MI/ CV Death in the overall population studied. There appears to be a difference in this effect among races (blacks vs. non-blacks) and there is no evidence of the superiority of losartan in blacks. There were small differences in achieved blood pressure control throughout the study in favor of losartan. A clinical decision will have to be made to determine whether these differences in blood pressure can explain the differences in the rate of the primary endpoint.

       

    3. Overview of Clinical Program and Studies Reviewed
    4. One randomized study was done in hypertensive patients with left ventricular hypertrophy (LVH) comparing losartan versus the active control atenolol. The active control, a beta-blocker, is not approved for this indication, but was chosen because it has similar antihypertensive effectiveness to losartan and because beta-blockers have been recognized as a standard therapy for hypertension. A total of 9193 patients between 45 and 83 years old were randomized in the United States and Europe and followed for at least four years. The vast majority (80%) of the patients were European Caucasians between 55 and 80 years old, with roughly equal numbers of males and females. The primary endpoint was the time from randomization to the first event in the composite of cardiovascular death, myocardial infarction, and stroke. The primary analysis was a Cox proportional hazards model with covariates for treatment group and three prognostic factors measured at baseline: Cornell product and Sokolow-Lyon voltage- both measures of severity of LVH, and Framingham Risk Score- a measure of coronary heart disease risk. The primary hypothesis tested was a superiority hypothesis and no non-inferiority analysis was planned if the superiority test failed.

       

    5. Principal Findings

    The point estimate of the hazard ratio for the primary endpoint was 0.869 with a 95% confidence interval of (0.772, 0.979) and a p-value of 0.021. There were a total of 508 events in the losartan group versus 588 events in the atenolol group. The difference appeared to be largely due to a difference in the time to stroke (232 strokes in the losartan group versus 309 in the atenolol group, p<0.001).

    Although the study drug was titrated in both groups to achieve comparable blood pressure effects and the addition of up to 25 mg of hydrochlorothiazide was allowed, the actual achieved effects on blood pressure throughout the study in the two groups were slightly different, particularly with respect to systolic blood pressure. For example, at year four, the mean systolic blood pressure was 144.9 and 146.4 in the losartan group and the atenolol group, respectively. Moreover, higher systolic blood pressure was associated with a significant increase in the risk of the primary composite endpoint. It is difficult to determine whether the differences observed on the primary endpoint can be explained by differences in effects on systolic blood pressure.

    There was some evidence of lack of internal consistency of the findings on the primary endpoint. In particular, among the 533 Black patients in the study, the results seemed to favor the atenolol group [HR= 1.666, 95% CI = (1.043, 2.661), p=0.033]. Also, there was no evidence of a difference in effectiveness in the subgroup of patients from the United States [HR=0.947, 95% CI = (0.731, 1.23), p=0.68]. Moreover, approximately 1/3 of the1700 patients from the United States were Black while there were only a handful of Black patients outside the United States. Hence, it is difficult to separate the effects on Blacks from the effects on United States patients as a whole. Finally, there is also a trend suggesting that losartan appears to be more effective relative to the control in older patients. However, these subgroup comparisons, as always, should be interpreted with caution because the study was not designed or powered to show differences within any specific subgroup.

     

  3. Statistical Review and Evaluation of Evidence
  4.  

    1. Data Analyzed and Sources
    2. A total of 9193 patients between 45 and 83 years old were randomized in the United States and Europe and followed for at least four years. The demographic information for both groups at baseline appears in Table 1. There does not appear to be an imbalance between the groups.

      Table 1 Patient Disposition and Baseline Demographics [Source: page 14 of Study Report]

       

      Most patients were taking some type of drug for treatment of cardiovascular disease prior to the study (the most common of these were hydrochlorothiazide (HCTZ), beta-blockers, calcium channel blockers, or ACE inhibitors).

      Patients were randomized to receive 50 mg of losartan or 50 mg of atenolol. The protocol permitted increasing the dose of study drug (either losartan or atenolol) and the addition of antihypertensive medication, such as HCTZ, as needed to control blood pressure. More precisely, open label 12.5 mg HCTZ was added during the first 6 months if blood pressure was not adequately controlled. If still not controlled, the dose of study drug was doubled to 100 mg. After the first 6 months, HCTZ could be increased to 25 mg and other antihypertensive drugs could be added (excluding beta-blockers, ACE inhibitors, and AII receptor antagonists). At the end of follow-up or the last visit before a primary event occurred, about half the patients were taking the highest dose of study drug (100 mg) and almost all of these were also taking additional drugs including HCTZ.

      99% of the patients in both groups had complete follow-up until the end of the study. The mean duration of follow-up was 4.8 years. The majority of the patients in both groups continued on randomized study drug for the duration of the study. 22.6% of the patients in the losartan arm and 26.6% of the patients in the atenolol arm discontinued study drug, usually due to an adverse experience (10.9% losartan versus 15.3% atenolol). Other reasons for discontinuing were endpoint other than death, patient required other therapy, withdrew consent, or other administrative reason.

      There were two planned interim analyses- after approximately 1/3 and 2/3 of the total number of events were observed. An O’Brien-Fleming boundary was used so that the nominal alpha level for the final analysis was 0.046.

       

    3. Statistical Evaluation of Evidence on Efficacy/ Safety
    4. The two groups differed in the time to the primary endpoint (the first event in the composite of cardiovascular death, myocardial infarction, and stroke). The Cox proportional hazards model that was used in the primary analysis included covariates for treatment group, baseline Cornell product and Sokolow-Lyon voltage, and Framingham Risk Score. The Cornell product, Sokolow-Lyon voltage and Framingham Risk Scores appeared to be positively associated with the risk for an event. Everything else being equal, a 10 unit increase in the Framingham score appeared to increase the risk by approximately 60% and a 10 unit increase in the Cornell product or Sokolow-Lyon voltage appeared to increase the risk by about 13% and 17% respectively. After adjusting for these covariates, the p-value for the treatment effect was 0.021, the point estimate for the hazard ratio was 0.869 and the 95% confidence interval was (0.772, 0.979) [Study Report p. 69-70 and independently verified by the FDA]. The unadjusted Kaplan-Meier estimates of the event-free survival curves appear in Figure 1. Although the curves appear to cross near the 6 year time point, this is a result of small numbers of patients having follow-up for that duration (roughly 200-300 per group) and hence, unreliable estimates in that part of the curve. Overall, there was no evidence to reject the assumption of proportional hazard functions for the unadjusted survival distributions (p-value = 0.786). In order to reveal more about the differences in hazard rates over time, the number of events and the amount of patient exposure in each year are provided in Table 2. The figures in the table show that the risk of an event stayed relatively constant over the six years in each arm. Also, in each year of the study except the second and fourth year, the number of events and the event rate were smaller in the Losartan arm than in the Atenolol arm.

       

      Figure 1 Kaplan-Meier estimates of primary event-free survival in the two treatment arms.

       

       

      Table 2 Number of events and exposure in each arm by Study Year [Source: FDA analysis]

       

      Atenolol Arm

      Losartan Arm

      Year

      of

      Study

      Number

      of

      Events

      Exposure (1000 patient years)

      Rate (Events per 1000 PY)

      Number

      of

      Events

      Exposure (1000 patient years)

      Rate (Events per 1000 PY)

      1

      142

      4.49

      31.59

      108

      4.53

      23.87

      2

      104

      4.35

      23.92

      112

      4.39

      25.54

      3

      111

      4.20

      26.40

      74

      4.24

      17.44

      4

      105

      4.06

      25.87

      108

      4.11

      26.28

      5

      103

      3.25

      31.71

      86

      3.32

      25.89

      6

      23

      0.74

      31.04

      20

      0.76

      26.39

      Total

      588

      21.09

      27.88

      508

      21.34

      23.80

       

      The reduction in risk did not appear to be constant across the three components of the primary endpoint. Figure 2 (next page) shows the Kaplan-Meier estimates of the event-free survival curves for each of the components. Table 3 shows the number of events for each component in two different ways. One way is to count all events that occur at any time in the study whether they were preceded by a different event or not while the other way counts only the events that define the primary endpoint. In the middle of the Table are the components that occurred at any time in the study. If a patient had an MI and then died later during the study, both of these events would get counted in the middle part of the Table, but only the MI would get counted in the bottom section. No hazard ratios or p-values are calculated at the bottom of the table since these counts are provided for descriptive purposes only.

       

      Table 3 Components of primary endpoint by any occurrence and by definition of primary endpoint. [Source: FDA analysis, part of Table appears in p. 69 of Study Report]

      Number of Events

      Hazard Ratio

      95% CI

      P-value

      Losartan Arm

      Atenolol Arm

      Primary Composite

      508

      588

      0.869

      (0.772, 0.979)

      0.021

      Components that occurred at any time in the Study

      CV Mortality

      204

      234

      0.886

      (0.734, 1.069)

      0.206

      MI

      198

      188

      1.073

      (0.879, 1.310)

      0.491

      Stroke

      232

      309

      0.752

      (0.634, 0.891)

      0.001

      Components that occurred that defined the primary endpoint

      CV Mortality

      125

      134

      -

      -

      -

      MI

      174

      168

      -

      -

      -

      Stroke

      209

      286

      -

      -

      -

       

      Figure 2 Components of primary endpoint [Source: Study Report p. 71]

       

       

       

      It can be seen from both Figure 2 and Table 3 that the main difference between the two groups on the primary endpoint was due to a difference in the rate of stroke. The bottom of Table 3 shows how many events that were classified as a primary endpoint were of each of the three types. There was almost no difference in the number of deaths or the number of MIs, but a big difference in the number of strokes. The hazard ratios and p-values in the middle section of the table are calculated using a Cox regression model that included covariates for treatment group, Sokolow-Lyon voltage, Cornell product, and Framingham score. Care should be taken in the interpretation of the analyses for many reasons including the possibility of informative censoring. For example, if a patient has an event of one type and has no further follow-up, then the patient is censored at that time point in the analysis of the other components. However, this specific analysis of the components was pre-specified in the data analysis plan.

      The sponsor’s study report also includes an analysis that shows that the apparent difference of the effect among the three components is greater than what would be expected by chance alone (p=0.023 for the test of consistency of the effect, p. 76 of Study Report). The Data Analysis Plan indicates that the effect of losartan was expected to be similar on each of the components, but that this test for heterogeneity would be done to test this assumption. The conclusions regarding the primary composite endpoint are not rendered invalid by the apparent lack of heterogeneity. In fact, whenever such a composite endpoint is used as the primary endpoint and a significant difference is found, the only conclusion that can be drawn is that there is an effect on at least one of the components. Moreover, the effect does not appear to be harmful in any of the components.

      Figure 3 shows the mean blood pressure over time. These blood pressure measurements were obtained at trough during office visits. There was some ambulatory blood pressure monitor data measured for a subset of the patients in the study, but too little to be able to reliably compare the blood pressure control in the two groups throughout the day. The Figure indicates that blood pressure was controlled in both groups, but the control was slightly better in the losartan group. At many different fixed time points, the differences in systolic, diastolic, or pulse pressure between the two groups are statistically significant (see p. 107 of Study Report).

       

      Figure 3 Blood pressure over time [Source: p. 108 of Study Report]

       

       

      In order to adjust for the effect of blood pressure, three separate Cox regression models were fit that included terms for systolic, diastolic, or pulse blood pressure. The treatment effect remained significant in each of these models and the point estimate for the hazard ratio remained close to 0.86. Adjusting for a time-dependent covariate, particularly one that is affected by treatment, is problematic for many reasons. Here, we do not know whether the differences at trough correlate with the differences throughout the day.

      There were more drug-related adverse experiences in the atenolol group (45% vs. 37%) and more patients discontinued the study in the atenolol arm for adverse experiences (18% vs. 13%). Only 3% of the patients in both groups reported serious drug-related adverse experiences. A group of adverse experiences were pre-specified as being of particular interest: angioedema, bradycardia, sleep disturbance, hypotension, dizziness, sexual dysfunction, cold extremities, cough, and cancer Significantly more patients in the atenolol group experienced bradycardia (Los: 1.4% versus Atl: 8.5%, p=<0.001), cold extremities (Los: 3.9% versus Atl: 5.9%, p=<0.001) and sexual dysfunction (Los: 3.6% versus Atl: 4.7%, p=0.009). Significantly more losartan patients experienced hypotension (Los: 2.6% versus Atl: 1.6%, p=0.001). There were no differences in the frequency of angioedema, sleep disturbance, dizziness, cough, or cancer between the treatment groups [Study Report, p. 66-70].

       

    5. Findings in Special/Subgroups Populations

    Table 4 shows the results for the primary endpoint in different subgroups. In the second and third columns, n represents the total number of events, N represents the total number of patients in the subgroup, and the number in parentheses is the percent of patients with an event (n/N*100). In all cases except one (Non-Diabetics), the hazard ratios, CIs, and p-values are calculated using a Cox regression model adjusting for Cornell product and Sokolow-Lyon voltage and Framingham scores as in the primary analysis. For the Non-Diabetic subgroup, the hazard ratios and p-values are unadjusted because the adjusted values were not available from the Study Report.

     

    Table 4 Primary endpoint results (adjusted as in primary analysis) in special subgroups

    Subgroup

    n/N (%)

    Hazard ratio

    95% CI

    p-value

    Losartan

    Atenolol

    Diabetic

    103/586 (17.6)

    139/609 (22.8)

    0.755

    (0.585, 0.975)

    0.031

    Non-Diabetic

    405/4019 (10.1)

    449/3979 (11.3)

    0.893

    (0.780, 1.021)

    0.098

    ISH (SBP³ 160 and DBP <90)

    75/660 (11.4)

    104/666 (15.6)

    0.750

    (0.557, 1.011)

    0.059

    Country-US

    114/869 (13.1)

    115/838 (13.7)

    0.943

    (0.728, 1.222)

    0.658

    Male

    293/2118 (13.8)

    327/2112 (15.5)

    0.909

    (0.776, 1.064)

    0.235

    Female

    215/2487 (8.6)

    261/2476 (10.5)

    0.820

    (0.685, 0.983)

    0.031

    Race-Black

    46/270 (17.0)

    29/263 (11.0)

    1.666

    (1.043, 2.661)

    0.033

    Race-NonBlack

    462/4335 (10.7)

    559/4325 (12.9)

    0.829

    (0.733, 0.938)

    0.003

    Race-White

    455/4258 (10.7)

    548/4245 (12.9)

    0.834

    (0.736, 0.944)

    0.004

    Age <65

    130/1748 (7.4)

    120/1741 (6.9)

    1.151

    (0.898, 1.477)

    0.266

    Age ³ 65

    378/2857 (13.2)

    468/2847 (16.4)

    0.797

    (0.696, 0.913)

    0.001

    † Unadjusted for measurements of severity of LVH and Framingham scores.

    [Source: FDA analysis and identical to Study Report where available]

    The study was designed to answer a question about the overall effect in the entire population, not to answer questions about smaller subgroups. There appears to be little or no effect in Blacks and a significant effect in Whites. Also, there appears to be little or no effect in those under 65 and a significant effect in those 65 or older. Although there appears to be little evidence of effectiveness in patients from the United States, the analysis of the US subgroup is confounded with the results in the Black subgroup because nearly all of the Blacks were recruited in the US and approximately 1/3 of the US patients were Black. The evidence of effectiveness in Blacks and across Age groups will be examined in more detail in the remainder of this section. These are post hoc analyses and should be interpreted as exploratory.

    Figure 4 indicates how the treatment effect (hazard ratio of losartan versus atenolol) changes as a function of age. The circles are the point estimates of the hazard ratio for each age group categorized in groups of length 5 years. The location on the x-axis is the midpoint of the age interval. The hazard ratios are estimated using the model from the primary analysis, but only the data from that age group. The dashed lines represent the 95% CI for the hazard ratios. The solid line is the estimated hazard ratio as a function of Age from a Cox regression model with Age (and Age* Treatment interaction) included as a continuous covariate (in addition to the other covariates from the primary analysis). The p-value for the Age term in the model was highly significant (less than 0.0001) and the p-value for the Age*Interaction term is 0.06. The figure and the model suggest that the hazard ratio is heterogeneous across age groups.

     

    Figure 4 Hazard ratio for losartan vs. atenolol as a function of Age

    From Table 4, it appears that all of the treatment effect is lost in Blacks because the confidence interval is completely above 1. It is possible to calculate the probability that this would happen under different assumptions regarding the true treatment effect in non-Blacks and how much of this effect is lost in Blacks. For example, the point estimate of the true treatment in non-Blacks is a hazard ratio of 0.829 [Study Report, p. 37] or a log-hazard ratio of –0.187. The estimated standard deviation of the log-hazard ratio for Blacks is 0.239. Since the log-hazard ratio is approximately normal, the probability that the confidence interval for Blacks would be completely above 1 assuming none of this effect is lost is . Now, suppose that losartan is still effective in Blacks, but only half as effective as it is for non-Blacks, then the probability that the confidence interval would be completely above 1 is . Even if all of the effect is lost but losartan is still no worse than atenolol in Blacks, then the probability is 0.025. All of this together tells us that it is very unlikely that we would have observed this effect in the opposite direction in Blacks if losartan was effective in Blacks.

    There is some evidence of internal consistency of the overall result in the Black subgroup. This subgroup can be further subdivided by Age and gender. In each of the four subgroups listed, there are relatively few events but the trend favors atenolol. This also lends some support to the perception that the apparent difference in the effectiveness in Blacks was not simply due to chance.

     

    Table 5 Primary endpoint (losartan vs. atenolol) in Black subgroups

    Subgroup

    n/N (%)

    Hazard ratio

    95% CI

    p-value

    Losartan

    Atenolol

    Black Females

    19/115 (16.5)

    10/132 (7.6)

    3.10

    (1.41, 6.81)

    0.005

    Black Males

    27/155 (17.4)

    19/131 (14.5)

    1.21

    (0.670, 2.19)

    0.526

    Black Age <65

    14/123 (11.4)

    7/147 (4.8)

    2.52

    (1.01, 6.32)

    0.048

    Black Age ³ 65

    32/147 (21.8)

    22/116 (19.0)

    1.31

    (0.758, 2.28)

    0.329

    [Source: FDA analysis]

    Unlike in the overall results, the components of the primary endpoint in Blacks all appeared to trend in the same direction in favor of atenolol and particularly for the stroke endpoint. The estimates for the hazard ratios for CV Mortality, MI, and Stroke in Blacks were 1.483: 95% CI (0.764, 2.879), 2.074: 95% CI (0.786, 5.437), and 2.179: 95% CI (1.079, 4.401) respectively.

    The Gail-Simon test can be used to test for a qualitative interaction. A qualitative (also called crossover) interaction occurs when one treatment is superior for some subset of patients and the alternative treatment is superior for other subsets. This is different from the situation where there is variation in the magnitude, but not the direction, of the effect among subgroups (called quantitative or non-crossover interactions). The null hypothesis is that the treatment effect in all subgroups is on the same side of 1 and the alternative hypothesis is that some are on one side of 1 while at least one is on the opposite side and the test is based on the likelihood ratio. If only the Blacks and non-Black subgroups are included in the test, then the p-value is 0.016 [Study Report, p. 35]. To use this test, the defined subgroups have to be disjoint (they cannot overlap). Hence this test cannot be used directly to see if there is evidence of a qualitative interaction among subgroups defined by Age, Gender, and Race. In other words, the test cannot be used to test the null hypothesis that Blacks, Whites, Males, Females, Patients under 65 years old, and patients 65 or older all have a treatment effect on the same side of 1.

    Since the purpose of the study was not to investigate the relative efficacy of losartan versus atenolol in the Black subgroup, there are many problems with trying to assess the relative efficacy. For example, there are many possible subgroups that could have been examined and the Black subgroup looked like it was one of the worst. Some adjustment for multiple testing seems appropriate, but exactly how it should be done is not clear. If the Gail and Simon test for Blacks vs. non-Blacks had been a pre-specified analysis with a formal plan for handling multiple comparisons, then there would be fewer problems with interpreting the p-value.

    There is some evidence that ACE-inhibitors may not be as effective in the Black subgroup as they are in other races. For example, Exner et. al. state that

    "A lesser response to ACE inhibitors in black patients as compared with

    white patients is not surprising, given previous clinical and experimental

    evidence. Black patients with hypertension have been shown to have

    smaller reductions in blood pressure, on average, than white patients...

    The difference in response has previously been attributed to lower

    plasma renin activity in black patients."

    One does not need to look any farther than the label for losartan to see that there is a concern about differences of effectiveness of losartan in blacks for treating hypertension. The label states "COZAAR was effective in reducing blood pressure regardless of race, although the effect was somewhat less in black patients (usually a low-renin population)". With all of the evidence that is available now, perhaps future studies in populations with a relatively high proportion of blacks should have a pre-specified analysis intended to show that the effect is not substantially worse in Blacks than in non-Blacks and the study should have adequate power to do this. If this had been a pre-specified analysis, the evidence seen in this trial would be considered strong evidence of a difference in effectiveness among Blacks. At this point, we can probably say credibly that there is no evidence of effectiveness of losartan relative to atenolol in the Black subgroup. To go beyond that and say that there is evidence that atenolol is better than losartan is not possible based on this data alone.

     

  5. Conclusions and Recommendations

Based on one randomized study losartan appears to be superior to atenolol in reducing the rate of the composite endpoint Stroke/ MI/ CV Death in the overall population studied. There appears to be a difference in this effect among races (blacks vs. non-blacks) and there is no evidence of the superiority of losartan in blacks. There were small differences in achieved blood pressure control throughout the study in favor of losartan. A clinical decision will have to be made to determine whether these differences in blood pressure can explain the differences in the rate of the primary endpoint.