Final Statistical Summary Review for PMA P970003/S50 (Original and Various Amendments), Vagus Nerve Stimulator (VNS) Therapy System for Depression, Cyberonics, Inc.
I. Introduction
The VNS system is indicated for the adjunctive longterm treatment of chronic or recurrent depression in patients who are experiencing a major episode that has not had an adequate response to two or more antidepressant treatments. This review summarizes the important statistical issues and results for pivotal D02 study (VNS plus Standard of Care), observational D04 (Standard of Care alone), and D02/D04 comparison. The primary efficacy endpoint is the comparison of average rate of change per month (slope) and average change from baseline between D02 and D04 patients for the evaluable patient population. The secondary efficacy endpoint is the comparison of proportions of Response (defined as ³ 50% decrease in scores from baseline) at 12 months.
The Hamilton Rating Score for Depression (HRSD24) is the primary efficacy endpoint in the D02 pivotal study. However, since HRSD24 scores were collected only at baseline and at 12 months in the D04 observational study, the Inventory of Depressive Symptomatology Self Report (IDSSR) was used as the primary efficacy endpoint in the D02/D04 comparison via repeatedmeasures linear regression (RMLR) analyses. The RMLR requires both patient baseline and multiple postbaseline measurements to estimate average rate of change per month (slope) and the difference of two true slopes for D02/D04 comparison.
II. MultiCenter Study Data
There are 22 sites (centers) which participated in either D02 or D04 studies. Of these 22 sites, 12 sites participated in both D02 and D04 studies (called overlapping sites), 9 sites enrolled D02 patients but no D04 patients, and 1 site enrolled D04 patients but no D02 patients. The numbers of patients for “evaluable” and “12month completer” patient population, separately by all participating and overlapping sites, are shown in Table 1.
Table 1. Number of Patients (N)^{a} by “All Sites” and “Overlapping” Sites,
D02/D04 Study
Site 
D02 LongTerm 
D02 Evaluable 
D02 12Month Completers 
D04 Evaluable 
D04 12Month Completers 
All (22) 
233 
205 (185 Unipolar, 20 Bipolar) 
177 
124 
112 
Overlapping (12)^{} 
165 
147 
128 
120 
108 
a. Sample size (N) was justified for the comparison of two response proportions, the secondary efficacy endpoints; not for the comparison of two slopes, the primary efficacy endpoint. The detailed distribution of patients by clinical site (Nonoverlapping and Overlapping) is shown in Table 10, Appendix 1.
III. D02 Pivotal Study
The D02 study included patients whose HRSD24 ³ 18 anytime during the 12 month followup, and HRSD24³ 29 at acute phase (at 3 months). The following Table 2 provides a brief summary for D02 group patients
Table 2. Brief Summary for D02 Study, All Sites

Acute (3 months) 
LongTerm (1year) 
Study Design 
Doubleblind, randomized, parallel, Active VNS versus Sham control, Multicenter (22) 
Active VNS VNS Sham control VNS (Delayedtreatment group) 
Followup 
Baseline (2), Implanted, 2 weeks,…3 months 
Monthly in the first year, quarterly thereafter 
Clinical Outcome 
HRSD24 Score, Primary; IDSSR Score, Secondary 
HRSD24 score (PerProtocol) 
Primary Endpoint 
Comparison of two response proportions 
Average rate of change per month (slope)RepeatedMeasures Linear Regression (RMLR) 
Result 
HRSD24 (N = 221) 15% (17/111) VNS, 10% Sham (11/110) p = 0.31 (Fisher’s exact) IDSSR ( N = 215) 17.4% (19/109)VNS, 7.5% ( 8/106) Sham, p = 0.039 (Fisher’s exact) 
Evaluable ( N = 205) [Slope = 0.45, Standard error = 0.05, 95% CI: (0.55, 0.34), p<0.001 to reject the true null hypothesis (slope = 0)] 12Month Completers ( N = 177) [Slope = 0.47, Standard error = 0.06, 95% CI: (0.58, 0.36), p<0.001] 
IV. D02 and D04 Comparison
Propensity Score (PS) Adjustment
Since D04
is an observational study (Standard of Care alone), evaluation of true device effect must control for
potential bias or confounding effect in differences for individual patient demographic characteristics and
clinically important baseline
covariates between D02 and D04 group patients.
The PS approach is to derive an overall summary composite score of sponsor’s selected 17 patient binary or continuous covariates [age, gender, bipolar versus unipolar depression, lifetime electroconvulsive therapy (ECT) use, length of current major depressive episode (MDE) in months, average number of lifetime episodes of depression, percent of patients received ECT in lifetime, percent of patients received ECT in current MDE, percent of patients with suicide attempt in lifetime or in the past 12 months, and others]. The purpose of PS analysis is to reduce bias in nonrandomized, observational studies, such as in the D02/D04 comparison. Statistical logistic regression is used to predict D02 treatment assignment conditional on the individual patient’s covariates. The resulting individual patient predicted probabilities of receiving active treatment
group
(D02) and control (D04) groups were then ordered to form a 5subgroup or quintiles based on the estimated
propensity scores. For example, the
first quintile group contains
approximately 20% of the patients with the lowest D02 PS and the last group contains approximately 20% of the
patients with the highest D02 PS.
PS can only adjust for observed
covariates, not for unobserved ones. PS analysis may not eliminate all
selection bias, particularly hidden bias.
In my previous reviews, this reviewer asked the sponsor to provide the following information regarding their PS analysis:
Justify selection criteria for fitted logistic regression model:
Graphical display (e.g., Bar chart) of the distribution of PS quintile means (for continuous covariates) or quintile proportions (for binary covariates) between
D02 and D04 patients;
For each selected patient covariates (17), prepare statistical analyses for both beforeandafter PS adjustment between D02 and D04 patients. Explain the degree of covariate unbalance before PS adjustment and covariate balance (or unbalance) after adjustment;
Explain 2way analysis of variance including main effect (treatment group, PS quintile) and their interactions.
The sponsor has responded to the above comments in the March 17, 2004 Amendment # 4.
RepeatedMeasures Linear Regression (RMLR) Analysis
The RMLR analysis is used to evaluate average rate of change (slope) and average change in IDSSR scores from baseline to the 12month followup. SAS PROC MIXED was used to analyze the 12month followup data. No missing data imputation is needed to run SAS PROC MIXED since missing data are assumed to be missing at random (MAR), which means that probability of missing data is independent of future observed data. The last observation carried forward (LOCF) analysis was also prepared by the sponsor for comparison purpose. The patient covariates used in the general mean response mixed model include several fixedeffect study factors [9 pooled sites with some pooled sites containing only D02 patients (see Table 18.2, March 17, 2004 submission), treatment (D02 versus D04), 5level grouped PS quintiles, baseline IDSSR score, indicator variables for followup time at 3, 6, 9, and 12 months, and treatment by time interactions]. The spatial power covariance structure is used to count for correlation among different followup times.
In the March 26, 2004 email, FDA asked the sponsor to respond to the following questions:
Provide an analysis the IDSSR primary efficacy endpoint in the RMLR analysis, HRSD24 secondary efficacy endpoint, and Response/NonResponse (see following section) proportions from only those sites that enrolled for both D02 and D04 (overlapping sites, see Appendix 1) for unipolar and bipolar patients combined.
Repeat the above analyses for censored patients (i.e., additions or changes in either antidepressant drugs or ECT).
The sponsor has responded to the above FDA’s comments. For censored patients approach in RMLR analyses, the sponsor indicated that “IDSSR raw scores were censored such that the value from the patient’s IDSSR measurement obtained prior to their first increase in the antidepressant resistance rating (ARR) score was carried forward and replaced all of the patient’s subsequent, nonmissing, IDS SR measurements.”
Please note that the current ECT addition or change during the followup was not discussed above.
Comparison of Proportions of Response
(for 12Month Completers)
The comparison of two response proportions (³ 50% reduction from baseline in IDSSR or HRSD24 scores), defined as one of several other secondary efficacy endpoints, is discussed in this summary review. However, the direct, simple comparison of two response proportions between D02 and D04 patients, without adjusting for individual patient baseline IDSSR or HRSD24 scores and other clinically important patient covariates, is subject to potential bias. The 12month completers are the patients who were in close compliance with the scheduled followup, may provide the “bestcase” scenario as compared to those who did not complete the 12month study. The selected cutoff point (³ 50% from baseline, response; else, nonresponse) is also subject to measurement
error, variability of the IDSSR or HRSD24 scores, and serial correlation of repeatedmeasures data. Statistical logistic regression is more appropriate than simple comparison of two proportions by taking important patient covariates (site, baseline IDSSR or HRSD24, and others) into the model building. Sensitivity analysis may be helpful to evaluate robustness of Response/Non Response outcomes by various cutoff points. Appropriate statistical methods for pooling of multicenter data, such as metaanalysis, stratified categorical data analysis, will provide precise evaluation of true treatment effect for comparison of two response proportions.
Concordance Between HRSD24 and IDSSR Scores in D02 Study
As discussed above, the IDSSR score was used in the RMLR analyses to estimate the average rate of change per month (slope) for D02 and D04 patients. Sponsor’s justification is that the HRSD24 score data were collected only at baseline and at the 12 months in the D04 study, and that the IDSSR score had been shown as a “good predictor” of the HRSD24 score from the published literature.
In FDA’s major deficiency letter of March 1, 2004 and FDA’s Email dated March 31, 2004, we do not agree that concordance studies reported in the published literature are sufficient to support IDSSR as a “good predictor” of HRSD24. Due to wide variability of paired IDSSR/HRSD24 scores from patient to patient, pooledpatient correlation and regression analyses are not appropriate. FDA asked the sponsor to prepare the following analyses:
Calculate correlation coefficients between IDSSR and HRSD24 scores for each individual patient and a pooled estimated correlation coefficient over all patients by an appropriate statistical method.
Likewise, calculate corresponding results for estimated slopes and their standard errors for each individual patient.
Provide the fitted linear regression model (intercept, slope, or higher terms), the estimated parameter values, standard errors, 95% confidence intervals, and squared multiple correlation coefficient (RSquare) to show goodnessoffit of sponsor’s regression equation to the observed paired IDSSR/HRSD24 data.
Provide graphical displays for all individual patient paired HRSD/IDSSR data to show all observed individual patient data pairs, and the fitted regression lines.
The sponsor has responded to all of the above FDA’s requested issues. However, the linear regression model assumes that all individual patient IDSSR/HRSD24 pairs are independent, but they are actually correlated in the more proper longitudinal data analyses. Nevertheless, this reviewer believes that linear regression analyses based on individual patient paired IDSSR/HRSD24 data may still be applied to verify the sponsor’s claimed IDSSR as a “predictor” of HRSD24.
Unipolar and Bipolar Patients
In Question # 9 of FDA’s major deficiency letter of March 1, 2004, we asked the sponsor to prepare separate and combined analyses for unipolar and bipolar patients for both Response/NonResponse secondary efficacy endpoint, and RMLR analyses for the primary efficacy endpoint. The sponsor’s responded in their March 17, 2004 submission. The unipolar/bipolar subgroup analyses were not discussed in the original study design. The sample size for bipolar patients is too small to provide any statistically valid conclusion. The distribution of sample size in D02 and D04 comparison is shown in Table 3.
Table 3. Distribution of Number of Patients in the D02 and D04 Comparison, by Unipolar/Bipolar Patients, IDSSR (HRSD24) Scores
Group 
D02 
D04 
Unipolar 


Evaluable 
163 (164) 
97 (91) 
12Month Completer 
156 (157) 
97 (91) 
Bipolar 


Evaluable 
17 (17) 
15 (13) 
12Month Completer 
17 (17) 
15 (13) 
Combined 


Evaluable 
180 (181) 
112 (104) 
12Month Completer 
173 (174) 
112 (104) 
V. Statistical Analyses Results
PS
Analyses
All graphical displays for each of 17 covariates for D02 and D04 comparisons are shown in Attachment 20 of March 17, 2004 submission. The graphical displays appear to be acceptable to examine comparability of D02 and D04 patient populations with respect to these 17 covariates after PS adjustment.
The beforeandafter PS comparisons for 17 covariates are also shown. The statistically significant differences in some covariates between D02 and D04 group patients before PS adjustment were nonsignificant after PS adjustment. For example, a statistically significant pvalue <0.001 in percent of patients who received ECT in current MDE between D02 and D04 groups became non significant (p = 0.434) after PS adjustment. Although some PS quintile by treatment interaction is also shown for percent of patients who received ECT in their lifetime and length of current MDE and treatment, the sponsor’s PS adjustment procedures via logistic regression model appear to be acceptable. The final PS quintile by treatment frequency distribution is shown in Table 4.
Table 4. Treatment (D02/D04) by PS Quintile Frequency Distribution (Evaluable Patients)
PS Quintile Group 
D02 ( N = 205) 
D04 ( N = 124) 

1 
22 (10.9%) 
43 (34.7%) 

2 
39 (19.4%) 
26 (21.0%) 

3 
36 (17.9%) 
29 (23.4%) 

4 
48 (23.9%) 
17 (13.7%) 

5 
56 (27.9%) 
9 ( 7.3%) 

Total 
201 (100%)* 
124 (100%) 

[*: 4 patients excluded from PS analysis]
The above frequency distributions are statistically acceptable. Patients belong to each of the above 5 PS quintile groups were coded as the categorical variable (5 levels) in the RMLR analyses.
RMLR
Analyses, D02 and D04 Comparison,
IDSSR Scores
The following Figure 1 shows the observed and predicted (by RMLR) mean IDS SR scores by baseline, and each of 4 quarters (9 pooled sites, evaluable patients). The predicted mean IDSSR scores appear to be close to the observed scores. The predicted differences in mean IDSSR scores between D2 and D4 patients showed smaller improvement than these observed scores between D02 and D04 patients. For example, at Quarter 4, the predicted difference (D2 – D4) is 4.8 (33.7 – 38.5) and the observed difference (D2 – D4) is 6.6 (32.6 – 39.2).
The following Figure 2 shows the corresponding mean IDSSR scores for 12 month completers
For primary effectiveness endpoint (IDSSR), the differences of average rate of change per month (slope) and their 95% confidence intervals (CI) for Evaluable and 12month completers are shown in Table 5.
Table 5. Difference of slope (D2 – D4) and the 95% CI,
All Sites, Unipolar/Bipolar Patients Combined, All Sites
Patient Population 
Difference (Std Error) 
95% CI for Difference 
Evaluable 
0.397 (0.1) 
(0.59, 0.21) 
12Month Completers 
0.452 (0.1) 
(0.65, 0.26) 
Clinical interpretation is needed to decide whether or not the above results are clinically acceptable.
The sample size for the above D2 and D4 comparison is shown in Table 6
Table 6. Sample Size (N) by D2/D4 Comparison, All Sites
Group 
Baseline 
Q 1 
Q 2 
Q 3 
Q 4 
D2 (N) 
201* 
200 
195 
183 
177 
Missing 
0 
1 
6 
18 
24 
D4 (N) 
124 
120 
119 
116 
112 
Missing 
0 
4 
5 
8 
12 
*: 4 evaluable patients did not have IDSSR and/or PS scores available
I have revised part of sponsor’s reported proportions of Response/NonResponse for IDSSR and HRSD24 scores, for 12month completers, as shown in sponsor’s Tables 3 and 4, Volume 19, Clinical Summary (See Tables 7A and 7B below)
Table 7A. FDA’s Revised Proportions of Response for 12Month Completers, IDSSR Scores, All Pooled Sites
12Month Data 
D02 
D04 
pvalue^{a} 
Response^{b} 
22 % (38/173) 
12% (13/112) 
0.027 
LOCF Response 
22% (39/176) 
12% (13/112) 
0.027 
Complete Response^{c} 
15% (27/180) 
4% (4/112) 
0.001 
LOCF Complete Response 
13% (27/204) 
3% (4/124) 
0.003 
a. Fisher’s twosided exact test
b. ³ 50% decreasing change from baseline; c. IDSSR £14
Table 7B. FDA’s Revised Proportions of Response for 12Month Completers, HRSD24 Scores, All Pooled Sites
12Month Data 
D02 
D04 
pvalue^{a} 
Response^{b} 
30 % (54/181)) 
13% (13/104) 
0.001 
LOCF Response 
27% (55/205) 
13% (13/104) 
0.004 
Complete Response^{c} 
17% (31/181) 
7% (7/104) 
0.018 
LOCF Complete Response 
16% (32/205) 
7% (7/104) 
0.029 
a. Fisher’s twosided exact test;
b. ³ 50% decreasing change from baseline
c. HRSD24 £9
Please note that, in Tables 7A and 7B, pvalues calculated from direct pooling of all cell frequencies (Response/NonResponse by treatment group) over all sites (nonoverlapping and overlapping), without preparing appropriate statistical modeling approach or metaanalysis, may be invalid.
Under the section for RMLR, in the FDA’s March 26, 2004 email, we asked the sponsor to reanalyze the IDSSR and HRSD24 score data from only these sites that enrolled both D2/D4 (overlapping sites) and for censored patients (i.e., additions or changes in either antidepressant drugs or ECT). The following FDA’s revised Tables 8A through 8D are for Tables 241 through 244 shown in the sponsor’s responses of April 2, 2004 to the FDA’s email dated March 26, 2004 (Amendment # 6).
FDA’s Revised Table 8A
IDSSR ScoresD02/D04 Comparisons (Overlapping sites for both D02 and D04 patients only), Evaluable Patient Population (Unipolar and Bipolar patients combined)

D02 
D04 
Pvalue^{} 
95% CI (Least square mean) 
RR(95% CI)^{b} D2 – D4 
N at Baseline 
147 
120 



Baseline Average 
42.7 
43.6 



12 Month Data 
N = 131 
N = 108 



Average 
33.8 
39.4 



Average change from baseline (SD) 
8.9 (13.3) 
4.2 (12.1) 
0.003^{} 
(8.9, 1.8) 

LOCF average change from baseline (SD) 
8.4 (12.8) 
4.6 (12.2) 
0.021 
(7.1, 0.6) 

Response (% of Subjects)^{c} 
19.8 (26/131) 
11.1 (12/108) 
0.076^{a} 

1.8 (0.94, 3.4) 
LOCF Response (% of Subjects) 
17.7 (26/147) 
11.7 (14/120) 
0.227 

1.5 (0.83, 2.8) 
Complete Response (% Subjects)^{d} 
13.0 (17/131) 
2.8 (3/108) 
0.0045 

4.7 (1.4, 15.5) 
LOCF Complete Response (% Subjects) 
11.6 (17/147) 
2.5 (3/120) 
0.0048 

4.6 (1.4,15.4) 






a By Fisher’s twosided exact test
b. Risk Ratio (RR) = [P(Response) for D02]/[P(Response) for D04]
Example: for Response, the estimated RR = (26/131)/ (12/108) = 1.78
c Response: ³ 50% decreasing change from baseline
d. Complete Response: IDSSR £ 14
Primary effectiveness endpoint [Difference in two slopes, D2 – D4 and 95% CI: 0.32 per month, 95% CI: (0.52, 0.12), p = 0.002 to reject true null hypothesis (difference = 0)] (see April 2, 2004 Amendment # 6)
FDA’s Revised Table 8B
HRSD24 ScoresD02/D04 Comparisons (Overlapping sites for both D02 and D04 patients only), Evaluable Patient Population

D02 
D04 
Pvalue 
95% CI (Least square mean) 
RR (95% CI)^{b} 
N at baseline 
147 
120 



Baseline Average 
27.4 
27.7 



12 Month Data 
N = 130 
N = 100 



Average 
19.7 
23.0 



Average change from baseline (SD) 
7.7 (8.8) 
4.7 (7.6) 
0.020 
(5.1, 0.4) 

LOCF Average change from baseline (SD) 
6.9 (8.9) 
4.7 (7.6) 
0.113 
(4.0, 0.4) 

Response (% of Subjects)^{c} 
27.7 (36/130) 
11.0 (11/100) 
0.0018^{a}^{} 

2.5 (1.3, 4.7) 
LOCF Response (% of Subjects)^{} 
25.2 (37/147) 
11.0 (11/100) 
0.0055 

2.3 (1.2, 4.2) 
Complete Response (% Subjects)^{d} 
16.9 (22/130) 
5.0 (5/100) 
0.0065 

3.4 (1.3, 8.6) 
LOCF Complete Response (% Subjects) 
15.6 (23/147) 
5.0 (5/100) 
0.013 

3.1 (1.2, 7.9) 






a. By Fisher’s twosided exact test
b. Risk Ratio (RR) = [P(Response) for D02]/[P(Response) for D04]
c. Response: ³ 50% decreasing change from baseline
d. Complete Response: HRSD_{24} £ 9
FDA’s Revised Table 8C
IDSSR Scores After D02 Censoring OnlyD02/D04 Comparisons (Overlapping sites for both D02 and D04 patients only), Evaluable
Patient Population

D02 
D04 
Pvalue^{} 
95% CI (Least square mean) 
RR (95% CI)^{b} 
N (at Baseline) 
147 
120 



Baseline Average 
42.7 
43.6 



12 Month Data 
N = 131 
N = 108 



Average 
36.0 
39.4 



Average change from baseline (SD) 
6.7 (13.3) 
4.2 (12.1) 
0.026 
(7.5,0.5) 

LOCF Average change from baseline (SD) 
6.1 (12.8) 
4.6 (12.2) 
0.160 
(5.5, 0.9) 

Response (% of Subjects)^{c} 
16.8 (22/131) 
11.1 (12/108) 
0.26^{a} 

1.5 (0.78, 2.9) 
LOCF Response (% of Subjects) 
15.0 (22/147) 
11.7 (14/120) 
0.47 

1.3 (0.7, 2.4) 
Complete Response (% Subjects)^{d} 
8.4 (11/131) 
2.8 (3/108) 
0.095 

3 (0.86, 10.6) 
LOCF Complete Response (% Subjects) 
7.5 (11/147) 
2.5 (3/120) 
0.097 

3 (0.85, 10.5) 






a. By Fisher’s twosided exact test
b. Risk Ratio (RR) = [P(Response) for D02]/[P(Response) for D04]
c. Response: ³ 50% decreasing change from baseline
d. Complete Response: IDSSR £ 14
Primary effectiveness endpoint [Difference in two slopes, D2 – D4 and 95% CI: 0.18 per month, 95% CI: (0.38, 0.02), p = 0.079 to reject true null hypothesis (difference = 0)] (see April 2, 2004 Amendment # 6)
RMLR
predicted mean IDSSR (Table 6.2.37, Amendment # 6)
Quarter D2 (SE) D4 (SE) D2D4 95% CI*
1 37.93 (0.59) 38.47 (0.60) 0.54 (1.70, 0.64)
2 37.01 (0.65) 38.09 (0.69) 1.08 (2.38, 0.24)
3 36.35 (0.71) 37.96 (0.81) 1.61 (3.10, 0.12)
4 36.58 (0.77) 38.72 (0.98) 2.14 (3.84, 0.54)
(*The average of two standard errors was used as pooled SE for 95% CI, N unknown)
FDA’s Revised Table 8D
HRSD24 ScoresAfter D02 Censoring Only D02/D04 Comparisons (Overlapping sites for both D02 and D04 patients only), Evaluable Patient Population

D02 
D04 
Pvalue^{} 
95% CI (Least square mean) 
RR (95% CI)^{b} 
N (at 12Month) 
147 
120 



Baseline Average 
27.4 
27.7 



12 Month Data 
N = 130 
N = 100 



Average 
22.5 
23.0 



Average change from baseline (SD) 
4.9 (9.1) 
4.7 (7.6) 
0.581 
(3.9, 1.7) 

LOCF Average change from baseline (SD) 
4.3 (8.9) 
4.7 (7.6) 
0.910 
(2.1, 2.4) 

Response (% of Subjects)^{c} 
18.5 (24/130) 
11.0 (11/100) 
0.14^{a} 

1.7(0.86,3.3) 
LOCF Response (% of Subjects) 
16.3 (24/147) 
11.0 (11/100) 
0.27 

1.5(0.76,2.3) 
Complete Response (% Subjects)^{d} 
7.7 (10/130) 
5.0 (5/100) 
0.59 

1.5(0.54,4.3) 
LOCF Complete Response (% Subjects) 
6.8 (10/147) 
5.0 (5/100) 
0.79 

1.3(0.48,3.8) 






a. By Fisher’s twosided exact test
b. Risk Ratio (RR) = [P(Response) for D02]/[P(Response) for D04]
c . Response: ³ 50% decreasing change from baseline
d. Complete Response: HRSD_{24} £ 9
In Tables 8A through 8D, the 95% confidence interval (CI) for the difference
(D2 – D4) of two average changes from baseline, rather than the calculated p values, would allow better clinical evaluation, because pvalues are simply used to reject the true null hypothesis that two average changes = 0 against the alternative hypothesis that two average changes ¹ 0.
In Tables
8C and 8D for censored patients (overlapping sites), as described previously, that “IDS SR or HRSD24 raw scores were censored
such that the value from the patient’s IDSSR
or HRSD24 measurement obtained prior to their
first increase in the antidepressant resistance rating score was carried forward and replaced all of the patient’s
subsequent, nonmissing, IDSSR measurements.
In Tables 8C and 8D (Censored patients from overlapping sites, see Appendix 1), most of statistical results failed to support that D2 patients showed superior IDSSR or HRSD24 results to those for D4 patients, except for average change from baseline comparison at 12 months for IDSSR only.
Concordance between IDSSR and HRSD24 Scores
Due to wide variability of paired IDSSR/HRSD24 scores from patient to patient, FDA requested the sponsor to calculate the estimated correlation coefficient and its 95% confidence interval (CI), the estimated regression intercept and slope and their 95% CI, the unadjusted (for degrees of freedom) RSquare (R^{2}), which measures the “proportion of total variation about the mean HRSD24 explained by the fitted regression equation”, from each individual patient. The R^{2} evaluates how well IDSSR, predicts HRSD24 score. R^{2 }ranged from 0% (worst prediction fit) to 100% (perfect prediction fit). In Figure 3, the histogram of R^{2} shows relatively poor to fair prediction with mean R^{2} of 0.55, ranging from 0 to1, for 235 evaluable patients from the D2 study. In Table 9, the average simple Pearson correlation coefficient of 0.7 with 95% CI (0.67, 0.73) between IDSSR and HRSD24 scores, again indicates that IDSSR is not a “good predictor” of HRSD24.
Table 9. Summary of Correlation Coefficient and Regression Slope, for 235 Evaluable Patients, Paired IDSSR/HRSD24 Scores, All Sites
Parameter 
N 
Mean 
SD 
Median 
Min 
Max 
Lower 95% CL 
Upper 95% CL 
Pearson Correlation Coefficient 
235 
0.7 
0.25 
0.77 
0.26 
1.0 
0.67 
0.73 
Slope 
235 
0.55 
0.25 
0.56 
0.76 
1.49 
0.51 
0.58 
VI. Conclusion
The IDSSR does not to be a “good predictor” of the HRSD24 based on the distribution of R^{2} values (Figure 3) and sample correlation coefficients (Table 9) of 235 D2 evaluable patients.
Due to small sample size (Table 3) for bipolar patients, no valid statistical analyses can be prepared, but a clinical decision is needed.
The required sample size analyzed in this PMA was based on neither the comparison of two true slopes (Primary effectiveness endpoint) nor mean responses in this repeatedmeasures/longitudinal data analyses. No minimum clinically detectable difference in two slopes or mean HRSD24 or IDSSR was defined at the study design stage in order to estimate the required sample size with prespecified power, type I error, estimated variability of the data, number of follow up visits, and correlation among repeated measures.
A clinical decision is also required to decide several important issues, such as pooled sites, potentially important hidden covariates in the PS analysis, non overlapping sites or overlapping sites, and censored or noncensored analyses. The validity of statistical inferences from comparison of two proportions of Responses pooled over all nonoverlapping and overlapping sites, without any appropriate statistical modeling approach, such as metaanalysis, is highly questionable. Clinically important patient covariates, such as patient baseline IDSSR or HRSD24 measurements, clinical site, and others, must be considered in the comparison of two proportions via statistical modeling approach or stratified categorical data analysis.
For censored and overlapping sites, no statistically significant differences in primary effectiveness endpoint (Difference in two slopes, IDSSR, Table 8C) and secondary effectiveness endpoint (Difference in two proportions of responses or difference in average change from baseline, HRSD24, Table 8D) were found.
Due to above statistical issues, such as questionable concordance between HRSD 24 and IDSSR, questionable pooling of multicenter data for comparison of proportions of responses, statistically insignificant findings from censored and overlapping sites (Tables 8C and 8D) for IDSSR primary effectiveness endpoint (Slope) and HRSD24 secondary effectiveness endpoint (Response proportions), it is unclear whether the effectiveness claim of D02 over D04 group patients has been demonstrated.
Appendix 1
Table 10. Distribution of Number of Patients by Clinical Site
(10 Nonoverlapping and 12 Overlapping Sites), D02 and D04 Study
Site 
Long Term, D02 
Evaluable D02 
12Month D02 
Evaluable D04 
12Month D04 
040 
16 
15 
15 
6 
5 
041 
10 
9 
7 
3 
1 
042 
10 
9 
7 
 
 
043 
13 
12 
10 
8 
7 
044 
17 
13 
10 
12 
12 
045 
18 
18 
15 
13 
10 
046 
13 
11 
10 
2 
2 
047 
9 
7 
7 
 
 
048 
7 
7 
7 
 
 
049 
12 
10 
9 
11 
11 
050 
10 
9 
8 
8 
6 
051 
9 
7 
3 
 
 
052 
9 
8 
7 
 
 
053 
4 
3 
2 
 
 
054 
13 
12 
12 
15 
14 
055 
8 
5 
4 
 
 
056 
9 
9 
9 
 
 
057 
6 
5 
5 
16 
14 
058 
19 
17 
14 
14 
14 
059 
18 
16 
13 
12 
12 
060 
3 
3 
3 
 
 
071 
 
 
 
4 
4 
Total (All) 233 205 177 124 112
Total 165 147
128 120 108
(Overlapping)