Background Information for Advisory Committee for Pharmaceutical Science
Concept and Criteria of BioINequivalence
Bioequivalence is defined as “the absence of a significant difference in the rate and extent to which the active ingredient or active moiety in pharmaceutical equivalents or pharmaceutical alternatives becomes available at the site of drug action when administered at the same molar dose under similar conditions in an appropriately designed study...”. To evaluate bioequivalence, the U.S. Food and Drug Administration (FDA) has employed a testing procedure termed the two one-sided tests procedure ([i]) to determine whether the average values for the pharmacokinetic measures from the test and reference products are comparable. This procedure involves the calculation of a confidence interval for the ratio between the average values of the test and reference product. FDA considers a test product to be bioequivalent to a reference product if the 90% confidence interval of the geometric mean ratio of AUC and Cmax between the test and reference fall within 80-125% ([ii]).
Recently, the FDA has received several studies intended to show bio-inequivalence between two drug products, for example an innovator company might conduct a study to challenge FDA’s approval of generic versions of its drug product. Although there has not been a formal definition of the concept of bio-inequivalence in the regulation, intuitively, the concept of bio-inequivalence is not hard to perceive, given the well-defined concept of bioequivalence. However, there are no clear criteria to guide sponsors in conducting bio-inequivalence studies and FDA reviewers in assessing the validity of such bio-inequivalence studies. Because of a lack of a clear definition of bio-inequivalence, there has been some confusion and misunderstanding by the public.
Many questions arise when evaluating a bio-inequivalence claim. A typical question is if it is appropriate to claim bio-inequivalence when the two-sided 90% confidence intervals for the ratios of the PK endpoints do not fall inside the bioequivalence interval? There are numerous literature reports that claim bio-inequivalence based on a failed bioequivalence study without identification of the causes of the study failure. There are many ways that a bioequivalence study can fail, including an insufficient number of subjects. Many products that were claimed to be bio-inequvialent in the literature might well be bioequivalent if the studies were conducted appropriately. Therefore, it is imperative to develop and establish a bio-inequivalence criterion to clarify confusion and misunderstanding in the public.
In these presentations, we first introduce the concept of bio-inequivalence and present a statistical explanation for the proposed criterion to assess bio-inequivalence. We then discuss several statistical strategies to assess bio-inequivalence studies with three pharmacokinetic endpoints (Cmax, AUCt and AUC¥). The goal is to propose a set of criteria that are scientifically sound, statistically valid, and easy to use and to provide sufficient information to stimulate discussion on the evaluation of bio-inequivalence.
The concept of bio-inequivalence and test criteria
FDA’s bioequivalence criteria require the 90% confidence interval of the ratio of the geometric means of the test and reference drug products to be within the bioequivalence interval [80%, 125%]. The definition of the bio-inequivalence region then is simply the region that lies outside the bioequivalence interval, i.e., (0, 80%) or (125%, ∞). Now the question is why a study failing to show bioequivalence cannot be used to claim bio-inequivalence. Once this question is answered, it will be a little easier to understand the statistical criteria proposed for bio-inequivalence claims.
To answer this question, we need to understand statistically how the criteria for bioequivalence are formed. To test bioequivalence, the null hypothesis is set to be the bio-inequivalence region and the alternative hypothesis to be the bioequivalence interval. The goal is to see if bio-inequivalence can be rejected so that we may conclude that bioequivalence is true. For this purpose, it is important for the probability of an error that wrongfully rejects bio-inequivalence, and therefore falsely concludes bioequivalence, to be small. This error is usually controlled at the level of 0.05, which is the so-called significance level or the type I error rate. To reject the bio-inequivalence region, we need to perform two one-sided tests, each controlling the type I error rate at the level of 0.05. The maximum error rate in the two tests are actually controlled at the level of 0.05. The statistical criteria for rejecting bio-inequivalence and claiming bioequivalence are to have two-sided 90% confidence intervals (for the geometric mean ratio for each of the three PK endpoints) that are each within the bioequivalence interval. This procedure based on 90% confidence intervals is identical to carrying out the two one-sided tests described above.
To address whether failing to show bioequivalence demonstrates bio-inequivalence, we need to understand that in a bioequivalence test we usually do not control the error of wrongfully failing to conclude bioequivalence. If this error were controlled at a very low level, this would be equivalent to having very high power in a bioequivalence test. In order for both the significance level and power to be controlled at high level, a large sample size will generally be required, which will increase the cost of the study. For example, if we set the power to be 85%, and assuming the variance is 0.04, the sample size required is about 22, given the ratio of the two geometric means deviates from 1 by no more than 5%. In this case, the test could have about a 15% chance to fail to show bioequivalence even when the two drugs are truly equivalent. If the variance is larger than 0.04 and the ratio of the two geometric means deviates from 1 by more than 5% but still within the bioequivalence interval, the power could be much lower than 85% for the given sample size of 22. That is, the chance of failing to show bioequivalence would be much higher than 15% even when the two drugs are equivalent. Therefore, because there is less control over the probability of failing to show bioequivalence, it is inappropriate to use a study that fails to show bioequivalence to claim bio-inequivalence.
Then why should the bio-inequivalence criterion be that the upper (lower) limit of the two-sided 90% CI should be less (greater) than 80% (125%)? As mentioned before, usually it is not realistic to control both types of errors, i.e., wrongfully rejecting bio-inequivalence and bioequivalence. A reasonable study only tightly controls one type of error. Therefore when testing for bio-inequivalence, we would like to control the error of wrongfully rejecting bioequivalence to be small. To be consistent with the bioequivalence testing, the error rate is also chosen at the level of 0.05. To reject bioequivalence, we also need to perform two one-sided tests, however, the level of each test may need to be 0.05. For one of the two tests to be significant at the 0.05 level, either the upper limit of the two-sided 90% CI has to be less than 80% or the lower limit to be above 125%.
Theoretically, it is possible for the type I error to reach 0.10 when a two-sided 90% CI is used to assess bio-inequivalence. However, this is true only when the variance of the estimated treatment difference (the ratio of geometric means) is very large. For typical crossover bio-inequivalence trials, such a large variance may not be a realistic possibility. Therefore, the type I error rate should be maintained at the level of 0.05 when two-sided 90% CI is used.
The above figure illustrates the different possible outcomes. A study with the two-sided 90% confidence interval completely between 80-125% demonstrates bioequivalence and allows market access. A study with the two-sided 90% confidence interval completely outside 80-125% demonstrates bio-inequivalence and may be grounds for market exclusion. A study with the point estimate within 80-125% but the two-sided 90% confidence interval outside of 80-125% fails to demonstrate bioequivalence. A study with the point estimate outside 80-125% but the two-sided 90% confidence interval overlapping 80-125% fails to demonstrate bio-inequivalence. Both of the failing cases would require studies with larger sample sizes to draw a definitive regulatory conclusion.
Evaluating the three PK endpoints collectively:
As mentioned earlier, based on the interpretation of regulation, FDA usually requires three pharmacokinetic endpoints (Cmax, AUCt, and AUC¥) to show bioequivalence. All the two-sided 90% confidence intervals for the ratios of the geometric means for the three pharmacokinetic endpoints must be within the bioequivalence interval to demonstate bioequivalence. If the 90% confidence interval for just one of the three pharmacokinetic endpoints does not fall completely within the bioequivalence interval, the study has not demonstrated that the two drugs are bioequivalent. However, the statistical criteria for testing bio-inequivalence using all the three pharmacokinetic endpoints will not be as simple. Here we discuss several strategies that potentially can be used for assessing bio-inequivalence using three pharmacokinetic endpoints. The evaluation of the strategies is based on both the error rate of wrongfully rejecting bioequivalence and power for detecting bio-inequivalence under various correlation structures.
One strategy that seems intuitive is to have at least one of the three pharmacokinetic endpoints satisfy the statistical criteria for bio-inequivalence, i.e., the upper (lower) limit of the two-sided 90% CI to be less (greater) than 80% (125%). However, this strategy could potentially inflate the error rate of wrongfully rejecting bioequivalence above the level of 0.05 if the three pharmacokinetic endpoints are not highly correlated.
The second strategy that is just the opposite of the first one discussed above is to require all the three pharmacokinetic endpoints to satisfy the statistical criteria for bio-inequivalence. This strategy can certainly control the error rate of wrongfully rejecting bioequivalence under all correlation structures. However, it may not always provide adequate power under alternatives that are of interest.
The third strategy that could protect the error rate of wrongfully rejecting the bioequivalence is to pre-specify one pharmacokinetic endpoint for bio-inequivalence testing. For example, one could pre-specify AUCt and completely ignore the results of the other two pharmacokinetic endpoints. However, this strategy only has good power when AUCt is the endpoint most likely to demonstrate bio-inequivalence. If only Cmax of the two drugs were bioinequivalent, then pre-specifying AUCt would give the test zero power to detect bio-inequivalence.
It is possible to develop a compromise approach. Instead of requiring all the three pharmacokinetic endpoints to satisfy the statistical criteria for bio-inequivalence with two-sided 90% confidence intervals as the measurement, we could have flexible width of the one-sided confidence intervals, while controlling the error rate at the level of 0.05 under all correlation structures. For example, it is possible to have one pharmacokinetic endpoint use a two-sided 92% confidence interval (slightly wider than 90% confidence interval) to show bio-inequivalence, while the second pharmacokinetic endpoint uses a two-sided 86% confidence interval (narrower than 90% confidence interval) and the third pharmacokinetic endpoint uses two-sided 80% confidence interval (much narrower than 90% confidence interval). For this strategy, it does not matter which pharmacokinetic endpoints uses which confidence interval. The advantage of this strategy is to use narrower confidence intervals to increase power to show bio-inequivalence, although at the cost of slightly widening one pharmacokinetic endpoint’s confidence interval. Notice this strategy is developed using the assumption of a normal distribution. If the normal assumption is inadequate, it is possible to derive slightly different widths of confidence intervals under other distributions.
1. Does the ACPS agree with the distinction between demonstrating bioINequivalence and failure to demonstrate bioequivalence?
Committee’s comments: The Committee felt that there was a need to establish criteria for bioINequivalence evaluation and the criteria should not be just as failure of the bioequivalence test. The members argued it was important to focus on the clinical relevance with the therapeutic index. The Committee discussed both Area under the Curve (AUC) and Cmax as metrics important for bioequivalence and bioINequivalence.
2. Does the ACPS recommend a preferred method for evaluating the three pharmacokinetic endpoints for bioINequivalence?
· If bioINequivalence is demonstrated for any one pharmacokinetic endpoint, then bioINequivalence is demonstrated for the products.
· BioINequivalence must be demonstrated for all three pharmacokinetic endpoints for bioINequivalence to be demonstrated for the products.
· There should be one pre-selected pharmacokinetic endpoint used for bioINequivalence testing. If so, which one?
· The three pharmacokinetic endpoints should be evaluated for bioINequivalence with statistical corrections to the level of significance for each endpoint in order to maintain an overall significance level of 0.05.
Committee’s comments: The Committee agreed on a general understanding of bioINequivalence to move forward by recognizing it is not a simple matter. In addition, the members felt this is an important concept, especially how it applies to the entire regulatory scenario. There was no consensus at this point as to a final criteria pertaining to the three pharmacokinetic endpoints.
If one of the three PK endpoints fail to satisfy the equivalence interval [80%, 125%], by definition, the bioequivalence is false and bioinequivalence is true. To test bioinequivalence using three PK endpoints, we would also like to control the error rate of wrongfully rejecting bioequivalence at the level of 5%.
As mentioned before, it is tempting to claim bioinequivalence between two drug products if one of the three endpoints demonstrates inequivalence by showing that a two-sided 90% CI falls in the bioinequivalence regions. This strategy could potentially inflate the overall type I error of wrongfully rejecting bioequivalence. For example, the error rate can be as high as 14.7% if no correlation exists (ρ=0), about 8.2% when pairwise correlation is 0.90, and 5.9% when the three endpoints are highly correlated (ρ=0.99), given the variance of test statistics is not too large. Therefore, this strategy is not acceptable for assessing bioinequivalence using three endpoints.
However, if one is certain that one PK endpoint could have the highest power to demonstrate inequivalence among the three, it is possible to pre-specify the PK endpoint to test for bioinequivalence using one PK endpoint at the level of 5%. This strategy will protect the error rate of wrongfully rejecting bioequivalence at the level of 5% and perform well if we know, at the design stage of study protocol, which PK endpoint is most likely to demonstrate bioinequivalence. Pre-specifying one PK endpoint is important in this strategy in order to protect the error rate. The disadvantage of this strategy is that it could end up having low power if the pre-specified PK endpoint in fact does not have high power to show inequivalence.
If one can not pre-specify a PK endpoint, three PK endpoints should be used to demonstrate bioinequivalence. For this scenario, the strategy that requires all three two-sided 90% CIs to fall outside the equivalence interval should not be recommended. Though this strategy protects the error rate tightly, it is overly stringent to show bioinequivalence in most situations. An enhanced approach for this scenario is to use unequal width of CIs, such as 92%, 86%, 80% for the three CIs. This approach controls the error rate at 5% level regardless the correlation structure of the three endpoints, and can have enhanced power for the situations that all three PK endpoints may have some power to demonstrate bioinequivalence, but it is uncertain which one would have the best power.
In this presentation, we will explain that so far there is not a universally powerful approach that can be used to evaluate three PK endpoints for bioinequivalence. The sponsor of the bioinequivalence studies should be able to make their own choice in selecting a strategy in evaluating bioinequivalence. However, the choice must be pre-specified in bioinequivalence study protocol before study is conducted.
What is your preferred method for evaluating the three pharmacokinetic parameters for bioinequivalence?
· If bioinequivalence is demonstrated for any one pharmacokinetic parameter that is prespecified, then bioinequivalence is demonstrated for the products.
· Bioinequivalence must be demonstrated for all three pharmacokinetic parameters for bioinequivalence to be demonstrated for the products, where the error rate is controlled at 5%.
· FDA should allow sponsors of bioinequivalence studies to make their own choice on picking up strategies; sponsors should prespecify the choice in study protocols before the study is conducted.
[i] D.J. Schuirmann. A comparison of the two one-sided tests procedure and the power approach for assessing the equivalence of average bioavailability. J. Pharmacokinet. Biopharm. 15: 657-680 (1987).