Toxicological Principles for the Safety Assessment of Food Ingredients
Chapter IV.B.4. Statistical Considerations in Toxicity Studies
Return to Redbook 2000 table of contents
- Specific Statistical Issues
- Statistical Considerations Reference
The regulations governing approval for marketing imply that submissions should contain both statistical analyses of toxicology data presented in the submission and documentation of the analyses. The purpose of this section is to guide the submitter in documenting statistical aspects of toxicity studies contained in food ingredient submissions so that CFSAN reviewers can evaluate these studies efficiently. Additional advice in the form of Standard Operational Procedures (SOPs) prepared by the Division of Mathematics of CFSAN's Office of Toxicological Sciences is available upon request from the CSO assigned to the petition.
To ensure the validity of safety assessments of a food ingredient obtained from well-conducted toxicity studies, statistical expertise should be used routinely in the planning, design, execution, analysis, and interpretation of results. This guideline highlights factors that are of primary importance in assessing the validity of evidence from toxicity studies. These factors are 1) study protocol and design, 2) presentation of collected data (individual animal data), 3) presentation and interpretation of analytical results (including tables of summary data), and 4) other considerations.
FDA emphasizes that communication between statisticians and the scientists conducting a particular toxicity study can help ensure that the statistics used are relevant to the biology of the toxicity test. For example, statistical outliers are not always biological outliers, and a "significant" statistical test (p < 0.05) does not always indicate biological significance. FDA encourages petitioners to consult with Agency statisticians during the design and conduct of the study and the interpretation of data from the study, as appropriate.
The following recommendations offer general guidance to the petitioner in organizing and documenting the results of toxicity studies:(1)
- Data should be submitted in a form that will enable FDA reviewers to easily verify the results by duplicating the analysis or, if necessary, performing an alternative analysis. The best way to accomplish this is to submit the data in tabular form in the petition and, at the same time, in a machine-readable form (see Chapter II.B. for additional information about submission of machine-readable data).
- Summary tables of the data also should be submitted.
- The submission should be organized and documented so as to enable Agency reviewers to move easily between the data and the summary tables. (For example, if the report of a bioassay involving 50 rats in a dose group includes a summary table indicating that the incidence of a given tumor is 3/40, there should be auxiliary tables showing which three rats had the tumor, which 37 rats were examined but did not have the tumor, and which ten rats were not examined for the tumor.)
- When outliers are removed for statistical reasons, the statistical test upon which the decision to remove them was based should be specified.
- The description of a statistical inference should include a statement about the model used, summary data appropriate for the model, analysis of the data with estimates of treatment effects, and reasonable statistical checks on the adequacy of the model.
- In presenting tables of summary data that reference statistical tests of hypotheses, a statement should identify the null and alternative hypotheses, the statistical test, the sampling distribution of the test statistic under the null hypothesis, the value of the test statistic, the degrees of freedom of the test statistic (when appropriate), the p-value, and whether the test is one or two tailed.
- Statistical analyses should be directly linked to specific questions regarding the safety of the additive (i.e., comparing results for treated groups with results for a control group and evaluating the effects of various animal characteristics (sex, species, age, etc.) on the results of an experiment).
- Results of the statistical analyses of all toxicity studies (e.g., p-values, confidence intervals) should be tabulated. Additionally, an effort should be made to explain how these results contribute to resolving questions about the safety of the food ingredient .
- The submission should cross-reference related information (e.g., data tabulations, statistical hypotheses tested, models used, etc.) that will facilitate FDA's statistical review of the study.
The submitted petition should contain the original protocol and a complete account of protocol modifications made during the course of the study. The protocol is a critical document in the evaluation of a bioassay, shaping both the conduct of the study and the ultimate analyses. It sets forth the objectives of the study and relates these objectives to the statistical hypotheses that are tested. It describes critical features of the study's design and execution, such as the purpose of the study, experimental design (subchronic, short-term, multi-generation), selection of species, selection of parameters to be assessed, planned interim analyses of data, planned interim and final sacrifices, events that would trigger early termination of the study, roles and responsibilities of data monitoring boards or quality assurance boards, and proposed statistical methods. By designating in advance the treatment groups and the variables that will be considered to be primary endpoints for statistical analyses, the protocol appropriately defines and limits the hypotheses that the study is able to test.
A well-designed experimental protocol will normally contain, as a minimum, the following items:
- Statement of objectives: In addition to the primary objective(s), secondary objective(s) should be stated explicitly. The precise hypotheses that the study is attempting to prove or disprove also should be stated explicitly.
- Source of test animals: A clear statement about the species, strain, sex and source of the test animals in the study and how animals are screened from the study (i.e., will "runts" be eliminated; why?).
- Experimental design: This should include information about initial baseline periods (if any), the study configuration (short-term, lifetime, etc.), the treatment levels, the control group(s), the number of animals in each group (sample size), and the criteria for terminating the study.
- Randomization procedures: A description of the randomization procedure(s) used to assign animals to experimental groups. Generally, a computer-driven procedure using a random number generator is better than a table of random numbers.
- Administration: A statement about the route of administration and frequency of administration of the test compound.
- Diets: A complete description of any diets used in the study.
- Control of confounding factors: A statement about how the effects of confounding response variables of interest (i.e., caging effects) were minimized. If this is not possible, that fact should be stated along with the reason for the inability to discount these effects and the possible impact on the study.
- Experimental parameters measured: A description of the parameters that will be measured and a statement about how frequently they will be measured.
- Power analysis: If the number of animals being used is within the guidelines given in this publication for the type of study being planned, a power analysis is not necessary. If fewer numbers of animals are to be used, then a power analysis or a statement about the differences in study parameters between compared groups that the study is able to detect should be submitted.
- Quality control: A description of the steps taken to ensure accurate, consistent, and reliable data (e.g., standard operating procedures, instruction manuals, data verification).
- Data analysis: A description of planned interim analyses of the data, including monitoring procedures, variables to be analyzed, statistical analyses to be used (including the choice of significance level for each interim analysis), and frequency of analysis.
- Statistical Methods: A description of the statistical methods to be applied to the data. Here, specific questions that the statistical analyses will address in support of the study objectives are identified. For example, a description of the methodology that would be used to detect outliers may be important. The major end-points for analysis should be identified. If multiple comparisons are to be made, they should be pre-planned.
Information on every animal in the study should be presented. Data should be organized so that the reviewer can easily find all information about any animal used in the study. For example, data should be organized so that the reviewer can view all study parameters for a single animal and a single parameter for all animals. Individual animal records can be presented or data can be tabulated, depending on the study and the type of data collected. The liberal use of data tables and submission of machine-readable data is strongly encouraged (see contact information for electronic submissions). Steps taken to assure the numerical accuracy of the collected data should be documented in detail sufficient to permit the reviewer to judge their accuracy.
As described previously, the identifying number, age upon entry into the study, dose level, sex, initial body weight, and cage identification should be presented for each animal in the study. There also should be a table showing how animals were randomized into their respective dose groups. Other information should include:
- For each animal, length of time in the study, date of death, type of death (e.g., scheduled sacrifice, moribund sacrifice, animal found dead, etc.), and reason for early withdrawal from the study, if this occurred (e.g., escaped from cage).
- Food, water, and test compound consumption at each interval specified in the protocol.
- All measured values for defined parameters and the times at which these measurements were taken. If deviations from standard operating procedures occurred in taking the measurements, the nature of the deviation, the reason for the deviation, and its impact on the study should be discussed.
- For all microscopic lesions: a) type of lesions (neoplastic or non-neoplastic) should be clear from the morphologic diagnosis; b) when appropriate, severity grades (e.g., mild, moderate, marked) for non-neoplastic lesions should be included; c) when appropriate, modifiers such as "metastatic", "invasive" or "systemic" should be used for neoplastic lesions; and d) information indicating when the lesion was first observed (in life or at necropsy) should be included in the individual animal data.
Presentation of results of statistical analysis should include a description of, and rationale for, all statistical methods used. Unless the method is well-known (e.g., analysis of variance), references should be provided. A thorough discussion of the statistical analysis, including reasons for the use of a particular analysis, assumptions, conduct of the analysis, and validity of the conclusions, will guide FDA in deciding whether re-analysis of the data is needed. For each analysis of a relevant variable that is submitted, the following information should be provided:
- Specific variables and analysis of variance: A statement identifying the specific variable; if not obvious, a discussion of its relevance to the objectives of the study should be included.
- Statistical model: The statistical model underlying the analysis; references should be provided, if necessary.
- Hypothesis: A statement of the hypothesis being tested and of the alternative hypothesis.
- Power calculation: A power calculation for tests that failed to reject the null hypothesis, particularly to justify the adequacy of the sample size.
- Confidence intervals: The statistical methods used to estimate effects, construct confidence intervals, etc.; literature references should be supplied when appropriate.
- Outliers: The methods used to detect outlying data points (outliers) and the reasons why particular methods were selected. Identified outliers should be studied in an attempt to determine the reason for their deviation from other data in the set.
- Assumptions underlying the statistical methods: It should be shown that, insofar as is statistically reasonable, the data satisfy crucial assumptions, especially when such assumptions are necessary to confirm the validity of an inference. For example, in deciding whether to use parametric or non-parametric methods, tests for normality and for equality of variances should be conducted.
- Survival analyses: Such analyses will address the question of whether treated animals died earlier than control animals and will help determine if treated animals lived long enough to enable treatment-related tumors to be detected.
- Analysis of tumors: Analysis of tumors (benign and malignant) and other lesions for each group of test animals. Whether the tumor is an incidental finding upon death or a cause of death should dictate the method of analysis used. The major theoretical difference between these analyses is the manner in which the number of animals at risk in each time interval is defined. This needs to be taken into account in performing tests such as the standard Cox Life Table test.
- Trend test: A trend test, when appropriate. This includes not only a test for linearity, but a test for lack of fit as well.
- Plots or graphs of summary data: Care should be taken to generate plots that will convey the most information: For example, in studies with many animals in each dose group, it may be better to plot the mean and confidence limits or plus or minus one (±1) standard deviation than to attempt to plot individual data.
The following points are also important in the presentation of collected data:
- Transformation of data: Unnecessary data transformations should be avoided. If data transformation has been performed, a rationale for the transformation and an interpretation of the estimates of treatment effects based on transformed data should be provided.
- Parametric and non-parametric analyses of data: Parametric and non-parametric analyses of the same parameter at different time periods should be avoided. For example, if equality of variances in a parameter measured over time is tested, and some tests turn out significant and others do not, the statistician should arrive at a consensus (i.e., does the preponderance of evidence point to equality of variances or not). We recommend that this be done by converting p-values obtained to standard normal deviates (z-scores) and obtaining the p-value for the average score times the square root of the number of p-values.
- Litter and caging effects: Litter and caging effects should be taken into account in determining the statistical model. If this is not possible, that fact should be stated along with the reason for the inability to account for these effects and its possible impact on the study.
- Repetitive measurements: For parameters that are measured across time, a repeated measures analysis should be considered.
- Dependent experimental parameters: If a given parameter depends biologically on another parameter (i.e., organ weight depends on body weight), then the dependent parameter should be adjusted, as in analysis of co-variance.
- Time of death: Time of death should be reported as days from the start of the study. For example, if a study began on January 1, 1997 and the animal died on January 1, 1999, then the animal died on Day 730.
- Reproduction studies: In reproduction studies, if a dam continues in the study after all pups have died, the number of pups in her litter should be counted as 0.
- Statistical comparisons: When statistical comparisons of data were not pre-planned, a statement on how bias was avoided in choosing the particular analysis should be included.
- Statistic: The statistic, the sampling distribution of the test statistic under the null hypothesis, the value of the test statistic, the significance level (i.e., p-value), a statement of whether the test used was one or two tailed, and intermediate summary data should be presented in a format that will enable the reviewer to verify the results of the analysis quickly and easily. In most cases, a copy of the computer output will provide the necessary information. For example, documentation of a two-sample t-test should include the two sample sizes, the mean and variance for each of the samples, the pooled estimate of variance, the value of the t-statistic, the associated degrees of freedom, and the p-value.
Computer programs: When possible, commonly available computer programs should be used; please consult with FDA statisticians about appropriate programs. If it is necessary to use a program written by the petitioner itself, the program should be fully documented, including:
- the source code;
- test runs against "known" results; that is, textbook examples, examples worked by hand, or examples run with packaged programs. These test runs should cover every case that could arise in connection with the data in the petition. Test cases should be run both before and after the program is used for the submitted data.
In the case of a complex toxicity test or carcinogenicity bioassay, the petitioner is encouraged to consult with CFSAN before conducting the study or submitting the petition to discuss relevant statistical considerations. Requests for comments by statistical reviewers on protocols for proposed toxicity studies can be sent to the CSO assigned to the petition (see Chapter II.A. ).
If unusual concerns arise during the conduct of a study, the petitioner may submit preliminary tabulations of the data and materials pertaining to the statistical analysis to CFSAN for advice and guidance.
- Dubey, Satya (June, 1985) Draft Guidelines for the Format and Content of the Statistical Sections of an Application. (Return to text)