Food and Drug Administration
VERSION NO.: 1.4
- 4.2.1 Accuracy, Precision, and Uncertainty
- 4.2.2 Error and Deviation; Mean and Standard Deviation
- 4.2.3 Random and Determinate Error
- 4.2.4 The Normal Distribution
- 4.2.5 Confidence Intervals
- 4.2.6 Populations and Samples: Student's t Distribution
- 4.2.7 References
Statistical procedures used to describe measurements of samples in the ORA laboratory allow regulatory decisions to be made in as unbiased manner as possible. The following are numerically descriptive measures commonly used in ORA laboratories.
The accuracy of a measurement describes the difference between the measured value and the true value. Accuracy is said to be high or low depending on whether the measured value is near to, or distant from, the true value. Precision is concerned with the differences in results for a set of measurements, regardless of the accuracy. Applied to an analytical method as used in an ORA laboratory, a highly precise method is one in which repeated application of the method on a sample will give results which agree closely with one another. Precision is related to uncertainty: a series of measurements with high precision will have low uncertainty and vice versa. Terms such as accuracy, precision, and uncertainty are not mathematically defined quantities but are useful concepts in understanding the statistical treatment of data. Exact mathematical expressions of accuracy and precision (error and deviation), will be defined in the next section.
As an example of these terms, consider shooting arrows at a target, where the "bull's eye" is considered the true value. An archer with high precision (low uncertainty) but low accuracy will produce a tightly clustered pattern outside the bull's eye; if low precision (high uncertainty) and low accuracy, the pattern will be random rather than clustered, with the bull's eye being hit only by chance. The best situation is high accuracy and high precision: in this case a tight cluster is found in the bull's eye area. This example illustrates another important concept: accuracy and precision depend on both the bow and arrow, and the archer. Applied to a laboratory procedure, this means that the reliability of results depends on both the apparatus/instruments used and the analyst. It is extremely important to have a well trained analyst who understands the method, applies it with care (for example by careful weighing and dilution), and uses a calibrated instrument (demonstrated to be operating reliably). Without all of these components in place, it is difficult to obtain the reliable results needed for regulatory analysis.
The concepts of accuracy and precision can be put on a mathematical basis by defining equivalent terms: error and deviation. This will allow the understanding of somewhat more complicated statistical formulations used commonly in the ORA laboratory.
If a set of N replicate measurements x1, x2, x3,…,xn , were made (examples: weighing a vial N times, determining HPLC peak area of N injections from a single solution, measuring the height of a can N times, …), then:
Ei = xi - μ
The definition of error often has little immediate practical application, since in many cases μ, the true value, may not be known. However, the process of calibration against a known value (such as a chemical or physical standard) will help to minimize error by giving us a known value with which to compare an unknown.
The deviation, a measure of precision, is calculated without reference to the true value, but instead is related to the mean of a set of measurements. The mean is defined by:
Note: this is the arithmetic mean of a set of observations. There are other types of mean which can be calculated, such as the geometric mean (see the section on "Application of Statistics to Microbiology" below), which may be more accurate in special situations.
Then, the deviation, di, for each measurement is defined by:
di = xi - X
Using the example of the archer shooting arrows at a target, the deviation for each arrow's position is the distance from the arrow's position to the calculated mean of all of the arrow's positions.
Finally, the expression of deviation most useful in many ORA laboratory applications is s, the standard deviation:
where s = standard deviation, and other terms are as previously defined.
The standard deviation is then a measure of precision of a set of measurements, but has no relationship to the accuracy. The standard deviation may also be expressed in relative terms, as the relative standard deviation, or RSD:
Whereas the standard deviation has the same units as the measurement, the RSD is dimensionless, and expressed as a percentage of the mean.
Standard deviation as defined above is the correct choice when we have a sample drawn from a larger population. This is almost always the case in the ORA laboratory: the sample which has been collected is assumed to be "representative" of the larger population (for example, a batch of tablets, lot of canned goods, field of wheat) from which it has been taken. As it is taken through analytical steps in the laboratory (by subsampling, compositing, diluting, etc.) the representative characteristic of the sample is maintained.
If the entire population is known for measurement, the standard deviation s is redefined as σ, the population standard deviation. The formula for σ differs from that of s in that (N-1) in the denominator is replaced by N. The testing of an entire population would be a rare circumstance in the ORA laboratory, but may be useful in a research project.
Statistical parameters such as mean and standard deviation are easily calculated today using calculators and spreadsheet formulas. Although this is convenient, the analyst should not forget how these parameters are derived.
Recall the definition of error in section 4.2.2 above. Errors in measurement are often divided into two classes: determinate error and non-determinate error. The latter is also termed random error. Both types of error can arise from either the analyst or the instruments and apparatus used, and both need to be minimized to obtain the best measurement, that with the smallest error.
Determinate error is error that remains fixed throughout a set of replicate measurements. Determinate error can often be corrected if it is recognized. Examples include correcting titration results against a blank, improving a chromatographic procedure so that a co-eluting peak is separated from the peak of interest, or calibrating a balance against a NIST-traceable standard. In fact, the purpose of most instrument calibrations is to reduce or eliminate determinate error. Using the example of the archer shooting arrows at a target, calibration of the sights of the bow would decrease the error, leading to hitting the bull's eye.
Random error is error that varies from one measurement to another in an unpredictable way in a set of measurements. Examples might include variations in diluting to the mark during volumetric procedures, fluctuations in an LC detector baseline over time, or placing an object to be weighed at different positions on the balance pan. Random errors are often a matter of analytical technique, and the experienced analyst, who takes care in critical analytical operations, will usually obtain more accurate results.
In the introduction to this chapter, it was briefly mentioned that statistics is derived from the mathematical theory of probability. This relationship can be seen when we consider probability distribution functions, of which the normal distribution function is an important example. The normal distribution curve (or function) is of great value in aiding understanding of measurement statistics, and to interpret results of measurements. Although a detailed explanation is outside the scope of this chapter, a brief explanation will be beneficial. The normal distribution curve describes how the results of a set of measurements are distributed as to frequency; assuming only random errors are made. It describes the probability of obtaining a measurement within a specified range of values. It is assumed here that the values measured (i.e. variables) may vary continuously rather than take on discrete values (the Poisson distribution, applicable to radioactive decay is an example of a discrete probability distribution function; see discussion under "Statistics Applied to Radioactivity"). The normal distribution should be at least somewhat familiar to most analysts as the "bell curve" or Gaussian curve. The curve can be defined with just two statistical parameters that have been discussed: the true value of the measured quantity, μ, and the true standard deviation, σ. It is of the form:
An example of two normal curves with the same true value, μ, but two different values of σ is shown below (this was calculated using an Excel ® spreadsheet, using the formula above and an array of x values):
Some properties of the normal distribution curve that are evident by inspection of the graph and mathematical function above go far in explaining the properties of measurements in the laboratory:
- In the absence of determinate errors, the measurement with the most probable value will be the true value, μ.
- Errors (i.e. x-μ), as defined previously, are distributed symmetrically on either side of the true value, μ; errors greater than the mean are equally as likely as errors below the mean.
- Large errors are less likely to occur than small errors.
- The curve never reaches the y-axis but approaches it asymptotically: there is a finite probability of a measurement having any value.
- The probability of a measurement being the true value increases as the standard deviation decreases.
The confidence interval of a measurement or set of measurements is the range of values that the measurement may take with a stated level of uncertainty. Although confidence intervals may be defined for any probability distribution function, the normal distribution function illustrates the concept well.
Approximately 68% of the area under the normal distribution curve is included within ±1 standard deviation of the mean. This implies that, for a series of replicate measurements, 68% will fall within ±1 standard deviation of the true mean. Likewise, 95% of the area under the normal distribution curve is found within about ± 2σ (to be precise, 1.96 σ), and approximately 99.7% of the area of the curve is included within a range of the mean ±3σ. A 95% confidence interval for a series of measurements, therefore, is that which includes the mean ± 2σ. An example of the application of confidence limits is in the preparation of control charts, discussed in Section 7.6 below.
In the above discussion, we are using the true standard deviation, σ (i.e. the population standard deviation). In most real life situations, we do not know the true value of σ. In the ORA laboratory, we are generally working with a small sample which is assumed to be representative of the population of interest (for example, a batch of tablets, a tanker of milk). In this case, we can only calculate the sample standard deviation, s, from a series of measurements. In this case, s is an estimate of σ, and confidence limits need to be expanded by a factor, t, to account for this additional uncertainty. The distribution of t is called the Student's t Distribution. Further discussion is beyond the scope of this chapter, but tables of t values, which depend on both the confidence limit desired and the number of measurements made, are widely published.
The following are general references on statistics and treatment of data that may be useful for the ORA Laboratory:
- Dowdy, S., Wearden, S. (1991). Statistics for research (2nd ed.). New York: John Wiley & Sons.
- Garfield, F.M. (1991). Quality assurance principles for analytical laboratories. Gaithersburg, MD: Association of Official Analytical Chemists.
- Taylor, J. K. (1985). Handbook for SRM users (NBS Special Publication 260-100). Gaithersburg, MD: National Institute for Standards and Technology.