Skip page top navigationFDA Logo--links to FDA home page Logo of and Link to start page of Office of Regulatory Affairs, U.S. Food and Drug Administration U.S. Food and Drug Administration Office of Regulatory Affairs HHS Logo and link to Department of Health and Human Services website

FDA Home Page | Federal-State | Import Program | Compliance | Inspection | Science | ORA Search

VOLUNTARY NATIONAL RETAIL FOOD REGULATORY PROGRAM STANDARDS

CLEARINGHOUSE WORK GROUP
Questions and Answers May 2002 - July 2004

RFS Home | Previous

Standard #5

  1. Final Outbreak Reports

    Question/Problem
    The second paragraph under “Description of Requirement” in Standard 5 discusses follow-up on complaints of alleged food-related illness or injury. At the end of that paragraph, it says “the final report of the investigation is shared with the state epidemiologist and the Centers for Disease Control and Prevention.” Are all complaint follow-ups to be reported to the epidemiologist and CDC or only those that meet the definition of a foodborne illness?

    Clearinghouse Work Group
    The final reports of investigations that meet the definition of a foodborne illness should be shared with the state epidemiologist and CDC.  The paragraph does seem to mix “apples and oranges.”  The intention is for you to record all complaints of alleged illness or injury, to perform an assessment of the complaint to determine appropriate follow-up, and to link that information to the establishment record for retrieval purposes in order to identify patterns and trends.  In some instances an investigation may be performed and a short report written on complaints that do not meet the official definition of a foodborne illness.  You are not required to share work reports of that nature with the state epidemiologist and CDC.  For all investigations that meet the definition of a foodborne illness, a final report is to be written and shared with the state epidemiologist and CDC.

    Page Top

Standard #6

  1. Determining Conformance with a Standard

    Much of the criteria contained in Standard #6 are predicated on the inspection program containing risked based approaches outlined in earlier Standards - in particular Standard #3.  Our initial review of Standard #6 indicates that much of the criteria rely on forms that record and quantify status of risk factors/interventions and other serious violations. 

    Since our current program is lacking a definitive step-by-step compliance and enforcement process based on the occurrence and correction of risk factors and interventions, the assessment protocol outlined in Standard #6 can not be completed as designed.  Assessing staff’s consistency in these areas through file reviews seems premature until all the components are in place.

    Question/Problem
    Our internal self-assessment process has revealed significant gaps in our compliance and enforcement program.  Several risk-based components contained in Standard #6’scriteria must be developed and integrated into our program before a meaningful file review can be performed against all the criteria contained in the Compliance and Enforcement Standard.

    For this initial self-assessment, is it sufficient to note the gaps within our current compliance and enforcement program as rationale as to why we do not meet Standard #6 or must we also complete a file review of randomly selected establishments.  If we are to continue on with a file review, against what criteria do we assess compliance with the Compliance and Enforcement program components we already have in place?

    Clearinghouse Work Group
    This question is similar to a prior one concerning Standard 4, and the response applies regardless of the Standard in question.  If a cursory look at a Standard compared to your program is sufficient to reveal gaps that prevent you from meeting the Standard and provides a rational for your conclusion, then it is not necessary for you to proceed further.  No one wants you to spend time that is not productive.  You only need go as far as necessary to identify the gaps you would need to fill then establish a strategic plan for ultimately meeting the Standard.

  2. Risk Categories for Establishments

    Question/Problem:
    Risk categories for prioritizing establishment inspections need to be established and put into the Program Standards Guidelines.

    Clearinghouse Workgroup Response
    The Workgroup agrees. See response to Standard #3, Problem 4.

  3. Spreadsheet with Embedded Math Formulas

    Question/Problem
    There is a need for a standardized reporting and collecting format for Standard #6.

    Clearinghouse Workgroup Response
    Since a ‘format’ is included in the Appendix to Standard, it appears that a mathematical spreadsheet for calculating the columns in the worksheet in the Appendix for Standard #6 is being requested. The Workgroup will recommend that FDA collect existing spreadsheets, if they exist, from the participating jurisdictions.

  4. Files with No Risk Factor/Intervention Violation on the Start-Point Inspection

    Question/Problem
    The process for self-assessment against Standard 6 is unclear. It is not clear from the Appendix F worksheet instructions how to mark files drawn for review that do not have a risk factor or Food Code intervention violation on the ‘start-point inspection.’ Should the self-assessor keep drawing files until he/she finds the requisite number of files with violations on the start-point inspection or are files without a violation considered as ‘passing’ files?

    Clearinghouse Work Group
    While the Standard 6 implies the answer, the Appendix F instruction create some confusion. Standard 6 says in item 3. under the essential program elements required to meet the Standard that there must be ‘documentation on the establishment inspection report form or in the establishment file that compliance and/or enforcement action was taken to achieve compliance at least 80 percent of the time when out-of-control risk factors or interventions are recorded on a routine inspection measured using the procedures in Appendix F.” The ‘when’ in the sentence indicates that the files to be counted are ones where a risk factor/intervention violation exists.

    Appendix F says that in order for an establishment file to ‘pass,’ each column marked with a violation at the start-point inspection must have a subsequent “yes” answer to indicate that at least one type of follow-up action was taken. This again clearly defines what constitutes a passing file. The Appendix also goes on the define a ‘failing’ file as a start-point violation without a final resolution.

    The confusion occurs when the instructions say to divide the number of files that passed by the number of files reviewed to determine the percentage. This assumes that every file reviewed would have a violation on the start-point inspection and would, therefore, be either a ‘pass’ or a ‘fail’ file. This is not the case when the start-point inspection does not reflect a risk factor or intervention violation. Files without a violation at the start-point inspection do not meet either the pass or fail criteria, and using the math formula as described would not demonstrate an 80 percent resolution when violations are identified. The instruction should say to divide the number of files that pass by the number of files showing a risk factor or Food Code intervention on the start-point inspection.

    Advice from the FDA Division of Mathematics says that the random draw of files must continue until the assessor finds the requisite number of files that have a start-point violation. Files that do not have a violation at the start-point inspection do not meet the criteria established for the sample. The minimum random draw of files with a start-point violation will be 20 files for jurisdictions with less than 400 establishments and five percent (5%) or 70 files, whichever is less, for jurisdictions with 400 or more establishments. The number of alternate files suggested under the Supplemental Sampling heading in Appendix F should also probably be increased to 50 percent of the original sample size to accommodate the number of files that may be excluded from the draw because of no start point inspection violation.

    Page Top

Standard #7

  1. Strengthen Standard #7

    Question/Problem
    There needs to be more emphasis on interacting with industry and community. How many are feasible? We recommend the Standard be altered to recommend a total of four activities annually (two in each category).

    Clearinghouse Workgroup Response
    The Clearinghouse believes that it is too early to determine whether a Standard should be made more stringent or lenient. Information should be gathered from the current participants, and then recommendations for change can be based on experiential data gathered from across the country from both large and small jurisdictions. The Clearinghouse does not recommend action on this item at the present time.

    Page Top

Standard #8

  1. Full-time Equivalent to inspection Ratio

    Question/Problem
    FTE inspection ratio is not attainable. Develop more realistic criteria. We recommend that FDA approve the innovative grant for the time study to develop realistic data on this issue. We also recommend that CFP Program Standards Committee use the information developed in the time study (still to be approved) to re-evaluate the FTE ration with respect to the number of inspections and/or allowing local staffing formulas.

    Clearinghouse Workgroup Response
    About the grant process, a panel of ten or so judges, including experts from outside the FDA organization, rate the innovative grant proposals independently based on the published rating criteria for the grants. Although the panel meets to present and discus the merits of each proposal, each judge separately scores each proposal on each of the several rating criteria. After each of the judges rates the proposals, the separate ratings from each of the judges are averaged. A separate grant board (FDA uses the National Institutes of Health Grant Board) then ranks the proposals in descending order according to their averaged scores, with the lowest score being the best. Grants are given starting with the proposals at the top of the list and continuing until the pool of grant money available is exhausted.

    We agree that a time study of inspections conducted in accordance to the Standards’ inspectional requirements and including all the activities in the definition of an inspection in Standard # 8’s would be most useful. The time study mentioned in the issue was funded for a grant from FDA. As in all studies, it’s usefulness will be determined by the quality of the study design. The report of the study outcome will be reviewed by FDA with great interest. The Clearinghouse expects that important data will be gathered as well from the jurisdictions currently participating in the self-assessment process. As stated in other responses, the Clearinghouse believes that recommendations for changes will need to be based on solid information and should reflect the best practices in the food safety community. The Clearinghouse Workgroup recommends no action at this time.

    Page Top

Standard #9

  1. Standard Self-Assessment Process for Decentralized Jurisdictions

    I work for a State regulatory food program.  The responsibility for regulatory oversight of retail food and foodservice operations has been passed on to local county agencies through a formal delegation agreement.  The State agency, itself, has very little direct regulatory inspection responsibilities within the retail food sector.

    My agency has enrolled in the Program Standards.  Our staff is able to initiate a self-assessment of our State Program against some of the Standards, including:

    • Standard #1 – Regulatory Foundation
    • Standard #2 – Trained Regulatory Staff
    • Standard #5 – Foodborne Illness Surveillance
    • Standard #7 – Industry and Community Relations

    Some of the other Standards, however, present significant challenges to our successful completion of a self-assessment because they rely heavily on the structure and process related to direct inspection work.  These Standards include:

    • Standard #3 – Incorporating the Principles of HACCP into Regulatory Inspections
    • Standard #4 – Inspection Uniformity
    • Standard #6 – Compliance and Enforcement
    • Standard #8 – Program Resources
    • Standard #9 – Program Assessment

    Question/Problem
    How should State Programs who have delegation agreements with local agencies for direct regulatory inspections of the retail segment of the industry conduct a self-assessment of their own program?  What parameters should be used to assess compliance with those Standards listed above that rely substantially on an assessment of the structure and/or process pertaining to on-site inspection work and related files?

    Clearinghouse Work Group Response
    In the circumstance that you describe, the application/implementation of the Standards falls into two areas, those areas that are a direct part of your program and those areas that you manage.  Legal delegations can be accomplished using several different written instruments such as delegation agreements, contracts for service, or memoranda of understanding.  For those pieces of the program that you manage through delegation, you should establish written criteria to be followed by the delegatee in the performance of those delegated duties.  You can meet the Standards in those areas that you have delegated by demonstrating the following elements:

    1. That criteria exist in your formal delegation document that meets the Standards criteria for those areas,
    2. That you regularly perform a monitoring, oversight, or audit function of retail food programs that have entered into a delegation agreement or contract with your agency to ensure that the criteria is being met (We suggest that you require by delegation document that the delegatee perform self-assessment and develop plans to bridge gaps in order to make oversight/auditing less resource intensive), and
    3. That you require the delegatee to develop and implement action plans for correction if it does not meet the criteria in the delegation agreement or contract.

    For your self-assessment of delegated program areas, you will determine the presence or absence of these three elements (a. through c. above) for delegated functions.  Of course, you also will perform a self-assessment against the other Standards’ requirements for pieces of the program that you perform directly.

  2. Facility Types to be Included in Baseline Surveys

    Our jurisdiction is considering limiting our baseline survey data collection to only one of the facility types that we regulate. We are thinking of surveying only the full service restaurants since it is the more complex segment of the industry and includes the majority of our permitted establishments. There are two reasons for limiting the scope of our survey. The first reason is to conserve the expenditure of resources during these tight budget times. The second reason is that we would like to gain some experience in the methodology and surveying techniques before we put too many resources into the process only to discover that changes need to be made in the process.

    Question/Problem
    If we limit the scope of our data collection survey to only one of the facility types that we regulate, will we still meet the intent of Standard 9?

    Rationale
    We believe that we will meet the intent of Standard 9 by surveying only one facility type. The Standard does not spell out which facility types must be surveyed. It simply requires that baseline data be collected and that additional data be collected on subsequent three-year cycles. Further FDA did not collect information on all of the potential facility types in existence or that might be regulated by a jurisdiction. Therefore, we should be free to select the scope of the survey that meets our needs.

    Clearinghouse Work Group Response

    You are correct in stating that Standard 9 does not spell out which facility types must be surveyed. There was recognition by the drafters of the Standard that jurisdictions vary in the number and types of facilities that they regulate. The Standard’s description of the requirement, however, does specify that the intent of the Standard is to measure trends and to determine whether there has been a net change over time in the occurrence of risk factors and the use of Food Code interventions. The stated outcome further clarifies that this Standard is intended to enable program managers to measure their program against national criteria, and to identify program elements that may require improvement or be deserving of recognition.

    The Clearinghouse Workgroup believes that a data collection survey must include, as a minimum, all of the facility types identified in the FDA National Baseline that are regulated by a jurisdiction. As demonstrated in the “Report of the FDA Retail Food Program Database of the Foodborne Illness Risk Factors,” different facility types are likely to have different risk factors in need of priority attention. Surveying only one facility type presents an incomplete picture and will not give a complete measure of trends over time.

    In FDA’s national survey, it chose to survey the major facility types for the three industry segments. A direct focus on these industry segments provided a breadth of coverage of general and highly susceptible populations while also covering the vast majority of establishment types. The nine identified facility types are:

      Institutions

      1. Hospitals
      2. Nursing Homes
      3. Elementary Schools (K-5)

    Restaurants

    1. Fast Food Restaurants
    2. Full Service Restaurants

    Retail Food Stores

    1. Deli Departments
    2. Meat Departments
    3. Seafood Departments
    4. Produce Departments

    These are the nine facility types for which there is national data; and if you regulate any of these nine facility types, they should be included in a data collection study to meet the intent of Standard 9. You may, if you wish, survey facility types in addition to the nine identified types, but you are not required to go further in your data collection efforts.

    Request to Reconsider Previous Interpretation regarding Baseline Surveys. The question of whether all facility types under a jurisdiction’s authority must be included in the baseline survey in order to meet the Standards was answered previously in the affirmative by the Clearinghouse. The Hawaii District Health Office, Placer County Environmental Health, Santa Clara Department of Health, Sonoma County Environmental Health, and the San Diego County Environmental Health asked the Clearinghouse to reconsider the issue. The group presented in writing a number of reasoned arguments in favor of recognizing the accomplishment of survey’s of the largest pool of establishments. They reasoned that intervention strategies developed for a program in response to surveys of full-service establishments (or other largest pool of a jurisdiction’s establishments) would have an impact on the other facility types as well, and that future surveys could include other facility types.

    ADDITIONAL INFORMATION AND RESPONSE: While the Clearinghouse stands behind its previous interpretation, here are some additional thoughts. Baseline surveys are now considered a part of Standard 9 for purposes of Standards accomplishments. Failure to meet Standard 9 will not have additional consequences that influence participation or enrollment in the Standards as a whole.

    There is some sympathy for the point of view that staggered facility-type baselines may have utility as far as conservation of resources: however, there is overall support for requiring a definite point in time where all facility types under the jurisdiction’s authority have been included in a survey. Since risk factor surveys need only be completed once every three years (every 5 years as of the 2004 CFP recommendation), there is no reason why the surveys of the various facility types cannot be conducted independently over the 3- or 5-year evaluation period as long as all the facility types under the jurisdiction’s authority are surveyed within the recurring survey cycle. This procedure would meet the intent of Standard 9.

  3. Baseline Survey Sample Size

    At the Program Standards workshop, information was presented related to determining a jurisdictions sample size to ensure a valid Baseline measurement of CDC identified foodborne illness risk factors.  In order to ensure a comparable baseline with FDA, a jurisdiction that has 100 or more establishments in any of the 9 categories was instructed to sample at least 100 of those establishment in each category for a valid sample size.  If a category had less than 100, the jurisdiction was expected to sample all the facilities within that category.

    Question/Problem
    Aren’t the sample size parameters presented above unnecessarily high given the fact that FDA’s sample size for any of the nine categories did not exceed 100 and theirs is a national study comprising about one million establishments?  Is there an alternative to this suggested model that would provide a statistically valid confidence level given the much smaller total number of establishments within any given jurisdiction?

    Rationale:  While we are awaiting feedback from the work group, we strongly believe that a statistically valid baseline is achievable from a sample size that is significantly less than what the FDA has presented as a model.

    Clearinghouse Work Group Response
    Statisticians within FDA’s Division of Mathematics have re-examined this issue and determined that smaller sample sizes can be used to attain a statistically valid confidence level for the establishment of a Baseline of Occurrence of Foodborne Illness Risk Factors.  The following presents the Division of Mathematics current guidance on assuring sample sizes for Baseline measurements are statistically meaningful.

    SAMPLE SIZE RECOMMENDATIONS FOR LOCAL GOVERNMENT RETAIL FOOD SAFETY BASELINES

    A Working Paper by W. E. Bing Garthright, Ph.D., HHS/FDA/CFSAN/OSAS/Division of Mathematics

    February 7, 2002

    Many states, counties, and cities are beginning to plan their own retail food safety baseline measurements, based on the FDA project (“Report of the FDA Retail Food Program Database of Foodborne Illness Risk Factors”, 8/10/2000).  These activities will be called “local baselines” for brevity.  This working paper will recommend sample sizes for random selection of facilities to inspect, based on analyses done by Bing Garthright and Jerome Schneidman of FDA/CFSAN’s Division of Mathematics.

    For a local baseline for some facility types, the inventory of establishments is small enough that sample sizes can be smaller than those used in the FDA’s national assessment.  Local requirements should also be satisfied by a slightly less stringent requirement on confidence limits, which will also allow some reduction to sample sizes.  These two facts will lead to the recommendations below.

    John Marcello, an FDA regional retail food specialist, has proposed a theoretical profile of a local government inventory as follows:

    Hospitals 6
    Nursing homes   36
    Elem. schools  48
    Fast food 420
    Full service 360
    Retail grocery stores 180

    I will recommend sample sizes for inventories of these sizes and bigger.

    The purposes of a local baseline would include these two:

    • compare the locality to FDA’s national baseline profile by risk factors;
    • identify the subset of the 42 items in the baseline that are most in need of  improvement.

    Of course states and local governments will want to see whether compliance with risk-based factors is improving or not over periods of several years.  The local situation is different from FDA’s however, because local authorities have frequent contact with most of their inventories every year, and so they have many more points for comparison than just a baseline measurement. The locality will observe its improvements and declines in more detail than a periodic baseline, and will know more rapidly how its efforts are succeeding.

    There are many different goals that we could pursue that would lead to different sample size requirements.  Pursuing the most difficult goal will automatically provide big enough samples to satisfy the rest.  The most difficult goal is to identify those specific baseline items, out of FDA’s 42 items, that are most in need of priority attention.  Of course everyone wants every risk-related item to be as in compliance as possible, but with limited resources it is good to tackle the factors that are the least in compliance.  All of FDA’s 42 items are directly connected to risk, so FDA highlighted the least in compliance items in its August 10, 2000 report.  The 9 tables numbered 3 through 11 gave items deserving priority attention each of the 9 facility types in our baseline.  We expect some degree of similarity in most local baseline results, so we will look at those tables when planning our statistical criteria.

    There is no single correct basis for setting a sampling plan for an operation like baseline measurement.  We determined by consulting FDA’s retail field specialists that some rough guidelines could be derived.  In particular, we view an item that is in compliance more than 80 percent of the time to need improvement, but not as a priority; an item in compliance less than 60 percent of the time clearly deserves priority attention. 

    There is a great body of valuable survey theory that deals with difficulties and complexities in collecting data and getting accurate and precise conclusions.  This theory is necessary when the conclusion will be to describe causality in social relations (e.g., children whose parents read more than 3 books per year earn $10,000 more than the average citizen).  Most of this theory is unnecessary for a baseline measurement, which simply gives a measurement of conditions at one particular time.  We will define as our completely accurate measurement the data that would result if we conducted baseline measurements at the entire inventory of establishments.  We will define the results of sampling a subset of the inventory by how accurately it reflects the data we would get by including the complete inventory.  This bypasses many complexities in sampling theory.

    If we want to give priority attention to items whose compliance (measured by the whole inventory) is less than 60 percent, then we have to decide what a successful measurement will be.  Many approaches are reasonable, but FDA used the following goal when determining its sample sizes relative to prioritizing items:

    If a particular baseline item has a compliance rate of no more than 60 percent, we want to have a high probability that our data will show a compliance rate of no more than 70 percent. 

    This means that we can treat items that score in compliance at less than 60 percent as clear priorities and treat those up to 70 percent as also of special concern.  I will call this objective the “60-70 objective”, for convenience. 

    FDA’s J. Schneidman has used statistical theory (the hypergeometric distribution) to see how well various sample sizes meet the 60-70 objective. 

    I suggest a goal of 95% confidence that a particular item with 60% total compliance would not be found to have more than 70% compliance in the randomly selected sample.  (This is less demanding than the 98.5% confidence of the 60-70 objective required for the national baseline, but we think it is justified by two facts: the consequences of an error are confined to one locality, and the locality would soon discover any such errors by their follow-up activities.)  The table below shows how many compliance observations must result from the sampling in order to achieve this. 

    Note that in this working paper, the term “observations” refers to findings of  “in compliance” or  “out of compliance”, but does not include “not applicable” or “not observed”.  The table below cannot be used directly, since we can’t predict the number of observations that would be achieved if the entire inventory were attempted.

    FOR ONE OF THE 42 ITEMS IN THE BASELINE:

    If this no. of observations would result if the entire inventory is attempted:

    10 20 30 40 50 60 70 80& 90 100 110 120 130

    Then this no. of observations is needed from the partial sample: 

    9 16 22 28 29 32 38 38 39 42 45 45 48
    If this no. of observations would result if the entire inventory is attempted: 150 175 200 225 250 300 350 400 450

    Then this no. of observations is needed from the partial sample: 48 49 52 55 58 58 58 58 58

     

  4. How can we adapt the above relationship for observations to the relationship for establishments, using the results of the FDA baseline study?  As was noted in Tables 3-to-11 of the FDA baseline study, many items are both applicable and observable at only a fraction of the inspections. This means that, for some particular item in the baseline, the numbers of establishments in the inventory really represent smaller numbers of observations, and so we must take that into account when setting our desired sample sizes. 

    Tables 3-to-11 record, for the 9 individual facility types, a total of 55 mentions of baseline items that deserve the most priority for improvement.  I would expect these tendencies to be reflected to a great extent in most localities, and so we will use them as a guide in judging just how much to “oversample” in order to get adequate numbers of observations for making important decisions.

    When an item is much less than 60 percent in compliance, say less that 50 percent, it takes only a very small sample to give a result no more than 70 percent in compliance with 95 percent confidence.  We want to take into account the sampling that will do a good job for items that score very near to 60 percent.

    There were ten mentions of items that appeared to be between 58-62% in compliance, and they were observed at between 72 and 100 percent of the inspections, with an average of 87 percent of inspections.  We want to be able to capture enough observations for all such items, and we know that there will be some sampling error involved that requires that we assume an even lower of level of observation to have high assurance of coverage.  Therefore, we will allow for the possibility that only 2/3 (67%) of the inspections yield observations. 

    For example, suppose a locality has 90 elementary schools.  For an item of interest, we would suppose that there would exist a potential for 60 observations (2/3 of 90).  For this no. (60) of potential observation, our table above would require a sample of 32 observations. Using the 2/3 rule, we would sample 48 establishments (since 2/3 of 48 is 32).  

    But the example above is clearly over-simplified, since our sampling of 48 of the 90 schools could conceivably encounter as many as 48 or as few as 18 observations.  This involves the second layer of sampling errors, the sampling that coincides with observable items and with non-observable ones.  We will accept this oversimplification, however, for several reasons.  First, the probabilities suggest that mistakes will be very few.  Second, we have picked a hardest case to represent the test that our sampling must satisfy.  The FDA baseline items with 58-62% compliance averaged 87 percent observations, much higher than our conservative assumption of 67 percent, and so we have a cushion of over-sampling for these items.  Third, 45 out of 55 of the FDA items of concern were noticeably above or below 60 percent in compliance, and therefore we will not need such large samples in order to characterize them correctly.  Taken together, with a little smoothing at the upper end, these three reasons cause us to support the following table of samplings based on inventory sizes:

    ESTABLISHMENT INVENTORY SAMPLE SIZES

    Inventory size 9  9  10-12  13  14-19  20-24  25-28  29-31 32-36  37-43  44-51
    Sample size all 8 9 12 14 18 23 24 27  29 33
    Inventory size 52-58  59-73  74-81  82-96  97-103  104-133  134-148  149-163
    Sample size 38 42 44 48 53 57 59 63
    Inventory size 164-186  187-261  262-291  292-328  329-373  374+
    Sample size   68 72 74 78 83 87

    This will give the following sample sizes for the theoretical example posed by John Marcello:

    Type Inventory Sample size
    Hospitals   6 6
    Nursing homes  36 27
    Elem. schools  48   33
    Fast food 420    87
    Full service  360 83
    Retail food stores  180   68
    Totals  1050 304

        

    This working paper supersedes the sampling scheme that I spelled out in my prepared remarks, delivered in my absence by John Marcello, for the Pacific Northwest Regional Meeting in August of 2001.  (The regional meeting remarks would have recommended 390 inspections for the example above.)  This paper represents CFSAN’s best advice for sample sizes of inspections for local baseline studies.

    Postscript: When the tables are used for Retail food stores, they really represent the numbers of each of the four retail food store departments to be measured.  It will be necessary to visit more than this number of stores in order to achieve coverage of the less frequently encountered departments.  Guidance for this will be developed by FDA’s regional specialists and by the Clearinghouse Workgroup for Program Standards.

  1. Baseline Surveys – Use of lower confidence levels than recommended in the FDA Data Collection Manual
  2. Question/Problem
    I am the director of a jurisdiction that is participating in the Standards and have completed my self-assessment. Although I would like to conduct a risk factor baseline survey, I have very limited resources. The FDA Data Collections Manual recommends sample sizes that will result in a 95 percent confidence level. It seems that if I am willing to accept a lower confidence, for example 80 or 90 percent, I can collect fewer samples. This will allow me to conduct my survey using fewer person hours.

    Rationale: I realize that I would not be able to compare my results with the FDA National data. I also realize that the results would not be as reliable using a lower confidence level; however, I think the information I gather will be sufficient to help me tweak my program to gain some improvements. I’m not sure I need the scientific justification of a 95 percent confidence level. If I’m willing to accept the lower confidence levels, are there other reasons why I shouldn’t reduce sampling to stretch my resources?

    Clearinghouse Work Group Response
    There are a number of issues to be considered here. In a nut shell, the statistics show that although you may be able to reduce sample size somewhat, your ability to measure trends over time is greatly compromised. You will lose precision to a degree that you may not be able to detect increases or decreases in compliance of risk factors in future surveys. In deed, upward trends in compliance may even be mistaken for downward trends. The complete mathematical explanation for this phenomena that argues against using confidence levels lower than 95 percent (95%), as outlined in the “FDA Data Collection Manual,” is included as an answer addendum at the end of this Clearinghouse response.

    The surveys are intended to track over time the occurrence of risk factors known to cause or contribute to foodborne illness. The idea is that the information uncovered will allow you to focus your efforts in selected areas where compliance is low in order to achieve significant improvement. Future surveys would then reveal whether your efforts and strategies were successful in changing the occurrence of the selected risk factors. If your survey is conducted in such a way that you are unable to identify changing trends in risk factor occurrence, then the purpose of the survey is defeated. You may conserve resources used to conduct the surveys, but if the information gathered does not serve the intended purpose, then the resources will have been wasted.

    An initial baseline survey and future risk factor surveys can be a tremendously powerful tool to demonstrate the usefulness of your program to the Board of Health, City Council or whatever body has influence over your budget and resources. For the first time, there exists an effectiveness measure for a public health program. It has always been difficult to justify preventive programs, especially during austere economic times. The surveys allow you to identify areas that represent potential problems affecting consumer health and the well being of the community at large. You can then develop logical strategies to reduce the risk in those specific problem areas and to demonstrate the positive impact of your program. Conducted properly, risk factor surveys can provide tangible justification for your food program in a way never before possible. This being the case, your surveys should be conducted in such a way as to maintain the highest integrity and maximum usefulness of the survey results. For these reasons the Clearinghouse cannot recommend the use of lower confidence levels.

    Answer addendum
    DISCUSSION OF IMPACT OF CONFIDENCE LEVELS ON DATA PRECISION, Prepared by Jerome Schneidman, FDA Division of Mathematics

    Recall that our original samples sizes for state and local baselines, as presented in the Data Collection Manual, were calculated to give 95% confidence that a data item that was 60% or less in compliance would be found to be no more than 70% in compliance in the sample (pages 48-49). We were asked to explore the effect on sample size, if we reduced the confidence goal from 95% to 80% and 90%, respectively.

    80% Confidence
    Provided the number of establishments in a facility type is no more than 15,951, this yields a sample size of no more than 29 (i.e., 29 or fewer). Assuming nonresponse (not observed or not applicable) similar to what FDA experienced, this could easily lead to only about 20 observations for a data item. Under such a scenario, there would be only 21 possibilities: 0 IN, 1 IN, 2 IN,…,19 IN, 20 IN. Similarly, this yields only 21 possibilities for % IN: 0%, 5%, 10%,…, 95%, 100%. Such limited possibilities for the results give too little information to be of much use. With such a small sample, there will be almost no ability to detect small changes from repeated baselines. In fact, there would be a good chance that a small increase in compliance would erroneously show up as a decrease. We cannot recommend such a small sample size and would urge rejection of using only 80% confidence.

    90% Confidence
    The sample size results are summarized as follows.

    Population Size Sample Size
    763 - or less 57 or less
    764 - 1,311 59
    1,312 - 3,591 63
    3,592 or above 68

    We don’t recommend using this either, because such sample sizes will make it more difficult to show or detect small changes from repeated baselines because of loss of precision due to these smaller sample sizes. This difficulty cannot be quantified until the particular data has been collected. We can illustrate using example scenarios.

    Example: With these sample sizes, we have 90% confidence that a data item that was 60% or less in compliance would be found to be no more than 70% in compliance in the sample. It is also expected to be more difficult to show a small change from say, 60% IN to 65% IN or 70% IN to 75% IN than with our original samples sizes. Furthermore the probability of detecting such changes from repeated baselines is expected to be less than .90.

    Under these sample sizes, jurisdictions will be less likely (it will be more difficult) to detect changes from repeated baselines. With full understanding of these caveats, these smaller sample sizes could be used; however, we still do not recommend them. If used at all, this 90% confidence level should probably be restricted to very small states and jurisdictions. It is probably not appropriate for jurisdictions with large populations since public health is at issue.

  3. Survey Reports – The use of the number 32 as a reporting cut off for out of compliance elements.
  4. Question/Problem
    In the Report of the FDA Retail Food Program Database of Foodborne Illness Risk Factors, it appears that the risk factors with out-of-compliance observations of at least 32 was a cut off mark for reporting and prioritizing the results. I cannot find an explanation of why 32 was used as the cut off. Is there a statistical significance to this number? And does this figure apply to all jurisdictions conducting risk factor studies as well?

    Clearinghouse Work Group Response
    The rationale for choosing 32 Out-of-Compliance observations as the cut off point for determining what individual data items deserved priority attention is discussed on page 22 of the mentioned report. Basically FDA analysts sorted the data items by number of OUT-of-Compliance observations, ranking them in order from the one with the highest number to the one with least number of Compliance observations. The analysts then looked for a point in the list of ranked data items after which the number of OUT observations began to decrease more rapidly or were farther apart. This ‘natural break’ was the cut off value used for each facility type.

    This approach appears to have worked well for the FDA Baseline. However, this is not the only approach that can be used and other approaches may be appropriate for individual jurisdictions conducting a risk factor survey. For example, with the possibility of different sample size requirements and different observation rates for different facility types, state and local jurisdictions may decide to choose different cutoff points for highlighting data items in need of priority attention for each of the facility types.

    The recipe used in the FDA Baseline Report for identifying data items needing priority attention does not use the OUT-OF-COMPLIANCE percentage (rate). Instead the approach only considers the number of OUT-OF-COMPLIANCE observations as the criterion. This means that individual data items with high OUT-OF-COMPLIANCE rates but with few observations, will not be highlighted using the FDA approach.

    The goal of conducting repeated baselines survey over time is to measure trends on the occurrence of foodborne illness risk factors. Note that progress is measured in terms of the amount of increase in the overall percent of IN COMPLIANCE observations for all data items combined. This is done separately for each facility type. (This is the ratio of total “IN” observations for all data items combined to the total of “IN” observations plus total “OUT” observations for all data items combined.).

    Overall Baseline IN Compliance percentage for a Facility Type =

    (Total number of IN Compliance Observations for all data items) X (100%)
    (Total # of IN Compliance Observations +Out of Compliance Observations for all data items)

    The reality of this approach is that those individual data items that are seldom observed or are frequently noted as not applicable will have little impact on this score. What affects the overall baseline measurement are the data items that are frequently observed. Practically speaking, this means focusing on the items with the most OUT OF COMPLIANCE observations.

    However, there is no reason why states and local jurisdictions cannot also consider items that have high OUT OF COMPLIANCE percentages and simultaneously do not have a large number of observations. If you decide that some of these items represent important problems and you have sufficient resources, you may wish to work on improving these items as well as the problematic data items with many observations. Additionally, if you determine that some items may be improved with very little effort, it may be wise to address these, regardless of how often they occur. Be aware, however, you cannot expect efforts devoted to data items that have low observation rates to have a substantial effect on future baseline measurement trends.

    In conclusion, states and local jurisdictions may list as many data items as you like in your reporting and analysis, selecting them in order of the number of OUT-OF-COMPLIANCE observations, and prioritizing the items to be worked on in the same order, based on your resource constraints. This should be done separately for each facility type. The number of items that can be listed is up to your discretion and preferences. You are not required to list the same number for each facility type. The number you choose to work on and the amount of effort you wish to expend on each is up to you.

RFS Home | Previous