|
Standard #5
- Final Outbreak Reports
Question/Problem
The second paragraph under “Description of Requirement” in
Standard 5 discusses follow-up on complaints of alleged food-related
illness or injury. At the end of that paragraph, it says “the
final report of the investigation is shared with the state epidemiologist
and the Centers for Disease Control and Prevention.” Are all
complaint follow-ups to be reported to the epidemiologist and CDC
or only those that meet the definition of a foodborne illness?
Clearinghouse Work Group
The final reports of investigations that meet the definition of a foodborne
illness should be shared with the state epidemiologist and CDC. The paragraph does seem to mix
apples and oranges. The intention is for you to record all complaints of
alleged illness or injury, to perform an assessment of the complaint to determine
appropriate follow-up, and to link that information to the establishment record for
retrieval purposes in order to identify patterns and trends. In some instances an
investigation may be performed and a short report written on complaints that do not meet
the official definition of a foodborne illness. You are not required to share work
reports of that nature with the state epidemiologist and CDC. For
all investigations that meet the definition of a foodborne illness,
a final report is to be written and
shared with the state epidemiologist and CDC.
Page Top
Standard #6
- Determining Conformance with a Standard
Much of the criteria contained in Standard #6 are predicated on the inspection program
containing risked based approaches outlined in earlier Standards - in particular Standard
#3. Our initial review of Standard #6 indicates that much of the criteria rely on
forms that record and quantify status of risk factors/interventions and other serious
violations.
Since our current program is lacking a definitive step-by-step compliance and
enforcement process based on the occurrence and correction of risk factors and
interventions, the assessment protocol outlined in Standard #6 can not be completed as
designed. Assessing staffs consistency in these areas through file reviews
seems premature until all the components are in place.
Question/Problem
Our internal self-assessment process has revealed significant gaps in our compliance and
enforcement program. Several risk-based components contained in Standard
#6scriteria must be developed and integrated into our program before a meaningful
file review can be performed against all the criteria contained in the Compliance and
Enforcement Standard.
For this initial self-assessment, is it sufficient to note the gaps within our current
compliance and enforcement program as rationale as to why we do not meet Standard #6 or
must we also complete a file review of randomly selected establishments. If we are
to continue on with a file review, against what criteria do we assess compliance with the
Compliance and Enforcement program components we already have in place?
Clearinghouse Work Group
This question is similar to a prior one concerning Standard 4, and the response applies
regardless of the Standard in question. If a cursory look at a Standard compared to
your program is sufficient to reveal gaps that prevent you from meeting the Standard and
provides a rational for your conclusion, then it is not necessary for you to proceed
further. No one wants you to spend time that is not productive. You only need
go as far as necessary to identify the gaps you would need to fill then establish a
strategic plan for ultimately meeting the Standard.
- Risk Categories for Establishments
Question/Problem:
Risk categories for prioritizing establishment inspections need to
be established and put into the Program Standards Guidelines.
Clearinghouse Workgroup Response
The Workgroup agrees. See response to Standard #3, Problem 4.
-
Spreadsheet with Embedded Math Formulas
Question/Problem
There is a need for a standardized reporting and collecting
format for Standard #6.
Clearinghouse Workgroup Response
Since a ‘format’ is included in the Appendix to Standard,
it appears that a mathematical spreadsheet for calculating the
columns in the worksheet in the Appendix for Standard #6 is being
requested. The Workgroup will recommend that FDA collect existing
spreadsheets, if they exist, from the participating jurisdictions.
- Files with No Risk Factor/Intervention Violation on the
Start-Point Inspection
Question/Problem
The process for self-assessment against Standard 6 is unclear.
It is not clear from the Appendix F worksheet instructions how
to mark files drawn for review that do not have a risk factor
or Food Code intervention violation on the ‘start-point
inspection.’ Should the self-assessor keep drawing files
until he/she finds the requisite number of files with violations
on the start-point inspection or are files without a violation
considered as ‘passing’ files?
Clearinghouse Work Group
While the Standard 6 implies the answer, the Appendix F instruction
create some confusion. Standard 6 says in item 3. under the essential
program elements required to meet the Standard that there must
be ‘documentation on the establishment inspection report
form or in the establishment file that compliance and/or enforcement
action was taken to achieve compliance at least 80 percent of
the time when out-of-control risk factors or interventions are
recorded on a routine inspection measured using the procedures
in Appendix F.” The ‘when’ in the sentence
indicates that the files to be counted are ones where a risk
factor/intervention violation exists.
Appendix F says that in order for an establishment file
to ‘pass,’ each
column marked with a violation at the start-point inspection must
have a subsequent “yes” answer to indicate that at
least one type of follow-up action was taken. This again clearly
defines what constitutes a passing file. The Appendix also goes
on the define a ‘failing’ file as a start-point violation
without a final resolution.
The confusion occurs when the instructions say to divide
the number of files that passed by the number of files reviewed
to determine
the percentage. This assumes that every file reviewed would have
a violation on the start-point inspection and would, therefore,
be either a ‘pass’ or a ‘fail’ file. This
is not the case when the start-point inspection does not reflect
a risk factor or intervention violation. Files without a violation
at the start-point inspection do not meet either the pass or fail
criteria, and using the math formula as described would not demonstrate
an 80 percent resolution when violations are identified. The instruction
should say to divide the number of files that pass by the number
of files showing a risk factor or Food Code intervention on the
start-point inspection.
Advice from the FDA Division of Mathematics says that the random
draw of files must continue until the assessor finds the requisite
number of files that have a start-point violation. Files that do
not have a violation at the start-point inspection do not meet
the criteria established for the sample. The minimum random draw
of files with a start-point violation will be 20 files for jurisdictions
with less than 400 establishments and five percent (5%) or 70 files,
whichever is less, for jurisdictions with 400 or more establishments.
The number of alternate files suggested under the Supplemental
Sampling heading in Appendix F should also probably be increased
to 50 percent of the original sample size to accommodate the number
of files that may be excluded from the draw because of no start
point inspection violation.
Page Top
Standard #7
- Strengthen Standard #7
Question/Problem
There needs to be more emphasis on interacting with industry and
community. How many are feasible? We recommend the Standard be altered
to recommend a total of four activities annually (two in each category).
Clearinghouse Workgroup Response
The Clearinghouse believes that it is too early to determine whether
a Standard should be made more stringent or lenient. Information should
be gathered from the current participants, and then recommendations
for change can be based on experiential data gathered from across the
country from both large and small jurisdictions. The Clearinghouse
does not recommend action on this item at the present time.
Page Top
Standard #8
- Full-time Equivalent to inspection Ratio
Question/Problem
FTE inspection ratio is not attainable. Develop more realistic
criteria. We recommend that FDA approve the innovative grant for the
time study to develop realistic data on this issue. We also recommend
that CFP Program Standards Committee use the information developed
in the time study (still to be approved) to re-evaluate the FTE ration
with respect to the number of inspections and/or allowing local staffing
formulas.
Clearinghouse Workgroup Response
About the grant process, a panel of ten or so judges, including
experts from outside the FDA organization, rate the innovative grant
proposals independently based on the published rating criteria for the
grants. Although the panel meets to present and discus the merits of
each proposal, each judge separately scores each proposal on each of
the several rating criteria. After each of the judges rates the proposals,
the separate ratings from each of the judges are averaged. A separate
grant board (FDA uses the National Institutes of Health Grant Board)
then ranks the proposals in descending order according to their averaged
scores, with the lowest score being the best. Grants are given starting
with the proposals at the top of the list and continuing until the pool
of grant money available is exhausted.
We agree that a time study of inspections conducted in accordance
to the Standards’ inspectional requirements and including all the
activities in the definition of an inspection in Standard # 8’s
would be most useful. The time study mentioned in the issue was funded
for a grant from FDA. As in all studies, it’s usefulness will be
determined by the quality of the study design. The report of the
study outcome will be reviewed by FDA with great interest. The Clearinghouse
expects that important data will be gathered as well from the jurisdictions
currently participating in the self-assessment process. As stated
in
other responses, the Clearinghouse believes that recommendations
for changes will need to be based on solid information and should reflect
the best practices in the food safety community. The Clearinghouse
Workgroup
recommends no action at this time.
Page Top
Standard #9
- Standard Self-Assessment Process for Decentralized Jurisdictions
I
work for a State regulatory food program. The responsibility
for regulatory oversight of retail food and foodservice operations
has been passed on to local county
agencies through a formal delegation agreement. The State agency,
itself, has very little direct regulatory inspection responsibilities
within the retail food sector.
My agency has enrolled in the Program Standards. Our staff is able
to initiate a self-assessment of our State Program against some
of the Standards, including:
- Standard #1 Regulatory Foundation
- Standard #2 Trained Regulatory Staff
- Standard #5 Foodborne Illness Surveillance
- Standard #7 Industry and Community Relations
Some of the other Standards, however, present significant challenges
to our successful completion of a self-assessment because they
rely heavily on the structure and process
related to direct inspection work. These Standards include:
- Standard #3 Incorporating the Principles of HACCP into
Regulatory Inspections
- Standard #4 Inspection Uniformity
- Standard #6 Compliance and Enforcement
- Standard #8 Program Resources
- Standard #9 Program Assessment
Question/Problem
How should State Programs who have delegation agreements with local
agencies for direct regulatory inspections of the retail segment
of the industry conduct a self-assessment of
their own program? What parameters should be used to assess
compliance with those Standards listed above that rely substantially
on an assessment of the structure and/or
process pertaining to on-site inspection work and related files?
Clearinghouse Work Group Response
In the circumstance that you describe, the application/implementation
of the Standards falls into two areas, those areas that are a direct
part of your program and those areas
that you manage. Legal delegations can be accomplished using
several different written instruments such as delegation agreements,
contracts for service, or memoranda of
understanding. For those pieces of the program that you manage
through delegation, you should establish written criteria to be followed
by the delegatee in the performance
of those delegated duties. You can meet the Standards in those
areas that you have delegated by demonstrating the following elements:
- That criteria exist in your formal delegation document
that meets the Standards criteria for those areas,
- That you regularly perform a monitoring, oversight,
or audit function of retail food programs that have entered
into a delegation agreement or contract with your agency to
ensure that the criteria is being met (We suggest that you require
by delegation document
that the delegatee perform self-assessment and develop plans
to bridge gaps in order to make oversight/auditing less resource
intensive), and
- That you require the delegatee to develop and implement
action plans for correction if it does not meet the criteria
in the delegation agreement or contract.
For your self-assessment of delegated program areas, you
will determine the presence or absence of these three elements
(a. through c. above) for delegated functions. Of
course, you also will perform a self-assessment against the other
Standards requirements
for pieces of the program that you perform directly.
- Facility Types to be Included in Baseline Surveys
Our jurisdiction is considering limiting our baseline survey data collection
to only one of the facility types that we regulate. We are thinking
of surveying only the full service restaurants since it is the more
complex segment of
the industry and includes the majority of our permitted establishments.
There are two reasons for limiting the scope of our survey. The first
reason is
to conserve the expenditure of resources during these tight budget
times. The second reason is that we would like to gain some experience
in the methodology
and surveying techniques before we put too many resources into the
process only to discover that changes need to be made in the process.
Question/Problem
If we limit the scope of our data collection survey to only one of
the facility types that we regulate, will we still meet the intent
of Standard 9?
Rationale
We believe that we will meet the intent of Standard 9 by surveying
only one facility type. The Standard does not spell out which facility
types must be surveyed. It simply requires that baseline data be
collected and that additional data be collected on subsequent three-year
cycles.
Further FDA did not collect information on all of the potential
facility types in existence or that might be regulated by a jurisdiction.
Therefore,
we should be free to select the scope of the survey that meets
our needs.
Clearinghouse Work Group Response
You are correct in stating that Standard 9 does not spell out which
facility types must be surveyed. There was recognition by the drafters
of the Standard that jurisdictions vary in the number and types of
facilities that they regulate. The Standard’s description of the
requirement, however, does specify that the intent of the Standard
is to measure
trends and to determine whether there has been a net change over time
in the occurrence
of risk factors and the use of Food Code interventions. The stated
outcome further clarifies that this Standard is intended to enable
program managers
to measure their program against national criteria, and to identify
program elements that may require improvement or be deserving of recognition.
The Clearinghouse Workgroup believes that a data collection survey
must include, as a minimum, all of the facility types identified
in the FDA
National Baseline that are regulated by a jurisdiction. As demonstrated
in the “Report of the FDA Retail Food Program Database of the Foodborne
Illness Risk Factors,” different facility types are likely to have
different risk factors in need of priority attention. Surveying only
one facility type presents an incomplete picture and will not give
a complete measure of trends over time.
In FDA’s national survey, it chose to survey the major facility
types for the three industry segments. A direct focus on these industry
segments provided a breadth of coverage of general and highly susceptible
populations while also covering the vast majority of establishment
types. The nine identified facility types are:
Institutions
- Hospitals
- Nursing Homes
- Elementary Schools (K-5)
Restaurants
- Fast Food Restaurants
- Full Service Restaurants
Retail Food Stores
- Deli Departments
- Meat Departments
- Seafood Departments
- Produce Departments
These are the nine facility types for which there is national
data; and if you regulate any of these nine facility types, they
should be included
in a data collection study to meet the intent of Standard 9. You
may, if you wish, survey facility types in addition to the nine
identified types,
but you are not required to go further in your data collection efforts.
Request to Reconsider Previous Interpretation regarding Baseline
Surveys.
The question of whether all facility types under a jurisdiction’s
authority must be included in the baseline survey in order to meet
the Standards was answered previously in the affirmative by the Clearinghouse.
The Hawaii District Health Office, Placer County Environmental Health,
Santa Clara Department of Health, Sonoma County Environmental Health,
and
the San Diego County Environmental Health asked the Clearinghouse
to reconsider the issue. The group presented in writing a number
of reasoned arguments
in favor of recognizing the accomplishment of survey’s of the largest
pool of establishments. They reasoned that intervention strategies
developed for a program in response to surveys of full-service establishments
(or
other largest pool of a jurisdiction’s establishments) would have
an impact on the other facility types as well, and that future surveys
could include other facility types.
ADDITIONAL INFORMATION AND RESPONSE: While the Clearinghouse stands
behind its previous interpretation, here are some additional thoughts.
Baseline
surveys are now considered a part of Standard 9 for purposes of Standards
accomplishments. Failure to meet Standard 9 will not have additional
consequences that influence participation or enrollment in the
Standards as a whole.
There is some sympathy for the point of view that staggered facility-type
baselines may have utility as far as conservation of resources: however,
there is overall support for requiring a definite point in time where
all facility types under the jurisdiction’s authority have been
included in a survey. Since risk factor surveys need only be completed
once every
three years (every 5 years as of the 2004 CFP recommendation), there
is no reason why the surveys of the various facility types cannot
be conducted
independently over the 3- or 5-year evaluation period as long as
all the facility types under the jurisdiction’s authority are surveyed
within the recurring survey cycle. This procedure would meet the
intent of Standard 9.
-
Baseline Survey Sample Size
At the Program Standards workshop, information was presented related
to determining a jurisdictions sample size to ensure a valid Baseline
measurement of CDC identified
foodborne illness risk factors. In order to ensure a comparable
baseline with FDA, a jurisdiction that has 100 or more establishments
in any of the 9 categories was instructed
to sample at least 100 of those establishment in each category for
a valid sample size. If a category had less than 100, the jurisdiction
was expected to sample all the facilities within that category.
Question/Problem
Arent the sample size parameters presented above unnecessarily
high given the fact that FDAs sample size for any of the
nine categories did not exceed 100 and theirs is a national study
comprising about one million establishments? Is
there an alternative to this suggested model that would provide a
statistically valid confidence
level given the much smaller total number of establishments within
any given jurisdiction?
Rationale: While we are awaiting feedback
from the work group, we strongly believe that a statistically
valid baseline is achievable from a sample size that is
significantly less than what the FDA has presented as a model.
Clearinghouse Work Group Response
Statisticians within FDAs Division of Mathematics
have re-examined this issue and determined that smaller sample
sizes can be used to attain a statistically valid
confidence level for the establishment of a Baseline of Occurrence
of Foodborne Illness Risk Factors. The following presents
the Division of Mathematics current guidance on assuring sample
sizes for Baseline measurements
are statistically meaningful.
SAMPLE SIZE RECOMMENDATIONS FOR LOCAL GOVERNMENT
RETAIL FOOD SAFETY BASELINES
A Working Paper by W. E. Bing Garthright, Ph.D.,
HHS/FDA/CFSAN/OSAS/Division of Mathematics
February 7, 2002
Many states, counties, and cities are beginning to plan
their own retail food safety baseline measurements, based on
the FDA
project (Report of the FDA Retail Food
Program Database of Foodborne Illness Risk Factors, 8/10/2000). These
activities will be called local baselines for brevity. This
working paper will recommend sample sizes for random selection
of facilities to inspect, based on
analyses done by Bing Garthright and Jerome Schneidman of FDA/CFSANs
Division of Mathematics.
For a local baseline for some facility types, the inventory
of establishments is small enough that sample sizes can be smaller
than those used in the FDAs national
assessment. Local requirements should also be satisfied by
a slightly less stringent requirement on confidence limits, which
will also allow some reduction to sample
sizes. These two facts will lead to the recommendations below.
John Marcello, an FDA regional retail food specialist,
has proposed a theoretical profile of a local government inventory
as follows:
| Hospitals |
6 |
| Nursing homes |
36 |
| Elem. schools |
48 |
| Fast food |
420 |
| Full service |
360 |
| Retail grocery stores |
180 |
I will recommend sample sizes for inventories of these
sizes and bigger.
The purposes of a local baseline would include these two:
- compare the locality to FDAs national baseline
profile by risk factors;
- identify the subset of the 42 items in the baseline
that are most in need of improvement.
Of course states and local governments will want to see
whether compliance with risk-based factors is improving or not
over periods
of several years. The local
situation is different from FDAs however, because local authorities
have frequent contact with most of their inventories every year,
and so they have many more points for
comparison than just a baseline measurement. The locality will
observe its improvements and declines in more detail than a periodic
baseline, and will know more rapidly how its
efforts are succeeding.
There are many different goals that we could pursue that
would lead to different sample size requirements. Pursuing
the most difficult goal will automatically provide big enough
samples
to satisfy the rest. The most difficult goal is to identify
those specific baseline items, out of FDAs 42 items, that
are most in need of priority attention. Of course everyone
wants every risk-related item to be as in compliance as possible,
but with limited resources it is good to tackle the factors that
are the least in compliance. All of FDAs 42 items are
directly connected to risk, so FDA highlighted the least in compliance
items in its August 10, 2000
report. The 9 tables numbered 3 through 11 gave items deserving
priority attention each of the 9 facility types in our baseline. We
expect some degree of similarity in most local baseline results,
so we will
look at those tables when planning our statistical
criteria.
There is no single correct basis for setting a sampling
plan for an operation like baseline measurement. We determined
by consulting FDAs retail field
specialists that some rough guidelines could be derived. In
particular, we view an item that is in compliance more than 80
percent of the time to need improvement, but not
as a priority; an item in compliance less than 60 percent of the
time clearly deserves priority attention.
There is a great body of valuable survey theory that deals
with difficulties and complexities in collecting data and getting
accurate
and precise conclusions. This
theory is necessary when the conclusion will be to describe causality
in social relations (e.g., children whose parents read more than
3 books per year earn $10,000 more than the
average citizen). Most of this theory is unnecessary for
a baseline measurement, which simply gives a measurement of conditions
at one particular time. We will
define as our completely accurate measurement the data that would
result if we conducted baseline measurements at the entire inventory
of establishments. We will define the
results of sampling a subset of the inventory by how accurately
it reflects the data we would get by including the complete inventory. This
bypasses many complexities in sampling theory.
If we want to give priority attention to items whose compliance
(measured by the whole inventory) is less than 60 percent, then
we have to decide what a successful measurement
will be. Many approaches are reasonable, but FDA used the
following goal when determining its sample sizes relative to
prioritizing items:
If a particular baseline item has a compliance rate of
no more than 60 percent, we want to have a high probability that
our
data will show a compliance rate of no more than 70
percent.
This means that we can treat items that score in compliance
at less than 60 percent as clear priorities and treat those up
to
70 percent as also of special concern. I will
call this objective the 60-70 objective, for
convenience.
FDAs J. Schneidman has used statistical theory (the
hypergeometric distribution) to see how well various sample sizes
meet the 60-70
objective.
I suggest a goal of 95% confidence that a particular item
with 60% total compliance would not be found to have more than
70% compliance
in the randomly selected sample.
(This is less demanding than the 98.5% confidence of the 60-70
objective required for the national baseline, but we think it is
justified by two facts: the consequences of an error
are confined to one locality, and the locality would soon discover
any such errors by their follow-up activities.) The table
below shows how many compliance observations must result from the
sampling in order
to achieve this.
Note that in this working paper, the term observations refers
to findings of in compliance or out
of compliance, but does not
include not applicable or not observed. The
table below cannot be used directly, since we cant predict
the number of observations that would be achieved if the entire
inventory
were attempted.
FOR ONE OF THE 42 ITEMS IN THE BASELINE:
If
this no. of observations would result if the entire inventory
is attempted:
|
10
20 30 40 50 60 70 80& 90
100 110 120 130 |
Then this no. of observations
is needed from the partial sample:
|
9 16 22 28 29 32 38 38 39
42 45 45 48 |
| If this no. of observations
would result if the entire inventory is
attempted: |
150 175 200
225 250 300 350 400 450
|
| Then this no. of observations
is needed from the partial sample: |
48 49 52 55
58 58 58 58 58 |
How can we adapt the above relationship for observations
to the relationship for establishments, using the results of
the FDA baseline
study? As was noted in Tables
3-to-11 of the FDA baseline study, many items are both applicable
and observable at only a fraction of the inspections. This means
that, for some particular item in the baseline,
the numbers of establishments in the inventory really represent
smaller numbers of observations, and so we must take that into
account when setting our desired sample
sizes.
Tables 3-to-11 record, for the 9 individual facility
types, a total of 55 mentions of baseline items that deserve
the most
priority
for improvement. I would expect these
tendencies to be reflected to a great extent in most localities,
and so we will use them as a guide in judging just how much to oversample in
order to get adequate numbers of observations for making important
decisions.
When an item is much less than 60 percent in compliance,
say less that 50 percent, it takes only a very small sample
to give a
result no more than 70 percent in compliance with
95 percent confidence. We want to take into account the
sampling that will do a good job for items that score very
near to 60 percent.
There were ten mentions of items that appeared to be
between 58-62% in compliance, and they were observed at between
72 and
100 percent
of the inspections, with an average of 87
percent of inspections. We want to be able to capture enough
observations for all such items, and we know that there will
be some sampling error involved that requires that
we assume an even lower of level of observation to have high
assurance of coverage. Therefore,
we will allow for the possibility that only 2/3 (67%) of the
inspections yield observations.
For example, suppose a locality has 90 elementary schools. For
an item of interest, we would suppose that there would exist
a potential for 60 observations (2/3 of
90). For this no. (60) of potential observation, our table
above would require a sample of 32 observations. Using the 2/3
rule, we would sample 48 establishments (since
2/3 of 48 is 32).
But the example above is clearly over-simplified, since
our sampling of 48 of the 90 schools could conceivably encounter
as many as
48 or as few as 18 observations. This
involves the second layer of sampling errors, the sampling that
coincides with observable items and with non-observable ones. We
will accept this oversimplification, however, for several reasons. First,
the probabilities suggest that mistakes will be very few. Second,
we have picked a hardest case to represent the test that our
sampling must satisfy. The FDA baseline items with 58-62%
compliance averaged 87 percent observations, much higher than
our conservative assumption of 67 percent, and so we have a
cushion of over-sampling for these items. Third, 45 out
of 55 of the FDA items of concern were noticeably above or below
60 percent in compliance, and therefore we will not
need such large samples in order to characterize them correctly. Taken
together, with a little smoothing at the upper end, these three
reasons cause us to support the
following table of samplings based on inventory sizes:
ESTABLISHMENT INVENTORY SAMPLE SIZES
| Inventory size |
9 9 10-12 13 14-19 20-24 25-28 29-31
32-36 37-43 44-51 |
| Sample size |
all 8 9 12 14 18 23 24 27 29 33 |
| Inventory size |
52-58 59-73 74-81 82-96 97-103 104-133 134-148 149-163 |
| Sample size |
38 42
44 48 53 57 59 63 |
| Inventory size |
164-186 187-261 262-291 292-328 329-373 374+ |
| Sample size |
68 72
74 78 83 87 |
This will give the following sample sizes for the theoretical example posed by John
Marcello:
| Type |
Inventory |
Sample size |
| Hospitals |
6 |
6 |
| Nursing homes |
36 |
27 |
| Elem. schools |
48 |
33 |
| Fast food |
420 |
87 |
| Full service |
360 |
83 |
| Retail food stores |
180 |
68 |
| Totals |
1050 |
304 |
This working paper supersedes the sampling scheme that I
spelled out in my prepared remarks, delivered in my absence by
John Marcello,
for the Pacific Northwest Regional
Meeting in August of 2001. (The regional meeting remarks would have recommended 390
inspections for the example above.) This paper represents CFSANs
best advice for sample sizes of inspections for local baseline studies.
Postscript: When the tables are used for Retail food stores,
they really represent the numbers of each of the four retail food
store
departments to be measured. It will be
necessary to visit more than this number of stores in order to achieve coverage of the
less frequently encountered departments. Guidance for this will be developed by
FDAs regional specialists and by the Clearinghouse Workgroup
for Program Standards.
-
Baseline Surveys – Use of lower confidence levels than recommended
in the FDA Data Collection Manual
Question/Problem
I am the director of a jurisdiction that is participating in the
Standards and have completed my self-assessment. Although I would
like to conduct a risk factor baseline survey, I have very limited
resources. The FDA Data Collections Manual recommends sample
sizes that will result
in a 95 percent confidence level. It seems that if I am willing
to accept a lower confidence, for example 80 or 90 percent, I
can collect
fewer
samples. This will allow me to conduct my survey using fewer person
hours.
Rationale: I realize that I would not be able to compare
my results with the FDA National data. I also realize that the
results would not
be as reliable using a lower confidence level; however, I think
the information I gather will be sufficient to help me tweak my
program to gain some
improvements. I’m not sure I need the scientific justification
of a 95 percent confidence level. If I’m willing to accept the
lower confidence levels, are there other reasons why I shouldn’t
reduce sampling to stretch my resources?
Clearinghouse Work Group Response
There are a number of issues to be considered here. In a nut shell,
the statistics show that although you may be able to reduce sample
size somewhat, your ability to measure trends over time is greatly
compromised. You will lose precision to a degree that you may
not be able to detect increases or decreases in compliance of risk factors
in future surveys. In deed, upward trends in compliance may even
be
mistaken for downward trends. The complete mathematical explanation
for this phenomena that argues against using confidence levels
lower than 95 percent (95%), as outlined in the “FDA Data Collection
Manual,” is included as an answer addendum at the end of this
Clearinghouse response.
The surveys are intended to track over time the occurrence of
risk factors known to cause or contribute to foodborne illness.
The idea is that the
information uncovered will allow you to focus your efforts in selected
areas where compliance is low in order to achieve significant improvement.
Future surveys would then reveal whether your efforts and strategies
were successful in changing the occurrence of the selected risk
factors. If your survey is conducted in such a way that you are
unable to identify
changing trends in risk factor occurrence, then the purpose of
the survey is defeated. You may conserve resources used to conduct
the surveys,
but if the information gathered does not serve the intended purpose,
then the resources will have been wasted.
An initial baseline survey and future risk factor surveys can
be a tremendously powerful tool to demonstrate the usefulness
of your program to the Board
of Health, City Council or whatever body has influence over your
budget and resources. For the first time, there exists an effectiveness
measure
for a public health program. It has always been difficult to justify
preventive programs, especially during austere economic times.
The surveys allow you to identify areas that represent potential
problems affecting
consumer health and the well being of the community at large. You
can then develop logical strategies to reduce the risk in those
specific problem areas and to demonstrate the positive impact
of your program.
Conducted properly, risk factor surveys can provide tangible justification
for your food program in a way never before possible. This being
the
case, your surveys should be conducted in such a way as to maintain
the highest integrity and maximum usefulness of the survey results.
For these
reasons the Clearinghouse cannot recommend the use of lower confidence
levels.
Answer addendum
DISCUSSION OF IMPACT OF CONFIDENCE LEVELS ON DATA PRECISION, Prepared
by Jerome Schneidman, FDA Division of Mathematics
Recall that our original samples sizes for state and local baselines,
as presented in the Data Collection Manual, were calculated to
give 95% confidence that a data item that was 60% or less in compliance
would
be found to be no more than 70% in compliance in the sample (pages
48-49). We were asked to explore the effect on sample size, if
we reduced the
confidence goal from 95% to 80% and 90%, respectively.
80% Confidence
Provided the number of establishments in a facility type is no
more than 15,951, this yields a sample size of no more than 29
(i.e., 29 or fewer). Assuming nonresponse (not observed or not
applicable) similar to what FDA experienced, this could easily
lead to only
about
20 observations for a data item. Under such a scenario, there
would be only 21 possibilities: 0 IN, 1 IN, 2 IN,…,19 IN, 20
IN. Similarly, this yields only 21 possibilities for % IN: 0%,
5%, 10%,…, 95%,
100%. Such limited possibilities for the results give too little
information to be of much use. With such a small sample, there
will be almost no
ability to detect small changes from repeated baselines. In fact,
there would be a good chance that a small increase in compliance
would erroneously
show up as a decrease. We cannot recommend such a small sample
size and would urge rejection of using only 80% confidence.
90% Confidence
The sample size results are summarized as follows.
| Population Size |
Sample Size |
| 763 - or less |
57 or less |
| 764 - 1,311 |
59 |
| 1,312 - 3,591 |
63
|
| 3,592 or above |
68
|
We don’t recommend using this either, because such sample sizes will
make it more difficult to show or detect small changes from repeated
baselines because of loss of precision due to these smaller sample
sizes. This difficulty
cannot be quantified until the particular data has been collected.
We can illustrate using example scenarios.
Example: With these sample sizes, we have 90% confidence that a data
item that was 60% or less in compliance would be found to be no
more than 70% in compliance in the sample. It is also expected
to be more
difficult to show a small change from say, 60% IN to 65% IN or
70% IN to 75% IN than with our original samples sizes. Furthermore
the probability
of detecting such changes from repeated baselines is expected to
be less than .90.
Under these sample sizes, jurisdictions will be less likely (it will
be more difficult) to detect changes from repeated baselines. With
full understanding of these caveats, these smaller sample sizes
could be used;
however, we still do not recommend them. If used at all, this 90%
confidence level should probably be restricted to very small states
and jurisdictions.
It is probably not appropriate for jurisdictions with large populations
since public health is at issue.
- Survey Reports – The use of the number 32 as a reporting
cut off for out of compliance elements.
Question/Problem
In the Report of the FDA Retail Food Program Database of Foodborne
Illness Risk Factors, it appears that the risk factors with out-of-compliance
observations of at least 32 was a cut off mark for reporting
and prioritizing the results. I cannot find an explanation of
why 32 was used as the
cut off. Is there a statistical significance to this number?
And does this figure apply to all jurisdictions conducting risk
factor studies
as well?
Clearinghouse Work Group Response
The rationale for choosing 32 Out-of-Compliance observations as
the cut off point for determining what individual data items
deserved priority attention is discussed on page 22 of the mentioned report.
Basically FDA analysts sorted the data items by number of OUT-of-Compliance
observations, ranking them in order from the one with the highest
number
to the one with least number of Compliance observations. The
analysts then looked for a point in the list of ranked data items after
which
the number of OUT observations began to decrease more rapidly
or were farther apart. This ‘natural break’ was the cut off
value used for each facility type.
This approach appears to have worked well for the FDA Baseline.
However, this is not the only approach that can be used and other
approaches may
be appropriate for individual jurisdictions conducting a risk factor
survey. For example, with the possibility of different sample size
requirements and different observation rates for different facility
types, state and
local jurisdictions may decide to choose different cutoff points
for highlighting data items in need of priority attention for
each of the
facility types.
The recipe used in the FDA Baseline Report for identifying data
items needing priority attention does not use the OUT-OF-COMPLIANCE
percentage
(rate). Instead the approach only considers the number of OUT-OF-COMPLIANCE
observations as the criterion. This means that individual data
items with high OUT-OF-COMPLIANCE rates but with few observations,
will not
be highlighted using the FDA approach.
The goal of conducting repeated baselines survey over time is
to measure trends on the occurrence of foodborne illness risk
factors. Note that
progress is measured in terms of the amount of increase in the
overall percent of IN COMPLIANCE observations for all data items
combined. This
is done separately for each facility type. (This is the ratio of
total “IN” observations
for all data items combined to the total of “IN” observations
plus total “OUT” observations for all data items combined.).
Overall Baseline IN Compliance percentage for a Facility Type
=
(Total number of IN Compliance Observations for all data
items) X (100%)
(Total # of IN Compliance Observations +Out of Compliance Observations
for all data items)
The reality of this approach is that those individual data items
that are seldom observed or are frequently noted as not applicable
will have
little impact on this score. What affects the overall baseline
measurement are the data items that are frequently observed.
Practically speaking,
this means focusing on the items with the most OUT OF COMPLIANCE
observations.
However, there is no reason why states and local jurisdictions
cannot also consider items that have high OUT OF COMPLIANCE percentages
and
simultaneously do not have a large number of observations. If you
decide that some of these items represent important problems
and you have sufficient
resources, you may wish to work on improving these items as well
as the problematic data items with many observations. Additionally,
if you determine
that some items may be improved with very little effort, it may
be wise to address these, regardless of how often they occur.
Be aware, however,
you cannot expect efforts devoted to data items that have low observation
rates to have a substantial effect on future baseline measurement
trends.
In conclusion, states and local jurisdictions may list as many
data items as you like in your reporting and analysis, selecting
them in order
of the number of OUT-OF-COMPLIANCE observations, and prioritizing
the items to be worked on in the same order, based on your resource
constraints.
This should be done separately for each facility type. The number
of items that can be listed is up to your discretion and preferences.
You
are not required to list the same number for each facility type.
The number you choose to work on and the amount of effort you
wish to expend
on each is up to you.
|