Draft Report of Quantitative Risk and Benefit Assessment of Consumption of Commercial Fish, Focusing on Fetal Neurodevelopmental Effects (Measured by Verbal Development in Children) and on Coronary Heart Disease and Stroke in the General Population: Appendix C, Methodology And Results From Carrington and Bolger (2000)
January 15, 2009
This is a slightly modified excerpt from Carrington and Bolger (2000) that describes the methodology used to model developmental milestone data from Iraq and Seychelles. The excerpt provides modeling details that are referenced in Appendix A but not specifically included.
The concern for exposure to mercury is primarily a result of two poisoning epidemics that occurred in Japan and Iraq. The latter epidemic, that occurred after exposure to contaminated grain, was the subject of an extensive epidemiological investigation that included an effort to relate the magnitude of exposure to methylmercury to health impact (Marsh et al, 1987). Because these were not prospective studies, the reports concerned with the Iraqi episode do not reflect the same degree of experimental control as subsequent studies. For risk assessment purposes, perhaps the major shortcoming of the Iraqi study is the presence of relatively few individuals at low doses. For instance, because there is little data on the extent of normal variation for the observed measures of development, it is difficult to discern whether a slightly higher frequency of "abnormal" responses (e.g. delayed walking) is attributable to mercury effects or normal variation. However, in spite of numerous shortcomings, the Iraqi study has a major advantage over more recent reports - there were high-dose health effects that were unequivocally attributable to methylmercury exposure.
More recent prospective studies have searched for health effects of methylmercury in populations consuming whale or fish with much lower levels of exposure than those encountered in Iraq (Kjellstrom, et al, 1986; Marsh et al, 1995a, Grandjean et al, 1997). For the present analysis, the results of the Iraqi study are combined with a more recent study in the Seychelles Islands (Marsh et al, 1995a), where the exposure to methylmercury is from the consumption of marine fish. Data from the Seychelles study were used because of the presence of some of the same measurements as those collected from Iraq and because the individual subject data was made available to us. The Seychelles study has many more individual subjects and the range of mercury hair levels were much lower and more representative of levels typically found in consumers of fish, but which are still much higher than those typical of infrequent consumers of fish.
To combine results from two or more studies in an analysis, it is necessary that there be a common measure. For the present analysis, two endpoints that were collected in both the Iraqi and Seychelles studies were used as the common measure: 1) Age of Talking (AOT) - the age at which the infants started talking, and 2) Age of Walking (AOW) -- the age at which the infants became toddlers. Not only were these measures available from both studies, they have the advantage of being simple measures of neurological development.
Construction of Cumulative Frequency Tables
The data were used to construct separate (one for each study) two dimensional cumulative frequency tables for each study which tabulated frequency for groups spanning the range of hair levels and observed response. These were constructed by grouping the subjects from each study by dose, and calculating the frequency at which each of series of response levels were exceeded. Tables were then constructed for AOT and AOW. Plots of cumulative frequency tables for both the Iraqi and Seychelles studies for AOT and AOW are shown in Figure AC-2, respectively.
The analysis presented here is an exercise in comparative modeling where a large number of alternative mathematical models are examined with respect to their ability to describe historical data. Analyses of epidemiological data often undertake evaluation that are designed to identify which of a number of different parameters (e.g. confounding variables or modifying factors) are to be included in a final model. The present analysis differs in two important respects. First, it evaluates models that are different in form rather than just complexity. Second, rather than concluding the analysis with a final or best model, a probability tree that employs probabilities for a set of alternative models to characterize the uncertainty associated with an estimation.
To conduct a comparative modeling exercise, the first step is to assemble a list of candidate models. Dose-response models often have multiple sources of theoretical uncertainty. These include the dose-response relationship itself, the influence of factors other than dose on the outcome, and the extent of the variability among individual subjects. In addition, when multiple studies are being used to evaluate the models, it may be desirable to accommodate differences in the studies within the model. As a result, models were formulated from four submodels, each of which had several theoretical alternatives. Each of the four submodels represent a potential source of model uncertainty: 1) A dose-response function (relating hair level to AOT or AOW); 2) a statistical distribution describing population variability; 3) dose-independent factors; and 4) study dependent factors. With several variations of the mathematical form (see Table ) and relative position of each of the submodels (AC-1), a series of 1092 candidate models were assembled. All the models were relatively simple and contained 3 to 7 adjustable parameters (e.g. slope, standard deviation, dose-independent AOT) which could be altered to improve the fit.
As an example of what the dose response equations looked like when assembled, a model that fit the data well was constructed from a linear dose response function, a background response parameter, a background study parameter, and a Weibull distribution to account for population variability at position 4. To predict cumulative frequency as a function of dose and response, this yielded the following function:
= 1 – exp((-Response / (Dose*P1 + P3 + P4) / ((log 2) ^ (1/P2))) ^ P2)
To predict response as a function of dose and frequency, the following function was used:
= (Dose*P1 + P3 + P4) / ((log 2) ^ (1/P2)) (log(1/(1- Frequency)]) ^ (1/ P2)
P1 is the dose-response slope
P2 is the Weibull alpha parameter
P3 is the background response (i.e. age in months)
P4 is a study-dependent background term (also age in months)
The analysis was conducted in Microsoft Excel using procedures written in Visual Basic for Applications, which are available on request.
Fitness was judged by a composite least residual squares measure that gave equal weight to residuals for predicted population percentiles (frequency as a function of dose and response) and for predicted magnitude of effect (response as a function of frequency). The fit for each dose-group was weighted by the original number of observations - which gave the values from the Seychelles considerably more weight in the low dose regions.
The parameters were adjusted to fit the data (minimize the measure of fit) with Excel Solver. Simple equations were used to assign initial estimates for the parameters - some of these used information from the study such as the range of doses and responses. If an obviously poor fit was obtained, different initial estimates were used in order to find a better fit - usually by adopting estimates from simpler models with the same parameters that produced a better fit.
Model Weighting and Model Uncertainty
The models were judged with an algorithm that rewards a model for goodness-of-fit and penalizes for the use of extra parameters:
Model Weight = (((1 + n / Pn) ^ O) * ((1 - gof) ^ H)
n = number of observations
Pn = Number of Model Parameters
gof = Goodness-of-Fit
O = The Parameter Penalty, an arbitrary constant that determines the relative importance of model simplicity
H = The Association factor, an arbitrary constant that determines the relative importance of goodness-of-fit.
In the present analysis, values of 0.3 and 100 were used for O and H, respectively. These values were chosen because they appeared to generate a reasonable balance between fit and model simplicity (see Carrington, 1996 for further discussion of this approach). The uncertainty associated with the predictions made was represented by weighting the 200 best models. The algorithm used for model weighting was also used to select the best models.
Two of the dose-response models employed have a biochemical heritage - the Mass Action model is an equation that is able to describe reversible (ionic) competitive ligand-receptor binding interactions. The first order equation is a function that describes irreversible (covalent) ligand-receptor interactions. Evidence that methylmercury acts by either of these mechanisms could be construed as an increase in the weight (and probability) accorded theories that employ those functions. However, it should be noted that even if a particular biochemical mechanism of action is conclusively established, the in vivo reaction will often be vastly more complicated that a biochemical reaction taking place in vitro. As a result, a model reflecting the wrong mechanism, or no mechanism, may still describe the data and still make a better prediction. Although it would be possible to include theoretical support for a theory in the calculation of each models evidential weight, the biochemical mechanism for methymercury is presently unknown.
Age of Talking. For the AOT endpoint, the best model was comprised of a linear dose-response relationship, a Weibull population distribution, and a background response parameter, and a study-dependent dose parameter (see Figure AC-1). The exponential, hockey stick, and mass action dose-response relations were also heavily represented among the top-rated models (see Table AC-2). The first-order and logistic models tended to not fit as well. The Weibull distribution was clearly the best fitting population distribution - regardless of the dose-response function used. The lognormal distribution consistently provided a better fit than the other two distributions. The poorer fit with either the normal or logistic distribution functions is indicative of a skewed distribution. All the top rated models included parameters for both dose-independent and study-dependent effects, reflecting the notions that a) children do not speak at age 0, and that there are differences in the Iraqi and Seychelles studies that are not attributable to methylmercury.
Age of Walking. For the AOW endpoint, the best model was comprised of a linear dose-response relationship, a Weibull population distribution, a background dose and response parameters, and a study-dependent dose parameter (see AC-3). All the dose-response functions were represented among the top-rated models. The Weibull and lognormal distributions were again the clear favorites for modeling population variability. All the top rated models included two parameters for dose-independent effects. All of the best models also included a study dependent parameter, again reflecting differences in the Iraqi and Seychelles populations.
Function Output. The output of the best model for each of the three endpoints is plotted in Figures AC-2 and AC-3 for both the Iraqi and Seychelles studies. Probability trees comprised of the top 200 models yield uncertainty distributions when used as a predictive tools. As an example, sample output from the AOT function that weights the frequency of use of the best 200 models is given in AC-3. In a two-dimensional Monte-Carlo simulation used to simulate both variability and uncertainty, this function will impact the distribution in both dimensions.
Because the models contain study-dependent variables, the study for which a prediction is required must be specified. If the resulting models are to be used in a risk assessment, this requires a decision about which study population is more representative of the population of concern to the assessment. This decision would revolve around speculation about the source of the differences between the studies (e.g. cultural or genetic), and would be a source of both variability and uncertainty. For instance, the population of concern may be variable with regard to the percentage of the population for which each study is more appropriate, while the extent of that frequency for each may be uncertain.
General structure of the models used to integrate the results from the Iraqi and Seychelles studies. The central component is the dose-response function that relates dose to the magnitude of an individual outcome (i.e. AOT or AOW). The background and study functions add parameters to account for dose-independent influences that are study-independent or study-dependent. The population submodel converts the individual model into a population model by introducing a statistical distribution at one of four positions in the individual model.
Figure AC-2: Age of Talking, Data and a Model
The charts at the top reflect the cumulative incidence tables constructed from the raw data (Age of Talking) from the Iraqi (left) and Seychelles (right). The charts at the bottom reflect a common model fit to both data sets. The Z-axis reflects the percent of the population in each dose group (Y-axis) with a AOT above the X-axis value. The X-axis values are chosen to represent the range of value encountered in the studies, and therefore do not necessarily generate incidences of 0 or 1 at all dose groups.
Figure AC-3: Age of Walking, Data and a Model
The charts at the top reflect the cumulative incidence tables constructed from the raw data (Age of Walking) from the Iraqi (left) and Seychelles (right). The charts at the bottom reflect a common model fit to both data sets. The Z-axis reflects the percent of the population in each dose group (Y-axis) with a AOW above the X-axis value. The X-axis values are chosen to represent the range of value encountered in the studies, and therefore do not necessarily generate incidences of 0 or 1 at all dose groups.
|Dose vs Individual Response||Linear, Hockey Stick, Mass Action, First Order, Exponential, Logistic|
|Population Variability||Normal, Lognormal, Weibull, Logistic|
|Dose Independent Factors||None, Background Dose, Background Effect, Background Dose and Background Effect|
|Study Factors||None, Study Dose, Study Effect, Study Dose and Study Effect|
|Response Submodel||Population Submodel||Position||Background Submodel||Study Submodel||Fit||n||Weight||Map Value|
|Exponential||Weibull||3||dose and response||Dose||0.0071||6||1.3250||0.0314|
|Hockey Stick||Weibull||3||dose and response||Dose||0.0072||6||1.3165||0.0374|
|Linear||Weibull||4||Response||dose and response||0.0078||5||1.3072||0.0614|
|Linear||Weibull||4||dose and response||Response||0.0078||5||1.3072||0.0734|
|Linear||Weibull||4||dose and response||Dose||0.0078||5||1.3035||0.0854|
|Exponential||Weibull||4||Response||dose and response||0.0074||6||1.2844||0.1151|
|Dose(ppm in Hair)||Population Frequency||Study||Average||UncertaintyMedian||0.95|
|10 vs. 1||0.5||Seychelles||0.32||0.31||0.47|
|10 vs. 1||0.95||Seychelles||0.45||0.43||0.73|
|10 vs. 1||0.5||Iraq||0.31||0.31||0.42|
|10 vs. 1||0.95||Iraq||0.44||0.43||0.63|
The average, median, and 95th percentiles for predicted AOT is given for various combinations of dose, population frequency, study population, and likelihood. The values for the doses "10 vs 1" represent the net difference in expected AOT with maternal concentrations of methylmercury in hair at 10 ppm vs 1 ppm.
Descriptions of Figures
- Figure AC-1: Model Assembly from Four Components Figure AC-1 is a flowchart that indicates how the components of the dose-response models are assembled. It indicate that for model fitting purposes, either the response can be predicted from dose and a given cumulative frequency, or frequency can be predicted for the dose and a given response. In the former case, the background dose terms and the individual dose response submodel are used to calculate an input for the population submodel, whereas in the latter case the population model is used to calculate the input for the dose-response model.
- Figure AC-2: Age of Talking, Data and a Model Figure AC-2 is comprised of four 3-dimensional charts, where the x-axis represents the age of talking in months, the y-axis represents mercury levels in maternal hair, and the z-axis represents cumulative frequency. The charts at the top reflect the cumulative incidence tables constructed from the raw data (Age of Talking) from the Iraqi (left) and Seychelles (right). The charts at the bottom reflect a common model fit to both data sets. The charts show that the same model provides a reasonable approximation of both data sets.
- Figure AC-3: Age of Walking, Data and a Model Figure AC-3 is comprised of four 3-dimensional charts, where the x-axis represents the age of walking in months, the y-axis represents mercury levels in maternal hair, and the z-axis represents cumulative frequency. The charts at the top reflect the cumulative incidence tables constructed from the raw data (Age of Walking) from the Iraqi (left) and Seychelles (right). The charts at the bottom reflect a common model fit to both data sets. The charts show that the same model provides a reasonable approximation of both data sets.