AASLD-FDA-NIH-PhRMA Hepatotoxicity Steering Group Meeting, 2006 Presentations: Use and limitations of the RUCAM
James Freston, MD, PhD
University of Connecticut School of Medicine
Use and limitations of the RUCAM [PDF]
The Roussel Uclaf Causality Assessment Method (RUCAM) was developed in an effort to improve objectivity and consistency in determining a causal relationship between drug administration and liver disease. The method is also referred to as the CIOMS because the Council for International Organizations of Medical Sciences beginning in 1989 organized the effort. Based on consensus recommendations of an international panel of experts, the RUCAM appears to be the most widely used causality instrument. It uses numerical weighting of key features in 7 domains. These include temporal relationship (latency and dechallenge), concomitant drug use, search for other etiologies, existing information linking the drug to liver injury and the response to rechallenge. Each key feature is given a numerical weight. This generates an overall numerical score, which is intended to reflect the probability of causality. The total score is divided into ranges that represent highly probably (>8), probable ((6-8), possible (3-5), unlikely (1-2), and excluded (≤0). The system handles hepatocellular and cholestatic/mixed reactions differently, recognizing that the latter may occur after a longer postcessation period and improve more slowly.
Validity is a major requirement of a rating scale. To have validity the scores must accurately reflect the phenomenon measured. Two studies have addressed RUCAM’s validity. The first entailed application of the RUCAM to 49 published cases of DILI that included a positive rechallenge and 28 controls. The cases were rated on the basis of information obtained prior to rechallenge. The score was significantly higher in cases than in controls with high levels of sensitivity (86%) and specificity (89%). The positive predictive value and negative predictive value were 93% and 78%, respectively. The second study compared the RUCAM against another rating method, the clinical diagnostic scale (CDS), also referred to as the M&V after its inventors, V.Maria and R. Victorino. The CDS is simpler than the RUCAM, scoring factors in only 5 domains. It places more weight on hypersensitivity DILI, and, unlike the RUCAM, requires normalization of dechallenge laboratory tests. It also uses a shorter postcessation onset interval and provides less precise criteria for excluding nondrug causes. The CDS was previously validated in 50 cases that were adjudicated as DILI by three experts. The study compared the RUCAM and CDS in 215 cases in a DILI registry in Spain. The cases had been submitted to the registry in a structured reporting form and evaluated by three DILI experts. Absolute agreement between the two scales was observed in only 42 cases (18%). Disagreement of one level (e.g. unlikely vs. possible) was reported in 108 cases (47%), and disagreement of two levels (e.g. unlikely vs. probable) in 70 cases (31%). The closest agreement was in cases with hypersensitivity features, in which disagreement was one level or less in 72% of cases. Agreement was just 6% in cases with features of cholestasis. No agreement was found in fulminant hepatitis or death. The RUCAM produced better discrimination power and its content validity agreed better with the clinical assessments and with the opinion of the expert DILI panel. The authors concluded that more convincing evidence of RUCAM’s validity is difficult to obtain without a unanimously accepted gold standard of causality.
While the RUCAM appears to be superior to the CDC, it has several limitations, some of which have become evident from its use by DILIN as an ancillary causality tool. These include considerable variation in scoring among the three expert reviewers who adjudicate each case of suspected DILI collected from the DILIN centers and presented to the reviewers in a standardized fashion. Among the first 16 cases adjudicated, there was complete agreement among the reviewers in only 4 instances. Scores differed by one level of probability in 3 cases and by more than 1 (e.g. possible vs. highly probable) in 10 cases. Believing that ambiguous instructions may be responsible for the inconsistency, standard operating procedures (SOP) were developed and applied to the interpretation of the elements in the RUCAM and a re-review of 18 cases was conducted. There were no significant differences on average between the two reviews but this masked individual scoring changes ranging from –4 to +7. The reliability among reviewers improved from the first to the second review, suggesting that the application of the RUCAM can be improved by more experience in its use by reviewers, the application of the SOP, or both.
Another limitation is the arbitrary weighting of factors; because of the absence of evidence, the weighting is based entirely on entirely on consensus opinion. The inclusion of only three risk factors, pregnancy, age above 55 years and use of alcohol is problematic as well. Two now-common populations at increased risk of DILD are likely to receive multiple medications for their concomitant morbidities: obese patients and those infected with HIV, and children are more susceptible than adults to valproate reactions. Alcohol use per se is not known to increase the risk but excessive use does. The RUCAM does not differentiate between the two. The DILIN SOP now direct reviewers to add a point to the score if alcohol consumption exceeds one drink per day in women and two in men. The RUCAM is also limited with respect to scoring a case in which two drugs with known hepatotoxicity were administered simultaneously. In such cases 2 points are subtracted with the potential of reducing the probability assessment by one level. The case may well be DILI but this fact may not be reflected in the final score. DILIN now adjudicates each drug separately, first establishing the probability of DILI and then the probability of one drug versus another as the responsible agent.
The RUCAM clearly provides a level of objectivity and semiquantitation to the assessment of cases of suspected DILI. Given its shortcomings, however, one of DILIN’s goals is to attempt to develop and validate a more reliable assessment methodology.
James Freston is Boehringer Ingelheim Chair of Clinical Pharmacology and Professor of Medicine Emeritus at The University of Connecticut School of Medicine in Farmington, Connecticut. He was Chief of the Divisions of Gastroenterology and Hepatology and Clinical Pharmacology at the University of Utah before his appointment as Professor and Chair of the Department of Medicine at the University of Connecticut, a position held for 17 years before becoming Director of Clinical Research.
He trained in gastroenterology and clinical pharmacology at the University of Utah and in hepatology for two years under Sheila Sherlock at the Royal Free Hospital in London. Throughout his career, he has focused on the effects of drugs on digestive diseases, with an emphasis on toxicity and safety, and on the effects of digestive diseases on drug disposition. He served twice on FDA GI Drugs Advisory Committees. His recent work has focused on a semiquantitative assessment of the relative consequences of the clinical, pathological and biochemical manifestations of DILI. He is a DILIN co-investigator at the University of Connecticut and serves on the DILIN Causality Committee.
He is Chair of the Foundation for Digestive Health and Nutrition and Past President of the American Gastroenterological Association. He received his M.D. degree from the University of Utah and his Ph.D. degree from the University of London.