Before new drug candidates are approved for use in humans, they must undergo a comprehensive panel of non-clinical safety tests. Among these tests are those that assess a drug’s ability to cause changes to a cell’s DNA sequence (mutagenicity), which is closely linked to the drug’s potential to cause cancer. The mutagenicity of a drug’s active ingredient is assessed using tests in bacteria1 as well as in mammalian cells and animals. In addition to their active ingredient(s), drug products may contain a wide variety of impurities that can arise as by-products of manufacturing or as degradation products during storage. Until recently, these impurities had to be tested for mutagenicity in bacteria before the drug product could be approved for human use. In addition, drug developers often explore multiple synthetic routes to the active ingredient in parallel, with different synthetic routes producing different impurities; it is therefore desirable to have a rapid way to assess the mutagenicity of all possible impurities without the need for a bacterial lab test to determine which route would best limit the production of mutagens.
Over the past decade, computer models of increasing sophistication based on chemical structure have been used to help predict the mutagenicity of drug impurities as well as drug substance. These models, which have wide application in many research contexts are collectively referred to as (quantitative) structure-activity relationship ((Q)SAR) models.
Structure-Activity Correlations and (Q)SAR Models
(Q)SAR2 models describe the correlation between specific characteristics, or structural features, of molecules and their chemical or biological activities (for example mutagenicity) under the general assumption that similar structures give rise to similar activities. In constructing a statistical-based QSAR model (Figure 1), one starts with a set (typically large) of molecules for which one has knowledge of the activity or property that one would like to predict. Characteristics or features of the molecules that might be correlated to this activity are encoded numerically as “descriptors.” Descriptors can range from relatively simple ones, like molecular weight or the number of specific atom types, to complex ones, such as mathematical expressions for the distribution of the electrons or the structure or conformation of a chemical group within the molecule. Then, using a variety of possible statistical and machine learning approaches, modelers establish a mathematical relationship between a set of informative descriptors and the activity in question. With this model in hand, one can calculate descriptors for a molecule with unknown activity and make a prediction of its activity without the need for laboratory testing.
(Q)SAR models may also be in the form of rule-based SARs that are often applied by a computer but based on knowledge (for example, about the reactivity of chemical groups) that was derived manually by human experts. In the form of a decision tree, the rules can activate a “structural alert” for a given compound as well as identify patterns of mitigation that could mitigate the presence of the alert. These modeling systems can thus make predictions of a chemical's activity based solely on its structure.
The general performance of QSAR models can be tested in validation experiments with a set of molecules with known activity that were not used for model construction. Statistical measures of a model’s predictive performance as it is applied to this new data set include sensitivity, specificity, positive predictivity and negative predictivity (Figure 2). Depending on its context of use, the model can be tuned to optimize performance according to one or more statistical measures.
New Guidelines for Assessing Drug Impurities
Over the past decade, CDER has maintained a vigorous program of research in (Q)SAR approaches for mutagenicity prediction, as well as for other drug toxicities, and advances in the field have led to major drug regulatory changes. Specifically, there are now international guidelines, ratified by FDA and its regulatory counterparts in the International Council for Harmonization (ICH), that allow drug developers to substitute predictions based on (Q)SAR models for traditional laboratory tests when determining the mutagenicity of drug impurities3 CDER research has collaboratively improved models for this purpose so that predictions are more reliable.
Building Better (Q)SAR Models to Assist Drug Developers
In a recent effort to improve the performance of (Q)SAR models for mutagenicity prediction, CDER researchers in the Division of Applied Regulatory Science collaborated with (Q)SAR software developers to collect experimental laboratory information for thousands of molecules from multiple databases, carefully reviewing the data and resolving conflicting results.4 To create a large and robust data set to train models, they combined the experimental results obtained in the two bacteria historically used for testing (Salmonella typhimurium and Escherichia coli) such that a positive result for mutagenicity in either would be considered positive in the model. The researchers then constructed two statistical-based models, which were tested in combination with an expert rule-based model to simulate their real-life application to drug impurities.
Performance of the models or combinations thereof was tested using known mutagenic and non-mutagenic chemicals that had not been used in model construction, and the results were compared to previous (Q)SAR models. The combination of the models led to a reduction in the number of chemicals that were unable to be predicted (or “out of domain"5), while sensitivity and negative predictivity were increased. The use of three models also meant that there was greater confidence in many of the predictions as to whether a molecule was mutagenic.
With continued improvement in the performance of (Q)SAR models, there is increasing interest from drug developers to apply them, not only to impurities but also to drug molecules. Models have been developed and continue to be improved for a wide variety of endpoints relevant to drug safety, such as carcinogenicity, liver toxicity, and cardiac toxicity. When drug developers are selecting candidate structures to investigate as future therapies, they can use (Q)SAR models to predict the relative toxicity of those structures before any lab testing is performed. This enables rapid decisions to be made on which candidates to develop, prioritizing those with the most desirable safety profiles. This state-of-the-art computational screening approach provides a foundation for bringing new drugs to the market.
 Those assays involve exposing billions of bacteria to potentially DNA damaging agents using strains that cannot grow without certain nutrients. In these strains, mutations can restore the ability to grow without these nutrients, and by counting the number of surviving bacterial colonies one can determine the level of mutation caused by a substance.
 (Quantitative) structure-activity relationship. The term “(Q)SAR” collectively refers to models that describe quantitative relationships (QSARs), where individual contributing features are weighted, and qualitative relationships (SARs), where the absence or presence of a feature is the basis of the prediction.
 The ICH M7 guideline describes the process whereby actual and potential impurities or degradation products likely to be present in the drug substance and drug product are identified when experimental data is lacking and outlines how a hazard assessment should be performed. For impurities without experimental data, two complementary (Q)SAR models can be used to determine if they are mutagenic or non-mutagenic.
 Landry, Curran, Marlene T. Kim, Naomi L. Kruhlak, Kevin P. Cross, Roustem Saiakhov, Suman Chakravarti, and Lidiya Stavitskaya. "Transitioning to composite bacterial mutagenicity models in ICH M7 (Q) SAR analyses." Regulatory Toxicology and Pharmacology 109 (2019): 104488.
 The applicability domain is the area of chemical space within which a model is able to make a prediction with a given reliability. This area can be calculated based on the structural features and properties of the chemicals used to construct the model, whose values define the perimeter of chemical space and knowledge that the model has familiarity with (See Stavitskaya L., Aubrecht J., Kruhlak N.L. (2015) Chemical Structure-Based and Toxicogenomic Models. In: Graziano M., Jacobson-Kram D. (eds) Genotoxicity and Carcinogenicity Testing of Pharmaceuticals. Springer, Cham). and OECD Principles for the Validation, for Regulatory Purposes, of (Quantitative) Structure-Activity Relationship Models.