BERTox Initiative

Initiative to apply LLMs to facilitate analysis of FDA documents and public literature for improved efficiency and accuracy.

Image

AnimalGAN | SafetAI | BERTox | PathologAI | TranslAI

Objective: To apply the large language models (LLMs) such as GPT and Llama to facilitate analysis of FDA documents and public literature for improved efficiency and accuracy in supporting regulatory science and review process.

Introduction: The FDA regulatory process involves document review. In addition, the FDA also generates multiple documents during the product-review process. Analysis of data from these documents provides information that supports regulatory science research and informs the FDA product-review process. AI-based Natural Language Processing (NLP) has been focused on developing LLMs trained with large text documents to perform a broad range of NLP tasks. This initiative aims to assess the application of LLMs for FDA documents as well as developing content-specific LLMs to facilitate regulatory science at FDA such as information retrieval and text summarization.

Approaches: BERTox is a suite of NLP applications using LLMs powered by diverse functions ranging from information retrieval to sentiment analysis, text classification, and Name Entity Recognition (NER). In several pilot studies, the BERTox approach has been applied to drug-induced liver injury classification based on FDA drug labeling, causal inference of the FDA Adverse Event Reporting Systems (FAERS) database, AI bias in interpretation and classification of drug properties (e.g., safety and efficacy), text summarization to provide highlights of labeling sections, and automatic anomaly analysis. The initiative has a specific emphasis on developing responsible AI models with customized LLMs that can be operated in a local environment for specific regulatory applications with understanding of their bias, context of use, causal inference, and explainability.

Potential impact: Reviewing text documents is a crucial step in assessing the safety and efficacy of FDA-regulated products. However, the current manual process is time consuming and resource intensive. BERTox offers a set of LLMs-based AI tools/systems to intelligently process and extract critical information from FDA documents to improve and expedite the product-review process. In addition, BERTox can also serve as an institutional memory to effectively access past documents that are often referenced to ensure consistency and evidence-based decision making in the review of new products.

References

Year	Title	Authors	Full Citation
2025	Leveraging FDA labeling documents and large language model to enhance annotation, profiling, and classification of drug adverse events with AskFDALabel.	Wu, L., Fang, H., Qu, Y., Xu, J., & Tong, W	Leveraging FDA labeling documents and large language model to enhance annotation, profiling, and classification of drug adverse events with AskFDALabel. Wu, L., Fang, H., Qu, Y., Xu, J., & Tong, W. (2025). Drug Safety, 48(6), 655.
2025	Assessing the performance of large language models in literature screening for pharmacovigilance: a comparative study.	Li, D., Wu, L., Zhang, M., Shpyleva, S., Lin, Y. C., Huang, H. Y., Li, T., & Xu, J.	Assessing the performance of large language models in literature screening for pharmacovigilance: a comparative study. Li, D., Wu, L., Zhang, M., Shpyleva, S., Lin, Y. C., Huang, H. Y., ... & Xu, J. (2024). Frontiers in Drug Safety and Regulation, 4, 1379260.
2025	Is ChatGPT ready for public use in organ-specific drug toxicity research?	Connor, S., Wu, L., Roberts, R. A., & Tong, W	Is ChatGPT ready for public use in organ-specific drug toxicity research? Connor, S., Wu, L., Roberts, R. A., & Tong, W. (2025). Drug Discovery Today, 104297.
2024	50 Shades of AI in Regulatory Science.	Tong W. and Baran S.W.	50 Shades of AI in Regulatory Science. Tong W. and Baran S.W. Drug Discovery Today. 2024, 29(8): 104058. doi:10.1016/j.drudis.2024.104058.
2024	Context is Everything in Regulatory Application of Large Language Models (LLMs).	Tong W. and Renaudin M. GCRSR Interagency LLMs Taskforce.	Context is Everything in Regulatory Application of Large Language Models (LLMs). Tong W. and Renaudin M. GCRSR Interagency LLMs Taskforce. Drug Discovery Today. 2024, 29(4): 103916. doi:10.1016/j.drudis.2024.103916.
2024	Description and Validation of a Novel AI Tool, LabelComp, for the Identification of Adverse Event Changes in FDA Labeling.	Neyarapally G.A., Wu L., Xu J., Zhou E.H., Dang O., Lee J., Mehta D., Vaughn R.D., Pinnow E., and Fang H.	Description and Validation of a Novel AI Tool, LabelComp, for the Identification of Adverse Event Changes in FDA Labeling. Drug Safety. 2024, 47: 1265–1274. doi:10.1007/s40264-024-01468-8.
2024	Text Summarization with ChatGPT for Drug Labeling Documents.	Ying L., Liu Z., Fang H., Kusko R., Wu L., Harris S., and Tong W.	Text Summarization with ChatGPT for Drug Labeling Documents. Ying L., Liu Z., Fang H., Kusko R., Wu L., Harris S., and Tong W. Drug Discovery Today. 2024, 29(6): 104018. doi:0.1016/j.drudis.2024.104018.
2024	A Framework Enabling LLMs into Regulatory Environment for Transparency and Trustworthiness and its Application to Drug Labeling Document.	Wu L., Xu J., Thakkar S., Gray M., Qu Y., Li D., and Tong W.	A Framework Enabling LLMs into Regulatory Environment for Transparency and Trustworthiness and its Application to Drug Labeling Document. Wu L., Xu J., Thakkar S., Gray M., Qu Y., Li D., and Tong W. Regulatory Toxicology and Pharmacology. 2024, 149: 105613. doi:10.1016/j.yrtph.2024.105613.
2023	Bidirectional Encoder Representations from Transformers-like Large Language Models in Patient Safety and Pharmacovigilance: A Comprehensive Assessment of Causal Inference Implications.	Wang X., Xu X., Liu Z., and Tong W.	Bidirectional Encoder Representations from Transformers-like Large Language Models in Patient Safety and Pharmacovigilance: A Comprehensive Assessment of Causal Inference Implications. Wang X., Xu X., Liu Z., and Tong W. Experimental Biology and Medicine. 2023, 248(21):1908-1917. doi:10.1177/15353702231215895.
2023	Classifying Free Texts Into Predefined Sections Using AI in Regulatory Documents: A Case Study with Drug Labeling Documents.	Gray M., Xu J., Tong W., and Wu L.	Classifying Free Texts Into Predefined Sections Using AI in Regulatory Documents: A Case Study with Drug Labeling Documents. Gray M., Xu J., Tong W., and Wu L. Chemical Research in Toxicology. 2023, 36(8): 1290-1299. doi:10.1021/acs.chemrestox.3c00028.
2023	Development of Benchmark Datasets for Text Mining and Sentiment Analysis to Accelerate Regulatory Literature Review.	Wu L., Chen S., Guo L., Shpyleva S., Harris K., Fahmi T., Flanigan T., Tong W., Xu J., and Ren Z.	Development of Benchmark Datasets for Text Mining and Sentiment Analysis to Accelerate Regulatory Literature Review. Wu L., Chen S., Guo L., Shpyleva S., Harris K., Fahmi T., Flanigan T., Tong W., Xu J., and Ren Z. Regulatory Toxicology and Pharmacology. 2023, 137: 105287. doi:10.1016/j.yrtph.2022.105287.
2023	Measurement and Mitigation of Bias in Artificial Intelligence: A Narrative Literature Review for Regulatory Science.	Gray M., Samala R., Liu Q., Skiles D., Xu J., Tong W., and Wu L.	Measurement and Mitigation of Bias in Artificial Intelligence: A Narrative Literature Review for Regulatory Science. Gray M., Samala R., Liu Q., Skiles D., Xu J., Tong W., and Wu L. Clinical Pharmacology and Therapeutics. 2023, 115(4): 687-697. doi:10.1002/cpt.3117.
2023	RxBERT: Enhancing Drug Labeling Text Mining and Analysis with AI Language Modeling.	Wu L., Gray M., Dang O., Xu J., Fang H., and Tong W.	RxBERT: Enhancing Drug Labeling Text Mining and Analysis with AI Language Modeling. Wu L., Gray M., Dang O., Xu J., Fang H., and Tong W. Experimental Biology and Medicine. 2023, 248(21):1937-1943. doi:10.1177/15353702231220669.
2022	DeepCausality: A General AI-Powered Causal Inference Framework for Free Text: A Case Study of LiverTox.	Wang X., Xu X., Tong W., Liu Q., and Liu Z.	DeepCausality: A General AI-Powered Causal Inference Framework for Free Text: A Case Study of LiverTox. Wang X., Xu X., Tong W., Liu Q., and Liu Z. Frontiers in Artificial Intelligence. 2022, 5:999289.
2022	NeuroCORD: A Language Model to Facilitate COVID-19-Associated Neurological Disorder Studies.	Wu L., Ali S., Ali H., Brock T., Xu J., and Tong W.	NeuroCORD: A Language Model to Facilitate COVID-19-Associated Neurological Disorder Studies. Wu L., Ali S., Ali H., Brock T., Xu J., and Tong W. International Journal of Environmental Research and Public Health. 2022, 19:9974.
2021	AI-Based Language Models Powering Drug Discovery and Development.	Liu Z., Roberts R.A., Lal-Nag M., et al.	AI-Based Language Models Powering Drug Discovery and Development. Liu Z., Roberts R.A., Lal-Nag M., et al. Drug Discovery Today. 2021, 26:2593-2607.
2021	BERT-Based Natural Language Processing of Drug Labeling Documents: A Case Study for Classifying Drug-Induced Liver Injury Risk.	Wu Y., Liu Z., Wu L., et al.	BERT-Based Natural Language Processing of Drug Labeling Documents: A Case Study for Classifying Drug-Induced Liver Injury Risk. Wu Y., Liu Z., Wu L., et al. Frontiers in Artificial Intelligence. 2021, 4:729834-729834.
2021	DICE: A Drug Indication Classification and Encyclopedia for AI-Based Indication Extraction.	Bhatt A., Roberts R., Chen X., et al.	DICE: A Drug Indication Classification and Encyclopedia for AI-Based Indication Extraction. Bhatt A., Roberts R., Chen X., et al. Frontiers in Artificial Intelligence. 2021, 4.
2021	InferBERT: A Transformer-Based Causal Inference Framework for Enhancing Pharmacovigilance.	Wang X., Xu X., Tong W., et al.	InferBERT: A Transformer-Based Causal Inference Framework for Enhancing Pharmacovigilance. Wang X., Xu X., Tong W., et al. Frontiers in Artificial Intelligence. 2021, 4:659622-659622.

AnimalGAN | SafetAI | BERTox | PathologAI | TranslAI

References

Resources for You