Improving adverse event detection related to biologic immunosuppressant use – a pilot study of the BERT deep learning model adapted to real-world clinical notes

CERSI Collaborator: University of California, San Francisco: Dana Ludwig, MD; Madhumita Sushil, PhD; Mahalakshmi Parakala; James Buchanan, PharmD; Anna Silverman, MD; Balu Bhasuran, PhD; Atul Butte, MD, PhD; Vivek Rudrapatna, MD, PhD (UCSF)

FDA Collaborators: Jawahar Tiwari, PhD; Lauren Choi, PharmD; Nadia Habal, MD; Artur Belov, PhD; Rebecca Racz, PharmD; Samer El-Kamary, MD, MPH; Qi Liu, PhD; Ohenewaa Ahima, MD; Yan Li, Ph.D., B.Pharm

Project Start Date: 02/2021
Project End Date: 08/2022

Regulatory Science Challenge

Accurate and timely discovery of adverse events and undesirable outcomes from a clinical treatment is critical to ensure that clinicians and patients can make well-informed treatment decisions that balance risks with benefits. Immunosuppressants, a class of medications that are used for the treatment of autoimmune and immune-mediated diseases, is an important area for tracking adverse events. These medications are commonly required for long-term management of these conditions. Many of these medications are therapies that are made from living systems such as cultured cells and organisms. Immunosuppressants have been associated with toxicities because they suppress the immune system. These can include infections, cancer, and other serious conditions. There is great interest in the study of these biologic immunosuppressants from a safety standpoint, in large part because patients use them for long-periods of time, and they are relatively new therapies.

A rich source of adverse event data comes from clinical notes within electronic health record (EHR) systems because clinicians often document adverse events as well as the reasons for treatment discontinuation in their notes about their patients. However, clinical notes have been underutilized because we lack tools to effectively examine them. Recently, there have been impressive advances in understanding natural language processing following the release of the context-aware deep learning models such as BERT (Bidirectional Encoder Representations from Transformers). However, efforts to adapt the BERT model for applications in clinical research have historically been hindered by the presence of information that could be used discover a patient’s identify and other protected health information (PHI) in the clinical notes.

Project Description and Goals

The goals of this project were to 1) develop a clinical language-specific adaptation of the BERT model by training it using 75 million machine-PHI-redacted clinical notes authored at the University of California, San Francisco (UCSF), and 2) adapt it to the task of serious adverse event detection from clinical notes. For this pilot study, we used outpatient clinical notes from the inflammatory bowel disease (IBD) clinic at UCSF and limited the scope of our model development to serious adverse events that resulted in a hospitalization and were associated with the use of biologc immunosuppressants for IBD. We compared our models against other publicly-available and state-of-the-art systems for clinical language inference, both on general benchmarked tasks as well as tasks pertaining specifically to adverse event detection.

Research Outcomes/Results

Using 75 million clinical notes authored at UCSF, we successfully trained a dedicated clinical language model from scratch, called UCSF BERT. We evaluated its performance on multiple publicly benchmarked tasks related to clinical language inference. These tasks include identifying important medical concepts from clinical notes as well as correctly characterizing the relationships between these concepts. For example, identifying drug and diseases that are mentioned in notes, as well as recognizing relationships such as “Drug A is a treatment for disease B”. On these tasks, our model performs as well as or better than previously published models. These included versions of BERT that have been trained on publicly available documents related to biomedical and clinical topics.
We subsequently customized this model to perform tasks specific to serious adverse event detection. These included identifying temporal relationships between medications of interest and subsequent hospitalizations (e.g. the patient was hospitalized after starting drug A) as well as identifying the stated reasons for given hospitalizations (e.g. the patient was hospitalized for acute pancreatitis). These customized BERT models were compared to previously published, state of the art models for adverse event detection. These models were trained and evaluated using a collection of gastroenterology clinical notes written about patients with inflammatory bowel disease, many of whom received biologic treatments for their disease.
Our best performing models achieved accuracy ranging from 88-96% and a macro F1 score ranging from 62-68%. The latter is a more stringent metric that reflects the general rarity of serious adverse events (defined here as those associated with a hospitalization) in our collection of notes. These models performed reasonably well despite the significant length of these clinical notes, and generally outperformed state-of-the-art models by up to 10%.
Overall, these results support the feasibility and the value of using advanced clinical language models to automate the detection of adverse events from electronic health records data. These models may someday support the surveillance of drug safety in the post-marketing setting using new data sources.

Research Impacts

Facilitation of strategic relationships with expert groups and stakeholders: This project has established new relationships between the FDA and our university, comprised of stakeholders that include clinicians, pharmacists, researchers, software engineers, and patients. Our team is currently in the process of developing a follow-on grant that involves both FDA and UCSF stakeholders, to continue the development of this method.
Scientific publications/citations in literature: We are in the process of submitting one paper describing the UCSF BERT model and are preparing a second (and possibly third) manuscript to report the results of the serious AE detection customizations of our BERT model.
Presentations at conferences/meetings: Our work has been accepted for presentation at the American Medical Informatics Association 2022 annual summit, and we anticipate additional conference presentations. We are also in the process of preparing an education session for the Drug Information Association meeting in 2023.
Data-sharing with public: The underlying data used to train and test this model are deidentified and available to interested members of the public on request.
Technology transfer to stakeholders: Our models will be made publicly available to enable others to build upon and improve these adverse event detection methods
Catalyst for future research: This pilot study has led to multiple follow-on grant submissions by our team to both internal and external funding sources to support future development of these methods.