Identifying and Measuring Artificial Intelligence (AI) Bias for Enhancing Health Equity
As part of the Artificial Intelligence (AI) Program in the FDA’s Center for Devices and Radiological Health (CDRH), the goal of this regulatory science research is to understand and measure bias and improve assessment of AI model generalizability.
Overview
Bias has a range of definitions in technical literature, law, and everyday usage. In this Artificial Intelligence Program, we define bias as a systematic difference in treatment of certain objects, people, or groups in comparison to others, where treatment is any kind of action, including perception, observation, representation, prediction, or decision (ISO/IEC TR 24027:2021). Health equity is a priority for CDRH, and we recognize it as advancing the development of knowledge and safe and effective technologies to meet the needs of all patients and consumers.
There is considerable concern in the AI community that AI models may (typically inadvertently) worsen inequalities in health care delivery. A major regulatory science gap in the regulation of AI-enabled medical devices include fundamental methods that analyze training and test methods to understand, measure, and minimize bias, and characterize performance for subpopulations. This is closely related to the generalizability and robustness of the AI-enabled models, where one is interested in preserving model performance under naturally induced variations, including variations between subpopulations. There is a need to understand the conditions under which AI-enabled medical devices provide generalizable and robust output to reasonably assure their safety and effectiveness.
Projects
- Tackling Sex Bias in AI for Severity Assessment of COVID-19
- Visual Feature Auditing of Imaging Classification Models to Identify Subgroups with Poor Performance
- Unsupervised Deep Clustering for Subgroup Identification within Medical Image Datasets
Resources
- “DRAGen: Decision Region Analysis for Generalizability,” Catalog of Regulatory Science Tools, 2024.
- Burgon A, Petrick N, Sahiner B, Pennello G, Cha K and Samala RK. Decision Region Analysis for Generalizability (DRAGen) of AI models: Estimating model generalizability in the case of cross-reactivity and population shift, Journal of Medical Imaging, 11(1), 014501-014501. 2024.
- Sidulova M, Sun X, and Gossmann A. “Deep Unsupervised Clustering for Conditional Identification of Subgroups Within a Digital Pathology Image Set.” In International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI 2023), pp. 666-675. 2023.
For more information, email OSEL_AI@fda.hhs.gov.