Request For Public Comment: Measuring and Evaluating Artificial Intelligence-enabled Medical Device Performance in the Real-World
This Request for Public Comment is intended for discussion purposes only and does not represent draft or final guidance. It is not intended to propose or implement policy changes regarding the evaluation of devices which integrate artificial intelligence (AI), including generative AI (GenAI)-enabled technology. This document is not intended to communicate the FDA's proposed (or final) regulatory expectations but is instead meant to seek early feedback from groups and individuals outside the Agency and advance a broader discussion among the AI healthcare ecosystem on this topic.
The objective of this ‘Request for Public Comment’ is to obtain comment and feedback from interested parties and the public on a series of questions related to the current, practical approaches to measuring and evaluating the performance of AI-enabled medical devices in the real-world, including strategies for identifying and managing performance drift, such as detecting changes in input and output. Please submit your comments to https://www.regulations.gov, Docket No. FDA-2025-N-4203 for ‘Measuring and Evaluating Artificial Intelligence-enabled Medical Device Performance in the Real World; Request for Public Comment.’ FDA intends to consider all comments timely submitted to this docket (FDA-2025-N-4203) by December 1, 2025, related to this topic.
Background
AI, including GenAI, presents opportunities to improve patient outcomes, advance public health, and accelerate medical innovation. At the same time, these technologies introduce new considerations as it relates to assuring the maintained safety and effectiveness of AI-enabled medical devices across the total product life cycle, and particularly with respect to assessing their performance, safety, and reliability after deployment in real-world settings.
The U.S. Food and Drug Administration (FDA or the Agency) is seeking information from interested parties and the public on best practices, methodologies, and approaches for measuring and evaluating real-world performance of AI-enabled medical devices. This includes approaches to detect, assess, and mitigate performance changes over time to help assure these medical devices remain safe and effective throughout their life cycle.
This Request for Public Comment builds on insights discussed during the November 2024 meeting of the FDA Digital Health Advisory Committee (Summary and Meeting Materials and Public Comments), where interested parties discussed robust real-world evaluation strategies to assure that AI-enabled medical devices continue to be safe and effective after deployment.
AI system performance can be influenced by changes in clinical practice, patient demographics, data inputs, health care infrastructure, among other factors. Such changes, commonly referred to as data drift (or concept drift, or model drift), may lead to performance degradation, bias, or reduced reliability. Additional factors such as user behavior, workflow integration, and changes to clinical guidelines may also impact system behavior in practice.
Currently, many AI-enabled medical devices are evaluated primarily through retrospective testing or static benchmarks. While these methods may help establish a baseline understanding of the medical device performance, they are not designed to predict behavior in dynamic, real-world environments. Ongoing, systematic performance monitoring is increasingly recognized as relevant to maintaining safe and effective AI use by observing how systems actually behave during clinical deployment.
Through this Request for Public Comment, the FDA seeks to gather information on current, practical approaches to measuring and evaluating the performance of AI-enabled medical devices in the field, including strategies for identifying and managing performance drift, such as detecting changes in input and output. The Agency is particularly interested in information on methods that:
- Are currently deployed at scale in real-world clinical environments,
- Are supported by real-world evidence, and
- Are applied in clinical (patient- or health care worker-facing) settings.
Request for Public Comment
In particular, CDRH seeks comment from the public on the following questions:
- Performance Metrics and Indicators
1a. What metrics or performance indicators do you use to measure the safety, effectiveness, and reliability of AI-enabled medical devices in real-world clinical use?
1b. How are these metrics defined, and weighted when assessing different dimensions of performance and safety?
1c. What timeframe do you consider when evaluating “real-world clinical use” performance?
- Real-World Evaluation Methods and Infrastructure
2a. What tools, methodologies, or processes are you currently using to proactively monitor AI-enabled medical device performance post-deployment?
2b. How do you balance human expert review and automated monitoring approaches in your evaluation methodology, and what are the pros and cons of each when it comes to practical implementation?
2c. What technical, operational, or organizational infrastructure supports your real-world AI-enabled medical device performance evaluation?
- Postmarket Data Sources and Quality Management
3a. What data sources do you typically use for ongoing performance evaluation (e.g., electronic health records, device logs, patient-reported outcomes)?
3b. How do you address data quality, completeness, and interoperability challenges in your monitoring systems?
3c.What methods have been most effective in incorporating clinical outcomes and user feedback into model updates?
- Monitoring Triggers and Response Protocols
4a. What triggers the need for additional assessments and more intensive evaluation?
4b. How do you define and respond to performance degradation in real-world settings?
- Human-AI Interaction and User Experience
5a. How do clinical usage patterns and user interactions influence AI-enabled medical device performance over time based on your observations?
5b. What design features, user training, or communication strategies have proven most effective for maintaining safe and effective use as systems evolve?
- Additional Considerations and Best Practices
6a. In addition to the factors previously mentioned, what other considerations, best practices, or tools were important in the development and implementation of your real-world validation system?
6b. Please address any implementation barriers encountered, incentives that supported your efforts, and approaches to maintaining patient privacy and data protections.
Please submit all public comments to the docket (FDA-2025-N-4203 for ‘Measuring and Evaluating Artificial Intelligence-enabled Medical Device Performance in the Real World; Request for Public Comment’), available at Regulations.gov. The public comment period will end on December 1, 2025.
Submitters who opt to respond are not required to respond to every question contained within this Request for Public Comment and may choose to address only those questions or combination of questions or topics that are relevant to their expertise, experience, or organizational capacity, and may provide partial responses, as appropriate.
Questions
For questions, contact digitalhealth@fda.hhs.gov.