FDA Digital Health and Artificial Intelligence Glossary – Educational Resource
This glossary is a compilation of commonly used terms in the digital health, artificial intelligence, and machine learning space and their definitions. These definitions are either directly from, or adapted from, various public sources, including consensus standard organizations and published literature.
The glossary is intended for general educational purposes. The glossary does not constitute agency guidance, policy, or recommendations. The glossary also does not constitute legally enforceable requirements nor does it affect any requirements of the Federal Food, Drug, and Cosmetic Act or implementing regulations.
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
A
A machine-based system that can, for a given set of human-defined objectives, make predictions, recommendations, or decisions influencing real or virtual environments. Artificial intelligence systems use machine- and human-based inputs to perceive real and virtual environments; abstract such perceptions into models through analysis in an automated manner; and use model inference to formulate options for information or action.
Source: From Executive Order 14110. (October 30, 2023). Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence.https://www.federalregister.gov/documents/2023/11/01/2023-24283/safe-secure-and-trustworthy-development-and-use-of-artificial-intelligence
Related Term(s): Machine Learning Model
Refers to the process of regularly collecting and analyzing data on the use of a deployed AI system to evaluate its performance in accomplishing its intended tasks in real-world settings. The assessment of an AI model’s performance involves various performance metrics and criteria depending on the specific application. This monitoring typically aims to assess the performance of these AI systems in practice, detect performance degradation or changes (e.g., due to data drift), identify instances of misuse, and address any safety or usability concerns.
Source: Adapted from:
- Sahiner, B., Chen, W., Samala, R. K., & Petrick, N. (2023). Data drift in medical machine learning: implications and potential remedies. British Journal of Radiology, 96(1150). https://doi.org/10.1259/bjr.20220878
- The White House. (2022). Blueprint for an AI Bill of Rights: Making Automated Systems Work for the American People. https://www.whitehouse.gov/ostp/ai-bill-of-rights/
Related Term(s): Data Drift
Any data system, software, hardware, application, tool, or utility that operates in whole or in part using AI.
Source: From Executive Order 14110. (October 30, 2023). Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence. https://www.federalregister.gov/documents/2023/11/01/2023-24283/safe-secure-and-trustworthy-development-and-use-of-artificial-intelligence
AI-enabled products designed to assist human decision making. The AI only provides suggestions, information, or data that helps users make more informed decisions.
Assistive AI and Autonomous AI exist on a spectrum. Examples of Assistive AI might include a wearable device that monitors a patient’s vital signs and alerts the user or a healthcare provider when certain metrics are out of the normal range or a product that assists radiologists by showing the location of a potential abnormality.
Source: Adapted from Bajwa, J., Munir, U., Nori, A., & Williams, B. (2021). Artificial intelligence in healthcare: transforming the practice of medicine. Future Healthcare Journal, 8(2), e188–e194. https://doi.org/10.7861/fhj.2021-0095
Related Term(s): Autonomous Artificial Intelligence
AI-enabled products that have the ability to perform tasks, operate independently, and make decisions without human intervention. The level of autonomy can vary based on the product.
Assistive AI and Autonomous AI exist on a spectrum. An example of Autonomous AI could be a product that autonomously identifies normal X-rays and creates reports without the need for radiologist intervention.
Source: Adapted from:
- International Organization for Standardization. (2022). Information technology — Artificial intelligence — Artificial intelligence concepts and terminology (ISO/IEC 22989:2022). https://www.iso.org/standard/74296.html
- Sáenz, A.D., Harned, Z., Banerjee, O., Abràmoff, M. D., & Rajpurkar, P. (2023). Autonomous AI systems in the face of liability, regulations and costs. Npj Digital Medicine, 6(1). https://doi.org/10.1038/s41746-023-00929-1
- Yang, G., Cambias, J., Cleary, K., Daimler, E., Drake, J., Dupont, P. E., Hata, N., Kazanzides, P., Martel, S., Patel, R. V., Santos, V. J., & Taylor, R. H. (2017). Medical robotics—Regulatory, ethical, and legal considerations for increasing levels of autonomy. Science Robotics, 2(4). https://doi.org/10.1126/scirobotics.aam8638
Related Term(s): Assistive Artificial Intelligence
C
The ability of a model to adapt its performance by incorporating new data or experiences over time while retaining prior knowledge/information.
The model changes are implemented such that for a given set of inputs, the output may be different before and after the changes are implemented. These changes are typically implemented and validated through a well-defined process that aims at improving performance based on analysis of new data. In contrast to a locked model, a continual machine learning model has a defined learning process to change its behavior.
Source: Adapted from International Organization for Standardization. (2022). Information technology — Artificial intelligence — Artificial intelligence concepts and terminology (ISO/IEC 22989:2022). https://www.iso.org/standard/74296.html
Synonym(s): Adaptive Model, Lifelong Learning
Related Term(s): Locked Model
A specialized deep neural network architecture that consists of one or more convolution layers that is suited for processing grid-like data, such as images. In a convolution layer, a “filter” (window or template) slides over regions of the input image to identify low-level patterns (e.g., edges) by applying convolution (a mathematical dot operation applied to the input data). Different filters can be applied to extract different features, such as edges, textures, or curves in images. Additionally, CNNs can include pooling layers, whose function is to reduce the feature dimensionality while retaining relevant features. These convolution and pooling layers get stacked on top of each other to enable this network to build up a hierarchical understanding of patterns and makes CNNs effective at tasks like image recognition and computer vision. An important aspect of this network is its ability to conserve spatial information of the original input while still performing the feature extraction.
Source: Adapted from:
- Gu, J., Wang, Z., Kuen, J., Ma, L., Shahroudy, A., Shuai, B., Liu, T., Wang, X., Wang, G., Cai, J., & Chen, T. (2018). Recent advances in convolutional neural networks. Pattern Recognition, 77, 354–377. https://doi.org/10.1016/j.patcog.2017.10.013
- International Organization for Standardization. (2022). Information technology — Artificial intelligence — Artificial intelligence concepts and terminology (ISO/IEC 22989:2022). https://www.iso.org/standard/74296.html
- Li, Z., Liu, F., Yang, W., Peng, S., & Zhou, J. (2022). A Survey of Convolutional Neural Networks: Analysis, Applications, and Prospects. IEEE Transactions on Neural Networks and Learning Systems, 33(12), 6999–7019. https://doi.org/10.1109/tnnls.2021.3084827
Related Term(s): Deep Learning, Neural Network
D
A structured report of relevant characteristics of datasets needed by stakeholders for AI development and evaluation. It contains a descriptive section including descriptive information such as number of samples, collection protocols and associated metadata, and a scorecard section, a quantitative analysis reporting dataset characteristics using relevant criteria and metrics.
Source: Adapted from Pushkarna, M., Zaldivar, A., & Kjartansson, O. (2022). Data Cards: Purposeful and Transparent Dataset Documentation for Responsible AI. Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency (FAccT '22). Association for Computing Machinery. https://doi.org/10.1145/3531146.3533231
Related Term(s): Model Card
Refers to the change in the input data distribution a deployed model receives over time, which can cause the model's performance to degrade. This occurs when the properties of the underlying data change. Data drift can affect the accuracy and reliability of predictive models.
For example, medical AI-enabled products can experience data drift due to, statistical differences between the data used for model development and data used in clinical operation due to variations between medical practices or context of use between training and clinical use, and changes in patient demographics, disease trends, and data collection methods over time.
Source: Adapted from:
- International Organization for Standardization. (2020). Software and systems engineering — Software testing — Part 11: Guidelines on the testing of AI-based systems (ISO/IEC TR 29119-11:2020). https://www.iso.org/standard/79016.html
- Sahiner, B., Chen, W., Samala, R. K., & Petrick, N. (2023). Data drift in medical machine learning: implications and potential remedies. British Journal of Radiology, 96(1150). https://doi.org/10.1259/bjr.20220878
A specialized branch of ML that involves training neural networks with multiple intermediary (hidden) layers that operate between an input layer that receives data and an output layer that presents the final network output. Each layer learns to transform its input data into a slightly more abstract and composite representation and produces an output that serves as an input for the next layer. As data propagates through successive layers, these models are able to learn hierarchical feature representations from the input data.
For example, in healthcare, deep learning models can be used to identify tumors or suspicious lesions in medical images to support physicians and radiologists in the evaluation of disease.
Source: Adapted from:
- Esteva, A., Robicquet, A., Ramsundar, B., Kuleshov, V., DePristo, M., Chou, K., Cui, C., Corrado, G., Thrun, S., & Dean, J. (2019). A guide to deep learning in healthcare. Nature Medicine, 25(1), 24–29. https://doi.org/10.1038/s41591-018-0316-z
- International Organization for Standardization. (2020). Software and systems engineering — Software testing — Part 11: Guidelines on the testing of AI-based systems (ISO/IEC TR 29119-11:2020). https://www.iso.org/standard/79016.html
Synonym(s): Deep Neural Network Learning
Related Term(s): Neural Network
A system that uses computing platforms, connectivity, software, and/or sensors for healthcare and related uses. These technologies span a wide range of uses, from applications in general wellness to applications as a medical device. They include technologies intended for use as a medical product, in a medical product, or as an adjunct to other medical products (devices, drugs, and biologics). They may also be used to develop or study medical products.
Source: From U.S. Food and Drug Administration. (2023). Digital Health Technologies for Remote Data Acquisition in Clinical Investigations, Guidance for Industry, Investigators, and Other Stakeholders. https://www.fda.gov/regulatory-information/search-fda-guidance-documents/digital-health-technologies-remote-data-acquisition-clinical-investigations.
A set of information constructs that mimics the structure, context, and behavior of a physical asset, is dynamically updated with data from its physical twin throughout its lifecycle and informs decisions. The bidirectional interaction between the virtual and the physical is central to the digital twin.
Digital twins can enable personalized medicine applications. For example, the digital twin of a patient could inform clinical decisions, such as treatment options and clinical assessments. In addition, digital twins can play a role in assembling large, diverse virtual population cohorts for in silico clinical trials, and in quality assessment and process optimization of drug manufacturing processes.
Source: Adapted from:
- American Institute of Aeronautics and Astronautics (2020). Digital Twin: Definition & Value. An AIAA and AIA Position Paper. https://www.aiaa.org/docs/default-source/uploadedfiles/issues-and-advocacy/policy-papers/digital-twin-institute-position-paper-(december-2020).pdf
- Badano, A., Lago, M. A., Sizikova, E., Delfino, J. G., Guan, S., Anastasio, M. A., & Sahiner, B. (2023). The stochastic digital human is now enrolling for in silico imaging trials—methods and tools for generating digital cohorts. Progress in Biomedical Engineering, 5(4), 042002. https://doi.org/10.1088/2516-1091/ad04c0
- National Academies of Sciences, Engineering, and Medicine. (2023). Foundational Research Gaps and Future Directions for Digital Twins. https://doi.org/10.17226/26894
E
ML techniques that combine multiple models to improve the overall predictive performance compared to using a single model.
This involves training a set of base models, such as neural networks, and then aggregating their predictions to make the final prediction. Some common ensemble methods include bagging (i.e., training multiple models on different subsets of the training data and averaging their predictions), boosting (i.e., training models sequentially where each new model focuses on correcting the errors of the previous model), and stacking (i.e., using the predictions of multiple base models as input features for a higher-level “meta-model” that learns how to best combine them).
Source: Adapted from:
- Dietterich, T.G. (2000). Ensemble Methods in Machine Learning. In: Multiple Classifier Systems. MCS 2000. Lecture Notes in Computer Science. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45014-9_1
- IEEE Standards. (2022). IEEE Standard for Performance and Safety Evaluation of Artificial Intelligence Based Medical Devices: Terminology (IEEE Std 2802™‐2022). https://standards.ieee.org/ieee/2802/7460/
Synonym(s): Ensemble Learning
"Refers to a representation of the mechanisms underlying AI systems’ operation." (Source: NIST)
Explainability may help overcome the opaqueness of black-box systems (i.e., systems where the internal workings and decision-making processes are not transparent or readily understandable). These explanations can take various forms, including free-text explanations, saliency maps, SHapley Additive exPlanations (SHAP), or relevant input examples from data. The primary intent is to answer the question "Why" an AI system made a particular decision. Appropriate Explainable AI (XAI) methods may enable the development of more accurate, fair, interpretable, and transparent AI systems to safely augment human decision-making in healthcare.
Source: Adapted from:
- Arrieta, A. B., Díaz-Rodríguez, N., Del Ser, J., Bennetot, A., Tabik, S., Barbado, A., García, S., Gil-López, S., Molina, D., Benjamins, R., Chatila, R., & Herrera, F. (2020). Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI. Information Fusion, 58, 82–115. https://doi.org/10.1016/j.inffus.2019.12.012
- International Organization for Standardization. (2022). Information technology — Artificial intelligence — Artificial intelligence concepts and terminology (ISO/IEC 22989:2022). https://www.iso.org/standard/74296.html
Synonym(s): Explainable AI (XAI)
Related Term(s): Interpretability
F
A ML process where attributes from raw data that best represent the underlying patterns are identified for use in training a specific ML model. It involves selecting, transforming, or creating relevant input variables (known as features) to enhance the performance of ML models.
Domain knowledge and data analysis techniques can be used to craft features that capture the inherent relationships in the data. For example, for a model that can predict heart failure, feature engineering on patient data may involve creating a “risk score” by combining relevant features such as age, blood pressure, cholesterol levels, and a history of cardiovascular disease.
Source: Adapted from International Organization for Standardization. (2020). Software and systems engineering — Software testing — Part 11: Guidelines on the testing of AI-based systems (ISO/IEC TR 29119-11: 2020). https://www.iso.org/standard/79016.html
A decentralized approach to training ML models. Models are trained by each site on data that are kept locally, and model updates are sent to a central server, whereby the central server aggregates these updates to improve a global model. This method is designed to preserve data privacy, as raw data remain at the local sites and are not centralized.
For example, federated learning can allow hospitals to collaborate on a heart disease prediction model without sharing patient data. The model is sent to be trained locally at each hospital, and only the model updates from each hospital, not raw data, are sent back and aggregated. This way, individual patient information remains localized, addressing privacy concerns while still benefiting from a collectively improved model.
Source: Adapted from:
- Darzidehkalani, E., Ghasemi-rad, M., & van Ooijen, P. M. A. (2022). Federated Learning in Medical Imaging: Part I: Toward Multicentral Health Care Ecosystems. Journal of the American College of Radiology, 19(8), 969–974. https://doi.org/10.1016/j.jacr.2022.03.015
- IEEE Standards. (2022). IEEE Standard for Performance and Safety Evaluation of Artificial Intelligence Based Medical Devices: Terminology (IEEE Std 2802™‐2022). https://standards.ieee.org/ieee/2802/7460/
Synonym(s): Federated Machine Learning
AI models trained using large, typically unlabeled datasets and significant computational resources, that are applicable across a wide range of contexts, including some that the models were not specifically developed and trained for (i.e., emergent capabilities). These models can serve as a foundation upon which further models can be built and adapted for specific uses through further training (i.e., fine-tuning). These models can perform a range of general tasks, such as text synthesis, image manipulation, and audio generation. These models are based on deep learning architectures like transformers and can use either unimodal or multimodal input data.
Source: Adapted from:
- Jones, E. (2023). Explainer: What is a foundation model? Ada Lovelace Institute. https://www.adalovelaceinstitute.org/resource/foundation-models-explainer/
- Wornow, M., Xu, Y., Thapa, R., Patel, B., Steinberg, E., Fleming, S., Pfeffer, M.A., Fries, J., & Shah, N. H. (2023). The shaky foundations of large language models and foundation models for electronic health records. Npj Digital Medicine, 6(1). https://doi.org/10.1038/s41746-023-00879-8
Related Term(s): Generative Artificial Intelligence, Large Language Model
G
A deep learning-based model architecture that normally consists of two competing neural networks, a generator, and a discriminator. The goal of the “generator” is to synthesize fake data to fool the “discriminator”, while the “discriminator” tries to discriminate between the synthesized examples (generator’s output) and the original training data distribution. The goal of the training is to find a point of equilibrium between the two competing networks, and after the training process, the generator learns to generate new data with the same distribution as the training set. This approach can be used to generate synthetic images.
Source: Adapted from:
- Pan, Z., Yu, W., Yi, X., Khan, A., Yuan, F., & Zheng, Y. (2019). Recent Progress on Generative Adversarial Networks (GANs): A Survey. IEEE Access, 7, 36322–36333. https://doi.org/10.1109/access.2019.2905015
- Singh, N. K., & Raza, K. (2021). Medical Image Generation Using Generative Adversarial Networks: A Review. In Patgiri, R., Biswas, A., Roy, P. (Eds.), Health Informatics: A Computational Perspective in Healthcare. Studies in Computational Intelligence (pp. 77–96). Springer, Singapore. https://doi.org/10.1007/978-981-15-9735-0_5
Related Term(s): Deep Learning, Generative Artificial Intelligence, Synthetic Data
“The class of AI models that emulate the structure and characteristics of input data in order to generate derived synthetic content. This can include images, videos, audio, text, and other digital content.” (Source: E.O. 14110)
This is usually done by approximating the statistical distribution of the input data. For example, in healthcare, generative AI can be used to generate annotations on synthetic medical data (e.g., image features, text labels) to help expand datasets for training algorithms.
Source: Adapted from:
- Meskó, B., & Topol, E. J. (2023). The imperative for regulatory oversight of large language models (or generative AI) in healthcare. Npj Digital Medicine, 6(1). https://doi.org/10.1038/s41746-023-00873-0
- Zohny, H., McMillan, J., & King, M. (2023). Ethics of generative AI. Journal of Medical Ethics, 49(2), 79–80. https://doi.org/10.1136/jme-2023-108909
Related Term(s): Foundation Models, Large Language Model, Multimodal
H
An approach where humans interact with ML models to enhance accuracy and end-user trust in the machine. In human in the loop ML, human interaction is iterative and can lead to continuous performance improvement over time. This interaction is especially relevant in scenarios where the model might be uncertain about its predictions and needs human guidance for verification.
Unlike human in the loop ML, supervised machine learning primarily involves human input during the data labeling phase, after which the algorithm trains independently. Labeling or annotation is the process of attaching descriptive information to data. Data itself are unchanged in the annotation process.
Source: Adapted from Mosqueira-Rey, E., Hernández-Pereira, E., Alonso-Ríos, D., Bobes-Bascarán, J., & Fernández-Leal, Á. (2022). Human-in-the-loop machine learning: a state of the art. Artificial Intelligence Review, 56(4), 3005–3054. https://doi.org/10.1007/s10462-022-10246-w
Related Term(s): Assistive Artificial Intelligence, Supervised Machine Learning
I
“Devices that have at least one transducer (sensor or actuator) for interacting directly with the physical world and at least one network interface (e.g., Ethernet, Wi-Fi, Bluetooth) for interfacing with the digital world.“ (Source: NIST)
For example, in healthcare, IoT devices can include wearable devices like smartwatches that can collect vital signs data, such as heart rate and activity levels, and smart inhalers with sensors that can track medication usage for asthma patients. This data can be sent to a mobile app or a central system and transmitted to healthcare providers to enable remote monitoring of patients.
Source: Adapted from Islam, S. M. R., Kwak, D., Kabir, M. H., Hossain, M., & Kwak, K. (2015). The Internet of Things for Health Care: A Comprehensive Survey. IEEE Access, 3, 678–708. https://doi.org/10.1109/access.2015.2437951
The ability to communicate and exchange data accurately, effectively, securely, and consistently with different information technology systems, software applications, and networks in various settings, and exchange data such that clinical or operational purpose and meaning of the data are preserved and unaltered.
Source: From U.S. Food and Drug Administration. (2024). Study Data Technical Conformance Guide- Technical Specifications Document. https://www.fda.gov/regulatory-information/search-fda-guidance-documents/study-data-technical-conformance-guide-technical-specifications-document (This Technical Specifications Document is incorporated by reference into the Guidance for Industry Providing Regulatory Submissions in Electronic Format – Standardized Study Data. https://www.fda.gov/regulatory-information/search-fda-guidance-documents/providing-regulatory-submissions-electronic-format-standardized-study-data)
Refers to the meaning of AI systems’ output in the context of their designed functional purposes.
Source: From National Institute of Standards and Technology. (2023). Artificial Intelligence Risk Management Framework (AI RMF 1.0). https://nvlpubs.nist.gov/nistpubs/ai/nist.ai.100-1.pdf.
Related Term(s): Explainability
L
A type of AI model trained on large text datasets to learn the relationships between words in natural language. These models can apply these learned patterns to predict and generate natural language responses to a wide range of inputs or prompts they receive, to conduct tasks like translation, summarization, and question answering. These models are characterized by a vast number of model parameters (i.e., internal learned variables within a trained model).
LLMs build on foundational AI models by developing more comprehensive language understanding beyond basic linguistic patterns. For example, in the context of LLMs, chatbot is a program that enables communication between the LLM and the human through text or voice commands in a way that mimics human-to-human conversation.
Source: Adapted from:
- Gottlieb, S., & Silvis, L. (2023). How to Safely Integrate Large Language Models Into Health Care. JAMA Health Forum, 4(9), e233909. https://doi.org/10.1001/jamahealthforum.2023.3909
- Thirunavukarasu, A. J., Ting, D. S. J., Elangovan, K., Gutiérrez, L., Tan, T. F., & Ting, D. S. W. (2023). Large language models in medicine. Nature Medicine, 29(8), 1930–1940. https://doi.org/10.1038/s41591-023-02448-8
Related Term(s): Foundation Models, Generative Artificial Intelligence
A model that provides the same output each time the same input is applied to it and does not change with use, as its parameters or configuration cannot be updated. In case of AI-enabled medical products, locked models can help ensure consistent performance.
Source: Adapted from Benjamens, S., Dhunnoo, P., & Meskó, B. (2020). The state of artificial intelligence-based FDA-approved medical devices and algorithms: an online database. Npj Digital Medicine, 3(1). https://doi.org/10.1038/s41746-020-00324-0
Related Term(s): Continual Machine Learning
M
A set of techniques that can be used to train AI algorithms to improve performance at a task based on data.
Source: From Executive Order 14110. (October 30, 2023). Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence. https://www.federalregister.gov/documents/2023/11/01/2023-24283/safe-secure-and-trustworthy-development-and-use-of-artificial-intelligence
Step-by-step procedures or set of instructions followed for performing a task or solving a problem.
For example, in ML, algorithms are used to train models using data to solve a specific problem.
Source: Adapted from El Naqa, I., Murphy, M.J. (2015). What Is Machine Learning? In: El Naqa, I., Li, R., Murphy, M. (eds) Machine Learning in Radiation Oncology. Springer, Cham. https://doi.org/10.1007/978-3-319-18305-3_1
Related Term(s): Machine Learning Model
The term “bias” is used in various contexts in different fields and industries. In the context of AI, bias refers to the systematic deviation in model predictions or outcomes for certain data points or groups compared to others. Here we are focusing on, algorithmic bias, where such deviations can stem from various sources, such as the characteristics of the training dataset, choices made during model development, data processing irregularities, or biases introduced during data collection or from human decisions. Algorithmic bias can lead to a systematic difference or error in treatment of certain objects, people, or groups in comparison to others, or prediction failures that can result in other risks, where treatment is any kind of action, including perception, observation, representation, prediction, or decision.
Source: Adapted from International Medical Device Regulators Forum. (2022). Machine Learning-enabled Medical Devices: Key Terms and Definitions. https://www.imdrf.org/documents/machine-learning-enabled-medical-devices-key-terms-and-definitions
A mathematical construct that generates an inference or prediction for input data. This model is the result of an ML algorithm learning from data. Models are trained by algorithms, which are step-by-step procedures used to process data and derive results. AI systems (e.g., AI-enabled medical devices) employ one or more models to achieve their intended purpose.
Source: Adapted from:
- International Medical Device Regulators Forum. (2022). Machine Learning-enabled Medical Devices: Key Terms and Definitions. https://www.imdrf.org/documents/machine-learning-enabled-medical-devices-key-terms-and-definitions
- International Organization for Standardization. (2022). Information technology — Artificial intelligence — Artificial intelligence concepts and terminology (ISO/IEC 22989:2022). https://www.iso.org/standard/74296.html
Related Term(s): Machine Learning Algorithm
The process of adjusting predicted probabilities generated by an ML model to ensure that they accurately reflect the observed frequencies of events or outcomes in the real world.
For example, if a model is well calibrated and predicts 20% probability of breast cancer for a patient, then the observed frequency of breast cancer should be approximately 20 out of 100 patients that were given such a prediction by the model.
Source: Adapted from Chen, W., Sahiner, B., Samuelson, F., Pezeshk, A., & Petrick, N. (2018) Calibration of medical diagnostic classifier scores to the probability of disease. Statistical Methods in Medical Research, 27(5). https://doi.org/10.1177/0962280216661371
A structured report of relevant technical characteristics of an AI model and benchmark evaluation results in a variety of conditions, such as across different cultural, demographic, or phenotypic groups and intersectional groups that are relevant to the intended application domains. Model cards also provide information about the context in which models are intended to be used and details of how their performance was assessed.
Source: Adapted from Mitchell, M., Wu, S., Zaldivar, A., Barnes, P., Vasserman, L., Hutchinson, B., Spitzer, E., Raji, I.D., & Gebru, T. (2019). Model Cards for Model Reporting. Proceedings of the Conference on Fairness, Accountability, and Transparency (FAT* '19). Association for Computing Machinery. https://doi.org/10.1145/3287560.3287596
Related Term(s): Data Card
The process of training an ML model to capture underlying patterns in the data by adjusting the training parameters to make the model’s predictions as close as possible to the target values in the training data.
This adjustment of the parameters enables the model to generalize its understanding of the data, making it useful for making predictions on new, unseen data. A well-fit model does not overfit or underfit but performs well both on the training data and on new, unseen data, due to correctly capturing the relationships between the input and target variables.
Source: Adapted from Géron, A. (2019). Hands-on Machine Learning with Scikit-Learn, Keras & TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems (2nd ed.). O'Reilly Media, Inc. https://www.oreilly.com/library/view/hands-on-machine-learning/9781492032632/
Related Term(s): Overfitting, Underfitting
The ability of an ML model to maintain its target or specified level of performance under different circumstances. These circumstances can include noisy data (e.g., data containing errors, inconsistencies, and missing values), unseen data or data drift, or adversarial attacks that manipulate the data to deceive the model.
For example, in healthcare, challenges in model robustness can arise in medical image classification, where variations in imaging conditions like lighting or resolution, can affect the performance of a tumor classification model trained on standardized images.
Source: Adapted from:
- International Organization for Standardization. (2022). Information technology — Artificial intelligence — Artificial intelligence concepts and terminology (ISO/IEC 22989:2022). https://www.iso.org/standard/74296.html
- Li, K., DeCost, B., Choudhary, K., Greenwood, M., & Hattrick‐Simpers, J. (2023). A critical examination of robustness and generalizability of machine learning prediction of materials properties. Npj Computational Materials, 9(1). https://doi.org/10.1038/s41524-023-01012-9
Related Term(s): Data Drift
A numerical parameter within an AI model that helps determine the model’s outputs in response to inputs.
Source: From Executive Order 14110. (October 30, 2023). Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence. https://www.federalregister.gov/documents/2023/11/01/2023-24283/safe-secure-and-trustworthy-development-and-use-of-artificial-intelligence
An approach for processing and integrating multiple different data types, aiming to capture and leverage the relationships between them for a better understanding of the input information or improved prediction performance. These data types may include text, images, audio, video, genomics, sensor data, etc. These different data types may be processed using a single multimodal network (e.g., based on neural network, or other architectures) or through separate unimodal networks (e.g., LLMs for text and CNNs for images) where the unimodal outputs are combined.
For example, in healthcare, data from electronic health records and wearable biosensors can be combined to enable remote monitoring of patients.
Source: Adapted from:
- Acosta, J.N., Falcone, G. J., Rajpurkar, P., & Topol, E. J. (2022). Multimodal biomedical AI. Nature Medicine, 28(9), 1773–1784. https://doi.org/10.1038/s41591-022-01981-2
- Kline, A., Wang, H., Li, Y., Dennis, S., Hutch, M., Xu, Z., Wang, F., Cheng, F., & Luo, Y. (2022). Multimodal machine learning in precision health: A scoping review. Npj Digital Medicine, 5(1). https://doi.org/10.1038/s41746-022-00712-8
Synonym(s): Multimodal Approach, Multimodal Learning
N
A subfield of AI and linguistics that enables computers to understand, process, interpret, and generate human language. NLP systems can perform tasks such as text classification, sentiment analysis, and translation, using techniques from computational linguistics and ML to process and analyze natural language data.
Natural Language Generation is one application of NLP, which involves using AI systems to produce human-readable text outputs like summaries, reports, stories, or responses.
Source: Adapted from:
International Organization for Standardization. (2022). Information technology — Artificial intelligence — Artificial intelligence concepts and terminology (ISO/IEC 22989:2022). https://www.iso.org/standard/74296.html
Network of the National Library of Medicine. (n.d.). Natural language processing. National Institute of Health. Retrieved July 31, 2024, from https://www.nnlm.gov/guides/data-glossary/natural-language-processing
A computational model inspired by the structure of the human brain. It is composed of interconnected nodes, or “neurons” organized into layers: an input layer that receives data, one or more hidden layers that process and identify patterns in the data, and an output layer that presents the final network output.
Source: Adapted from:
- International Organization for Standardization. (2020). Software and systems engineering — Software testing — Part 11: Guidelines on the testing of AI-based systems (ISO/IEC TR 29119-11: 2020). https://www.iso.org/standard/79016.html
- Network of the National Library of Medicine. (n.d.). Neural Networks. National Institute of Health. Retrieved July 31, 2024, from https://www.nnlm.gov/guides/data-glossary/neural-networks
Synonym(s): Artificial Neural Network, Neural Net
Related Term(s): Deep Learning
O
In ML, overfitting occurs when a model learns the training data too thoroughly, capturing not just the fundamental patterns, but also noise or random fluctuations. Such a model might excel on the training data, but struggles to generalize to new, unseen data.
Source: Adapted from IEEE Standards. (2022). IEEE Standard for Performance and Safety Evaluation of Artificial Intelligence Based Medical Devices: Terminology (IEEE Std 2802™‐2022). https://standards.ieee.org/ieee/2802/7460/
Related Term(s): Model Fitting, Underfitting
P
In the context of AI quantitative or qualitative measures that can be used to assess the ability of a model to produce the desired output for a given task. The choice of the metrics depends on the specific task and the model objectives.
Examples of quantitative metrics include accuracy, precision, sensitivity (recall), specificity, F1-score, and Area under the Receiver Operating Characteristic curve (AUC-ROC). Qualitative measures may involve heatmap evaluations or visual interpretations. These metrics enable systematic evaluation, comparison, and refinement of models, and aid in the assessment of whether the model meets its intended objectives.
Source: Adapted from International Organization for Standardization. (2020). Software and systems engineering — Software testing — Part 11: Guidelines on the testing of AI-based systems (ISO/IEC TR 29119-11: 2020). https://www.iso.org/standard/79016.html
Any software or hardware solution, technical process, technique, or other technological means of mitigating privacy risks arising from data processing, including by enhancing predictability, manageability, disassociability, storage, security, and confidentiality. These technological means may include secure multiparty computation, homomorphic encryption, zero-knowledge proofs, federated learning, secure enclaves, differential privacy, and synthetic-data-generation tools.
Source: From Executive Order 14110. (October 30, 2023). Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence. https://www.federalregister.gov/documents/2023/11/01/2023-24283/safe-secure-and-trustworthy-development-and-use-of-artificial-intelligence
Synonym(s): Privacy-preserving technology
R
The best available method for establishing or measuring the true state or property of the phenomenon being examined, often represented in the form of labeled data in AI. It serves as a benchmark against which the outputs of a model are evaluated. In clinical settings and medical research, a reference standard is a diagnostic measure or method that is considered to be the gold standard clinically and is used to validate the results. For instance, a reference standard can indicate the presence, extent, and location of diseases or abnormalities.
Labeling or annotation is the process of attaching descriptive information to data. Data itself are unchanged in the annotation process.
Source: Adapted from Bossuyt, P. M., Reitsma, J. B., Bruns, D. E., Gatsonis, C.A., Glasziou, P.P., Irwig, L., Lijmer, J. G., Moher, D., Rennie, D., de Vet, H. C.W., Kressel, H. Y., Rifai, N., Golub, R. M., Altman, D. G., Hooft, L., Korevaar, D. A., & Cohen, J. F., for the STARD Group. (2015). STARD 2015: an updated list of essential items for reporting diagnostic accuracy studies. BMJ. https://doi.org/10.1136/bmj.h5527
Synonym(s): Gold Standard, Ground Truth
A ML approach where a model (or agent) learns by taking actions and getting rewards or penalties through its interactions with an environment. The model learns from the consequences of its actions, rather than from being explicitly taught, and selects its actions based on its past experiences (exploitation) and by making new choices (exploration), which is essentially trial and error learning.
For example, in healthcare, reinforcement learning can be used for recommending personalized treatment plans for patients with chronic diseases. The model is given patient data, including their medical history, current health status, and treatment responses, and then suggests a treatment plan. The key is the feedback loop: as patient data is continually updated with information on how well they are responding to the treatment, the model adjusts its recommendations accordingly. This process involves a lot of trial and error, as the model learns from each patient interaction. Over time, through many such interactions, the model becomes more adept at predicting and recommending the most effective treatment plans for individual patients.
Source: Adapted from International Medical Device Regulators Forum. (2022). Machine Learning-enabled Medical Devices: Key Terms and Definitions. https://www.imdrf.org/documents/machine-learning-enabled-medical-devices-key-terms-and-definitions
Related Term(s): Self-Supervised Machine Learning, Semi-Supervised Machine Learning, Supervised Machine Learning, Unsupervised Machine Learning
S
ML algorithms that generate their own labels from the available unlabeled data. Unlike supervised learning, where labeled data are provided, and unsupervised learning, which uncovers hidden patterns without labels, self-supervised learning leverages the inherent structure within the data to create its own labels. This approach is useful when labeled data are limited or unavailable.
Source: Adapted from:
- Krishnan, R., Rajpurkar, P., & Topol, E. J. (2022). Self-supervised learning in medicine and healthcare. Nature Biomedical Engineering, 6(12), 1346–1352. https://doi.org/10.1038/s41551-022-00914-1
- Shurrab, S., & Duwairi, R. (2022). Self-supervised learning methods and applications in medical imaging analysis: a survey. PeerJ Computer Science, 8, e1045. https://doi.org/10.7717/peerj-cs.1045
Synonym(s): Self-Supervised Learning
Related Term(s): Semi-Supervised Machine Learning, Supervised Machine Learning, Unsupervised Machine Learning
ML algorithms that leverage both unsupervised and supervised techniques. Supervised learning techniques are trained using labeled data, while unsupervised learning techniques are trained using unlabeled data. Labeling or annotation is the process of attaching descriptive information to data. Data itself are unchanged in the annotation process.
For example, consider the task of diagnosing lung diseases from chest X-rays. A semi-supervised learning model would initially be trained on a small set of labeled X-ray images, where each image has been marked by radiologists as showing signs of specific lung conditions or being normal. The model then uses this knowledge to start making predictions on a larger set of unlabeled images.
Source: Adapted from International Medical Device Regulators Forum. (2022). Machine Learning-enabled Medical Devices: Key Terms and Definitions. https://www.imdrf.org/documents/machine-learning-enabled-medical-devices-key-terms-and-definitions
Synonym(s): Semi-Supervised Learning
Related Term(s): Self-Supervised Machine Learning, Supervised Machine Learning, Unsupervised Machine Learning
ML algorithms where labeled data is provided, and algorithms are trained using the labeled data. Labeling or annotation is the process of attaching descriptive information to data. Data itself is unchanged in the annotation process.
Source: Adapted from International Medical Device Regulators Forum. (2022). Machine Learning-enabled Medical Devices: Key Terms and Definitions. https://www.imdrf.org/documents/machine-learning-enabled-medical-devices-key-terms-and-definitions
Synonym(s): Supervised Learning
Related Term(s): Self-Supervised Machine Learning, Semi-Supervised Machine Learning, Unsupervised Machine Learning
Data that have been created artificially (e.g., through statistical modeling, computer simulation) so that new values and/or data elements are generated. Generally, synthetic data are intended to represent the structure, properties and relationships seen in actual patient data, except that they do not contain any real or specific information about individuals.
For example, in healthcare, synthetic data are artificial data that are intended to mimic the properties and relationships seen in real patient data. Synthetic data are examples that have been partially or fully generated using computational techniques rather than acquired from a human subject by a physical system.
Source: Adapted from:
- Chen, R. J., Lu, M.Y., Chen, T.Y., Williamson, D. F. K., & Mahmood, F. (2021). Synthetic data in machine learning for medicine and healthcare. Nature Biomedical Engineering, 5(6), 493–497. https://doi.org/10.1038/s41551-021-00751-8
- Executive Order 14110. (October 30, 2023). Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence. https://www.federalregister.gov/documents/2023/11/01/2023-24283/safe-secure-and-trustworthy-development-and-use-of-artificial-intelligence
- Giuffré, M., & Shung, D.L. (2023). Harnessing the power of synthetic data in healthcare: innovation, application, and privacy. Npj Digital Medicine, 6(1). https://doi.org/10.1038/s41746-023-00927-3
- Myles, P., Ordish, J., & Tucker, A. (2023). The potential synergies between synthetic data and in silico trials in relation to generating representative virtual population cohorts. Progress in Biomedical Engineering, 5(1), 013001. https://doi.org/10.1088/2516-1091/acafbf
T
A facility or mechanism equipped for conducting rigorous, transparent, and replicable testing of tools and technologies, including AI and privacy-enhancing technologies, to help evaluate the functionality, usability, and performance of those tools or technologies.
Source: From Executive Order 14110. (October 30, 2023). Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence. https://www.federalregister.gov/documents/2023/11/01/2023-24283/safe-secure-and-trustworthy-development-and-use-of-artificial-intelligence
These data are used to characterize the performance of an AI system. These data are never shown to the algorithm during training and are used to estimate the AI model’s performance after training. Testing is conducted to generate evidence to establish the performance of an AI system before the system is deployed or marketed.
For AI-enabled medical products, test data should be independent of data used for training and tuning.
Source: Adapted from International Medical Device Regulators Forum. (2022). Machine Learning-enabled Medical Devices: Key Terms and Definitions. https://www.imdrf.org/documents/machine-learning-enabled-medical-devices-key-terms-and-definitions
Related Term(s): Training Data, Tuning Data
These data are used by the manufacturer of an AI system in procedures and training algorithms to build an AI model, including to define model weights, connections, and components.
Source: Adapted from International Medical Device Regulators Forum. (2022). Machine Learning-enabled Medical Devices: Key Terms and Definitions. https://www.imdrf.org/documents/machine-learning-enabled-medical-devices-key-terms-and-definitions
Related Term(s): Test Data, Tuning Data
A strategic approach within ML wherein a model developed for a particular task is adapted for a second task. This approach leverages the knowledge and patterns acquired from a previously solved problem (source task) to boost the performance and learning efficiency of a model on a subsequent, often similar, problem (target task).
For example, in healthcare, a model trained to identify tumors in lung X-ray images might leverage the learned patterns to improve the identification of abnormalities in liver ultrasound images.
Source: Adapted from Yu, X., Wang, J., Hong, Q., Teku, R., Wang, S., & Zhang, Y. (2022). Transfer learning for medical images analyses: A survey. Neurocomputing, 489, 230–254. https://doi.org/10.1016/j.neucom.2021.08.159
These data are typically used by the manufacturer of an AI system to evaluate a small number of trained models. This process involves exploring various aspects, including different architectures or hyperparameters (i.e., parameters used to tune the model for the task). The tuning phase happens before the testing phase of the AI system and is part of the training process. While the AI and ML communities sometimes use the term “validation” to refer to the tuning data and phase, the FDA will not typically use the word “validation” in this context due to its specific regulatory definition (see 21 CFR 820.3(z)).
Source: Adapted from IEEE Standards. (2022). IEEE Standard for Performance and Safety Evaluation of Artificial Intelligence Based Medical Devices: Terminology (IEEE Std 2802™‐2022). https://standards.ieee.org/ieee/2802/7460/
Related Term(s): Test Data, Training Data
U
In ML, underfitting happens when a model does not capture the patterns and complexity of the training data, leading to poor performance on both the training and new, unseen data.
Source: Adapted from IEEE Standards. (2022). IEEE Standard for Performance and Safety Evaluation of Artificial Intelligence Based Medical Devices: Terminology (IEEE Std 2802™‐2022). https://standards.ieee.org/ieee/2802/7460/
Related Term(s): Overfitting
ML algorithms that only make use of unlabeled data during training. Unsupervised learning seeks to uncover hidden patterns or structures within the data.
Source: Adapted from International Medical Device Regulators Forum. (2022). Machine Learning-enabled Medical Devices: Key Terms and Definitions. https://www.imdrf.org/documents/machine-learning-enabled-medical-devices-key-terms-and-definitions
Synonym(s): Unsupervised Learning
Related Term(s): Self-Supervised Machine Learning, Semi-Supervised Machine Learning, Supervised Machine Learning
W
The act of embedding information, which is typically difficult to remove, into outputs created by AI—including into outputs such as photos, videos, audio clips, or text—for the purposes of verifying the authenticity of the output or the identity or characteristics of its provenance, modifications, or conveyance.
Source: From Executive Order 14110. (October 30, 2023). Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence. https://www.federalregister.gov/documents/2023/11/01/2023-24283/safe-secure-and-trustworthy-development-and-use-of-artificial-intelligence
Feedback: Was the Glossary useful? Let us know what you think. The FDA plans to routinely update the new Digital Health and Artificial Intelligence Glossary as appropriate.
Citation: U.S. Food and Drug Administration. (2024). September 26, 2024. Digital Health and Artificial Intelligence Glossary – Educational Resource https://www.fda.gov/science-research/artificial-intelligence-and-medical-products/fda-digital-health-and-artificial-intelligence-glossary-educational-resource