Development of a preliminary patient safety classification system for generative AI.
Hose B-Z, Handley JL, Biro J, et al. Development of a preliminary patient safety classification system for generative AI. BMJ Qual Saf. 2025;34(2):130-132. doi:10.1136/bmjqs-2024-017918.
Information on the prevalence of errors in artificial intelligence applications and their impact on the healthcare system provides important guidance on development, implementation, and use. This article describes the development of a classification system for two popular uses of AI in health care: patient-facing large language models (LLM) and ambient digital scribes (ADS). Errors were prevalent in both types, with errors of omission being the most common. Although most errors in the LLM were categorized as having low clinical significance, 25% were categorized as high clinical significance (e.g., omissions of urgent guidance for conditions such as heart attack symptoms).