Risk-informed design of Software as a Medical Device through Natural Language Processing techniques

Authors: Luschi A. , Zazzeri A. , Cevenini G. , Iadanza E.

Keywords

Deep learning; Failure classification; Health information technology; Medical device design; Natural Language Processing; Risk-informed design; Software as a Medical Device.

Summary

The rapid evolution of Software as a Medical Device (SaMD) for diagnostic and therapeutic applications requires evidence-based design strategies to maximise the benefit-to-risk ratio. This study aims to provide manufacturers with a framework for reducing risks by design. We also propose a novel standard classification of software failures related to Health Information Technologies (HIT) using a Natural Language Processing (NLP) multinomial classifier, shaping the entire design process of SaMD in an evidence-based, risk-aware manner.
Adverse event reports (2022–2024) were extracted from the FDA MAUDE database. HIT reports were identified using a binomial NLP classifier from the authors. A preliminary taxonomy of failure modes was derived from the literature and refined using self-supervised learning. K-modes clustering was applied to generate a balanced sample of 1048 records, then manually labelled and used to fine-tune the final classifier. Model performance was assessed through 10-fold cross-validation.
The multinomial classifier achieved cross-validated accuracies between 74.29% and 83.81% with an F1-score up to 0.87 for dominant classes. It enables rapid identification of recurring issues, helping developers prioritise design improvements based on real-world risks. Nine failure categories were also identified. Underrepresented categories exhibited lower performance due to the limited availability of training data. This study demonstrates the feasibility of integrating deep learning-based failure classification into SaMD design workflows and proposes a standard classification for HIT-related software failures. By leveraging insights from historical data, manufacturers can proactively identify and mitigate potential hazards, thereby enhancing both patient safety and regulatory compliance. This proactive, data-driven approach supports the creation of safer and more reliable biomedical devices and digital health technologies.

For futher details, here is the link for the open access publication.