By Shawn M. Schmitt
Communications Specialist, Enzyme
Data bias in medical devices that are enabled with artificial intelligence (AI) and machine learning (ML) will play a front-and-center role in an upcoming Technical Specification (TS) document from the International Organization for Standardization (ISO). ISO TS 24971-2 will serve as a companion guide to ISO TR 24971:2020, an ISO Technical Report that offers manufacturers guidance on using ISO 14971:2019, a voluntary standard that instructs MedTech companies on how to best put together a risk management program.
“Machine learning requires a model and fine-tuning of the model. And for fine-tuning, you need data that can be actual patient data or synthetic data that is created to simulate patients. And that data is used for training the algorithm – that is the machine learning aspect. When the data is not selected properly, it can lead to bias, and bias is an important aspect that we discuss in the forthcoming Technical Specification,” says Jos van Vroonhoven, a longtime Senior Manager for Standardization for Philips and convener of ISO Technical Committee 210, Joint Working Group 1 (TC210/JWG1), which developed TS 24971-2.
The roughly 40-page Technical Specification doc, titled “Medical devices – Guidance on the application of ISO 14971: Part 2: Machine learning in artificial intelligence,” was strongly informed by a Technical Information Report (TIR) published in 2023 by the Association for the Advancement of Medical Instrumentation, AAMI TIR34971, “Application of ISO 14971 to machine learning in artificial intelligence.” The draft of ISO TS 24971-2 is currently under review by TC 210 and a final version will likely be published later in 2025.
“Because artificial intelligence and ML are software-driven, the unique or elevated risks are those around data management, feature extraction, algorithm training, evaluation, bias, health inequity, safety, and cyber and information security,” ISO explains on its website, noting that the forthcoming ISO 24971-2 document will provide “examples and suggests strategies for eliminating or mitigating the associated risk.”
When it comes to overcoming the issue of bias, the quality of data used by manufacturers is of the utmost importance, Van Vroonhoven told Enzyme in an interview.
“That data is used for learning the algorithms of the medical device so it can make proper predictions in real-world situations for actual patients,” he explained. “So the testing data needs to be of good quality. For example, when you use data for male patients only, the predictions for female patients might be incorrect. Or when you have data for younger people, or only for older people who visit hospitals more frequently, it could be incorrect for other age groups. The developers need to be aware that there can be a bias, a hidden selection in their data. Sometimes it can be intentional – for example, the data could be collected by a hospital specifically for the elderly or a hospital in a specific region – but in most cases, this unconscious bias can affect the data and thereby the outcome of your medical device that uses machine learning.”
Van Vroonhoven went on: “When you are unaware that your data quality might be insufficient, then you can have unexpected results. So it is mainly to create awareness, and there is a long list of possible things that could be in your data. ISO 14971 gives a long annex of questions that a manufacturer can ask to create awareness, to find the typical characteristics of the medical device or of the machine learning model that can affect the safety and outcome – the diagnostic value – in the end.”
Writing Standards For a Fast-Paced World
When it comes to risk management and AI/ML, Pat Baird takes the concept very seriously. A member of TC210/JWG1 and other industry groups, Baird refers to the forthcoming ISO TS 24971-2 and the already published AAMI TIR34971, the source material for TS 24971-2, as his “kids.” He concedes that it can be a daunting task to keep such widely used documents up to date when it comes to rapidly evolving topics like artificial intelligence and machine learning.
“TIR34971 was a snapshot in time,” said Baird, who is also Regulatory Head of Global Software Standards for Philips. “We’re currently debating whether or not to have Generative AI in TIR34971, because Gen AI really wasn’t around so much when we wrote that TIR.”
The ever-advancing clock of medical technology puts pressure on standards writers, Baird told Enzyme. “I’m constantly arguing about that with myself. Should these documents have a disclaimer on every other sentence that says, for example, ‘This is not a complete list’? I also debate whether it would be better to have an incomplete list or have no list at all.”
Baird doesn’t see many avenues for standards groups to take that would speed up the process for high-stakes topics that can change on a dime like AI/ML. Presently, an ISO Consensus Report, or CR, appears to be the best way to get guidance out the door and into the hands of manufacturers more quickly. A CR gives industry guidance that explains what the “majority” on a particular topic agrees on when it comes to issues that are new and evolving.
Standards groups like ISO and the International Electrotechnical Commission (IEC) are using online development tools to ensure that information is centralized and digital, Philips’ Van Vroonhoven said, and he believes that has helped to hasten the standards process – but not by much.
“There still needs to be a careful review of all of the documents by national committees, and that takes time,” he said. “And there must be several rounds of commenting that requires careful consideration, and that obviously takes time. When a committee votes on a draft international standard, that’s going to take 12 weeks for commenting and a balloting period, and I don’t see that changing anytime soon.”
However, he noted that an unexpected benefit that came out of the COVID-19 pandemic and the heightened uptake of video conferencing platforms like Zoom and Teams to conduct meetings is that face-to-face get-togethers now happen much faster.
“The working groups can produce documents faster using those electronic means,” Van Vroonhoven said. “Before that, we had to physically meet or spend a lot of time writing lengthy emails. So what might happen in the future as far as standards timing is anyone’s guess.”



