Good machine learning practice for medical device development

rapjr9 · on Nov 21, 2022

Good wishlist, but how do you measure risk in a trained ML algorithm? What does reliability and stability mean? How do you measure bias? How much accuracy in results is enough for a specific purpose? These are all still research questions. Requiring diversity in training data inputs helps, but it is no guarantee of a fair or correct outcome, and there will likely always be weird edge cases where the algorithms go off the rails. Do you need to include Zulu people, Inuit people, and Hawaiian people in large proproportions in your training data? Maybe you need to train a different algorithm for each comorbidity a patient might present with? And all the combinations of those comorbidities? Does your training data have to include all the environmental factors that might affect the outcome? "Personalized medicine" recognizes that all people are unique and different in medically important ways, yet machine learning has a basic assumption that all people are the same and you can just dump data from everyone into the training and the algorithm will figure it out...somehow. This is unproven and likely just plain wrong.

5440 · on Nov 21, 2022

I think those are generally good questions but the majority of submissions that FDA CDRH are reviewing are for imaging of ultrasound, MRI, XRAY etc, with a sprinkling of audio AI and maybe some individualized vaccines predictions at CDER.

These questions are generally addressed during the pre-sub for each FDA submission but I agree that the demographic information could be pretty infinite. I'm working on at least 5 of these AI submissions per week.

For example, lets say you have an AI submission for identifying cardiac ultrasound images. FDA will ask for very specific demographic information in the training and performance tests, as well as comorbidities (such as hypertension) for each training image. In addition, they will want at least three physicians to annotate the images. The training dataset is likely to contain at least 100-200K images.

ware_am_i · on Nov 21, 2022

It's cool to see another FDA AI Regulatory person on HN. Drop me a note if you are interested in connecting -> chris@medlaunchnow.com

edge17 · on Nov 21, 2022

If three physicians are annotating 100-200k images, how much does that cost?

snowpid · on Nov 21, 2022

One startup from Hamburg that I know uses medical students for this, and from what I always hear, their training dataset is their most valuable asset.

s1artibartfast · on Nov 21, 2022

These are all good questions indeed, and ones that companies will have to figure out. If you aren't familiar with the medical space, there are already some methods for addressing these issues.

You could ask the same question about how a pharmaceutical agent interacts with a diverse set of patients

ware_am_i · on Nov 21, 2022

As a medical device company, establishing and documenting your approach to answer these questions can go a long way. More of a policy vs protocol approach.

Diapason · on Nov 21, 2022

I work in this space. It's so random to see this very guidance from 2021 popping up on HN today. Why today? Why this guidance? It's far from being the most interesting/relevant/up-to-date document about AI good practice out there!

pilotneko · on Nov 21, 2022

Honestly, discoverability for these kind of guidelines is very poor. I work in Healthcare (administrative/financial) and even with domain knowledge, it can be hard to find things on CMS’ website. For example, try and find the list of inpatient only procedures for 2023.

tomrod · on Nov 21, 2022

I would love additional references!

mgraczyk · on Nov 20, 2022

All of these points are good advice in any machine learning application. When I interview engineers and researchers for machine learning roles, these are some of the bullet points Google looks for.