I work in the medical imaging space, specifically with implementing deep learning into clinical practice. I see a lot of people making a lot of fuss about what type of network or loss function to use. I would argue that this focus is misguided 90% of the time. Sure, maybe using a very specific network architecture and custom loss can edge you out by a 2-3% performance gain. But is that making or breaking the fundamental clinical application? I would argue that it usually is not. Instead, I've seen how much of the deep learning in medical imaging is driven by the quality and diversity of source data, which in the medical space can often be scarce for a number of reasons.
I'm reminded about this tweet, which emphasizes that
a lot of your performance is going to be down to how good your datasets are [0].
I agree that most of the value in a clinical application won't come from the often (but not always) relatively small performance gains by tweaking your neural network architecture or fiddling with the loss function. Collecting a high-quality and diverse dataset is important for training and arguably even more important for validation because you want to show that the deployed model is reliable.
But before deploying a model, I'd argue that it is worth testing a few architectures out to determine if one is substantially better than the rest. It can be a pain to test out a bunch of architectures, but the ones we mention in the article have many implementations freely available (and we provide ones too!). So you can drop in one of these architectures and test them out pretty easily (especially if you skip hyperparameter tuning initially).
Spending too much time fussing over a 2-3% performance gain is silly, but sometimes, surprisingly, the difference in performance by choosing another architecture can be much greater. I wish I had more intuition as to why some architectures perform well and others don't. It would certainly make R&D easier if you could totally ignore the architecture choice.
As part of a law firm, I've submitted about 50+ AI/ML applications to FDA and EU on behalf of our clients. I don't think Ive ever seen anything but U-Net and Resnet at this point. This article was helpful for me.
What about providing some empirical evidence on how to choose a network? It's not enough to list a few alternative architectures - how are readers supposed to know which ones are worth trying first? This seems to be a problem in deep learning - too many seemingly important model parameter choices are more often than not just selected based on author preference.
For another perspective on applying machine learning to medical imaging, I recommend the blog of Luke Oakden-Rayner[1]. He's a radiologist first, so he's in a great position to bring some well-needed skepticism to the conversation. I learned about a lot of complications that I never would have imagined as a lay-person.
I'm reminded about this tweet, which emphasizes that a lot of your performance is going to be down to how good your datasets are [0].
[0]. https://twitter.com/lishali88/status/994723759981453312