Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I work in the medical imaging space, specifically with implementing deep learning into clinical practice. I see a lot of people making a lot of fuss about what type of network or loss function to use. I would argue that this focus is misguided 90% of the time. Sure, maybe using a very specific network architecture and custom loss can edge you out by a 2-3% performance gain. But is that making or breaking the fundamental clinical application? I would argue that it usually is not. Instead, I've seen how much of the deep learning in medical imaging is driven by the quality and diversity of source data, which in the medical space can often be scarce for a number of reasons.

I'm reminded about this tweet, which emphasizes that a lot of your performance is going to be down to how good your datasets are [0].

[0]. https://twitter.com/lishali88/status/994723759981453312




One of the authors here.

I agree that most of the value in a clinical application won't come from the often (but not always) relatively small performance gains by tweaking your neural network architecture or fiddling with the loss function. Collecting a high-quality and diverse dataset is important for training and arguably even more important for validation because you want to show that the deployed model is reliable.

But before deploying a model, I'd argue that it is worth testing a few architectures out to determine if one is substantially better than the rest. It can be a pain to test out a bunch of architectures, but the ones we mention in the article have many implementations freely available (and we provide ones too!). So you can drop in one of these architectures and test them out pretty easily (especially if you skip hyperparameter tuning initially).

Spending too much time fussing over a 2-3% performance gain is silly, but sometimes, surprisingly, the difference in performance by choosing another architecture can be much greater. I wish I had more intuition as to why some architectures perform well and others don't. It would certainly make R&D easier if you could totally ignore the architecture choice.


Where do you work if you don't mind my asking?




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: