Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> and that vision/hearing are actually integral parts of language acquisition

Deaf-blind authors would beg to differ.

But yes, a human brain is exposed to lots of other sensory input, and we know from other research that multi-modal models can learn shared representations that benefit from the knowledge of each domain.

In Transformer's favor, at least, they are far closer to tabula rasa than the human brain is and likely have to dedicate a lot of their training time to things that are otherwise "baked" into human brains. For example, humans come pre-packaged with V1 and V2 as part of their visual system, but CNNs and ViTs have to learn those filter packages from scratch.

I agree with you though. Human brains are able to take single instances of experiences and build a wealth of understanding from them in ways that even modern Transformer architectures are not yet able.



It seems like internal language (thinking in language) is also a way our brains train themselves too? I’ve probably thought 100x more words than I’ve spoken.


This would map to a sort of semi-supervised approach. For a lot of problems this has shown to drastically reduce the data requirements, but can bump up compute.

All those conversations in the shower were actually regularizers!




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: