Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

On Limitations of the Transformer Architecture https://arxiv.org/abs/2402.08164

Theoretical limitations of multi-layer Transformer https://arxiv.org/abs/2412.02975



Only skimmed, but both seem to be referring to what transformers can do in a single forward pass, reasoning models would clearly be a way around that limitation.

o4 has no problem with the examples of the first paper (appendix A). You can see its reasoning here is also sound: https://chatgpt.com/share/681b468c-3e80-8002-bafe-279bbe9e18.... Not conclusive unfortunately since this is in date-range of its training data. Reasoning models killed off a large class of "easy logic errors" people discovered from the earlier generations though.


Your unwillingness to engage with the limitations of the technology explains a lot of the current hype.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: