Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I think LLMs and CoT are powerful, but I agree with this description of what they do.

The important step is that they demonstrate that there are no architectural limits that keep the LLM from acting in these domains, "only" knowledge/planning ones. Once we have a big enough dataset of CoT prompts, the model "just" has to generalize CoT prompting, not CoT following. It decomposes the problem into two halves, and demonstrates that if one half (instruction generation) is provided, the other (instruction following) becomes very tractable.



Thanks for clarifying. That's not the claim that Rao is tweeting about though. Rao as far as I can tell is responding to a stream of papers that appeared in the last couple of years claiming that LLMs can plan right now, whether using CoT, or not. See the paper I linked to my reply to amenhotep (!) below for an example.

As to blocks world planning in particular, LLMs already have plenty of examples of block stacking problems - those are the standard motivational experiment in planning papers, like solving mazes is for Reinforcement Learning. Google returns 10 pages of results for "blocks world planning". If LLMs were capable of generalising as well as you expect they will one day, they should already be capable of solving block stacking problems without CoT and with no guidance.


Saying that LLMs should be able to do problems without CoT is a bit like saying that humans should be able to write programs without thinking though. CoT is fundamentally an architectural necessity owed to the finite cognitive effort possible per token. Let it be CoT, let it be QuietSTaR ( https://arxiv.org/abs/2403.09629 ), let it be pause tokens ( https://arxiv.org/abs/2310.02226 ), let it even be dots ( https://arxiv.org/abs/2404.15758 ), but the thinking has to happen somewhere.

I think mostly, LLMs at the moment are incredibly uneven. LLM assistants can pull obscure knowledge out of nowhere one second and fail extremely basic reasoning the next. So just because some example happens a lot in the source material doesn't mean the LLM can learn it. IMO, that CoT works at all is more down to luck of the training set than any inherent capability of the LLM.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: