Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The point? It's an almost useless way of looking at things.


What is the useful way?


The usual. A computer is a computer with memory cells, CPU and so on, or the Turing machine for certain purposes. A deep neural network is a deep neural network with a certain architecture (it allows to reason about training, at least), or the universal approximator (in the same limited way as a computer is the Turing machine), or a function (but it's too general).

For neural networks, the circuit complexity model (e.g. class TC0) is probably better suited.

It's the circuit complexity of one forward pass.

And, well, what would a human do when they encounter a problem of recognizing an arbitrarily complex grammar? Would they try to learn it? Would they succeed? Or would they write a program to recognize it? Is writing a program in TC0?

It might be a genuine limitation that guaranties subhuman performance at practically achievable network sizes, but it's not that straightforward.


That's actually a good point I had not thought of before... do you happen to know what this concept is called in the literature?

Indeed, even if someone discovered that LLMs can only ever emit/recognize context-sensitive languages, that is still sufficient to write programs that can then simulate the turing machine and give you turing completeness in a "meta" way. (The catch might be that the program is only turing complete in an "abstract" sense, just as in the real-world running programs don't have infinite tape).


The term for research on whether LLMs can learn and execute this or that algorithm of program synthesis? I guess there's none. If we do know the algorithm, it's better to teach LLM to write a program that outputs a program. If we don't know the algorithm, we can't reason whether LLM can do it.

> even if someone discovered that LLMs can only ever emit/recognize context-sensitive languages

https://arxiv.org/abs/2310.07923 Their recognition abilities are limited by context-sensitive languages, if the number of steps in chain-of-thought is linear in the length of input.

All in all, asymptotic behavior of a network on an algorithmic task doesn't matter that much for AI. What matters is whether the network can simplify a problem to a manageable length.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: