Hacker Newsnew | past | comments | ask | show | jobs | submit | hackinthebochs's commentslogin

Linear regression has well characterized mathematical properties. But we don't know the computational limits of stacked transformers. And so declaring what LLMs can't do is wildly premature.

> And so declaring what LLMs can't do is wildly premature.

The opposite is true as well. Emergent complexity isn’t limitless. Just like early physicists tried to explain the emergent complexity of the universe through experimentation and theory, so should we try to explain the emergent complexity of LLMs through experimentation and theory.

Specifically not pseudoscience, though.


>so should we try to explain the emergent complexity of LLMs through experimentation and theory.

Physicists had the real world to verify theories and explanations against.

So far anyone 'explaining the emergent complexity of LLMs through experimentation and theory' is essentially just making stuff up nobody can verify.


Well that’s why I provided the caveat “specifically not pseudoscience”, which is, as you described, “just making stuff up nobody can verify”.

If you say not pseudoscience and then make up pseudoscience anyway then what's the point? The field has not advanced anywhere enough in understanding for convoluted explanations about how LLMs can never do x to be anything but pseudoscience.

Sure, that's true as well. But I don't see this as a substantive response given that the only people making unsupported claims in this thread are those trying to deflate LLM capabilities.

So, to review this thread

  - OP asked for someone to make a logical argument for the separation of “training” from “model”
  - I made the argument
  - You cherry picked an argument against my specific example and made an appeal to emergent complexity
  - I pointed out that emergent complexity isn’t limitless
  - “the only people making unsupported claims in this thread are those trying to deflate LLM capabilities”

You made a pretty nonsensical argument, pretty much seems like the big standard for these arguments.

What does linear regression have to do with the limitations of a stacked transfer ? Absolutely nothing. This is the problem here. You don't know shit and just make up whatever. You can see people doing the same thing in GPT-1, 2, 3, 4 threads all telling us why LLMs will never be able to do thing it manages to do later.


> You don’t know shit

lol. Why so emotionally charged? Are you perhaps worried that you’ve invested too much time and effort into a technology that may not deliver what influencers have been promising for years? Like a proverbial bagholder?

> What does linear regression have to do with the limitations of a stacked transfer ? Absolutely nothing. This is the problem here.

We’re talking about fundamental concepts of modeling in this subthread. LLMs, despite what influencers may tell you, are simply models. I’ll even throw you a bone and admit they are models for intelligence. But they are still models, and therefore all of the things that we have learned about “models” since Plato are still relevant. Most importantly, since Plato we’ve known that “models” have fundamental limits vs. what they try to represent, otherwise they would be a facsimile, not a model.

> You can see people doing the same thing in GPT-1, 2, 3, 4 threads all telling us why LLMs will never be able to do thing it manages to do later.

I hope you enjoy winning these imaginary arguments against these imaginary comments. The fundamental limitations of LLMs discussed since GPT-1 have never been addressed by changing the architecture of the underlying model. All of the improvements we’ve experienced have been due to (1) improvements in training regime and (2) harnesses / heuristics (e.g. Agents).

Now, care to provide a counterargument that shows you know a little more than “shit”?


>We’re talking about fundamental concepts of modeling in this subthread. LLMs, despite what influencers may tell you, are simply models. I’ll even throw you a bone and admit they are models for intelligence. But they are still models, and therefore all of the things that we have learned about “models” since Plato are still relevant. Most importantly, since Plato we’ve known that “models” have fundamental limits vs. what they try to represent, otherwise they would be a facsimile, not a model.

Okay, but the brain is also “just a model” of the world in any meaningful sense, so that framing does not really get you anywhere. Calling something a model does not, by itself, establish a useful limit on what it can or cannot do. Invoking Plato here just sounds like pseudo-profundity rather than an actual argument.

>I hope you enjoy winning these imaginary arguments against these imaginary comments. The fundamental limitations of LLMs discussed since GPT-1 have never been addressed by changing the architecture of the underlying model. All of the improvements we’ve experienced have been due to (1) improvements in training regime and (2) harnesses / heuristics (e.g. Agents).

If a capability appears once training improves, scale increases, or better inference-time scaffolding is added, then it was not demonstrated to be a 'fundamental impossibility'.

That is the core issue with your argument: you keep presenting provisional limits as permanent ones, and then dressing that up as theory. A lot of people have done that before, and they have repeatedly been wrong.


To be clear, you are confusing me with other commenters in this thread. All I want is for those that liken LLMs to stochastic parrots and other deflationary claims to offer an argument that engages with the actual structure of LLMs and what we know about them. No one seems to be up to that challenge. But then I can't help but wonder where people's confident claims come from. I'm just tired of the half-baked claims and generic handwavy allusions that do nothing but short-circuit the potential for genuine insight.

>AlphaGo didn't teach itself that move. The verifier taught AlphaGo that move.

No. AlphaGo developed a heuristic by playing itself repeatedly, the heuristic then noticed the quality of that move in the moment.

Heuristics are the core of intelligence in terms of discovering novelty, but this is accessible to LLMs in principle.


Why would you want every site on the internet to traffic in government IDs? This is by far the least bad out of all possible ways to implement age checking. The benefit of this is that it can short-circuit support for more onerous age verification. The writing has been on the wall for some time now: the era of completely unrestricted internet is coming to an end. The question is how awful will the new normal be? This implementation is a win all around, a complete nothingburger. We should be celebrating it, not fighting it tooth and nail.

The tech crowds utter derangement over this minor mandate is truly a sight to behold.


> This is by far the least bad out of all possible ways to implement age checking.

Not quite. The least bad (that I'm aware of) is to mandate RTA headers (or an equivalent more comprehensive self categorization system) and to also mandate that major platforms (presumably OS and browsers, based on MAU or some such) implement support for filtering on those headers.

But sending a binned age as per the California law is the next best thing to that.


In fact, many libraries have computers sectioned off in semi-private areas exactly for this reason...

Are you sure that's a library?

I mean we have places here like that where you can insert some coins for a private viewing cabin but we don't call them libraries :)


A law defines the nature of collective action in response to certain violations. Words on paper themselves are impotent. If there is no potential for enforcement, i.e. there is no counterfactual state of collective action, there is no law.

I've always found it strange how Americans like to validate their ideals using their kids as vehicles. Instead of teaching kids how to be successful in a less than ideal world, we teach them our ideal view of the world. Like teaching kids violence is never the answer, instead of sometimes a situation does call for violence. We raise kids for a world that doesn't exist. It's up to the kid/adult to unlearn those obviously bogus ideals after they make contact with the world. It's just odd how we're so practiced at setting up our children for less success in the real world.


How did you arrive at this being uniquely American? I would say it's Western society more generally.


I mainly said America because I only feel qualified to speak on America. But I do think there is something uniquely American about seeing the march of "progress" as an ultimate ideal and stagnation in any form as a defeat. Economic and social progress is basically a founding ideal of American society and is a major driver of our success over the centuries. It permeates our culture in so many ways, e.g. the idea that your kids should have it better than you. So shaping the next generation by way of shaping the views of your kids, despite the potential mismatch between the ideal and the reality is seen as just a part of the march of progress.


Yes, let me send a picture of my ID to every app on the internet. That's so much better than having the device I own attest to my age anonymously.


They want it because it absolves them of responsibility for what their app does to kids. They can then just point to the existence of an already working mechanism for parents to intervene. The alternative would be for each app to implement stringent age verification or redesign itself to avoid addictive patterns. Neither option is good for their earnings.


The internet and the surrounding context changed so fast that it made little sense to cling to old email addresses made in the old context. Gmail represented the 'new internet' and old patterns became obsolete (less subversive, more mainstream/corporate). When there's a seismic shift in usage patterns that's when all bets are off regarding where everyone lands. Being the first mover means little here. If the way people interacted with AI underwent a massive shift, OpenAI would likely get left behind. The only safe bet is to invent your own killer.


What are neurosymbolic systems supposed to bring to the table that LLMs can't in principle? A symbol is just a vehicle with a fixed semantics in some context. Embedding vectors of LLMs are just that.


Pre-programmed, hard and fast rules for manipulating those symbols, that can automatically be chained together according to other preset rules. This makes it reliable and observable. Think Datalog.

IMO, symbolic AI is way too brittle and case-by-case to drive useful AI, but as a memory and reasoning system for more dynamic and flexible LLMs to call out to, it's a good idea.


Sure, reliability is a problem for the current state of LLMs. But I see no reason to think that's an in principle limitation.


There are so many papers now showing that LLM "reasoning" is fragile and based on pattern-matching heuristics that I think it's worth considering that, while it may not be an in principle limitation — in the sense that if you gave an autoregressive predictor infinite data and compute, it'd have to learn to simulate the universe to predict perfectly — in practice we're not going to build Laplace's LLM, and we might need a more direct architecture as a short cut!


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: