What it confirms, I think, is, that we are going to need a *lot* more chips.

georgemcbay · 2025-02-27T22:44:21 1740696261

Further confirmation, IMO, that the idea that any of this leads to anything close to AGI is people getting high on their own supply (in some cases literally).

LLMs are a great tool for what is effectively collected knowledge search and summary (so long as you are willing to accept that you have to verify all of the 'knowledge' they spit back because they always have the ability to go off the rails) but they have been hitting the limits on how much better that can get without somehow introducing more real knowledge for close to 2 years now and everything since then is super incremental and IME mostly just benchmark gains and hype as opposed to actually being purely better.

I personally don't believe that more GPUs solves this, like, at all. But its great for Nvidia's stock price.

BobbyJo · 2025-02-28T00:27:02 1740702422

I'd put myself on the pessimistic side of all the hype, but I still acknowledge that where we are now is a pretty staggering leap from two years ago. Coding in particular has gone from hints and fragments to full scripts that you can correct verbally and are very often accurate and reliable.

georgemcbay · 2025-02-28T01:12:34 1740705154

I'm not saying there's been no improvement at all. I personally wouldn't categorize it as staggering, but we can agree to disagree on that.

I find the improvements to be uneven in the sense that every time I try a new model I can find use cases where its an improvement over previous versions but I can also find use cases where it feels like a serious regression.

Our differences in how we categorize the amount of improvement over the past 2 years may be related to how much the newer models are improving vs regressing for our individual use cases.

When used as coding helpers/time accelerators, I find newer models to be better at one-shot tasks where you let the LLM loose to write or rewrite entire large systems and I find them worse at creating or maintaining small modules to fit into an existing larger system. My own use of LLMs is largely in the latter category.

To be fair I find the current peak model for coding assistant to be Claude 3.5 Sonnet which is much newer than 2 years old, but I feel like the improvements to get to that model were pretty incremental relative to the vast amount of resources poured into it and then I feel like Claude 3.7 was a pretty big back-slide for my own use case which has recently heightened my own skepticism.

infecto · 2025-02-28T12:25:40 1740745540

Hilarious. Over two years we went from LLMs being slow and not very capable of solving problems to models that are incredibly fast, cheap and able to solve problems in different domains.

pseufaux · 2025-02-28T00:06:49 1740701209

Well said. 100% agree

prisenco · 2025-02-27T21:11:14 1740690674

Or, possibly, we're stuck waiting for another theoretical breakthrough before real progress is made.

resource0x · 2025-02-27T21:29:42 1740691782

breakthrough in biology

DannyBee · 2025-02-27T22:55:11 1740696911

Eh, no. More chips won't save this right now, or probably in the near future (IE barring someone sitting on a breakthrough right now).

It just means either

A. Lots and lots of hard work that get you a few percent at a time, but add up to a lot over time.

or

B. Completely different approaches that people actually think about for a while rather than trying to incrementally get something done in the next 1-2 months.

Most fields go through this stage. Sometimes more than once as they mature and loop back around :)

Right now, AI seems bad at doing either - at least, from the outside of most of these companies, and watching open source/etc.

While lots of little improvements seem to be released in lots of parts, it's rare to see anywhere that is collecting and aggregating them en masse and putting them in practice. It feels like for every 100 research papers, maybe 1 makes it into something in a way that anyone ends up using it by default.

This could be because they aren't really even a few percent (which would be yet a different problem, and in some ways worse), or it could be because nobody has cared to, or ...

I'm sure very large companies are doing a fairly reasonable job on this, because they historically do, but everyone else - even frameworks - it's still in the "here's a million knobs and things that may or may not help".

It's like if compilers had no "O0/O1/O2/O3' at all and were just like "here's 16,283 compiler passes - you can put them in any order and amount you want". Thanks! I hate it!

It's worse even because it's like this at every layer of the stack, whereas in this compiler example, it's just one layer.

At the rate of claimed improvements by papers in all parts of the stack, either lots and lots and lots is being lost because this is happening, in which case, eventually that percent adds up to enough for someone to be able to use to kill you, or nothing is being lost, in which case, people appear to be wasting untold amounts of time and energy, then trying to bullshit everyone else, and the field as a whole appears to be doing nothing about it. That seems, in a lot of ways, even worse. FWIW - I already know which one the cynics of HN believe, you don't have to tell me :P. This is obviously also presented as black and white, but the in-betweens don't seem much better.

Additionally, everyone seems to rush half-baked things to try to get the next incremental improvement released and out the door because they think it will help them stay "sticky" or whatever. History does not suggest this is a good plan and even if it was a good plan in theory, it's pretty hard to lock people in with what exists right now. There isn't enough anyone cares about and rushing out half-baked crap is not helping that. mindshare doesn't really matter if no one cares about using your product.

Does anyone using these things truly feel locked into anyone's ecosystem at this point? Do they feel like they will be soon?

I haven't met anyone who feels that way, even in corps spending tons and tons of money with these providers.

The public companies - i can at least understand given the fickleness of public markets. That was supposed to be one of the serious benefit of staying private. So watching private companies do the same thing - it's just sort of mind-boggling.

Hopefully they'll grow up soon, or someone who takes their time and does it right during one of the lulls will come and eat all of their lunches.

gniv · 2025-02-28T05:49:37 1740721777

> Completely different approaches that people actually think about for a while

I think this is very likely simply because there are so many smart people looking at it right now. I hope the bubble doesn't burst before it happens.