HAHAHAHA What a load of crap, they completely missed the point of AI, they will ...

tomhow · 2025-05-08T13:00:34 1746709234

> HAHAHAHA

> What a load of crap

Please don't post shallow dismissals, especially of other people's work. A good critical comment teaches us something.

Please don't fulminate. Please don't sneer

Please don't use uppercase for emphasis.

https://news.ycombinator.com/newsguidelines.html

kergonath · 2025-05-07T05:16:36 1746594996

> I don't know if anyone has the statistic but I'd guess the immense majority of user queries are like 100 tokens or shorter, imagine loading 24k to solve 0.1k, only a waste of 99.995% of resources.

That’s par for the course. These things burn GPU time even when they are used as a glorified version of Google prone to inventing stuff. They are wasteful in the vast majority of cases.

> I wish I could just short Anthropic.

What makes you think the others are significantly different? If all they have is a LLM screwdriver, they’re going to spend a lot of effort turning every problem into a screw, it’s not surprising. A LLM cannot reason, just generate text depending on the context. It’s logical to use the context to tell it what to do.

moralestapia · 2025-05-07T05:27:19 1746595639

>What makes you think the others are significantly different?

ChatGPT's prompt is on the order of 1k, if the leaks turn out to be real. Even that one seems a bit high for my taste, but they're the experts, not me.

>It’s logical to use the context to tell it what to do.

You probably don't know much about this, but no worries I can explain. You can train a model to "become" anything you want, if your default prompt starts to be measured in kilobytes, it might as well be better to re-train (obv. not re-train the same one, but v2.1 or whatever, train it with this in mind) and/or fine tune, because your model behaves quite different from what you want it to do.

I don't know the exact threshold, there might not even be one as training and LLM takes some sort of artisan skills, but if you need 24k just to boot the thing you're clearly doing something wrong, aside from the waste of resources.

beardedwizard · 2025-05-07T05:38:32 1746596312

But this is the solution the most cutting edge llm research has yielded, how do you explain that? Are they just willfully ignorant at OpenAI and anthropic? If fine tuning is the answer why aren't the best doing it?

mcintyre1994 · 2025-05-07T09:32:39 1746610359

I'd guess the benefit is that it's quicker/easier to experiment with the prompt? Claude has prompt caching, I'm not sure how efficient that is but they offer a discount on requests that make use of it. So it might be that that's efficient enough that it's worth the tradeoff for them?

Also I don't think much of this prompt is used in the API, and a bunch of it is enabling specific UI features like Artifacts. So if they re-use the same model for the API (I'm guessing they do but I don't know) then I guess they're limited in terms of fine tuning.

int_19h · 2025-05-07T20:16:27 1746648987

Prompt caching is functionally identical to snapshotting the model after it processed the prompt. And you need the KV cache for inference in any case so it doesn't even cost extra memory to keep it around, if every single inference task is going to have the same prompt suffix.

impossiblefork · 2025-05-07T07:37:14 1746603434

The job of the system is to be useful, not to be AI.

Long prompts are very useful for getting good performance and establishing a baseline for behaviour, which the model can then continue.

Furthermore, you can see this as exploiting an aspect of these models that make them uniquely flexible: in context learning.

foobahhhhh · 2025-05-07T12:17:44 1746620264

Short AI. Hope you are more solvent than a glue factory!

Re. Prompt Length. Somewhere in the comments people talk about caching. Effectively it is zero cost.