I think you are right in saying that there is some deep intuition that takes mon...

simonw · 2026-01-12T01:38:00 1768181880

My great regret from the past few years is that experimenting with LLMs has been such a huge distraction from my other work! My https://llm.datasette.io/ tool is from that era though, and it's pretty cool.

tymscar · 2026-01-12T01:47:48 1768182468

I do think your datasettes work is fantastic and I genuinely hope you take my previous message the right way. I’m not saying you do something bad, quite the opposite, I feel like we need more of you and I’m afraid because of LLMs we get less of you.

beaker52 · 2026-01-12T06:50:54 1768200654

(Breaking the 4th wall for a minute):

It’s not just Simon that we’re getting less of, it’s YOU we’re getting less of too. And we want you around. Don’t go.

beaker52 · 2026-01-12T07:15:15 1768202115

> because of antipatterns that don’t apply anymore, such as always starting a new chat

I’m keen to understand your reasoning on this. I don’t agree, but maybe I’m just stuck with old practices, so help me?

What’s your justification as to why starting a new chat is an antipattern?

elliotto · 2026-01-12T20:43:40 1768250620

It used to be that the bots had a short context window, and they struggled with getting confused by past context, so it was much better to make a new chat every now and then to keep the thread on track.

The opposite is true now. The context windows are enormous, and the bots are able to stay on task extremely well. They're able to utilize any previous context you've provided as part of the conversation for the new task, which improves their performance.

The new pattern I am using is a master chat that I only ever change if I am doing something entirely different

beaker52 · 2026-01-12T22:25:59 1768256759

That’s cool. I know context windows are arbitrarily larger now because consumers think that larger window = better, but I think the sentiment that the model can’t even use the window effectively still stands?

I still find LLMs perform best with a potent and focussed context to work with, and performance goes down quite significantly the more context it has.

What’s your experience been?

elliotto · 2026-01-15T02:21:30 1768443690

I worked on a startup experimenting with using gemini-2.0-flash (the year old model) using its full 1m context window to query technical documents. We found it to be extremely successful at needle-in-a-haystack type problems.

As we migrated to newer models (gemini-3.0 and the o4-mini models) we again found it performed even better with x00k tokens. Our system prompt grew to about 20k tokens and the bots were able to handle it perfectly. Our issue became time to first token with large context, rather than the bot quality.

The ultra large 1m+ llama models were reported to be ineffective at >1m context. But at this point, it becomes so cost prohibitive to use anyway.

I am continuing to have success using Cursor's Auto model, and GPT-5.1 with extremely long conversations. I use different chats for different problems moreso for my own compartmentalisation of thoughts, rather than as a necessity for the bot.

jcheng · 2026-01-12T02:09:36 1768183776

> 95% of Claude and 5% of you, while still better than me (and your average Joe), is nowhere near the same jump from 95% Claude and 5% me.

I see what you're saying, but I'm not sure it is true. Take simonw and tymscar, put them each in charge of a team of 19 engineers (of identical capabilities). Is the result "nowhere near the same jump" as simonw vs. tymscar alone? I think it's potentially a much bigger jump, if there are differences in who has better ideas and not just who can code the fastest.

tymscar · 2026-01-12T02:33:07 1768185187

I agree, however there you don’t compare technical knowledge alone, you also compare managerial skills.

With LLMs its admittedly a bit closer to doing it yourself because the feedback loop is much tighter

jcheng · 2026-01-12T18:00:01 1768240801

Yeah... and besides managerial skills, also product (using the word loosely) sense, user empathy, clarity of vision, communication skills. They've always been multipliers for programmers, even more so in this moment.

tymscar · 2026-01-12T18:02:43 1768240963

Are they multipliers when you do less of it and offload more of it to the same tool everyone else uses?