I got Claude 4 Opus to summarize this thread on Hacker News when it had hit 319 ...

mrandish · 2025-05-22T19:18:41 1747941521

Interesting, thanks for doing this. Both summaries are serviceable and quite similar but I had a slight preference for Sonnet 4's summary which, at just ~20% of the cost of Claude 4 Opus, makes it quite the value leader.

This just highlights that, with compute requirements for meaningful traction against hard problems spiraling skyward for each additional increment, the top models on current hard problems will continue to cost significantly more. I wonder if we'll see something like an automatic "right-sizing" feature that uses a less expensive model for easier problems. Or maybe knowing whether a problem is hard or easy (with sufficient accuracy) is itself hard.

swyx · 2025-05-22T19:26:50 1747942010

this is known as model routing in the lingo and yes theres both startups and biglabs working on it

swyx · 2025-05-22T19:26:15 1747941975

analysis as the resident summaries guy:

- sonnet has better summary formatting "(72.5% for Opus)" vs "Claude Opus 4 achieves "72.5%" on SWE-bench". especially Uncommon Perspectives section

- sonnet is a lot more cynical - opus at least included a good performance and capabilities and pricing recap, sonnet reported rapid release fatigue

- overall opus produced marginally better summaries but probably not worth the price diff

i'll run this thru the ainews summary harness later if thats interesting to folks for comparison