Interesting, thanks for doing this. Both summaries are serviceable and quite similar but I had a slight preference for Sonnet 4's summary which, at just ~20% of the cost of Claude 4 Opus, makes it quite the value leader.
This just highlights that, with compute requirements for meaningful traction against hard problems spiraling skyward for each additional increment, the top models on current hard problems will continue to cost significantly more. I wonder if we'll see something like an automatic "right-sizing" feature that uses a less expensive model for easier problems. Or maybe knowing whether a problem is hard or easy (with sufficient accuracy) is itself hard.
Token cost: 22,275 input, 1,309 output = 43.23 cents - https://www.llm-prices.com/#it=22275&ot=1309&ic=15&oc=75&sb=...
Same prompt run against Sonnet 4: https://gist.github.com/simonw/1113278190aaf8baa2088356824bf...
22,275 input, 1,567 output = 9.033 cents https://www.llm-prices.com/#it=22275&ot=1567&ic=3&oc=15&sb=o...