One of the problems with this study is that the field is moving so very fast. 6 ...

One of the problems with this study is that the field is moving so very fast.

6 months in models is an eternity. Anthropic has better models out since this study was done. Gemini keeps getting better. Grok / xAI isn’t a joke anymore. To say nothing of the massive open source advancements released in just the last couple weeks alone.

This is all moving so fast that one already out of date report isn’t definitive. Certainly an interesting snapshot in time, but has to be understood in context.

Hackernews needs to get better on this. The head in the sand vibe here won’t be tenable for much longer.