One thing I hope to see included is a precursor step when constructing specs where Claude is used to intelligently inquire about gaps to fill that would disambiguate the implementation. If you told an engineer to do something with a set of requirements and outcomes, they'd naturally also have follow-up questions to ensure alignment before executing.
Yes, kind of like open AI’s deep research tool. I often find that a number of mistakes are made because no clarification questions or ask or even considered.
> And in the long run, the best way to get what you want is to deserve it.
Love this quote.
And just a comment on agency: it's not necessarily rewarded or even acknowledged in all environments. Some expect that you're literally a worker-robot and just need someone to carry out menial tasks from up top. You don't want to end up at one of these places, regardless of what your life situation is.
May be worth it for founders if you're naturally obsessive about the problem you're solving. No bueno for employees (or anyone with less than say ~5% equity)
While I agree generally with the premise that the silver bullet that AI coding has been marketed to be has underdelivered (even if it doesn't feel that way), I gotta point out that the experiment and its results don't do a good job of capturing that. One of the biggest parts of using these AI tools is knowing which tasks they're most suitable for (and sometimes it's using them in only certain subtasks of a task). As mentioned, some tasks they absolutely excel at. Flipping a coin and deciding to use it or not is crude and unrealistic. Hard to come up with a reliable method though, I also think METR has it's glaring issues.
You're exactly right. To be honest, in pretty much every case I've seen, indicating usage of a read-only resource directly in the prompt always outperforms using the MCP for it. Should really only be using MCP if you need MCP-specific functionality imo (elicitation, sampling)
If they do start to become unsustainable you might see more companies moving to a BYOK or usage-based billing model. If they do that, I don't know if the use cases for AI would justify the cost for consumers (but perhaps so for businesses). There's been a ton of build out of data centers so I do think the cost reduction we've seen so far may extrapolate but at the expense of more performant models. Hard to tell right now though