Everything about this feels like what Microsoft should have done. It’s absolutely amazing to me that search is so broken in Windows and yet a free third-party tool can instantly find any file anywhere.
One hypothetical I wonder about is what the windows ecosystem would be like if third parties could make distributions of windows, if somehow that could be licensed and enough windows building/packaging was opened up. It'd be interesting to see whether collaborations of projects would form where they pull out MS parts and substitute their own, presumably with the constraint that they maintain compatibility. I imagine it'd take a while for any commercial products thinking of getting involved to figure out sharing, trust, and how to offer it in a way companies or individuals might want to donate/pay for.
Windows file search has been useless as far back as I can remember. Especially file indexing and the load it puts on the CPU. I usually just disable file indexing on a new windows install.
I genuinely just don't use the Start Menu anymore. It cannot find anything, and every search will include two Internet results (Bing only of course) and a Microsoft Store reference.
Principal engineers! We need architecture! Marketing team, we need ads with celebrities! Product team, we need a roadmap to build on this for the next year! ML experts, get this into the training and RL sets! Finance folks, get me annual forecasts and ROI against WACCC! Ops, we’ll need 24/7 coverage and a guarantee of five nines. Procurement, lock down contracts. Alright everyone… make this button red!
We have to reject claude can do it simply by a prompt, then everyone can do it. As SWE's we are not going to pragmatically accept we are done. https://www.youtube.com/watch?v=g_Bvo0tsD9s
ha! The default system prompt appears to give the main agent appropriate guidance about only using swarm mode when appropriate (same as entering itself into plan mode). You can further prompt it in your own CLAUDE.md to be even more resistant to using the mode if the task at hand isn't significant enough to warrant it.
I wonder how much of it is due to the model being familiar with the game or parts of it, be it due to training of the game itself, or reading/watching walkthroughs online.
There was a well-publicised "Claude plays Pokémon" stream where Claude failed to complete Pokemon Blue in spectacular fashion, despite weeks of trying. I think only a very gullible person would assume that future LLMs didn't specifically bake this into their training, as they do for popular benchmarks or for penguins riding a bike.
While it is true that model makers are increasingly trying to game benchmarks, it's also true that benchmark-chasing is lowering model quality. GPT 5, 5.1 and 5.2 have been nearly universally panned by almost every class of user, despite being a benchmark monster. In fact, the more OpenAI tries to benchmark-max, the worse their models seem to get.
reply