How often is this happening? If criminals are aware that 401k transfers are done using cheques in the post, and some of the addresses these are typically sent to, then I'm expecting this to be a very common type of fraud. Yet the practice of 401k cheques in the mail continues? Weird.
This is simultaneously why most people desperately want to invest in OpenAI and at the same time why all the best gen AI researchers want to work for anthropic. The less you understand the more impressive this seems. Conversley the more you understand the more embarrassing this seems.
If you game the benchmark then you always get found out by your users. Yet the practice remains common in hardware. Outright lies are uncommon but misleading and cherry picked numbers are pretty much standard practice.
The fact that misleading benchmarks don't even drive profit at Meta didn't seem to stop them doing the same thing, but perhaps this isn't very surprising. I imagine internal incentives are very similar.
Unlike the hardware companies though, gaming the benchmark in LLMs seems to involve making the actual performance worse, so perhaps there is more hope that the practice will fade away in this market.
I had the same thought. I'm guessing rigorous and expensive safety certification, a custom designed steam driven turbo and alternator and stripping back and rebuilding the engine carriage? The fact it's got batteries and an alternator and a turbo suggests some stringent requirements.
Each sex can have their own stereotypes if you wish:
The male drunk driver rushing through the same intersection is probably even more common than the unfortunatelly common screen distracted Karen.
I'm really not trying to say that OP is misogynistic here, just mentioning that it can come across that way because the "Karen" example is oddly specific and not relevant to the overall point. Yes it's a stereotype and you could pick a male one but that'd also be weird and potentially also come across as sexist.
They used deep neural networks, reinforcement learning, and Monte Carlo tree search. All except the MCTS are critical components of modern LLMs. MCTS is a form of planning which you can argue has parallels to "reasoning" models, although that's pretty tenuous I admit.
reply