I've been using 4.5 for the better part of the day. I also have access to o3-min...

leumon · 2025-02-28T14:07:35 1740751655

this model does have a niche use-case: since its so large it does have a lot more knowledge and hallucinates much less. for example as a test question I asked it to list the best restaurants in my small town. and all of them existed. none of the other llms get this right.

A_D_E_P_T · 2025-02-28T14:14:44 1740752084

I tried the same thing with companies in my industry ("list active companies in the field of X") and it came back with a few that have been shuttered for years, in one case for nearly two decades.

I'm really not seeing better performance than with o3-mini.

If anything, the new results ("list active companies in the field of X") are actually worse than what I'd get with o3-mini, because the 4.5 response is basically the post-SEO Google first page (it appears to default to mentioning the companies that rank most highly on Google,) whereas the o3 response was more insightful and well-reasoned.

lolinder · 2025-02-28T14:18:57 1740752337

That's also a use case where the consensus among those in the know is that you shouldn't be relying on the model's size in the first place.

You know what gets the list of restaurants in my home town right? Llama 3.2 1b q4 running on my desktop with web search enabled.