Neither a statement for or against Grok or Anthropic: I've now just taken to see...

vessenes · 2025-02-24T19:05:04 1740423904

O1 pro is excellent at figuring out complex stuff that Claude misses. It’s my go to mid level debug assistant when Claude spins

maeil · 2025-02-25T02:20:13 1740450013

Ive found the same but find o3-mini just as good as that. Sonnet is far better as a general model, but when it's an open-ended technical question that isn't just about code, o3-mini figures it out while Sonnet sometimes doesn't. In those cases o3 is less inclined to go with purely the most "obvious" answer when it's wrong.

OsrsNeedsf2P · 2025-02-25T08:16:54 1740471414

I have never, in frontend, backend, or Android, had O1 pro solve a problem Claude 3.5 could not. I've probably tried it close to 20 times now as well

mrcwinn · 2025-02-25T15:26:48 1740497208

What's really the value of a bunch of random anecdotes on HN — but in any case, I've absolutely had the experience of 3.5 falling over on its face when handling a very complex coding task, and o1 pro nailing it perfectly.

Excited to try 3.7 with reasoning more but so far it seems like a modest, welcome upgrade but not any sort of leapfrog past o1 pro.

airstrike · 2025-02-25T15:50:26 1740498626

I've never had o1 figure something out that Claude Sonnet 3.5 couldn't. I can only imagine the gap has widened with 3.7.