Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Just want to say nice job and keep it up. Thrilled to start playing with 3.7.

In general, benchmarks seem to very misleading in my experience, and I still prefer sonnet 3.5 for _nearly_ every use case- except massive text tasks, which I use gemini 2.0 pro with the 2M token context window.



An update: "code" is very good. Just did a ~4 hour task in about an hour. It cost $3 which is more than I usual spend in an hour, but very worth it.


I find the webdev arena tends to match my experience with models much more closely than other benchmarks: https://web.lmarena.ai/leaderboard. Excited to see how 3.7 performs!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: