Hacker News new | past | comments | ask | show | jobs | submit login

At this level, it is very contextual - depending on your tools, prompts, language, libraries, and the whole code base. For example, for one project, I am generating ggplot2 code in R; Claude 3.5 gives way better results than the newer Claude 3.7.

Compare and contrast https://aider.chat/docs/leaderboards/, https://web.lmarena.ai/leaderboard, https://livebench.ai/#/.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: