Hacker Newsnew | past | comments | ask | show | jobs | submit | quantumHazer's commentslogin

the author is an employee of Cursor/Anysphere. i think this should've been declared at the start of the article. it's always like this.

Not unpopular at all. I think it's one of the best Ferrari ever made.


Or maybe, hear me out, we don't need any of this ""agent"" first shiny thingy


I wouldn’t be surprised if this is undisclosed PR from Anthropic


I'd be very surprised if it wasn't. Everything about that company turns me off. I've run across countless YouTube videos that are clearly Anthropic PR pretending to be real videos by regular people just trying it out and discovering how good Claude is. I'll stick with Gemini.


Why are we commenting the Claude subreddit?

1) it’s not impartial

2) it’s useless hype commentary

3) it’s literally astroturfing at this point


Seems pretty false if you look at the model card and web site of Opus 4.5 that is… (check notes) their latest model.


Building a good model generally means it will do well on benchmarks too. The point of the speculation is that Anthropic is not focused on benchmaxxing which is why they have models people like to use for their day-to-day.

I use Gemini, Anthropic stole $50 from me (expired and kept my prepaid credits) and I have not forgiven them yet for it, but people rave about claude for coding so I may try the model again through Vertex Ai...

The person who made the speculation I believe was more talking about blog posts and media statements than model cards. Most ai announcements come with benchmark touting, Anthropic supposedly does less / little of this in their announcements. I haven't seen or gathered the data to know what is truth


You could try Codex cli. I prefer it over Claude code now, but only slightly.


No thanks, not touching anything Oligarchy Altman is behind


> it won’t make sense to learn how to code.

Sure. So we can keep paying money to your employer, Anthropic, right?


Last year’s model were at 50-60% on SWE bench-verified actually


I see 25-29% here https://www.swebench.com/viewer.html for models released in Nov 2024 albeit not verified. gpt4o (Aug 2024) was 33% for swe bench verified.

Important point because people have a bias to underestimate the speed of ai progress.


Do you people think nobody calls your bluff?

Here’s the launch card of the sonnet 3.5 from a year and a month ago. Guess the number. Ok, Ill tell you: 49.0%. So yeah, the comment you replied to was not really off.

https://www.anthropic.com/news/3-5-models-and-computer-use


there is also Normal Computing[0] that are trying different approaches to chips like that. Anyway these are very difficult problems and Extropic already abandoned some of their initial claims about superconductors to pivot to more classical CMOS circuits[1]

[0]: https://www.normalcomputing.com

[1]: https://www.zach.be/p/making-unconventional-computing-practi...


there is also Normal Computing[0] that are trying different approaches to chips like that.

[0]: https://www.normalcomputing.com


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: