Hacker Newsnew | past | comments | ask | show | jobs | submit | esskay's commentslogin

Real world usage suggests otherwise. It's been a known trend for a while. Anthropic even confirmed as such ~6 months ago but said it was a "bug" - one that somehow just keeps happening 4-6 months after a model is released.

Real world usage is unlikely to give you the large sample sizes needed to reliably detect the differences between models. Standard error scales as the inverse square root of sample size, so even a difference as large as 10 percentage points would require hundreds of samples.

https://marginlab.ai/trackers/claude-code/ tries to track Claude Opus performance on SWE-Bench-Pro, but since they only sample 50 tasks per day, the confidence intervals are very wide. (This was submitted 2 months ago https://news.ycombinator.com/item?id=46810282 when they "detected" a statistically significant deviation, but that was because they used the first day's measurement as the baseline, so at some point they had enough samples to notice that this was significantly different from the long-term average. It seems like they have fixed this error by now.)


It's hard to trust public, high profile benchmarks because any change to a specific model (Opus 4.5 in this case) can be rejected if they have regressions on SWE-Bench-Pro, so everything that gets to be released would perform well in this benchmark

Any other benchmark at that sample size would have similarly huge error bars. Unless Anthropic makes a model that works 100% of the time or writes a bug that brings it all the way to zero, it's going to work sometimes and fail sometimes, and anyone who thinks they can spot small changes in how often it works without running an astonishingly large number of tests is fooling themselves with measurement noise.

In an alternative reality Apple didn't absolutely shit the bed on AI and made this possible. Sadly they've shown they are woefully behind and have utterly useless people leading divisions they shouldn't have been allowed anywhere near.

How many more times is it going to be rebuilt before they grasp the obvious bit - it's dead Dave.

Notepad.

Seriously, you need a heck of a lot more than a random HN reply to give you Jira alternatives if you've been embedded into its ecosystem for any length of time - and my condolences if you have.


It's fine, just not stellar. It was terrible (UX, speed, consistency) ten years ago. It's better now - mostly gets out of people's way and just works. It doesn't delight me.


Counterpoint - its a legal requirement in several parts of the world now (and rapidly expanding), how do you think they should handle it whilst you know...still being able to exist?


It's not a legal requirement in my country. Thus they are volunteering to go extra mile in the implementation of the repressive laws.

I think they should resist as much as possible. Yes it was a legal requirement to gas the Jews and it was illegal to hide them.

Who do we cheer now? Those who abided to the law or those who broke it?


It was supposed to be decentralised though, meaning there would be no central party to pursue.


Sure, which is why it's perfectly possible to work around those restrictions using any of the alternative apps that show the same data (but don't implement the legal restrictions).


It is. The block is on the client level, not the network.


How long until it KYC at bluesky becomes a centralized requirement?


What makes you think a nation state level entity can't pursue in this case...?


The same reason even nation state actors have not managed to eradicate torrents. You take one tracker down, another pops back up.


It kinda of confuses me when use the term "nation state" instead of just "state". For example Canada is not a nation-state but surely they are powerful and important enough that they could also pursue this kind of case.


This has been a pet peeve of mine for some time now. It seems people just feel smarter using a fancy terms instead of "a state" or "a government".


Unless you specifically need a pi (unlikely) then they really are awful value now. Hard to really go out of the way to support them now they've stuck two fingers up at the solo/indie/educational community and gone all enterprise.

Second hand mini pc's are a good option. Half the price of a pi 5 + sd + power and you often get them with 16gb ram, a decent ssd, etc.

If you need GPIO then many of the rockchip boards are still fairly affordable and easily had.


The Pi isn't great value, but honestly, I'm finding it hard to find a better trade-off between price, performance and software support right now than the compute modules for embedded projects where you can afford to spin a custom PCB. Especially for low-ish volume or prototype stuff.


I also love the compute modules for their size. Stick one on a nano base board and they’re half the size of a Pi 5. TBH the standard Pis are a bit frustrating with all of the IO. I do not believe the average purchaser is using one as a PC replacement and wants 4 USB ports and 2 HDMI ports. I’ve never seen one in use like that. They are mostly servers or driving a single display without any user input.


100% with you on the IO. I've never even wanted two display output ports with any raspberry pi.

You know what I do want though? An actual damn HDMI port! HDMI cables are everywhere, wherever I am I have unlimited options to connect an HDMI device to some kind of screen. But micro HDMI? The literal only thing in my life that uses it is the Raspberry Pi 4 and 5. There have been plenty of times where I've reached for a Pi 3b instead of a 4 or 5 just because I didn't have a micro HDMI cable.

I do not understand what has gone through their head. How could anyone look at the use case for a Raspberry Pi and decide that two micro HDMI ports is a better choice than one HDMI port? I don't understand it. Like you, my experience with the Pi is that they mostly just sit there, headless, so the only reason I need display output is that it's useful during setup (because they don't have a proper serial console port).

I can't set up a Pi 4 or 5 without going hunting for that micro HDMI cable I bought specifically for that purpose and never use for anything else. I can set up a Pi 3b anywhere, at any time.


The micro hdmi thing (which I too loath) is for digital signage and industrial machinery - we (home users) aren't the audience and haven't been for a long time.

Being able to run two sides of an advertising board, or two control panel screens on a big hunk of metal doing fabrication things in a factory was more important to Raspberry Pi as a business apparently.

Why he heck they didn't just go with 1x normal hdmi and 1x usb-c +DP for the Pi 5 is a mystery, perhaps the SOC doesn't support it or something.


Don't forget the case and fan. I think the RPi 3 was the last one you could comfortably run without a fan and not worry about it frying the SD card


Completely depends on what you're doing. If you're doing a lot of sustained compute, or doing graphics, then yeah you're gonna want some cooling. But it's a useful little machine for all kinds of tasks which don't cause sustained high power consumption.


Two fried on me. One was just running a printserver without a case. It was in summer so ambient temperature was around 32C but still, you telling me you use rpi 5 without even a cooling case?


I have been using a Pi 4 as a desktop computer for a few years (didn't have anything else) with an microSD card and without any fan, heatsink or case. Haven't had anything problems. Obviously, this depends on your environment, but it worked fine for me.


I've had an rPi4 running a copy of a forum and server (for reference) in one of the fancy aluminum cases which passively cools for a couple of years now, no issues.


The big chunky aluminum ones do seem pretty good on the pi 4. I had one in the flirc case for a long time and it never seemed to have issues. Obviously adds to the cost though. Also not sure if the Pi 5 works as well in them given its higher thermals, and the Pi 4 didn't exactly run cool so imagine the 5 might throttle occasionally without active cooling.


Yeah I should have been more specific, a fan isn't the only option but you need either a fan or a cooling case. Running them naked is too risky now


They dont need to use an M series. The chip they are using isn't far off (spec wise) a base M4. It's single threaded performance is damn good.


I didn't realize that. I thought it was a different architecture - or different enough that the two couldn't run the same binaries. Are they indeed binary compatible?


You can run iOS apps on an Apple Silicon Mac if the developer doesn't explicitly prevent you.


Additionally Apple have been sticking M-series chips in iPads for a while now. They appear to be pretty much interchangeable.


Yeah theres very little difference at this point between the A and M lines, think of the M as just being the more powerful line, but that doesnt make the A line weak, not by any stretch. Both are completely binary compatible at this point.

The A18 Pro single core performance is on-par with the M4 (a smidge lower but barely anything in it), and outperforms the M3.


Yes! It’s simply a naming convention.


You get what a budget product is right?


You're not the audience, obviously.


It's a show-stopper for the main audience: students.


Costs. gotta remember this thing is based on iPhone hardware...which doesnt have more than 1 usb port normally.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: