Hacker News new | past | comments | ask | show | jobs | submit login

Do you have a link for info on this? If this is a real issue (which I still have my doubts about), Apple can’t just sweep it under the rug, after all.



It's a real issue affecting a small minority of users; I have one who has already gone through 20% of their SSD lifetime writes.

Here's one of the worse ones back in February: https://twitter.com/marcan42/status/1361847686417190918?s=19

11.4 is in beta so not many people are using it, but at least one of the folks with the issue is running it and said it improved things.

Apple were definitely made aware of it, hence why it being fixed in 11.4 makes sense, though there is no official statement that I'm aware of.

I was never able to reproduce it myself; we never found a specific trigger, but some people have the issue consistently and others (most) don't. I only managed to trigger thrashing with very blatant memory pressure (i.e. allocating most of the system capacity and continuously reading it to keep it hot), which obviously isn't what these users were doing.

People have looked at the OSX swapper code, and there were some hints that the algorithm it uses to decide to swap may have had some bugs; if 11.4 fixes it then I'm sure we'll find out once the XNU source drops and we diff it. Nobody has actually tried to root cause this outside of apple (i.e. using debug XNU builds on an affected workload/user).

Also, we never confirmed that this was an M1 exclusive issue. There's some evidence that this is a Big Sur regression that affected all Macs, it's just that the effects aren't obvious on Intel ones because the age of the SSD makes it hard to draw conclusions unless you're actively watching lifetime writes over the course of weeks. On M1s, since the machines are young, the problem is obvious with a single data point.


I still don't see how this can be classed as a bug by anyone other than apple.

You have a single case of 10% lifetime usage (plus a 20% one you mention), along with thousands of reports of people with 2-5% - which you also stated was too high - based on your insistence of using TBW (which can vary by up to 10000x depending on the tech) instead of percentage used (supplied by the manufacturer).

I had an out of memory alert on my machine earlier because i opened a typescript file in VLC. It was using 26GB of memory (and climbing) when i noticed it and killed it. I have an 8GB RAM machine. The machine remained fully responsive throughout. That simply wasnt possible before.

Its definitely swapping a lot, for sure, but don't you think that there is a possibility that this is by design, sacrificing disk writes (i am 50TBW and still ONLY 2% "used" on a 256GB drive since launch) to make app switching more responsive?

I guess we will see when you are able to diff the source, and you can shut me up once and for all :)


It is by design, but not that much. That's the point. The machines are designed to use swap and memory compression to greatly enhance responsiveness even with less physical RAM than competitors. And that works well for most users. But there's a bug in the heuristic, and for some users, it starts pathologically swapping.

We've seen the numbers go up in the activity monitor. Even while doing ~nothing. Fast. That is obviously a bug. Even with some Electron apps open and such, I guarantee the working set of active apps was nowhere near the physical RAM size. And so, that's a bug.

Terabytes per day of swap activity is not normal, no matter how much these machines are designed to swap on purpose.

As I said, there's one user with 20% usage as reported by the drive. That's not TBW, that's real (they're at >500 TBW, for what it's worth), and it means that machine is going to have a dead SSD within 2 years if the issue isn't fixed.


> Terabytes per day of swap activity is not normal, no matter how much these machines are designed to swap on purpose.

1TB is only 62.5 * 16GB. If its paging out 8GB+ apps (quite easy for chrome with a number of tabs) it only takes one memory hog to increase the TBW in a few hours of typical app switching for a mobile app developer.

This edge case is pretty extreme, sure, but its still a MINIMUM lifetime of 2 years. It doesn't mean its suddenly going to die when it hits 100%, and even if it did it should be covered by warranty. And this usage is an order of magnitude more than the vast majority of other reports that were made.

Im inclined to think its a non-issue, but totally respect your position.

As an aside, I use tab suspenders on my browsers - habit from my intel mac where chrome frequently caused memory congestion. Its probably why I get away with running 2 iOS simulators, an android emulator, xcode, intellij, 3 vscode instances, safari, firefox and chrome, and a bunch of utilities and services on an 8GB machine - but ill still be first in line for a 32GB+ 16+ core machine, because then ill be able to run VMs :D


Just FYI, here's an Intel user with the issue. 50% lifetime usage in 7 months.

There is no way this is normal :-)

https://twitter.com/VE7FIM/status/1396395431941210118?s=19


A user having high drive usage doesn't make it an issue, let alone the same issue.

That user you linked to is using Catalina (as they mention in their twitter thread where they demonstrate a 3% usage increase over 2 weeks), so it will be completely unrelated to the support for silicon, which wasnt added until Big Sur.


Ironically enough, had swapping not been so fast then runaway RAM usage bugs probably would've been found a lot sooner as they crippled the machine.


Tangential aside: I wonder if the kernel was running the swapfest on the efficiency cores, and that's what let the system remain so responsive.

See also: https://eclecticlight.co/2021/05/17/how-m1-macs-feel-faster-...


Swapping isn't CPU-intensive, and Apple also implemented the memory compression as custom CPU instructions. Swapping is I/O intensive, and these machines have stupid fast SSDs which is why they can get away with it.

When you think of swapping as slow it's not because it eats CPU, it's because it blocks on I/O.


Its all very interesting. I wish apple would be less tight lipped about how it all works together. Theres so much guesswork because we don't fully understand how the new arch is being utilized.


>Apple can’t just sweep it under the rug

Check history, until a class action lawsuit forces Apple to admit the problem they will sweep it under the rug or they claim you are using it wrong.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: