All modern operating systems compress inactive pages.

Eric_WVGG · on Aug 26, 2024

My understanding is that the performance is based around Apple silicon being optimized for the NeXT/ObjC model of retain/release, which they got down from 30 nanoseconds on Intel to 6.5ns on M1 (14ns on emulated Intel).

https://singhkays.com/blog/apple-silicon-m1-black-magic/#is-...

IMO 8gb is still barely acceptable for most consumers in 2024, but not for any laptop that you'd expect to get more than say two years of usefulness out of, so… yeah it's about time.

wmf · on Aug 26, 2024

That increases performance but it doesn't save any RAM.

Eric_WVGG · on Aug 26, 2024

I’m not exactly an expert on this, but I do know from experience that Objective-C programming (and anything coming out of an ARC compiler, like Swift) is a constant game of grabbing memory, using it, and releasing it. My impression is that if you can release memory faster, that means greater availability in the pool.

Guesses aside, I spent a lot of time in Xcode on an 8gb M1 while waiting for the M1 Pro to come out three years ago. It was great, much better performing than the 16gb Intel it was replacing. I still don’t think 8gb is really a big deal for most people, getting more is just important for speculative future needs.

ComputerGuru · on Aug 26, 2024

The behavior you describe would be a function of the memory allocator in use (system allocator, custom allocator) and independent of the hardware. It's at a much, much higher level than the hardware or even the page compression.

somehnguy · on Aug 26, 2024

They're referring to these, when Apple said 8GB = 16GB elsewhere -

https://news.ycombinator.com/item?id=40024631

https://9to5mac.com/2023/11/08/m3-macbook-pro-ram/

ghostpepper · on Aug 26, 2024

It's not just compression - also the unified memory architecture.

zarzavat · on Aug 26, 2024

At the risk that I’ve missed a joke, the unified memory architecture reduces the amount of RAM you have available since “RAM” for any computer with discrete graphics doesn’t include VRAM.

zamadatix · on Aug 26, 2024

I think they are saying it from the perspective rather than two separate banks which are rarely both full it's one bank which is only ever full when 100% of the hardware is full. That said I think many miss that's how integrated graphics memory management had been working on x86 systems for many many years already. I.e. the "dedicated RAM" slider in the boot firmware is a legacy holdover that should be set to a minimum token value, not something which determines the limit the iGPU has access to. macOS also works this way, there is a token amount of the unified memory reserved for the iGPU still, you just can't adjust that amount higher since there is no legacy holdover for it to make sense to do so.

bitwize · on Aug 26, 2024

They are saying it from the perspective of the RAM being on the same die as the CPU. This is one of the innovations of the Apple Silicon architecture as it SIGNIFICANTLY reduces memory access latency.

It's not just TSMC's 3nm process. It's also Apple engineering.

zamadatix · on Aug 27, 2024

It's not on die, it's separate dies on chip.

This engineering was common in x86 CPUs by 2013 when AMD introduced Heterogeneous System Architecture which utilized Heterogeneous Uniform Memory Access. The approach has its upsides and downsides, the latter generally being a unified bus tends to have much lower overall bandwidth (even in the Max) and runtime scalability issues. Upsides are more obvious for the types of systems people want APUs for in the first place though so that's usually fine.

The main bit of engineering Apple should be lauded for in the memory department is the gumption to throw the hundreds of GB/s at the mid and high end models.

paulmd · on Aug 27, 2024

HSA was not “unified” in the modern sense. It still required designating memory as gpu or cpu side and these implied different cache coherency rules that meant memory couldn’t actually be shared, by default. To actually share memory you had to use a special “garlic bus” that guaranteed visibility and ordering and massively slowed down performance. Similarly, it was also impossible for the gpu to see cpu memory unless it was pinned and tagged for a special “onion bus”, but at least this was relatively fast iirc.

https://www.realworldtech.com/fusion-llano/3/

In contrast apple actually has everything tied into a single unified space with a single controller that immediately makes all writes visible regardless of where the happen.

They’ve also got enormously more memory bandwidth to play with. M1 Max is close to PS5 in both shader configuration and memory bandwidth.

wmf · on Aug 26, 2024

No, it doesn't reduce latency.

wtallis · on Aug 26, 2024

Also, the RAM isn't on the same die as the CPU cores. TSMC doesn't even make DRAM.

TiredOfLife · on Aug 27, 2024

https://x.com/LinaAsahi/status/1820947147312820497

"Sooo apparently there's this myth that only Apple has "unified memory"...?

Every single modern integrated graphics system works the same way. They even share the MM code in Linux.

I just had a reply guy try to argue otherwise with me even when I told him I wrote the driver"

IshKebab · on Aug 26, 2024

I don't think Linux does by default. You need to enable it, via some moderately complex commands line commands.

amlib · on Aug 26, 2024

Fedora, Ubuntu and few other distros does this by default since years ago.

HumblyTossed · on Aug 26, 2024

What is the effect on battery life vs keeping an actual 32GB or RAM powered?

zamadatix · on Aug 26, 2024

Generally it's more about the number of chips you need. If you can get to x GB of RAM with the same number of memory chips, just higher capacities, then the power difference is truly quite miniscule. If you have to double up chips to reach the capacity then you start drawing more power (though still on the order of a couple Watts max difference at those sizes). Even then it's not always a constant couple Watts, RAM+memory controllers uses less power when you're not actively writing data.