I thought the memory was one of the more interesting bits here.
My 2-year-old Intel MBP has 64 GB, and 8 GB of additional memory on the GPU. True, on the M1 Max you don't have to copy back and forth between CPU and GPU thanks to integrated memory, but the new MBP still has less total memory than my 2-year-old Intel MBP.
And it seems they just barely managed to get to 64 GiB. The whole processor chip is surrounded by memory chips. That's in part why I'm curious to see how they'll scale this. One idea would be to just have several M1 Max SoCs on a board, but that's going to be interesting to program. And getting to 1 TB of memory seems infeasible too.
Just some genuine honest curiosity here; how many workloads actually require 64gb of ram? For instance, I'm an amateur in the music production scene, and I know that sampling heavy work
flows benefit from being able to load more audio clips fully into RAM rather than streaming them from disk. But 64g seems a tad overkill even for that.
I guess for me I would prefer an emphasis on speed/bandwidth rather than size, but I'm also aware there are workloads that I'm completely ignorant of.
Same, I tend to get everything in 32GB but more and more often I'm going over that and having things slow down. I've also nuked an SSD in a 16GB MBP due to incredibly high swap activity. It would make no sense for me to buy another 32GB machine if I want it to last five years.
Another anecdote from someone who is also in the music production scene - 32GB tended to be the "sweet spot" in my personal case for the longest time, but I'm finding myself hitting the limits more and more as I continue to add more orchestral tracks which span well over 100 tracks total in my workflows.
I'm finding I need to commit and print a lot of these. Logic's little checker in the upper right showing RAM, Disk IO, CPU, etc also show that it is getting close to memory limits on certain instruments with many layers.
So as someone who would be willing to dump $4k into a laptop where its main workload is only audio production, I would feel much safer going with 64GB knowing there's no real upgrade if I were to go with the 32GB model outside of buying a totally new machine.
Edit: And yes, there is does show the typical "fear of committing" issue that plagues all of us people making music. It's more of a "nice to have" than a necessity, but I would still consider it a wise investment. At least in my eyes. Everyone's workflow varies and others have different opinions on the matter.
I know the main reason why the Mac Pro has options for LRDIMMs for terabytes of RAM is specifically for audio production, where people are basically using their system memory as cache for their entire instrument library.
I have to wonder how Apple plans to replace the Mac Pro - the whole benefit of M1 is that gluing the memory to the chip (in a user-hostile way) provides significant performance benefits; but I don't see Apple actually engineering a 1TB+ RAM SKU or an Apple Silicon machine with socketed DRAM channels anytime soon.
I think we'd probably see apple use the fast and slow ram method that old computers used back in the 90's.
16-32GB of RAM on the SOC, with DRAM sockets for usage past the built in amount.
Though by the time we see an ARM MacPro they might move to stacked DRAM on the SOC. But i'd really think two tier memory system would be apple's method of choice.
I'd also expect a dual SOC setup.
So I don't expect to see that anytime soon.
I'd love to get my hands on a Mac Mini with the M1 Max.
I went for 64GB. I have one game where 32GB is on the ragged edge - so for the difference it just wasn't worth haggling over. Plus it doubled the memory bandwidth - nice bonus.
And unused RAM isn't wasted - the system will use it for caching. Frankly I see memory as one of the cheapest performance variables you can tweak in any system.
> how many workloads actually require 64gb of ram?
Don't worry, Chrome will eat that up in no time!
More seriously, I look forward to more RAM for some of the datasets I work with. At least so I don't have to close everything else while running those workloads.
As a data scientist, I sometimes find myself going over 64 GB. Of course it all depends on how large data I'm working on. 128 GB RAM helps even with data of "just" 10-15 GB, since I can write quick exploratory transformation pipelines without having to think about keeping the number of copies down.
I could of course chop up the workload earlier, or use samples more often. Still, while not strictly necessary, I regularly find I get stuff done quicker and with less effort thanks to it.
Not many, but there are a few that need even more. My team is running SQL servers on their laptops (development and support) and when that is not enough, we go to Threadrippers with 128-256GB of RAM. Other people run Virtual Machines on their computers (I work most of the time in a VM) and you can run several VMs at the same time, eating up RAM really fast.
On a desktop Hackintosh, I started with 32GB that would die with out of memory errors when I was processing 16bit RAW images at full resolution. Because it was Hackintosh, I was able to upgrade to 64GB so the processing could complete. That was the only thing running.
What image dimensions? What app? I find this extremely suspect, but it’s plausible if you’ve way undersold what you’re doing. 24Mpixel 16bit RAW image would have no problem generally on an 4gb machine if it’s truly the only app running and the app isn’t shit. ;)
I shoot timelapse using Canon 5D RAW images, I don't know the exact dimensions off the top of my head but greater than 5000px wide. I then grade them using various programs, ultimately using After Effects to render out full frame ProRes 4444. After Effects was running out of memory. It would crash and fail to render my file. It would display an error message that told me specifically it was out of memory. I increased the memory available to the system. The error goes away.
But I love the fact that you have this cute little theory to doubt my actual experience to infer that I would make this up.
> But I love the fact that you have this cute little theory to doubt my actual experience to infer that I would make this up.
The facts were suspect, your follow up is further proof I had good reason to be suspect. First off, the RAW images from a 5D aren’t 16 bit. ;) Importantly, the out of memory error had nothing to do with the “16 bit RAW files”, it was video rendering lots of high res images that was the issue which is a very different issue and of course lots of RAM is needed there. Anyway, notice I said “but it’s plausible if you’ve way undersold what you’re doing”, which is definitely the case here, so I’m not sure why it bothered you.
>> die with out of memory errors when I was processing 16bit RAW images
> Canon RAW images are 14bit
You don’t see the issue?
> Are you just trying to be argumentative for the fun?
In the beginning, I very politely asked a clarifying question making sure not to call you a liar as I was sure there was more to the story. You’re the one who’s been defensive and combative since, and honestly misrepresenting facts the entire time. Where you wrong at any point? Only slightly, but you left out so many details that were actually important to the story for anyone to get any value out of your anecdata. Thanks to my persistence, anyone who wanted to learn from your experience now can.
>> I was processing 16bit RAW images at full resolution.
>> ...using After Effects to render out full frame ProRes 4444.
Those are two different applications to most of us. No one is accusing you of making things up, just that the first post wasn't fully descriptive of your use case.
Working with video will use up an extraordinary amount of memory.
Some of the genetics stuff I work on requires absolute gobs of RAM. I have a single process that requires around 400GB of RAM that I need to run quite regularly.
It’s a slight exaggeration, I also have an editor open and some dev process (test runner usually). It’s not just caching, I routinely hit >30 GB swap with fans revved to the max and fairly often this becomes unstable enough to require a reboot even after manually closing as much as I can.
I mean, some of this comes down to poor executive function on my part, failing to manage resources I’m no longer using. But that’s also a valid use case for me and I’m much more effective at whatever I’m doing if I can defer it with a larger memory capacity.
Since applications have virtual memory, it sort of doesn’t matter? The OS will map these to actual pages based on how many processes are available, etc. So if only one app runs and it wants lots of memory, it makes sense to give it lots of memory - that is the most “economical” decision from both a energy and performance POV.
So, M1 has been out for a while now, with HN doom and gloom about not being able to put enough memory into them. Real world usage has demonstrated far less memory usage than people expected (I don't know why, maybe someone paid attention and can say). The result is that 32G is a LOT of memory for an M1-based laptop, and 64G is only needed for very specific workloads I would expect.
Measuring memory usage is a complicated topic and just adding numbers up overestimates it pretty badly. The different priorities of memory are something like 1. wired (must be in RAM) 2. dirty (can be swapped) 3. purgeable (can be deleted and recomputed) 4. file backed dirty (can be written to disk) 4. file backed clean (can be read back in).
Also note M1's unified memory model is actually worse for memory use not better. Details left as an exercise for the reader.
Unified memory is a performance/utilisation tradeoff. I think the thing is it's more of an issue with lower memory specs. The fact you don't have 4GB (or even 2 GB) dedicated memory on a graphics card in a machine with 8GB of main memory is a much bigger deal than not having 8GB on the graphics card on a machine with 64 GB of main RAM.
Or like games, even semi-casual ones. Civ6 would not load at all on my mac mini. Also had to fairly frequently close browser windows as I ran out of memory.
I couldn't load Civ6 until I verified game files in Steam, and now it works pretty perfectly. I'm on 8GB and always have Chrome, Apple Music and OmniFocus running alongside.
I'm interested to see how the GPU on these performs, I pretty much disable the dGPU on my i9 MBP because it bogs my machine down. So for me it's essentially the same amount of memory.
From the perspective of your GPU, that 64GB of main memory attached to your CPU is almost as slow to fetch from as if it were memory on a separate NUMA node, or even pages swapped to an NVMe disk. It may as well not be considered "memory" at all. It's effectively a secondary storage tier.
Which means that you can't really do "GPU things" (e.g. working with hugely detailed models where it's the model itself, not the textures, that take up the space) as if you had 64GB of memory. You can maybe break apart the problem, but maybe not; it all depends on the workload. (For example, you can't really run a Tensorflow model on a GPU with less memory than the model size. Making it work would be like trying to distribute a graph-database routing query across nodes — constant back-and-forth that multiplies the runtime exponentially. Even though each step is parallelizable, on the whole it's the opposite of an embarrassingly-parallel problem.)
>The SoC has access to 16GB of unified memory. This uses 4266 MT/s LPDDR4X SDRAM (synchronous DRAM) and is mounted with the SoC using a system-in-package (SiP) design. A SoC is built from a single semiconductor die whereas a SiP connects two or more semiconductor dies.
SDRAM operations are synchronised to the SoC processing clock speed. Apple describes the SDRAM as a single pool of high-bandwidth, low-latency memory, allowing apps to share data between the CPU, GPU, and Neural Engine efficiently.
In other words, this memory is shared between the three different compute engines and their cores. The three don't have their own individual memory resources, which would need data moved into them. This would happen when, for example, an app executing in the CPU needs graphics processing – meaning the GPU swings into action, using data in its memory. https://www.theregister.com/2020/11/19/apple_m1_high_bandwid...
I know; I was talking about the computer the person I was replying to already owns.
The GP said that they already essentially have 64GB+8GB of memory in their Intel MBP; but they don't, because it's not unified, and so the GPU can't access the 64GB. So they can only load 8GB-wide models.
Whereas with the M1 Pro/Max the GPU can access the 64GB, and so can load 64GB-wide models.
How much of that 64 GB is in use at the same time though? Caching not recently used stuff from DRAM out to an SSD isn't actually that slow, especially with the high speed SSD that Apple uses.
Right. And to me, this is the interesting part. There's always been that size/speed tradeoff ... by putting huge amounts of memory bandwidth on "less" main RAM, it becomes almost half-ram-half-cache; and by making the SSD fast it becomes more like massive big half-hd-half-cache. It does wear them out, however.
You were (unintentionally) trolled. My first post up there was alluding to the legend that Bill Gates once said, speaking of the original IBM PC, "640K of memory should be enough for anybody." (N.B. He didn't[0])
Video and VFX generally don't need to keep whole sequences in RAM persistently these days because:
1. The high-end SSDs in all Macs can keep up with that data rate (3GB/sec)
2. Real-time video work is virtually always performed on compressed (even losslessly compressed) streams, so the data rate to stream is less than that.
My 2-year-old Intel MBP has 64 GB, and 8 GB of additional memory on the GPU. True, on the M1 Max you don't have to copy back and forth between CPU and GPU thanks to integrated memory, but the new MBP still has less total memory than my 2-year-old Intel MBP.
And it seems they just barely managed to get to 64 GiB. The whole processor chip is surrounded by memory chips. That's in part why I'm curious to see how they'll scale this. One idea would be to just have several M1 Max SoCs on a board, but that's going to be interesting to program. And getting to 1 TB of memory seems infeasible too.