Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Where’s the Apple M2? (tbray.org)
124 points by tosh on July 14, 2021 | hide | past | favorite | 161 comments


Hmm. Not very interesting take. First, the M1 is the 12th chip design in the Apple Silicon series (excluding the X variants). They have cranked out a more capable, higher performing, lower power chip every year since 2010.

https://en.wikipedia.org/wiki/Apple_silicon

Second, it's not about clock rate. That is only one small part of the story. It is really about instructions per cycle per core. Apple is killing it on that front and running wider at lower rates is a big part of how they are outperforming in performance per watt while still winning in single core performance. We may see some clock rate increase in an M2, but I suspect their basic design philosophy won't change. It is just working too well.

For the curious, see https://travisdowns.github.io/blog/2019/06/11/speed-limits.h....

That table shows that the M1 has a much bigger reorder buffer, large load and store buffers, huge integer and vector register files, way more branches in flight, etc. By eschewing high clock rates, they are able to really go after massive concurrency at the hardware level in a single core. 7 simultaneous integer operations, 4 simultaneous floating point, multiple load and store. It's a beast.

https://www.anandtech.com/show/16226/apple-silicon-m1-a14-de...

Of course they will come out with an M2 soon. They've been doing this year after year for over a decade.


> 7 simultaneous integer operations, 4 simultaneous floating poin

The floating point is going to be an interesting thing to look at - the CPUs made targeting HPC workloads tend to be flops heavy, but the flops tend to be starved for memory bandwidth unless you're doing exactly the best vector processing you can & that fortran can do a great job with.

So you throw in a lot more oomph on the vector side and leave single operation float arithmetic at 2.

Floating point operations of a smaller size of values (more realistically, quarternions or rgba) would be the reason that M1 feels a little bit more snappy when it comes to basic things like text-layout code or graphics images which don't do SIMD very well, but still consume a lot of arithmetic.

I'd suspect that the vertical integration is going to be the secret, because it looks like more profile information of desktop apps going into chip design here.

A similar story is expected of the Graviton series as well, with AWS having a good idea what to build for.


> They've been doing this year after year for over a decade.

So was Intel for decades, then they weren't.


Intel's stalled mostly at the fabrication side. Since Apple is fabless, this isn't really a good comparison.


I don't think this is a story, but someone still has to fab them and they can stall


The difference is that if Apple gets stalled, the whole world does since Apple uses whoever is best at the time.


What if the best is intel?


Thank you. I roll my eyes when someone uncritically states that fabless companies don't get stuck on process. TSMC was basically in Intel's shoes pre-finFET, so its customers were SOL.


Right, but apple can switch fabs much more quickly than Intel can fix fabs.


Who are they going to go to if TSMC does not have the next node ready when they need it? I only know of Intel and TSMC as far as latest-greatest nodes go, is there anyone else?


Samsung.


Samsung Foundry neither had the leading edge node, yield or capacity for Apple's appetite. Not to mention Apple is extremely worry of IP theft.


Well, no. Intel stalled at the design side. Even with superior processes to AMD they weren't competitive, and their CPU designs had only small changes for a decade.


Weren’t they stalled for nearly a decade on the manufacturing process node (14nm from 2014 on), so design was really the only way they’ve been able to produce any improvements? I’ve tried keeping up to date on this and that has seemed to be the case.


Yes, and they didn't improve at all. AMD outperformed Intel on an even weaker process, TSMC 14nm.


Sure, but we saw them stalling for several years. Apple doesn’t seem like they’re in that sort of rut.


Apple was also ballasted by iPhone sales during the lull in personal computing that took place about five years ago. Take a look at their income breakdown and you'll see what I mean. Intel bungled the transition to EUV but that isn't the only challenge they have had to surmount in the past decade.

Personally I don't see the Intel rut as particularly deep or mucky. Intel has good management, and Gelsinger has deep knowledge of how enterprise customers operate due to his experience. They have a road map for some exciting product releases in the next couple of years, and they dominate their game in terms of market share.

Intel made over $20B profit last year, and semiconductor demand is booming across the board, but they still get trashed by the masses. It's really interesting (if you're into stocks) to compare Intel’s P/E ratio against the rest of the sector. Even the market doesn't think particularly highly of them.


Note, Intel had three different CEOs in the past four years.


They didn't have real competition for a decade. Now they do.


Why doesn't anyone ever bother to mention that the M1 is a RISC CPU when discussing that it can do more IPS than x86?

There are 1024 possible Armv8 instructions [1] as opposed to 1,503 x86 instructions [2] and 3,684 x86-64 instructions [3]. There are things x86 and x86-64 can do in a single instruction that would take dozens of instructions to accomplish on Arm.

[1] https://www.csie.ntu.edu.tw/~cyy/courses/assembly/10fall/lec...

[2] https://fgiesen.wordpress.com/2016/08/25/how-many-x86-instru...

[3] https://www.csie.ntu.edu.tw/~cyy/courses/assembly/10fall/lec...


> Why doesn't anyone ever bother to mention that the M1 is a RISC CPU when discussing that it can do more IPS than x86?

This doesn't really matter, RISC is not a real distinction - it just means "kind of like MIPS". They both have complex address operands and 2-operand vs 3-operand is largely aesthetic.

x86's stronger memory model is what matters most since it can reorder memory accesses less often.


yes, but many x86 instructions get cracked into multiple risc-like instructions internally - those things that the ARM can't do likely get converted into what are essentially multiple ARM-like instructions.

Theoretically this means that x86 instructions are smaller (with better I$ performance), at the expense of larger/slower instruction decode


I would argue that it is better because it gives you the ability to tier your hardware higher. Low end SKUs like mobile hardware can get less physical components and more abstraction while higher end SKUs like server hardware can have more physical components and less abstraction for greater performance. All still operate under the guise of the highest-common-denominator.

It is a lot easier to abstract away complex instructions from x86 (when performance isn't needed) than it is to add physical hardware instructions to ARM (when performance is needed).


More isn't always better in CPU design, more means slower clocks, or longer pipelines, neither of which necessarily mean faster CPUs.

Any particular implementation of an architecture is a study in tradeoffs - gates are relatively cheap these days, but not much faster - it makes it easier to throw gates at bigger caches and more cores rather than faster ones - faster clocks likely mean longer pipelines which need lots of well predictable branches to perform which skew an implementation to particular types of benchmarks/workloads


In real workloads ARM executes only slightly more instructions than x86.


While true, it does also mean more stress on the reordering system, cache, branches, etc... Than on x86.


Do number of instructions even matter on modern chips? intel/amd chips don't even use x86 since they break it down in to their own internal RISC via a translation layer?

If you have a CPU that breaks one mega x86 instruction in to 100 internal instructions, is that any better than 100 external instructions generated by a compiler?


It does and doesn't. When your CPU is breaking up that instruction you can optimize for the exact way its being broken up at the design because you know what those 100 instructions will be and in what order. You pay for that with more decode. It's a trade-off, but it does mean there is less of a need for the machinery the M1 has more of, somewhat.


You have to feed those 100 instructions to the CPU, this takes memory bandwidth, which means you need larger instruction caches. Caches take up most of the space and power on modern hardware


Other people have commented about the increasing blurriness of that distinction over the last couple of decades but also ask why you believe it matters. Yes, M1 can do more instructions per second but it also does quite well on almost all high-level benchmarks. If it was just the case that a bunch of simple instructions were juicing IPS counts, you’d see a big discrepancy between those two metrics like we used to in the 90s. It seems like the most likely explanation for it not being mentioned is that it doesn’t add much to the conversation.


The biggest difference is that arm has a fixed length for instructions. Meaning apple can predictably feed more decoders than x86 which must finish decodes before knowing where future boundaries are going to be.

AMD has admitted to being at our very near the upper limit for x86 decoding. With arm Apple can have and reliably feed more decoders from the incoming stream of instructions. They have built their whole chip around being able to extract parallelism from the huge instruction window.


> That table shows that the M1 has a much bigger reorder buffer, large load and store buffers, huge integer and vector register files, way more branches in flight, etc.

That leaves out the most important thing that enables all of that, the 8-wide symmetric decoder that can feed those. x86_64 cpus only have 5-wide ones, and only the first of those can decode the multi-uops instructions, and even worse there are even more complex instructions that are microcoded.


Yes, absolutely right. That decoder is also beastly. Of course, fixed (nearly) length ARM instructions make the problem a lot easier. x86 instruction format and length variance is just ridiculous. Intel has saddled themselves with years of complexity and it has become a big obstacle for them. They have always muscled through it by throwing process innovation at the problem, but recent stumbles leave them trapped in a box of their own creation. Intel is a great company and I expect them to dig out of this, but it is tough right now.


> First, the M1 is the 12th chip design in the Apple Silicon series

How are you counting that?


I'm not parent but the A series starts with A4 which was the first commercially released model. So between A4-A14 we have 11 generations, and M1 is the 12th.


A14 and M1 are from the same series.


I guess you mean same generation. Apple uses the word "series" to identify A Series, M Series etc.

By the way, yes, both A14 and M1 use Firestorm + Icestorm cores: so I guess that they are the same generation from a technical standpoint.


Technically there's all the X versions which are boosted versions and are somewhat different internally.


Yes that's true but parent specified that the count was excluding the X versions.


Pretty sure there was an A12Z as well.


A4,A5,A6,...,A14,M1


The M1 can indeed run more instructions at once and branch those instructions better, but it comes at a cost.

That cost is that it's larger. At the same power AMD can stuff twice the cores, with similar though lower single core performance. That also means lower clock speeds, and there are less instruction level guarantees you can rely on thus somewhat more complexity is necessary.

And no, it's not really a beast. It's competitive.


>At the same power AMD can stuff twice the cores

Note that the “TDP” is meaningless for actual performance/watt comparisons with a controlled parameter due to the variation in this term. It’s just a marketing term.

The Ryzens on 7NM consume much more power than Apple’s Big cores did on the A12 and A13, both of which were on TSMC 7NM. They are also both competitive with the single-core scores of the Ryzen Zen 2 or 3 cores, if not equivalent while being older architectures than AMD’s.

The M1’s Big [Firestorm] cores also consume less power and achieve more performance than the Zen 3 core.

It’s safe to say Apple’s architecture is largely superior, outright.

https://images.anandtech.com/doci/16226/spec2006_A14.png


I'm not talking about TDP when I talking about stuffing cores, I mean physical size. The M1 cores use more transistors.

I'm sorry, but you're comparing power draw of a workstation "X" AMD chip with a laptop chip. It's simply not a honest comparison. You must compare the efficiency of mobile chip with a mobile chip. When you do that you find similar efficiency.

I don't understand why no one is posting actual apples to apples comparison. Every time a comparison is posted its either comparing to a workstation chip to find power efficiency even if they're tuned to be power inefficient, or comparing to Intel CPUs only, etc...

Laptop processors from AMD use less than half the power per core of workstation processors while sacrificing only a very small performance gain, due to the use of a different lithography for I/O.

Apple's architecture is simply not superior. If it was, we wouldn't be making these contrived comparisons, and we wouldn't even be comparing 5nm chips to 7nm chips.


You can compare Apple's 7NM chips, too, as I just said if you'd read. And guess what? The power consumption figures bode poorly for AMD.

The IO die only adds like ~15W btw, yes, I'm aware it's on a GlobalFoundries node.

Lastly, AMD's mobile chips throttle down to fairly low clocks and in AMD's case, low to modest performance when not plugged in, it's why you keep hearing tale of the great battery life on Zen 3 laptops.


A fair comparison would probably be Ryzen 5300U vs M1 but I've never seen the 5300U in the wild. I expect M1X to do well against the 5900H when it comes out.


Wouldn't the 5800U be an apt comparison? 25-30W is pretty much what an M1 in a mini will use.


It could be OK as long as you normalize the power limits (which most people can't do) and don't cherry-pick only single-threaded or multi-threaded tests.


The whole premise is flawed. Intel and AMD can release a CPU whenever they want. Even if the M2 (or whatever) is ready now, in what machine does Apple put it? And more importantly, when does that machine get released to consumers? Apple has never have had a consistent release cycle with their machines, with redesigns clearly slowing down the release cycle.

They told us they were on a two-year transition, and we have until WWDC next year at the earliest to finish that transition. And looking at what happened during the first year, it's pretty clear they meant that each machine would be updated once during the transition.

The Macbook Air, Mac Mini and 13-inch Macbook Pro were obviously the machines deemed to skip a redesign, and were released first. The 24-inch iMac was redesigned, and I suspect every other computer will get a redesign to go with their new chip.

I suspect this Fall, after the new iPhones unveil a new set of cores, we will see those cores used to build a new chip that goes in the machines still due for an update: the big iMac and the big Macbook Pro. The M1 machines will not be updated, and we will only see a Mac Pro at WWDC 2022. Then who knows?


And with the long development cycles required for silicon, any conceivable "M1X delay" wouldn't be related to disappointing performance surprises. The M1X would have already been "mostly done" at the moment when the M1 launched.

If Apple is dealing with any surprises relating to the silicon, those would most likely be security issues discovered/reported in the M1 platform after launch. Only Apple knows what's on that list – it's definitely not empty – and whether any of them must be fixed before M1X devices ship.


I think a big part in the order of refresh is also minimizing the time spent in transition.

It's in Apple's best interest to stop selling x64 arch computers as fast as possible.

The logical order would be to do a blazing fast low end chip first in the Air/Mini (tbh I'm surprised they diluted the mbp brand with an iPad chip), with a higher end MBP/iMac chip less than a year later. The iMac Pro was already discontinued and the Mac Pro is vastly less important wrt the architecture transition and can easily be done last.

I expect M>1 iMac and/or MBP in September.


> Apple has never have had a consistent release cycle with their machines, with redesigns clearly slowing down the release cycle.

The history here might not be representative because many of those delays were caused by Motorola/IBM and Intel struggling to deliver chips in the expected volume or thermal budget. Since that was one of the motivations for Apple to make their own designs it should be less pronounced going forward, especially without Johnny Ives pushing the limits so hard.


"Eight months without perceptibly faster chip!"

Get a load of this guy. Can't even wait even a full year, he wants his CPU revolutions every 4 to 8 months.


Apple has frustratingly skipped entire Intel generations in the past, leaving their computers to languish for years without updates.

If the big Macbook Pro is only released this Fall or next Spring, it will be business as usual for Apple.

Same goes for an update to the M1 Macbook Air. I expect that machine to be updated next year at the earliest, with something akin to an M3.


I wonder if they can't ship the same M1 but with some external RAM and GPU. It won't speed up some apps, but it'll help with Pro apps.


If you look at all sorts of limitations on the M1 macs (less ports, only one external screen except on Mini where its one DP and one HDMI) it's plain that it's paying the price for using a phone SoC in a general purpose computer - no generic expansion buses, connectors optimised for phone use case, etc.

And they are banking hard on unified memory model in all sorts of marketing zbut it's unclear if it's the cause or effect of ring unable to use external GPU.


He doesn't want a whole new chip.

The M1 was hailed as an entry level chip, and merely the start of greater things to come. If that were true, logic would hold that they would have a higher end version of the existing M1 chip available by now.

It's not a different model architecture.


> The M1 was hailed as an entry level chip

Hailed by whom? I doubt you'll find a single instance of Apple calling M1 an "entry level chip".

It looks like this is the blogosphere getting too high on their own supply, huh?


Not sure about "hailed by", but that perception is probably driven largely by the market position of the Macs Apple put it into first. There is no Mac Mini or Macbook Air that isn't entry level.


Apple announced a two year transition, the two years are not over yet so why should they be out by now?

A new MacBook Pro with M1X will come out this fall according to credible sources.


I think this person has no idea how much work goes into bring a product up. It's an incredible amount of work and time that goes into it. And that's with code from Intel/AMD/Qualcomm + reference designs. Designing the silicon and the board and the firmware in under a few years is absolutely ridiculous. To do so in 8 months is ludicrous.


He's no slouch, and Apple has been working on this for a decade+. My guess is a resource shortage prevened moving faster.


I do believe there is something behind-the-scenes that is not going according to plan, but one thing I've found a little interesting is since Apple now controls all aspects of the system, it would have been an incredible bit of showmanship and a PR coup to just announce the refresh of the entire Mac lineup at November 2020 launch announcement with a Jobsian "One more thing..." kind of surprise.


I'm going to be honest: while the author certainly has credibility, and whilst there's nothing really factually wrong with any technicals mentioned here, the opinions here seem nonsensical and show a lack of understanding of how chips are made.

Apple's chip design team is working on M3 if not M4 right now.


Yes, this. The M2 design has been finished for some time to allow plenty of time for testing. Same with products. I always find it hilarious when "journalists" write articles about how Apple is still debating X vs Y for some hardware feature on the next gen phones due in 3 months. No chance. Hardware is hard and takes big lead times and huge amounts of testing.


That is typically true. However, when developing risky new HW, it's not uncommon to have all variants built & do a late-binding decision. It's expensive but it gives SW & HW engineering more time to flush out a marquee feature (e.g. I don't doubt that FaceID may have had TouchId as a fallback until late, although probably not as late as a month or two before announcements).


We’ve also seen some unexpected stumbles leading to late changes.

The failed release of shaped batteries in the 2016 MBP [0], and the jet-falls-off-the-aircraft-carrier release of AirPower. [1]

I generally agree Apple makes plans in advance and follows through on them. However, the company seems to push very hard on some product release deadlines and sometimes they miss.

[0] https://www.bloomberg.com/news/articles/2016-12-20/how-apple...

[1] https://www.bloombergquint.com/technology/apple-cancels-anti...


I agree on some products, but not at iPhone scale. They do pilot runs months in advance to shake out issues. When you sell several million in the first release weekend you have to build stock and fill the channel. They may choose not to enable something in software, but I really doubt they are making HW changes. Last minute HW changes lead to big failure rates, which is just too costly in terms of reputation and cost.


ASIC lead time is at least 2Y for a CPU, probably more.


This is a silly take, primarily because it's centered on a single benchmark, Lightroom image import.

Image import is generally I/O bound, so not a good fit for CPU comparisons.

The GPU issue is relevant, we'll see how the M2 does there. Will Apple need a discrete GPU to compete?

All that said, the M1 is most impressive in terms of performance/Watt. We'll see how the M2 holds up against Threadripper/Epyc once the Mac Pro refresh is done and it's benchmarked with many CPU/GPU bound pro workloads.

The M2 should be nice for Macbook Pros though. Looking forward to it!


primarily because it's centered on a single benchmark, Lightroom image import.

It's not a very interesting article but it specifically talks about the relative non-importance of that particular benchmark, beside the results being largely a wash:

I sorely miss the benchmark I saw in some other publication but can’t find now, where they measured the interactive performance when you load up a series of photos on-screen. These import & export measurements are useful, but frankly when I do that kind of thing I go read email or get a coffee while it’s happening, so it doesn’t really hold me up as such.

To date, I haven’t heard anyone saying Lightroom is significantly snappier on an M1 than on a recent Intel MBP. I’d be happy to be corrected.


One mans new cpu that’s as fast as the highest end Intel chips at the entry level is another persons not snappier. This is the Louis C. K. thing about Internet on a flipping plane not working perfectly and people being super entitled to nice things!


The highest end Intel chips are really middle of the pack AMD chips, which is the other competitor the M1 is really competing against. All in all its much more of a wash.


The other silly thing is that it's presenting the M1 as a big change that they could do again, while it's the result of a series of incremental improvment on the iPhone, and a big change with vertical integration on the laptop that they can't "do again".


I read the article as arguing the low hanging fruit are picked, and that performance can't quite make another leap. Can't bump up clock speed because it'll use more power or because the CPU design won't support it.

There isn't evidence that Apple's team has run out of architectural improvements however, so I do think performance gains are still out there. Plus there's always the possibility of going to smaller semiconductor processes.


I think he’s actually making the point you are: that a step that large is probably impossible at this point in time.

Not that I would have expected the step function they did manage to pull off either…


>Will Apple need a discrete GPU to compete?

I can already hear the talking heads now

"Most users don't need that power(but apparently they need the power of the m1?)"

If I know Apple, it will be medium end at extraordinary prices. Post purchase Rationalization will cause users to praise it regardless.


All I want is a 16" laptop with an M1 and 32GB of RAM. And the extra RAM is just so I can comfortably allocate RAM to VMs.


This but 64GB. Docker, Chrome and IDEs alone regularly consume up to 50GB memory on my Intel MBP.

Plus, why the stupid monitor limitation? A "Pro" MacBook needs to be able to have 2-3 monitors with no compromises.


Is that just cache data though? Programs and the OS disk cache will just eat all the RAM you have because there's no point letting it sit empty.


no


How much of your problem is Docker? If you're running more than, say, 6 containers locally, you'd really benefit from spinning up a Linux VM (Parallels Desktop Pro is nice) and putting your Docker containers there. Note Parallels supports port forwarding, and VSCode supports opening projects through ssh.

As it is, if you're running Docker Desktop for Mac, you're spinning up an entirely separate (internal) VM with macOS filesystem syncing for each Docker container you run.


AFAIK it is not possible to run Docker outside of a Linux VM on the mac.


Please correct me if I’m wrong but (s)he’s saying running docker inside a single Linux VM is more efficient than running the Docker beta (multiple VMs). You seem to be saying that Docker runs inside a Linux VM either way which no one is disputing…


What is your memory pressure like? Your RAM will get filled with file system and other caches, even if no programs are actually using the memory.


Cache is not the problem, my IDE uses ~10GB (IntelliJ with very large projects), Chrome can use 20-30GB depending on tabs open. Then you've got Docker which is 5GB on a good day, 10GB on a bad day.

I'm currently using 40GB and I rebooted on Monday.


Sounds like you have memory leaks.


By the time "M2" is released, you'd want 128GB.


And maybe more two ports?


YES please. I'm generally quite happy with my 13 inch MBP. But I do wish it had two more USB-C ports. Would make things less complicated. I don't really even want a bigger MBP (I have a 16 inch as well, if I need screen real estate, but I really like the portability of the 13). Just the same laptop with four ports would be 80% of ideal. The last 20% would be more RAM.


And support for two external displays


I'm running three right now, please.


four


> All I want is a 16" laptop with an M1 and 32GB of RAM. And the extra RAM is just so I can comfortably allocate RAM to VMs.

All I want is a 14" laptop with an M1 and 32GB of RAM. And the extra RAM is just so I can comfortably allocate RAM to Safari tabs.


The article says M1 can only support up to 16GB of RAM. The M2 would be able to support 64GB.


And the ability to run x86 docker images without software emulation.


Why is that so important?

1) in production, hosts like aws have cheaper options for arm based compute 2) software emulation on m1 is largely comparable to native x86 performance, certainly per watt.


> software emulation on m1 is largely comparable to native x86 performance, certainly per watt.

Where are you getting this from? Hardware emulation is certainly fast, but I very much doubt qemu emulation (which is what Docker is using on the M1) is that fast, although I admit to not having tried.


you read my mind.


> I have a good guess what’s going on: It’s proving really hard to make a CPU (or SoC) that’s perceptibly faster than the M1. Here’s why.

This is daft. Chips are not something that you can finalise the design of and have in customers hands tomorrow, next week, or next month.

Whatever pro-level AS device Apple ends up shipping, the design of it was finalised by the end of last year at the latest. What's far more likely is they either never planned to ship it ad WWDC, or (more likely) they're affected by the same market dynamics as everyone else in terms of the ongoing semiconductor shortages.


I seriously doubt Apple would take such a big step as switch cpu architecture, if they didn’t already know how to make the first few cpu generations. That would be way too risky. If I were Tim, I would have needed a 5-10 year detailed road map from the chip-designers before making this step.

The idea that Apple made the m1, and somehow got stuck now is a bit silly. Most likely, they know what they are doing. They might hit unexpected problems 3-4 years from now, but not right now.

The m1x and m2 will only be incremental steps forward. That’s how the A-series have progressed, and that’s fine.


> But it’s been eight months since the M1 shipped and we haven’t heard from Apple. I have a good guess what’s going on: It’s proving really hard to make a CPU (or SoC) that’s perceptibly faster than the M1

It’s possible the arguments in the rest of the article are correct, but this isn’t. Hardware cycles take a while and the design for the M1X or M2 was surely settled upon some time ago, probably even before the official M1 release.


> But it’s been eight months since the M1 shipped

A whole eight months?

How often does he want Apple to release new products? What's his problem?


I’ll be very surprised if they abandon the yearly cadence they’ve settled into for the phones.


Macs are not phones, I don’t see why they need to be updated in the same cycle.


There is a global chip shortage. Apple has said the shortage would impact their products.


Yes, this. The high-end M1 iMac shipped with a cooling system more powerful than the M1 can possibly benefit from, and everyone was sure a new MacBook Pro was coming by WWDC. The chip shortage seems like the most likely explanation for both.


No credible leaker (Ming-Chi Kuo, Gurman) predicted a new MacBook Pro at WWDC.


The problem was Jon Prosser who was telling all the YouTubers it was going to happen. Even though he's been seemingly wrong as many times as he's right.


It’s a bit more nuanced than that. From what I understand it’s ripples from higher up in the supply chain that are causing delays (substrate, power components, etc.). TSMC on its own is doing fine.


It’s been a month since WWDC. Wait until fall, that’s when the hardware announcements come anyway. Why complain so early in the year?


It's always a great time to complain.


> Anyhow, it’d be really surprising if Apple managed to get ahead of GPU makers like NVidia.

I disagree and I think that based on what we've seen with the current constraints (fanless etc.) actually Apple can come out with some very competitive products in the pro/desktop/larger-laptops space.

In fact, I also think that probably the GPU will be the main differentiator between low-end and high-end Apple silicon models.


My hypothesis is a major push on gpu-based compute. Specializing in ML training.


There are two routes Apple could have take for the 2nd Apple Silicon chip.

1. Apple could take the M1 design and just increased the number of Firestorm CPU cores (along with everything else). It would be an M1X, along exactly the same lines as the A10X and A12X.

2. Apple could create a next generation design with the eight or more of same Firestorm-next cores we will see in the A15 Bionic. Call it the M2.

The M1X would have been easy enough for apple, and I think if it existed, we would have seen it by now. But it would have suffered from many of the same flaws that the M1 suffers from. The inability to drive more than two displays being a notable one.

I think apple decided to skip the M1X and go straight for the M2.

But it's a little too early for the M2 to launch. As it will be using the same cpu and gpu cores as the A15, it kind of needs to launch at around the same time at the earliest; Maybe a month or two earlier as they don't need to stockpile as much silicon for the M2s.

My prediction is that we might see an M2 announced in August or September.


But [M1X] would have suffered from many of the same flaws that the M1 suffers from. The inability to drive more than two displays being a notable one.

No? Larger chips generally also have more I/O.


It's not actually an I/O problem. An M1 macbook can drive a massive 6K monitor with no issues from either thunderbolt port.

But try to drive two 1080p or 720p monitors. Impossible.

The issue is the M1 only has two CRTCs to drive the video timings.


Weird take —- kind of the “640kb should be enough for anybody” but about one specific CPU that we know is about to get replaced with the M1X/M2.

Unsure what the point is here.

And there are plenty of workloads which are easily parallelizable across cores. Like compiling things. Or just running multiple programs at once.

Which you can see by typing “ps”.


I think a lot of the critique here is misguided. The summary at the end seemed to me entirely reasonable: writing good code for parallelism is very hard, and moves the cost upward into software which has to be really well written to work. Boosting IO and cache will help with some things, but absent clockspeed, making an M2 significantly better than an M1 is hard.

I don't see why people complain its only one benchmark. to import 100MB of image data invites all of the IO, Cache and CPU to play along because this is encoded data: it has to be mapped out of one form, transformed into another, as a stream of data and its a lot bigger than either a single fetch from memory, or a single bus transaction. Its ameneble to parallelism within some limits depending on the nature of the encoding. It's also a real-world test.


> To date, I haven’t heard anyone saying Lightroom is significantly snappier on an M1 than on a recent Intel MBP. I’d be happy to be corrected.

This seems to be throwing the baby out with the bathwater. People have mostly been saying that everything else is snappier on an M1, app launches, task switching, etc.


Everything is really really fast, and it never heats up, and it's silent, and the battery last longer. I've been super impressed, biggest 'jump' I felt when going into a new laptop.


Yeah. Everything is so FAST on my M1. At this point I'd say the fact it has Classic in the name explains why it's slow.


> The idea isn’t crazy. The last few releases of Lightroom have claimed to make more use of the GPU

Lightroom keeps claiming this on new releases, and have done for years, and users keep finding very little improvement. Adobe clearly wants users moving to the non-Classic cloud.

> Anyhow, it’d be really surprising if Apple managed to get ahead of GPU makers like NVidia. Now, at this point, based on the M1 we should expect surprises from Apple. But I’m not even sure that’d be their best silicon bet.

I would be very surprised, competition is intense and there is a lot of money on the table for anyone with a better architecture. Nvidia would compete for any acquisitions. Ultimately does Apple even need to compete with Nvidia? They don't care about gaming, just desktop graphics and local ML inference.


The rumors from rumor sources are that the M1X or M2 was expected to be released at WWDC but delayed because of delays sourcing Mini LED panels. Obviously that's based on rumors, but it's a lot more plausible than the hypothesis that Apple, a company which has executed a very similar platform transition before while maintaining internal builds for their target platform, vastly overestimated their own capabilities.


There’s also a global pandemic going on that’s affecting the supply chain for a bunch of components.


There are two theories to what is going on:

1) The best silicon designers in the world have hit a wall and can’t improve on a processor they shipped to production a year ago.

2) There is a massive, global-scale supply chain disaster that is largely hidden from the end consumer which is secretly driving every major manufacturer insane.

It seems that a bunch of industry “insiders” want to go with option 1; personally, I’ll take the second option.


Third theory. There is no problem. Apple is executing their plan on the two year schedule they announced. They are releasing hardware on the schedules they always have.


Not even a year.

All I’ve heard is that the mini LED screens for the next (M1X) MacBooks haven’t been available in volume/to spec quite yet.

I’m really not thinking there’s any real delay of concern here yet. Not enough to write the above article and speculate somewhat needlessly.


> But the returns on memory investment past 16G are, for most people, just not gonna be that dramatic in general and specifically, probably won’t make your media operations feel faster.

This is only true for editing still images. If you're editing video (with tools like Davinci Resolve - which is free and excellent), 16GB is going to be quite inadequate.


If you're editing very high bitrate or high resolution video - or both. 16G is perfectly adequate for 1080p video around 30MB/s, which most people's video will be.


Considering even the cheapest smartphones today record in 4K, and some go up to 60 fps (including iPhone), most people are definitely not editing 1080p videos on their macbooks.


I have a phone capable of 4K60, and I never use it, unless my goal is to fill up my phone's storage as quickly as possible.

And I've seen reports that the M1 can handle RED RAW 4K footage with just 8G of RAM, and only stutters on 8K. Snazzy Labs even claims that it actually scrubs the timeline smoother than this $10,000 Mac Pro ever did, the only failing is 50% longer render times: https://www.youtube.com/watch?v=eY-S9EuJ5Xs&t=474s

So this once again proves that the M1 defies our standard expectations and measurements of computer processing power.


" But it’s been eight months since the M1 shipped and we haven’t heard from Apple."

Sheesh, 8 months is not a long time as far as these sorts of things go. Wasn't the M2 not expected until next year at the earliest? Given the current semiconductor production issues that might even be optimistic.


Geez, does he not know long it takes just to fab a chip? Find a mistake, fix, retype, and start over.


The biggest issue for me is that the M1 Macbooks can't drive two external displays. A GPU improvement (or whatever else is holding it back here) is sorely needed for a "Pro" device. 32GB RAM would be nice as well.


Assume they had a finalized design for M2 at the same time as M1. If their plan was to produce and roll out machines based on M2 shortly after the release of M1 they likely were not able to do it as planned due to COVID bottlenecks. Almost every company that produces physical products had to focus their resources on producing high volume parts. This has happened quite a bit in trailer manufacturing for example. All of the raw materials are going towards high volume selling trailer models. Niche trailers models are not being made in any significant quantity.


Seems a bit premature, honestly. If you’ve been paying attention to iPhones, Apple has been pushing steady improvements year after year. Some years it’s more, some year it’s less, but every year they have a new microarchitecture and it’s faster than the one from last year. I see no reason for Apple to suddenly stop being able to improve performance in their next generation chips, especially as they clearly have much more thermal and multicore headroom to spend.


I remember having lunch with some google engineers who lamented that they would find clever optimizations and management would save them for future generations. It makes sense from a marketing standpoint, but from an engineering standpoint it drove the engineers nuts. I get it though, it offers a good path to having consistent gains.


> I have a good guess what’s going on: It’s proving really hard to make a CPU (or SoC) that’s perceptibly faster than the M1.

While I don't buy this at all, I don't even want a chip that is perceptibly faster than the M1. Instead give me other "pro" features like more than 2 USB ports, 32GB RAM, multiple external displays, external GPU support, boot camp, larger SSD capacity.


I am constantly between 22-25GB so I definitely need that 32GB RAM. Maybe 64GB for future proofing. I kinda gave up on GPUs on Mac.


But you aren't using an M1. The article says M1s can only go up to 16GB of memory, and the M2 should be able to do 64GB.


That was my point. That’s why I don’t have an M1.


Can we also take a moment to reflect on the odd shape of the M1’s heatspreader? It looks like a normal heatspreader sawn in half.


It's literally designed to fit in the space previously used by the fan. They took all the internal shortcuts they could NOT to redesign that machine.


came here to say this, kinda.

m1 is a chip designed to fit into their current lineup. it makes sense to launch m2 with redesigned hardware or after a hardware redesign. ergo, the M2 will need to fit in the chassis of the new iMac.

hardware redesign + new in-house chip = a lot of time

I'd be amazed if this even drops this year instead of early 2022 with how long apple's release cycles are. think they'll prob roll out the smaller MBP, the Air, the mini redesigns in the fall/winter, while keeping the current 16" lineup. they'll launch the 16" and larger iMac at next summer's WWDC.


How do you know it was designed for their current lineup?


It's exactly the same design as the A12X.


Perhaps it’s just “We turned out to only need half of the design thermal capacity” meets “let’s break the mold” in half.


there's still a lot of cool stuff that could be done in silicon:

1) cross execution caching of unchanged segments of code: if the ram state doesn't change across runs for a set of instructions, could execution be sped up by caching that ram state with os and silicon support for applying it?

2) assured computing: cryptographic guarantees about exactly which instructions executed, on which physical machine with what inputs

3) hardware support for accelerated vms emulating other architectures

4) more efficiency gains by having more stuff like the low power cores for light workloads

5) more of the cool specialty cores for accelerating ml/dsp/linalg

6) better in-silicon multitenant separation for security

7) in-silicon support for reducing memory costs of running multiple versions that are mostly similar of libraries (hardware support for containers)

are just a few... apple is in a great place because their vertical integration makes some of these possible. (and the good stuff of course would eventually make it to commodity hardware)


> To date, I haven’t heard anyone saying Lightroom is significantly snappier on an M1 than on a recent Intel MBP. I’d be happy to be corrected.

LOL. Has the author even spoken with actual photographers? Also, Lightroom Classic? Seriously? Lightroom CC is blazing fast on an M1. I know because it has been my go to real-life benchmark with all the M1 Macs I tested so far (all of them). You should of course compare them with similarly priced Intel Macs. But anyway, try working on 1000+ Canon Eos R5 raw files on a MacBook Pro M1 and the latest MacBook Pro Intel, then tell me which one’s faster. I’m not sure I would be able to hear you over the noise of the Intel Mac’s fans, though.

This “perception of speed” theory doesn’t hold up to explain the M1X/M2 delay. I definitely believe TSMC’s bottlenecks are a much more probable explanation.


The speed is pretty much irrelevant right now

All they need to do is support external RAM so they can make a machine with up to 128 or 256gb of memory. It's easy enough, but for their reasons they have just chosen not to do it yet.


32+ GB would sell me on it quite easy. 16GB is too cramped right now. Bumping the GPU would be useful as well, doesn't need to match a 3080, but an incremental improvement would be nice.


So many youtube channels CONFIRMED that latops with the M2 would land at wwdc


Not very interesting. This is more like a wishlist than analysis.


Shoot just make the MBP 16 with 2 M1’s that are designed to run in parallel and only fire up #2 when you need it.


Well, you had to wonder when that 5nm advantage was going to take it's toll. Considering that they spent almost an entire node and a half on increasing efficiency, it's unlikely that they'll be selling pro hardware until they can can get their hands on 3nm silicon: without some increase in chip density, they're going to continue to languish in M1 elo-hell.



I think the missing premise in this article is that they control the operating system in lockstep with the CPU, and so when they optimize frameworks and objective-C internals for CPU hardware acceleration, they can require those OS optimizations for each revision of their CPU, with no backward compatability whatsoever.

The best analogy I can think of is to imagine if epoll or dbus were hardware-accelerated, and every year a silicon update and a kernel update added more or improved acceleration for more and more components.

We don’t yet know what the speed gains possible from reducing OS overhead are, but I bet Apple does, and I bet it doesn’t require frequency bumps at all. If someone has compiled vmlinuz into FPGA tapeout somehow, that would be a good point of comparison. (“Inconceivable!”, except not so much nowadays..)

[I can’t find the tweet linked in the past few weeks about CPU-accelerated ObjC calls, but there’s an HN discussion about it that’s worth reading.]


Apple’s OS optimization for their processors doesn’t really look like what you’re describing, it’s more general-purpose than that except for a handful of hardware accelerators for extremely hot code.


If that were the case Safari wouldn't perform much better than Chrome on certain benchmarks on Apple ARM based devices. But I think "optimization" is more making commonly used instructions really fast (such as uncontended atomics) and providing custom instruction extensions to accelerate common use cases in hardware (for example, I believe there is an undocumented branch prediction hint capability, whereas afaik for x86 you can only tell your compiler to order branches according to compile-time hints).


It’s definitely only M1, that’s for sure. But I suspect it’ll be more than that over time. We’ll see, I suppose!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: