Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Your points don't explain why at 3Ghz this chip can outperform 5Ghz chips in single-threaded workloads.

It's a trade off. Making it wider is the reason the IPC is higher, but it's also the reason it is a 3GHz chip instead of a 5GHz chip.

The author's explanation of why x86 processors can't do this is also not especially compelling. The reason very wide processors are atypical is that common spaghetti code has poor instruction level parallelism -- the processor can't extract what isn't there, so there is a point at which higher clocks become the way to make bad code run faster. Interestingly, synthetic benchmarks are often less susceptible to this because the code is optimized to maximize ILP. It's a shame we still have so few real world benchmarks (mostly because the applications still haven't been ported).

> The DDR4 is faster than average, but you can get those kinds of speeds on enthusiast PCs and it doesn't make a big difference there.

It depends heavily on the task, but the difference in some cases is close to 40%:

https://www.tomshardware.com/reviews/best-ram-speed,5951-6.h...



Exactly. The far larger ROB, great branch predictor, large L1 and significantly decreased memory latency combined with a lower core frequency allcontribute to keep the core fed and prevent it stalling.

Still sustaining 8-wide (or even just 4-wide) is very hard. Apparently most code has an ILP of 1.5 on average.

I guess that the large width helps recovering from stalls (somehow absorbing spikes).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: