Threadripper 3 with 64 cores is going to be mindblowing! Not that long ago since Parallella board advertised 64 slow cores and soon we can get all x86/x64 high-end cores like that!
It is pretty crazy. I felt the same way. Individual x64 cores tend to be so much more powerful than other architectures, and now single chips will effectively have 128 logical cores.
For my purposes (large builds and rendering), I think RAM prices are holding back AMD here. To feed that many cores, you want really big RAM sticks. The CPUs have become a comparatively small cost compared to the RAM these days.
I've recently built a TR-based DL/ML workstation and bought 128GB ECC 2,667MHz UDIMMs for ~$1600, roughly the same price as 2990WX, but would have vastly preferred to get 256GB instead. Unfortunately, only Samsung is now sampling 32GB ECC DDR4 UDIMMs - I haven't seen them anywhere yet, and I expect the price is going to be insanely high :-(
Speaking of insanely high RAM prices, I just came across receipts for a PC I built in 1992. So 26 years ago, I paid $495 for 4MB. Yup, that's MB, not GB.
Admittedly these were AUD rather than USD. So maybe halve that for the USD cost.
When we complain about how expensive memory and compute, a slightly longer term view shows it's still pretty good value!
You realize that it's not about compared to 25 years ago though, right? When I looked at the beginning of this year, the same RAM size and speed was about twice as expensive as it was two years ago.
The most important bit WRT to TR3 is going to be the central I/O chiplet instead of dividing memory controllers between individual Zeppelin dies. No more NUMA headaches to deal with on their workstation/enthusiast CPU's, I'm glad that AMD saw that such an approach wasn't going to work long-term (at least not for the time being when basically anything outside large database systems and hypervisors lack even basic NUMA-awareness).
Do you know if that would allow all cores to have the same memory access speed like the current (16c in 2990WX) directly connected ones, or if it imposes a penalty (the same?) on all of them?
It's really hard to say what the memory latency is going to be, but at the very least this will mean that latency will remain consistent for access to every installed DIMM regardless of which CCX the request originates from.
On that note I'm really interested to see if a dedicated I/O chiplet will help with the memory frequency scaling issues with see with the IMC on Zen/Zen+. I'm not sure what made the integrated controller on Zen so finicky compared to Intel's IMC, but this move will at the very least allow AMD to bin memory controllers if they want to or maybe work around some issues with their design.