Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

GPUs are effectively “magic” and can do particular things hundreds or thousands of times faster than CPUs. Offloading a per-vertex or per-pixel task to the GPU, especially if doing so reduces bandwidth requirements, is almost always a big win.


I think reality is closer to this: GPU rasterization is 20-50 times faster than CPUs and throughput computing maybe 5-20 times faster.

For example consumer Ryzen 3 with 12 cores peaks at 32 FLOPS/core/cycle. So at 3.8 GHz, peak ~1.4 SP TFLOPs (or 0.7 DP TFLOPs). But just 50-60 GB/s memory bandwidth limits it somewhat.

Quick googling says consumer Nvidia RTX 2080 peaks at 10 SP TFLOPs (or 0.314 DP TFLOPs, yes, less than half than the CPU example). Memory bandwidth being at 448 GB/s.

GPUs win massively at rasterization, because they have huge memory bandwidth, a large array of texture samplers with hardware cache locality optimizations (like HW swizzling), texture compression, specialized hardware for z-buffer tests and compression, a ton of latency hiding hardware threads, etc.

But they're definitely not thousands or even hundreds of times faster.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: