Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

How could it be competitive with GPUs in terms of price-per-unit-performance? Seems like GPUs are expensive only because you can't rent a small portion of a GPU? But shouldn't that be possible with GPU virtualisation?

Or is it the case that if you virtualised a GPU up into tiny pieces, the memory-to-flops ratio would be way off what's needed for inference? Or the virtualisation overhead would be too big?

Those are all genuine questions, just to be clear - this is not my area of expertise.



Well, a GPU is a cluster of SIMD units with fast memory

A GPU thread variable is just like a SIMD lane, and a GPU warp variable is just like a SIMD vector register

Nvidia's SIMD instructions are called PTX and they are similar to AVX-512

An AVX-512 core is like a general purpose CPU with a 512-bit GPU core built in

So paying for a single AVX-512 core is like paying for part of a GPU, plus the general purpose compute you need to keep the GPU supplied with work

If you could divide the GPU up, you would lose most of the parallelism, keep all of the communication latency, and still need the drivers etc.

Would a hypothetical virtualized GPU be competitive with an AVX-512 core in terms of price/performance? I don't know, I haven't done the comparison




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: