How could it be competitive with GPUs in terms of price-per-unit-performance? Seems like GPUs are expensive only because you can't rent a small portion of a GPU? But shouldn't that be possible with GPU virtualisation?
Or is it the case that if you virtualised a GPU up into tiny pieces, the memory-to-flops ratio would be way off what's needed for inference? Or the virtualisation overhead would be too big?
Those are all genuine questions, just to be clear - this is not my area of expertise.
Or is it the case that if you virtualised a GPU up into tiny pieces, the memory-to-flops ratio would be way off what's needed for inference? Or the virtualisation overhead would be too big?
Those are all genuine questions, just to be clear - this is not my area of expertise.