But it's not free, and still comes at a cost of per-stream latency.
Speculative decoding seems less effective in practice than in theory.