While I fully agree with you on the absence of good benchmarks and the growing L...

NitpickLawyer · 2025-04-11T14:08:45 1744380525

Thanks! I was looking at blackwell 6000PROs, 8x 96GB for running full fp8 (as it's supported and presumably fast).

I know AWQ should run, and be pretty snappy & efficient w/ the new MLA added, but wanted to check if fp8 fits as well, because from a simple napkin math it seems pretty tight (might only work for bs1, ctx_len <8k which would probably not be suited for coding tasks).