Just curious what your issues with Triton were. We've done OK with it using it t... | Hacker News

Hacker News new | past | comments | ask | show | jobs | submit

login

dlewis1788 61 days ago | parent | context | favorite | on: Nvidia Dynamo: A Datacenter Scale Distributed Infe...

Just curious what your issues with Triton were. We've done OK with it using it to serve LLM models w/ a classifier head via HF Transformers pipeline & Flash Attention 2, as well as serving text generation models with the vLLM back-end.

bytesandbits 58 days ago [–]

triton is not that bad, TensorRT will give you nightmares

dlewis1788 48 days ago | [–]

100% - probably why vLLM is now the default back-end in Dynamo.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact