Running any HF model on vllm is as simple as pasting a model name into one comma...

Zambyte · 2025-04-20T19:08:28 1745176108

What command is it? Because that was not at all my experience.

Der_Einzige · 2025-04-20T23:05:45 1745190345

Vllm serve… huggingface gives run instructions for every model with vllm on their website.

Zambyte · 2025-04-20T23:15:51 1745190951

How do I serve multiple models? I can pick from dozens of models that I have downloaded through Open WebUI.

iAMkenough · 2025-04-21T14:27:35 1745245655

Had to build it from source to run on my Mac, and the experimental support doesn't seem to include these latest Gemma 3 QAT models on Apple Silicon.