Hacker News new | past | comments | ask | show | jobs | submit login

Running any HF model on vllm is as simple as pasting a model name into one command in your terminal.



What command is it? Because that was not at all my experience.


Vllm serve… huggingface gives run instructions for every model with vllm on their website.


How do I serve multiple models? I can pick from dozens of models that I have downloaded through Open WebUI.


Had to build it from source to run on my Mac, and the experimental support doesn't seem to include these latest Gemma 3 QAT models on Apple Silicon.



Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: