Submissions from vllm.ai

		VLLM: Easy, Fast, and Cheap LLM Serving with PagedAttention (2023) (vllm.ai)
		3 points by telotortium 34 days ago \| past
		vLLM V1: A Major Upgrade to vLLM's Core Architecture (vllm.ai)
		2 points by ozgune 3 months ago \| past
		vLLM V1: A Major Upgrade to vLLM's Core Architecture (vllm.ai)
		5 points by xmo 3 months ago \| past
		VLLM 2024 Retrospective and 2025 Vision (vllm.ai)
		1 point by shenli3514 3 months ago \| past
		Installing and Developing VLLM with Ease (vllm.ai)
		1 point by brethil 3 months ago \| past
		vLLM v0.6.0: 2.7x Throughput Improvement and 5x Latency Reduction (vllm.ai)
		3 points by xmo 7 months ago \| past
		VLLM automatic prefix / prompt caching (vllm.ai)
		2 points by danielhanchen 8 months ago \| past \| 1 comment
		VLLM hosts local LLMs easily (vllm.ai)
		2 points by myprotegeai 9 months ago \| past
		Llama 3.1 Support in VLLM (vllm.ai)
		2 points by e12e 9 months ago \| past
		vLLM (vllm.ai)
		2 points by jonbaer on April 24, 2024 \| past
		VLLM: Easy, Fast, and Cheap LLM Serving with PagedAttention (vllm.ai)
		2 points by udev4096 on Jan 7, 2024 \| past
		Notes on VLLM v.s. DeepSpeed-FastGen (vllm.ai)
		3 points by Palmik on Nov 15, 2023 \| past
		vLLM: Easy, Fast, and Cheap LLM Serving with PagedAttention (vllm.ai)
		295 points by wskwon on June 20, 2023 \| past \| 42 comments