The number of people running llama3 70b on NVidia gaming GPUs is absolutely tiny... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		imtringued on May 13, 2024 \| parent \| context \| favorite \| on: GPUs Go Brrr The number of people running llama3 70b on NVidia gaming GPUs is absolutely tiny. You're going to need at least two of the highest end 24 GB VRAM GPUs and even then you are still reliant on 4 bit quantization with almost nothing left for your context window.

resource_waste on May 14, 2024 [–]

The cognitive dissonance here.

70B models arent better than 7B models outside roleplay. The logic all sucks the same. No one even cares about 70B models.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact