Hacker News new | past | comments | ask | show | jobs | submit login

Resolving that issue, would help reduce (not eliminate) the size of the context. The model will still only just barely fit in 16 GB, which is what the parent comment asked.

Best to have two or more low-end, 16GB GPUs for a total of 32GB VRAM to run most of the better local models.






Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: