It takes a significant amount of time (few hours) on a single consumer GPU, even... | Hacker News

Hacker News new | past | comments | ask | show | jobs | submit

login

samspenc 75 days ago | parent | context | favorite | on: Fine-tune Google's Gemma 3

It takes a significant amount of time (few hours) on a single consumer GPU, even 4090 / 5090, on personal machines. I think most people use online services like runpod, vast ai, etc to rent out high-powered H100 and similar GPUs for a few cents per hour, run the fine-tuning / training there, and just use local GPUs for inference on those fine-tuned models generated on cloud-rented instances.

danielhanchen 75 days ago [–]

It used to be that way! Interestingly I find people in large orgs and the general enthusiast don't mind waiting - memory usage and quality are more important factors!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact