Hacker News new | past | comments | ask | show | jobs | submit login

I’m trying right now. The combination of small models, qlora and grpo has made it accessible to experimenters. I’m not using unsloth yet, but I will probably start checking it out pretty soon so that I can train larger models or increase the number of generations for grpo.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: