Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Not effectively, because you probably can't return to language pre-training after multi-task fine-tuning and RLHF. Stage 1 has to go before stages 2 and 3. So they would need to fine-tune a stage 1 model and re-apply stages 2 and 3.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: