Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Not an expert, but my high level understanding is this: If a model is a set of inputs, some middle layers, and a set of outputs. Fine tuning concentrates on only the output layers.

Useful for taking a generic model with a base level of knowledge, and tuning it so the output is more useful for an application specific use case.



not strictly true I think

- you could add new units throughout and train those while freezing existing units (adapter-based fine-tuning)

- you could train all units and use e.g. low-rank adaptation to limit how much they can change

- you could do prefix tuning and train an input to add at every layer

see e.g. - https://lightning.ai/pages/community/article/understanding-l...


I think that's more in line with transfer learning, a variant of fine-tuning. If I'm reading this article correctly, they're fine-tuning the LMs end-to-end.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: