In general, most giant LLMs are extremely undertrained at this time. Consider th... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		Der_Einzige on Feb 19, 2023 \| parent \| context \| favorite \| on: Open source solution replicates ChatGPT training p... In general, most giant LLMs are extremely undertrained at this time. Consider that most of the gains in RoBerta vs bert were from just continuing to train.

stevenhuang on Feb 20, 2023 | [–]

Cases of undertraining can be observed whenever the output is repeating gibberish or loops. Happened a lot in GPT2 ai dungeon days

leobg on Feb 20, 2023 | [–]

So can we continue training RoBERTa to get it to, say, GPT3 Ada level

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact