Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I recently found this paper[1] claiming near GPT-3 performance with only a fraction of parameters. They seems to simply reformulate the input sequence to change classification to a sequence generation task.

Disclaimer, I am not affiliated to any of the authors

[1] https://arxiv.org/pdf/2009.07118.pdf



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: