I did that a few years back: https://arxiv.org/abs/2011.11751

cs702 · on Dec 5, 2023

With a large model? How many parameters?

See my other comment here:

https://news.ycombinator.com/item?id=38536178

jefft255 · on Dec 6, 2023

A couple millions IIRC. Nothing "large" compared to modern transformer models.

cs702 · on Dec 7, 2023

Thanks for getting back to me. That's what I thought. The magic seems to start happening in the low billions of parameters -- and I say "seems" b/c there's no consensus as to whether it's really truly magic! In any case, it's a shame that most of the human brainpower capable of improving SotA AI doesn't have access to large-scale resources.