They used research credits, and even that aside, with their code and training ti...

p1esk · on Aug 23, 2019

Wait, how can I get to near SOTA on Imagenet in a minute (!) for $40?

gwern · on Aug 23, 2019

OK, I exaggerated a little because I was recalling from memory: the old fast.ai approach actually takes <18 minutes (https://www.fast.ai/2018/08/10/fastai-diu-imagenet/). My bad. (I'm sure it's improved since then but I don't know how much.) I was also thinking of https://myrtle.ai/how-to-train-your-resnet-8-bag-of-tricks/ which does CIFAR-10 in 26s but I'm not sure offhand what CIFAR-10's SOTAs look like so not sure how far away that is.

p1esk · on Aug 23, 2019

Actually this is still very good. Thanks for the links. I'll be timing some of these tricks tomorrow for my Imagenet experiments. By the way, I believe this is the current SOTA for Imagenet: https://arxiv.org/abs/1905.11946 (84/97%). CIFAR10 appears to be essentially solved (99%).

ZhuanXia · on Aug 23, 2019

Are you going to update your poetry engine now that we have this?

gwern · on Aug 23, 2019

Hm, maybe. It depends on how easy their training code is to use and how long retraining would take. It presumably will take at least a week because 345M took about a week, but I'm not sure I want to spend the money on a week of a very large cloud instance (which would be what, $300?) for what is probably a substantial but not stunning improvement in generation quality.

I might rather wait for the next leap, from something like a Sparse Transformer approach which can get global coherency by having a lookback over the entire poem or getting a better poetry corpus with delimited poems (rather than entire books).