based on information and background they thoroughly gave when releasing their research its pretty easy to put together that it did take them significantly less resources to train this model. only having specific parameters available at a time instead of activating everything all at once is pretty ingenious.
that and they just happened to be undergoing a large scale "cyber attack"
that and they just happened to be undergoing a large scale "cyber attack"