There is no need to train this particular model, but adopting this (or any novel model, the field is moving fast) to say Dutch, Italian, Hungarian, Icelandic or whatever still requires training. Luckily for most languages they are provided (at least in the case of BERT, FastText, or regular skipgram). But there is also still quite a bit of leeway in domain specific adoption (for example SciBERT for scientific texts, or legal and financial documents) reddit/wikipedia does carry a bias. Each of which not only requires pretraining the model, but also generating a huge and fairly well formatted corpus.
And, although the parameters are usually finetunable, it does break down sometimes on the various sub-word tokenizations used.