My experience with GPT-3 is that while it does perform better than those mini-GP...

My experience with GPT-3 is that while it does perform better than those mini-GPT small models, the gap does not compensate for the fact that the small models are free/unrestricted and you can use them as much as you like.

As mentioned elsewhere in the thread there are some large models around the 50-200B band that compete directly with GPT-3, but I haven’t used these.