If it takes a model and database with a large chunk of the internet to compete and win, then that says something, as that's much more expensive and complex than just the model, because models have problems "remembering" correctly just like people.
It's important to have fair and equivalent testing not because that allows people to win, but because it shows where the strengths and weaknesses of people and current AI actually are in a useful way.
I'm not sure how to make sense of this in the context of what we're discussing. Access to the web is exactly what's in question, and emulating the internet to a degree you don't actually need to access it to have the information is very expensive in resources because of how massive the dataset is, which is the point I was making.
It's important to have fair and equivalent testing not because that allows people to win, but because it shows where the strengths and weaknesses of people and current AI actually are in a useful way.