For what it's worth, I have read the abstract of the paper and discussions around it and I have seen more than one person rate it as not very interesting. Why is all this fuss about a paper restating common knowledge? For example, they say datasets need to be filtered for bias, and that large models consume ... duh ... a lot. We already know that, where's the new shiny architecture for fair modelling?
link to abstract and discussion - https://old.reddit.com/r/MachineLearning/comments/k69eq0/n_t...