Except that human generated doesn't really seem to matter, all that seems to matter is some basic guard rails on the data you choose. Meta has models generating training data then grading it and select the best examples to reincorporate into the training set, and it's improving benchmarks.
The problem with model collapse is reinforcing means at the costs of the edges of your distribution curve, particularly on repeat.
One of the things that is being overlooked is that offsetting the job loss from AI replacing mean work is that there's going to be new markets for edge case creation and curation.
Jackson Pollock and Hunter S Thompson for the AI generation with a primary audience of AI vs humans, sponsored by large tech and data companies like the new Renaissance Vatican.