"However some advantages can disappear when you put constraints on the output such as quality and correctness."
Only if you suppose that the ideal output is superhuman. In the case of OpenAI et al, that's arguably the case, but those aren't the players that are going to get into an arms race with detection anyway. They want it to be relatively easy to detect AI generated content, because they're not in the plagiarism business, and anti-plagiarism measures will get the public and media off their backs. And nobody who is interested in targeting plagiarism has nearly the funding to build their own LLM on a level that matters.
So if there's an arms race in the near term, I expect it will be with postprocessors instead. These will be much smaller models (i.e. runs in your browser, or at least on a small backend machine) that take the output of ChatGPT and tweak it to fool detectors. They won't care about maximizing quality or accuracy, but will just care about preserving meaning while erasing statistical signs of AI generation.
I don't know if the business case for that will be there. It's there for selling papers, and almost certainly some people will try their hand at these models just for the challenge and/or to prove a point.
Im not sure if that how it actually would work out.
Most humans can’t write say an essay to save their life.
And those who do write very well tend to have their own signature.
Whilst it’s not 100% accurate we’ve managed to fairly successfully attribute a lot of unknown works to specific authors based on their known works.
So if you create a generator that produces output equals to say top 1% of human authors I’m not entirely sure that you can get one that doesn’t have its own signature.
Because whilst as you said most humans produce output that is statistically indistinguishable from most other humans the output that tends to survive selection bias and become known works is quite distinguishable by definition.
So you don’t even need to get to superhuman capability you just need to get to a high enough output quality that it would limit the statistical search space from billions to millions or even thousands.
This may be along the lines of what you’re suggesting, but what if you flipped this around: instead of trying to recognize AI, you recognize the student? You model each student’s quirks so you can tell if they wrote their essay, or if someone else did. Now you don’t care about AI specifically; you just care about whether they wrote what they submitted.
The main failure mode I see here is students dramatically improving and throwing the system off. If someone gets a tutor or goes to writing workshops, you don’t want to accuse them of plagiarism just because they got better. But there may be ways you could deal with that, like having the student submit new samples.
That could work but that is changing the problem and moving the goal posts, a plagiarism detection system that is essentially trained on individual authors would be able to identify any time they skew too far from their rolling average.
I’m not even sure if ML is absolutely necessary for this or not.
Only if you suppose that the ideal output is superhuman. In the case of OpenAI et al, that's arguably the case, but those aren't the players that are going to get into an arms race with detection anyway. They want it to be relatively easy to detect AI generated content, because they're not in the plagiarism business, and anti-plagiarism measures will get the public and media off their backs. And nobody who is interested in targeting plagiarism has nearly the funding to build their own LLM on a level that matters.
So if there's an arms race in the near term, I expect it will be with postprocessors instead. These will be much smaller models (i.e. runs in your browser, or at least on a small backend machine) that take the output of ChatGPT and tweak it to fool detectors. They won't care about maximizing quality or accuracy, but will just care about preserving meaning while erasing statistical signs of AI generation.
I don't know if the business case for that will be there. It's there for selling papers, and almost certainly some people will try their hand at these models just for the challenge and/or to prove a point.