I don't understand what you're saying about GPT detectors. Are you angry that people are promoting detectors that don't work, or are you angry that OpenAI used to offer one and no longer do?
It was pulled because while it caught on the order of 25 of pure AI output, it missed the rest. But I’m cool with that number, in a world where so much AI in human writing? That’s a total win on low-effort propaganda. Anyone should be cool with that number.
Unfortunately they had to pull it because something like 9% of human authored text got hit with the AI flag. Again, some people are starting to write like it. It’s gonna happen.
This is from memory, so if I’ve got some that wrong I’ll retract that and leave my other reasons as more than sufficient to indicate serious change.
It's an Official AI Detector from The Company That Does AI, so it's going to be treated as completely authoritative. Humans will interpret probabilistic measurements like "this was likely written by" or even "there is an 86% chance this was written by" as meaning "this definitely, factually, without a shadow of a doubt was written by AI". And there are many circumstances where being adjudged to have written something with AI will have serious consequences for the accused.
That tool had to be almost perfect, if not actually perfect, to exist. And it decidedly was not. I think the AI safety people are somewhat contemptuous but this is not the tree to be barking up imo.
The make a little red light go on by having the API be: is this spanmy looking enough to tell people to read carefully and check citations. And just say nothing the rest of the time. That light is a game changer at “sure enough to warn” if it comes on the 5% of the time that it’s sure.
Besides, other people are offering this or will soon, so they have an obligation to push the performance via competition.
And who put them in charge of trying to set society’s policies on this: it’s the wildest overreach by far in an industry known for wild unilateral overreach.
A great deal of harm is being caused by "AI detectors" today.
There is an endless stream of stories about students who have their work flagged as "AI generated" by a detector when it wasn't - and then get failing grades, from teachers who don't understand that AI detectors are inherently inaccurate.
Nah, while the list of good things they’ve done is getting shorter, they have pushed the pace in a few areas utterly consistent with their ostensible mandate and they deserve credit for it:
- whisper is really useful, it has good applications in strictly socially positive settings (accessibility as one example), and its scope for abuse is very consistent with how they’ve opened up the weights. the whisper people either still are, or until recently were, doing the right thing.
- the TTS stuff is a little less cut and dried, but TTS is among the more potentially dangerous capability increases, and I can see an argument for going a little slower there, for a number of reasons. I still think they’re behind on opening that up, but the voice group has a case there, even if it is thin and I personally disagree.
- the detection stuff, they were pushing the pace on tools researchers and institutions need. they deserve the same credit for doing that, which I’m giving them, as blame for pulling it, which I’m giving them. that was consistent with their stated mandate.
If you’re going to criticize and institution stridently to the point of calling it a menace, as I am, you are held to a higher standard on being fair, and I acknowledge the good and positive and non-remunerative work, or at lead the headline stuff.
Turning off the good stuff is one more really, really red flag.
Those seem like really bad numbers to me, considering the base rate fallacy. Most of what people test with something like that is probably not going to be AI-generated, which could mean getting massive numbers of false positives.
Then watermark the output and say if they wrote it. Between a binary classifier in the age of adversarial training, and any level of watermarking, you’d be able to say which minor version printed it.
I don't think it's possible to watermark AI generated text in a way that can't be easily removed by someone who simply switches a word around or adds a typo.
Spot catches the people who can beat OpenAI on non-trivial stenography: sophisticated actors aren’t what this is about catching. They’re going to get away with some level of abuse no matter what. APTs? They can afford their own LLM programs just fine: some of them have credible quantum computing programs.
But a lot of propaganda is going to take place at the grassroots level by actors who can’t beat OpenAI, even one in decline, at breaking both watermarks and an adversarial model.
But the grand finale is of course, at this point how has OpenAI behaved like anything other than an APT itself. It’s the friendly, plucky underdog charity that’s now manipulating the process on making things illegal without involving congress.
That’s exactly how advanced actors operate: look at the xz thing.
I don't understand the chain of logic here at all?
Am I correct in thinking you are criticizing OpenAI for taking down their non-working GPT Detector?
Of all the things OpenAI deserve criticism for this seems to be an odd one. It just didn't work: as you say it couldn't properly detect GPT authored text and it incorrectly flagged human text as written by GPT.
Human text being flagged as wholly or partially synthetic is the default now. You move the knob on the AUPRC curve until it’s catching spam, and you report spam when you’re pretty sure. You report: “don’t know” the rest of the time.
> Again, some people are starting to write like it. It’s gonna happen.
Interesting point you touched here. Let's do the math for OpenAI: 100M users, with 10k tokens/user/month, that means 1 trillion tokens/month are read by people. That has got to influence speech and circulate information faster between all fields.
It’s what makes the precision/recall tradeoff a no-brainer if you’ve worked in spam. I worked in Abuse Detection at FB in 2016: there is a consensus on how to use binary classifiers responsibly in the presence of lame attempts at fraud.