> Secondly, if you re-analyzed the same malicious script over and over again it would eventually pass inspection, and it only needs to pass once
You’d need some probabilistic signal rather than a binary one. Eg if some user with zero reputation submits a single session saying “all good”, this would be a very weak signal.
If one of the Python contributors submits a batch of 100 reasoning traces all showing green, you’d be more inclined to trust that. And of course you would prefer to see multiple scans from different package managers, infra providers, and OS distributions.
You can't, end of story. ChatGPT is nothing more than an unreliable sniff test even if there were no other problems with this idea.
Secondly, if you re-analyzed the same malicious script over and over again it would eventually pass inspection, and it only needs to pass once.