> presumably LLM output is going into the training data of later LLMs The LLM ve...

mikepurvis · 2025-08-07T19:54:58 1754596498

The LLM vendors aren't exactly the most trustworthy on this, but regardless of that, there's still lots of free-tier users who are definitely contributing back into the next generation of models.

runako · 2025-08-07T20:35:08 1754598908

For sure, although I'm fairly certain there is a difference in kind between the outputs of free and paid users (and then again to API usage).

scottLobster · 2025-08-07T20:07:20 1754597240

Please describe these "great lengths". They allowing customer audits now?

The first law of Silicon Valley is "Fake it till you make it", with the vast majority never making it past the "Fake it" stage. Whatever the truth may be, it's a safe bet that what they've said verbally is a lie that will likely have little consequence even if exposed.

runako · 2025-08-07T20:33:36 1754598816

> great lengths to assure

is not incompatible with

> "Fake it till you make it"

I don't know where they land, but they are definitely telling people they are not using their outputs to train. If they are, it's not clear how big of a scandal would result. I personally think it would be bad, but I clearly overindex on privacy & thought the news of ChatGPT chats being indexed by Google would be a bigger scandal.

fragmede · 2025-08-07T21:21:36 1754601696

You did hear that it did happen (however briefly) though, yeah?

https://techcrunch.com/2025/07/31/your-public-chatgpt-querie...

runako · 2025-08-07T22:16:59 1754605019

That's my point. It is a thing that is known and obviously a big negative, but yet failed to leave a lasting mark of any kind.

RhysU · 2025-08-07T22:06:38 1754604398

Ah, the eternal internal corporate search problem.

KoolKat23 · 2025-08-07T20:37:29 1754599049

That's only if you opt out.

runako · 2025-08-07T22:23:26 1754605406

ChatGPT training is (advertised as) off by default for their plans above the prosumer level, Team & Enterprise. API results are similarly advertised as not being used for training by default.

Anthropic policies are more restrictive, saying they do not use customer data for training.