I'm not sure I see how this is meaningfully different than the threat posed by a search engine. It's a very real threat, and I've always done my best to search from a browser context that isn't logged in as a result. But it's not a new threat, or something distinctive to AI.
Because you can't ask the search engine to summarize the views or thoughts or whatever, of the user. You have to scroll through them by the hundreds and see if any obvious nuggets stand out that you might be interested in.
Yes, search engine history is private too and can reveal stuff you want to remain private. But you also need to see the browser history and the contents of those pages, together with the search history to see what the user was actually interested in reading to get close to the same level of data the the LLM has about you.
You might be surprised at the amount of people that interact with a search engine in the same way they do with an LLM. Especially now that many put an LLM widget at the top of results for queries like that.
One doesn't have to scroll through them and find the nuggets themselves; it's digital data. It can be copied[1].
Once copied, one can then paste it into an LLM and have it find the nuggets.
[1]: And by "copied," I mean... even a long series of hasty cell phone photos of the screen is enough for ChatGPT to ingest the data with surprising accuracy. It's really good at this kind of thing.
It sounds to me like you’re agreeing with the person above who said ChatGPT isn’t a new threat, but your explanation uses ChatGPT. In other words, “ChatGPT isn’t a new threat because even with a search engine you can use ChatGPT to look through the queries”.
ChatGPT is absolutely a new "threat", at very least because it trivializes the automation of coarse analyzation of unstructured information -- including a user's search history.
To add on to this, people tend to search short words and phrases in Google. Searching "Charlie Kirk assassination" for example doesn't really tell much about a person's political leanings. People have full on conversations with ChatGPT which makes their thoughts much clearer.
> I've always done my best to search from a browser context that isn't logged in as a result.
It isn't sufficient to avoid being logged in — you have to ensure that the search strings alone, grouped by IP address or some other signal, aren't enough to identify you. When AOL publicly released an archive with 20 million search strings in 2006, many users got exposed:
There's also the issue of a site's Terms of Service when not logged in, which may allow an AI to be trained on your interactions — which could potentially bleed compromising information into the generative results other people see.
Oh, I know, I'm just adding that detail to say that I'm not dismissive of the threat we're talking about. It's a real threat, I'm just saying it's an old one.
Which search and AI services reliably discard logs?
It's my understanding that if you configure your Google account correctly, logged-in searches will be discarded. However, I'm less certain about whether Google retains data for non-logged-in queries which allows for aggregation by IP address, etc.
Then there's DuckDuckGo, which at least the way it's advertised, implies that they discard search strings. Their "duck.ai" service stores prompt history locally, but they claim it's not retained on their machines, nor used for training by the various AI providers that duck.ai connects to[1].
In contrast, ChatGPT by default uses non-logged-in interactions to train their model[2].
I think it's related but different than simply a search engine, since AI:
- Entices you to "confess" (or overshare) things about yourself, in the form of questions / debate, because the chat bot is built for this. The "conversation" aspect is something you didn't get with search engines.
- Then, the tool itself makes it easier for someone else to draw conclusions and infer things from the "model" the AI built of you, even if you didn't explicitly told it these things.
Maybe Google can build a profile of me based on my searches and use of their products, but I bet ChatGPT is at least an order of magnitude more useful to draw inferences about me, my health status, and my opinions about stuff.
In theory you could accomplish this by combing through search history.
In practice, the scenario in OP is unlikely to be practical with search history alone. It’s much less convenient for CBP to ask someone to pull up their Google search history. And even if they did, it doesn’t work as well. Officers don’t have infinite time to assess every person.
They could also take your traditional search and chat history, feed it into an LLM, and ask it the same questions. Once you start doing that for one person... you could just feed everyone's chat and search history into an LLM, and ask it "who is the most dangerous" or whatever you want to ask.
Its just another version of the classic computing problem "computers might not make a new thing possible, but it makes it possible to do an old thing at a scale that fundamentally changes the way it works"
This is the same as universal surveillance... sure, anyone could have followed you in public and watched where you are going, but if you record everything, now you can do it for everyone at any time. That changes how it works.
1. Scale and Automation matter always. It wouldn't be the first time, something that was previously already technically possible goes from rarely done to widespread problem.
2. The whole benefit about using LLMs, especially for search is the understanding of logic and intent behind your query, which means that when people use LLMs, they often aren't just sending the half-garbled messes they send in google search, they are sending in queries that make clear the intent behind the queries (so it can better answer it). This is not information you are guaranteed to obtain roving through browser history.
3. Today, and with ~ 5 billion users, Google search has 8.5 billion searches per day. Today, with some ~800M Weekly active users, ChatGPT has some 2.5 billion messages per day.
Not only are people more revealing per query, they are clearly having a lot more of it per user.
Re: 3 - Do we know how many of those chatgpt queries are actually people? Cause i can't think of a use case to automate things with Google searches, but i can think of a million ways to automate bullshit with chatgpt. How much of that queries per user stat is inflated with the enterprise accounts making hundreds of queries a minute? How many of those are bot farms automating fake recipe websites and how many are actual people having real and revealing conversations?
The number is chatGPT user messages (not requests via the API) so they are no enterprise accounts making hundreds of queries per minute or bot farms automating fake recipe websites.
Based on Open AI's Usage Breakdown[0], as of July 2025, ChatGPT processes 1.9B Non-Work and 716 M Work Messages per day.
The level of time and effort being so low increases the likelihood of this happening. It's the same sort of reason there's red teaming for ensuring AI doesn't help bad actors with chemical weapons, lowered barriers for bad things is a concern even if the bad things were possible before.
A conventional keyword-based search engine is unlikely to actively and subtly encourage a user to (A) reveal secrets and blackmail material (B) become entrapped in behavior the Current Authority will punish them for.
A better "some of this isn't new" comparison would be to imagine you're communicating with an idiot-savant human employee, someone can be tasked with hidden priorities and will do anything to stay employed at their role. What "old" threats could occur there?
I think the most important difference is that chats are rich in context and, depending on how you use it, closer to journal entries than search queries. I also think it doesn't have to be new to be significant, it if is expanding the frontier of an existing vulnerability.
I don't understand how you don't understand. Trying to recreate someone's internal thoughts and attitudes from looking at their search history is a pale imitation of this. Just the thought experiment of a customs officer asking ChatGPT to summarise your political viewpoints was eye opening to me.
How so? You'd have a very, very good understanding of my political viewpoints from the log of my Google searches. I'm asking sincerely, not simply to push back on you.
It seems fairly easy to figure this out with a little thought…
When talking to a chatbot you're likely to type more words per query, as a simple measure. But you're also more likely to have to clarify your queries with logic and intent — to prevent it going off the rails — revealing more about the intentions behind your searches than just stringing together keywords.
It'd be harder to claim purely informational reasons for searching if your prompts betray motive.
Maybe not you in particular, but I expect people to be more forthcoming in their writing towards LLMs vs a raw google search.
For example, a search of "nice places to live in" vs "I'm considering moving from my current country because I think I'm being politically harassed and I want to find nice places to live that align with my ideology of X, Y, Z".
I do agree that, after collecting enough search datapoints, one could piece together the second sentence from the first, and that this is more akin to a new instance of an already existing issue.
It's just that, by default I expect more information to be obtainable, more easily, from what people write to an LLM vs a search box.
Asking Google for details about January 6th is different than telling ChatGPT I think the election was stolen, and then arguing with it for hours about it.
It would be harder to frame it in front of a jury that what you typed wasn't an accurate representation of what you were thinking and that you were being duplicitous to ChatGPT.
I don't think it really is in the circumstances we contemplate this threat in. In both the search engine case and the ChatGPT case, we're talking about circumstantial evidence (which, to be clear: is real and legally weighty in the US) --- particularly in the CBP setting that keeps coming up here, a Border Agent doesn't need the additional ChatGPT context you're talking about to draw an adverse conclusion!
I think at this point the fulcrum of the point I'm making is that people might be inadvertently lulling themselves into thinking they're revealing meaningfully less about themselves to Google than to ChatGPT. My claim would be that if there's a difference, it's not clear to me it's a material one.
Ah. Yeah you're more boned if you confess to ChatGPT that you've killed your wife than if you just googled for how to bury a body, but at the edges where people are using ChatGPT as a therapist and someone disappears, and the person who did it is smart enough to use incognito mode to search how to bury a body so it doesn't show up in court, how everyone feels about the deceased is gonna get looked at, including ChatGPT conversations. That's new.
The point is that the data is there from search engines (and more data, from more people anyway). Whether you automate reading it or do it manually, it is 100% unrelated to the topic of ChatGPT being an informant.
Users type a lot more often into search engines, and the largest one keeps files on all of their egresses and correlates it with full advertising profiles and what they do within other google properties (which may include their browser itself.)
Google has all of that and more, right? They control the browser and devices that you use to access an AI app. They control the content shown to you in leisure and work. ChatGPT doesn't have that much exposure and surface area yet
My personal favorite in this genre was the commenter who said that the heart-rate monitoring features of an Apple Watch were irrelevant because they could always check their own.
I think the underlying assumption is that people say very different things to an anthropomorphized (even if in their own parasocial head) chatbot than in other online spaces.
I can see why, mainly because of the parasocial relationship that probably many people tend to form with these things that talk to us like they are humans.