Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The long term memory is in the training. The short term memory is in the context window.


The comparison misses the mark: unlike humans, LLMs don't consolidate short-term memory into long-term memory over time.


That is easily fixed, ask it to summarize it's learnings, store it somewhere, and make it searchable through vector indexes. An LLM is part of a bigger system that needs not just a model, but context and long term memory. Just like human needs to write things down.

LLMs are actually pretty good at creating knowledge: if you give it a trial and error feedback loop it can figure things out, and then summarize the learnings and store it in long term memory (markdown, RAG, etc).


You’re making the assumption that there’s one, and only one, objective summarization, this is entirely different than “writing things down.”


Why do you assume i assume that?


My bad if I misunderstood. I assumed by your use of “it” and approximation methods.


This runs into the limitation that nobody has RL'd the models to do this really well.


Over time though, presumably LLM output is going into the training data of later LLMs. So in a way that's being consolidated into the long-term memory - not necessarily with positive results, but depending on how it's curated it might be.


> presumably LLM output is going into the training data of later LLMs

The LLM vendors go to great lengths to assure their paying customers that this will not be the case. Yes, LLMs will ingest more LLM-generated slop from the public Internet. But as businesses integrate LLMs, a rising percentage of their outputs will not be included in training sets.


The LLM vendors aren't exactly the most trustworthy on this, but regardless of that, there's still lots of free-tier users who are definitely contributing back into the next generation of models.


For sure, although I'm fairly certain there is a difference in kind between the outputs of free and paid users (and then again to API usage).


Please describe these "great lengths". They allowing customer audits now?

The first law of Silicon Valley is "Fake it till you make it", with the vast majority never making it past the "Fake it" stage. Whatever the truth may be, it's a safe bet that what they've said verbally is a lie that will likely have little consequence even if exposed.


> great lengths to assure

is not incompatible with

> "Fake it till you make it"

I don't know where they land, but they are definitely telling people they are not using their outputs to train. If they are, it's not clear how big of a scandal would result. I personally think it would be bad, but I clearly overindex on privacy & thought the news of ChatGPT chats being indexed by Google would be a bigger scandal.


You did hear that it did happen (however briefly) though, yeah?

https://techcrunch.com/2025/07/31/your-public-chatgpt-querie...


That's my point. It is a thing that is known and obviously a big negative, but yet failed to leave a lasting mark of any kind.


Ah, the eternal internal corporate search problem.


That's only if you opt out.


ChatGPT training is (advertised as) off by default for their plans above the prosumer level, Team & Enterprise. API results are similarly advertised as not being used for training by default.

Anthropic policies are more restrictive, saying they do not use customer data for training.


Is this not a tool that could be readily implemented and refined?


my knowledge graph mcp disagrees


I think it's more analogous to "intuition", and the text LLMs provide are the equivalent of "my gut tells me".


Humans have the ability to quickly pass things from short term to long term memory and vice versa, though. This sort of seamlessness is currently missing from LLMs.


No, it’s not in the training. Human memories are stored via electromagnetic frequencies controlled by microtubules. They’re not doing anything close to that in AI.


And LLM memories are stored in an electrical charge trapped in a floating gate transistor (or as magnetization of a ferromagnetic region on an alloy platter).

Or they write CLAUDE.md files. Whatever you want to call it.


That was my point, they’re stored in a totally different way. And that matters because being stored in microtubules infers quantum entanglement throughout the brain.


Whether QE is a mechanism in the brain still seems up for debate from the quick literature review I tried, but would love to learn more.

Given the pace of quantum computing it doesn’t seem out of the realm of possibility to “wire up” to LLMs in a couple years.


are ANN memories not also stored in loops like recurrent nets?


It's not that either.


I don't believe this has been really proved yet.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: