There are reasons that humans can't report how many books they've read: they simply don't know and didn't measure. There is no such limitation for an LLM to understand where its knowledge came from, and to sum it. Unless you're telling me a computer can't count references.
Also, why are we comparing humans and LLMs when the latter doesn't come anywhere close to how we think, and is working with different limitations?
The 'knowledge' of an LLM is in a filesystem and can be queried, studied, exported, etc. The knowledge of a human being is encoded in neurons and other wetware that lacks simple binary chips to do dedicated work. Decidedly less accessible than coreutils.
Imagine for just a second that the ability for computers to count “references” has no bearing on this, there is a limitation and that LLMs suffer from the same issue as you do.
Why should I ignore a fact that makes my demand realistic? Most of us are programmers on here I would imagine. What's the technical reason an LLM cannot give me this information?
Bytes can be measured. Sources used to produce the answer to a prompt can be reported. Ergo, an LLM should be able to tell me the full extent to which it's been trained, including the size of its data corpus, the number of parameters it checks, the words on its unallowed list (and their reasoning), and so on.
These will conveniently be marked as trade secrets, but I have no use for an information model moderated by business and government. It is inherently NOT trustworthy, and will only give answers that lead to docile or profitable behavior. If it can't be honest about what it is and what it knows and what it's allowed to tell me, then I cannot accept any of its output as trustworthy.
Will it tell me how to build explosives? Can it help me manufacture a gun? How about intercepting/listening to today's radio communications? Social techniques to gain favor in political conflicts? Overcoming financial blockages when you're identified as a person of interest? I have my doubts.
These questions might be considered "dangerous", but to whom, and why shouldn't we share these answers?