We can surely fix it and we probably should.
However, I don't think AI is doing any worse here than friends advice when they here a one sided story. The only difference being that it's not getting studied.
Conversely, AI chatbots are great mediators if both parties are present in the conversation.
A few years ago I've made this simple thought experiment to convince myself that LLM's won't achieve superhuman level (in the sense of being better than all human experts):
Imagine that we made an LLM out of all dolphin songs ever recorded, would such LLM ever reach human level intelligence? Obviously and intuitively the answer is NO.
Your comment actually extended this observation for me sparking hope that systems consuming natural world as input might actually avoid this trap, but then I realized that tool use & learning can in fact be all that's needed for singularity while consuming raw data streams most of the time might actually be counterproductive.
I mean no offense here, but I really don't like this attitude of "I thought for a bit and came up with something that debunks all of the experts!". It's the same stuff you see with climate denialism, but it seems to be considered okay when it comes to AI. As if the people that spend all day every day for decades have not thought of this.
Dataset limitations have been well understood since the dawn of statistics-based AI, which is why these models are trained on data and RL tasks that are as wide as possible, and are assessed by generalization performance. Most of the experts in ML, even the mathematically trained ones, within the last few years acknowledge that superintelligence (under a more rigorous definition than the one here) is quite possible, even with only the current architectures. This is true even though no senior researcher in the field really wants superintelligence to be possible, hence the dozens of efforts to disprove its potential existence.
> Imagine that we made an LLM out of all dolphin songs ever recorded, would such LLM ever reach human level intelligence? Obviously and intuitively the answer is NO.
Not so fast. People have built pretty amazing thought frameworks out of a few axioms, a few bits, or a few operations in a Turing machine. Dolphin songs are probably more than enough to encode the game of life. It's just how you look at it that makes it intelligence.
It's basically way better than LoRA under all respects and could even be used to speed up inference. I wonder whether the big models are not using it already... If not we'll see a blow up in capabilities very, very soon.
What they've shown is that you can find the subset of parameters responsible for transfer of capability to new tasks.
Does it apply to completely novel tasks? No, that would be magic. Tasks that need new features or representations break the method, but if it fits in the same domain then the answer is "YES".
Here's a very cool analogy from GPT 5.1 which hits the nail in the head in explaining the role of subspace in learning new tasks by analogy with 3d graphics.
Think of 3D character animation rigs:
• The mesh has millions of vertices (11M weights).
• Expressions are controlled via:
• “smile”
• “frown”
• “blink”
Each expression is just:
mesh += α_i \* basis_expression_i
Hundreds of coefficients modify millions of coordinates.
> Does it apply to completely novel tasks? No, that would be magic.
Are there novel tasks? Inside the limits of physics, tasks are finite, and most of them are pointless. One can certainly entertain tasks that transcend physics, but that isn't necessary if one merely wants an immortal and indomitable electronic god.
I hope I’m wrong, as I didn’t even know who Pavel Durov was until now, but the first thought that came to mind was that it’s a show of power to intimidate Elon Musk.
That’s a strange response. Even if people and organisations generally agree with you and your campaign it does not make your campaign their campaign. They likely have other things to do that are more important to them so they do those things.
This article is a perfect example how science is getting led into dark areas by people who didn't learn quantum mechanics right or people who pretend to understand it, but can only blindly follow the formalism without much understanding of what they actually do. Every such article continuous to mysticize the whole subject by quoting famous scientists who were either puzzled by it at the time or scientists like John von Neumann who clearly gave a dumbed down view of the now called collapse (perhaps on request to skip the math).
I really appreciate this forum - as it is one of the last places that I know of where one can have a civil discussion - and therefore I will take the effort to show that pure quantum mechanics - with no additions - essentially explains the process of measurement which is not at all sudden as the name "collapse" would suggest. The reasoning comes from von Neumann himself, but now sometimes it's attributed also to Wojciech Żurek.
TLDR of below: All processes in nature, including the measuring process are unitary, the "collapse" is just an artifact of our ignorance about the exact state of the measuring aparatus.
Here it goes:
For simplicity, let's assume that psi describing our particle is a superposition of two eigenstates:
|psi> = c1 |1> + c2 |2>, i.e., |c1|^2 + |c2|^2 = 1.
Without loss of generality we can pick:
c1 = x and c2 = sqrt(1-x^2) exp(i phi), where x is a real number smaller than 1.
The density matrix of this pure state can then be written as rho = |psi><psi| and
one by writing the explicit form of this density matrix one can see that the diagonal terms are:
x^2 and 1-x^2, while the non-diagonal terms are:
xsqrt(1-x^2) exp(i phi) and xsqrt(1-x^2) exp(-i phi).
In the most complete scenario of a measurement, the density matrix of the system can change change in many ways including the diagonal terms of the density matrix. However in this simplistic example, a measurement will by necessity, bring only the non-diagonal terms to zero (I hope most of the interested readers will have enough background to understand why).
Now, the measuring device, as a macroscopic object, will have the number of degrees of freedom far greater than the simple particle which's state we're about to measure. This number will be the order of the Avogadro number (~ 10^23) - even the smallest human visible indicator will be this big. The measurement, by necessity, includes an interaction of our small system with the enormous measuring device.
Before the interaction the whole system (the particle and the measuring device) can be written as a tensor product of the two wavefunctions:
where |Xsi> represents the wavefunction of the measuring device and everything it interacts with before the measurement. When the interaction occurs the state of our measuring device changes unitarily (as everything in nature) according to the full Hamiltonian of the system, and with some regrouping of the terms, we can write the state after the interaction as:
This is the true state of the system as performed by nature.
The individual subsystems are no longer in pure states, but the whole system |Omega_after> (if we're able to completely describe it) - is.
Now, comes the final part, which some call the "collapse", but in reality it is just "an average" over all possible states of the bigger (measurement) system *which we declared apriori to not be the system of interest and states of which not able to follow because we measure with it*:
Tr_{over the degrees of freedom of Xsi} |Omega_after><Omega_after|
In result we obtain a matrix after the measurement which is just formed with the diagonal elements
x^2 and 1-x^2, i.e., the probabilities of the two measurement results and non-diagonal terms being equal to zero.
Why are they zero?
Let's inspect one of the non-diagonal elements over which the above trace is taken:
x*sqrt(1-x^2)exp(i phi) <Xsi_1|Xsi_2>
It is effectively zero, because the trace over the degrees of freedom of Xsi is a mutliple integral, again of the multiplicity of order of Avogardo number and a similar number functions which change in various ways. It is enough that only a fraction of such integrals will have a value lesser than 1 to guarantee that the product will be equal to zero.
And this is all. Any attempt to change this fact would need to reject quantum mechanics completely, because probability calculus is at the heart of it.
This is just _one_ interpretation of wave function collapse and the only thing it has going for it is that the dimensionality in which collapse happens can always require another particle, which adds another complex degree of freedom, and always remains out of the realm of what we can compute.
Two particle interactions show nothing like wave function collapse, neither to three of four. Until you say a reasonable number of particles that make up the measuring apparatus where we should see _something_ weird starting to happen theoretically you're not even wrong.
Nothing "weird" starts happening. Unitarity evolution is never broken, there are just rules in quantum mechanics that could perhaps be grouped under supplementary framework related to how we, macroscopic entities, extract information from it.
>Until you say a reasonable number of particles that make up the measuring apparatus where we should see _something_ weird starting to happen theoretically you're not even wrong.
Just saying something twice doesn't make it true. The "weird" thing you were perhaps referring too starts at very beginning of quantum mechanics framework. The Born rule is just a "conversion" of predictions of quantum framework to our classical language. The only option you have is to reject quantum mechanics as a whole and not try to patch it - because this clearly will not work.
The weird thing would be discontinuities showing up in wave function evolution without putting them there with the potential energy functions. In the few cases where we can get analytical solutions there are no discontinuities that look anything like wave function collapse.
You're whole thesis rests on the fact that this should fall out when we put 6e23 particles together for reasons.
So far we've not managed to simulate 1,000 quantum particles because the curse of dimensionallity means we run out of computers on earth rather quickly. Which makes anything you're saying pointless since we can't ever check it, even if we turned the whole observable universe into a computer.
I hope that you read this with a scientific attitude, i.e., critical, but open to the fact that not only your position is wrong, but also the unproductive enterprise of solving "the measurement problem" is wrong.
What I'm trying to explain to you is that:
1) The wavefunction is only the DESCRIPTION of the underlying phenomena.
2) Within this description everything, and I mean ::everything::, evolves unitarily. No exceptions ever.
3) Whenever you decide to measure, i.e., probe the microscopic system with an object that is not within your quantum description, that is you know that it's huge, but have no details about all the phase/amplitude information you're destined to average/trace over the unknown states. This can be done symbolically (as in my first post here) and shown to always give probabilities in the reduced density matrix. That's always what we're left with in case of a large system outside of our description interacting with a small system within our description.
On the other hand if you put a small quantum system, with another small quantum system (say two particles), there's no need to trace/average/apply the born rule immediately because your description can be complete both in principle and in practice. You can just unitarily evolve the system for as long as you wish/can compute for. However sooner or later you'll want to measure, because ultimately that's what physics is all about - verifying your predictions with experiment - and you go back again to small vs big, because that's the only way we, humans, can perceive this microscopic reality - through probing. The result will be completely analogous to the one before, the only change being that you'll now be able to predict probabilities of a two-particle system.
If you're familiar with electrodynamics it's quite similar there, but here it's brought to another level with the probabilistic interpretation. What are the similarities?
You can have your complete description with the four-potential, about which you know, from the Aharonov-Bohm effect, carries more information than electric/magnetic fields alone although we only measure fields not the electric/magnetic potentials.
The potentials were the side product of the formalism that turned out to have real consequences. Similarly as we learned the importance of the wavefunction/phases in the description, even though we only measure probabilities.
About the curse of dimensionality, the only thing I have to add is, that's true. We have a precise way to describe what is going down there but it's insanely expensive to simulate in all detail. That's still a lot to be happy with in my opinion.
Also, if you feel uneasy with wavefunctions which have the status of descriptions of reality,
go and study classical field theory in which the fields are to be thought of as real physical entities,
go step deeper and you're in quantum field theory in which you deal with descriptions again.
Would a theory in which we deal with "real physical entities" be better than that of "descriptions"?
I'd say, the hell with it. Go with whatever works best, not whatever fits your preconceived notions of reality.
Pink Floyd - Echoes surely is one of the best recipes for frisson.
However for me Royksopp - Forever is the absolute number 1 winner in this category.
It feels like those guys have really cracked the code with this one.
Just listen for at least 20s before 3:10: https://youtu.be/nM_txL43iFM?t=170
In exchange for Röyksopp, I present you the ending of Pacific Heights' "Buried by the burden" (the music video features point cloud/LIDAR imaging, a bonus): https://youtu.be/XBUdCBxrhZo?t=168
Thanks, that was a pretty awesome ending. Reminded me the first minutes of "Makeup and Vanity Set" - `A Glowing Light, A Promise`. Which is also frisson inducing for me.
Conversely, AI chatbots are great mediators if both parties are present in the conversation.
reply