1. You are conflating the sense in which humans may arrive at a mistaken propositional model based on mistaking context; with one where the machine lacks any sense of contextual relevance to arrive at any specific propositional model.
This tactic is taken in these "replies" often: humans fail for semantic reason A; machines fail for non-semantic reason B; isnt A just B? No.
2. Or you've misunderstood how humans learn.
Though on the face of it the sketch of the proof its correct: there are an infinite number of target models (T) which compress to representation R. Eg., an infinite number of 3D geometries which can produce a given 2D photograph.
Compression (ie., "low-rank" interpolation through data) yields a function from R-space datasets (eg., 2D photos) to a model R which "covers" that space.
It does not yield a function from R->T, which doesn't exist as a formal matter. You, at least, need to add information. This is what many misunderstand: light itself is ambigious and does not "contain" sufficient information. We resolve light into 3D models by a "best guess" based on prior information.
So we require, at least (R, C) -> T where 'C' is some sort of contextual model which bridges the infinities between R and T.
Since ML takes Samples(T -> R) -> R, and not (R,C)->T, it doesnt produce what is required.
QED.
3. Word2Vec does not capture hierarchical relationships. He chose hierachical specifically because it is a discrete constraint and ML is a continuous interpolation technique that cannot arrive at discrete constraints.
4. "Actively worked on" means building AGI. Participating in the world with people is how animals acquire relevance, context, etc.
This tactic is taken in these "replies" often: humans fail for semantic reason A; machines fail for non-semantic reason B; isnt A just B? No.
2. Or you've misunderstood how humans learn.
Though on the face of it the sketch of the proof its correct: there are an infinite number of target models (T) which compress to representation R. Eg., an infinite number of 3D geometries which can produce a given 2D photograph.
Compression (ie., "low-rank" interpolation through data) yields a function from R-space datasets (eg., 2D photos) to a model R which "covers" that space.
It does not yield a function from R->T, which doesn't exist as a formal matter. You, at least, need to add information. This is what many misunderstand: light itself is ambigious and does not "contain" sufficient information. We resolve light into 3D models by a "best guess" based on prior information.
So we require, at least (R, C) -> T where 'C' is some sort of contextual model which bridges the infinities between R and T.
Since ML takes Samples(T -> R) -> R, and not (R,C)->T, it doesnt produce what is required.
QED.
3. Word2Vec does not capture hierarchical relationships. He chose hierachical specifically because it is a discrete constraint and ML is a continuous interpolation technique that cannot arrive at discrete constraints.
4. "Actively worked on" means building AGI. Participating in the world with people is how animals acquire relevance, context, etc.