>no results to many incredible results What exactly is incredible (relatively) a...

Veedrac · on July 23, 2019

> What exactly is incredible (relatively) about the current state of things?

Compared to 1999?

Watch https://www.youtube.com/watch?v=kSLJriaOumA

Hear https://audio-samples.github.io/#section-4

These are not just ‘increasing numbers’. These are fucking witchcraft, and if we didn't live in a world with 5 inch blocks of magical silicon that talk to us and giant tubes of aluminium that fly in the sky the average person would still have the sense to recognize it.

> It is the way to save this field from [...]

For us to have a productive conversation here you need to either respond to my criticisms of this line of argument or accept that it's wrong. Being disingenuous because you like what the argument would encourage if it were true doesn't help when your argument isn't true.

> Is rote multiplication a task that we're interested in achieving with AI?

It's a measure for which improvement would have meaningful positive impact on our ability to reason, so it's a measure we should wish to improve all else equal. Yes, it's marginal, yes, it's silly, that's the point: failure in one corner does not equate to failure in them all.

Alas1 · on July 24, 2019

>These are not just ‘increasing numbers’. These are fucking witchcraft, and if we didn't live in a world with 5 inch blocks of magical silicon that talk to us and giant tubes of aluminium that fly in the sky the average person would still have the sense to recognize it.

What about generative models is really AI, other than the fact that they rely on some similar ideas from machine learning that are found in actual AI applications? Yes, maybe to an average person these are witchcraft, but any advanced technology can appear that way---Deep Blue beating Kasparov probably was witchcraft to the uninitiated. This is curve fitting, and the same approaches in 1999 were also trying to fit curves, it's just that we can fit them way better than before right now. Even the exact methods that are used to produce your examples are not fundamentally new, they are just the same old ideas with the same old weaknesses. What we have right now is a huge hammer, and a hammer is surely useful, but not the only thing needed to build AI. Calling these witchcraft is a marketing move that we definitely don't need, creates unnecessary hype, and hides the simplicity and the naivete of the methods used in producing them. If anybody else reads this, these are just increasing numbers, not witchcraft. But as the numbers increase it requires a little more effort and knowledge to debunk them.

I'm not dismissing things for the fun of it, but it pains me to see this community waste so many resources in pursuit of a local minimum due to lack of a better sense of direction. I feel like not much more is to be gained from this conversation, although it was fun, and thank you for responding.

Veedrac · on July 25, 2019

I appreciate you're trying to wind it down so I'll try to get to the point, but there's a lot to unpack here.

I'm not evaluating these models on whether they are AGI, I am evaluating them on what they tell us about AGI in the future. They show that even tiny models, some 10000x to 1000000x times smaller than what I think are the comparable measures in the human brain, trained with incredibly simple single-pass methods, manage to extract semirobust and semantically meaningful structure from raw data, are able to operate on this data in semisophisticated ways, and do so vastly better than their size-comparable biological controls. I'm not looking for the human, I'm looking for small scale proofs of concepts of the principles we have good reasons to expect are required for AGI.

The curve fitting meme[1] has gotten popular recently, but it's no more accurate than calling Firefox ‘just symbols on the head of a tape’. Yes, at some level these systems reduce to hugely-dimensional mathematical curves, but the intuitions this brings are pretty much all wrong. I believe this meme has gained popularity due to adversarial examples, but those are typically misinterpreted[2]. If you can take a system trained to predict English text, prime it (not train it) with translations, and get nontrivial quality French-English translations, dismissing it as ‘just’ curve fitting is ‘just’ the noncentral fallacy.

Fundamental to this risk evaluation is the ‘simplicity and the naivete of the methods used in producing them’. That simple systems, at tiny scales, with only inexact analogies to the brain, based on research younger than the people working on it, is solving major blockers in what good heuristics predict AGI needs is a major indicator about the non-implausibility of AGI. AGI skeptics have their own heuristics instead, with reasons those heuristics should be hard, but when you calibrate with the only existence proof we have of AGI development—human evolution—, those heuristics are clearly and overtly bad heuristics that would have failed to trigger. Thus we should ignore them.

[1] Similar comments on ‘the same approaches in 1999’, another meme only true at the barest of surface levels. Scale up 1999 models and you get poor results.

[2] See http://gradientscience.org/adv/. I don't agree with everything they say, since I think the issue relates more to the NN's structure encoding the wrong priors, but that's an aside.