Hacker News new | past | comments | ask | show | jobs | submit login
The meeting of the minds that launched AI (ieee.org)
173 points by fremden on Sept 11, 2023 | hide | past | favorite | 54 comments



They tried so hard.

I went through Stanford CS in the 1980s, just as it was becoming clear that the predicate-calculus / expert systems approach to AI wasn't going to do much outside of very well defined problem spaces. It was very frustrating. I'd already done machine proof of correctness work, but that's in a problem space you can formalize. So I had a sense of the limits of the approach, and was rather surprised by the broad claims made for it at Stanford.

I took McCarthy's "Epistemological problems in artificial intelligence" class. One day he was describing the missionary-and-cannibals problem (amusingly, in a less violent form). He started out with it as a word problem, and then he set it up so that his new "circumscription" approach could be applied. Turn the crank on that algorithm, and an answer pops out. The word problem to formalism conversion struck me as the step where, as an old New Yorker cartoon showing a mathematician explaining something put it, "a miracle occurs". Formalizing the problem is the hard part.

The classic critique along those lines is "Artificial Intelligence meets Natural Stupidity", by Drew McDermott.[1] That was written in 1976, and basically describes why that class of AI systems hit a wall in the 1980s. The "AI Winter" followed, the 1980s AI startups went bust, and not much happened for 15 years. Well-known history.

[1] https://cs.fit.edu/~kgallagher/Schtick/Serious/McDermott.AI....


As a 7 year old, I met two of them at IFIP68 in Edinburgh. I don't have much recall of Minsky, but McCarthy was nice. We went out to turnhouse airport and he flew a light plane around. At dinner he commented we did the washing up by hand and when he offered to send us a dishwasher my mother (of course) said "no no John don't be silly" but she said to me in the kitchen later on: "if he ever offers again accept"

We also ate at a chinese restaurant across town and he was surprised when me and my brothers and sisters said birds-nest soup was disgusting.

Susy McCarthy stayed with us for a while. She was a typical Californian teenager, ate her cereal with the wrong hand holding the spoon and her feet up on the chairs and we younger kids were told in no uncertain terms not to emulate her. She refused to go to school and there were exasperating phone calls about what to do. In the end she went somewhere else. Johns wife died climbing Mt Everest. it was really sad. I think it was the first all-women's climbing team to try the route (or even Everest's pinnacle)

Minsky and McCarthy were quite happy to respond to emails in the 80s and 90s. I asked Minsky about writing with Harry Harrison and he said he was frustrated at the ending of the book they did ("the Turing Option") and McCarthy talked about his ideas of long term information ontologies in a notation or language he called "Elephant"

McCarthy was a notable USENET curmudgeon and would be oppositional to any anti-nuclear, anti-oil, anti-plastic agenda he saw. He also said he cried when he learned the bombs had been dropped, and that the invasion of Japan was off, because he knew he wouldn't die on a beach in an opposed landing. (It is possible I have this mixed up with memories of what Paul Fussel wrote, but I am pretty sure John said this)

John played "Alice's Restaurant" with a guitar, he had a very nasally whiny voice.


> As a 7 year old, I met two of them at IFIP68 in Edinburgh

Can you fill in some background? What was IFIP68 and how did you end up there as a 7 year old?


IFIP68 was the 1968 conference of IFIP (The International Federation for Information Processing) held in Edinburgh.

My dad was one of the conference organisers. Local host maybe? He was the foundation chair of the computer science department at the University of Edinburgh. As he was involved in the academic programme committee he had the principal keynote speakers back for socials, dinners and the like. I went to the trade show which was held alongside the conference, I got two cardboard mock pdp-8 computers from the Digital Equipment stand, they had just announced the PDP8/I configuration and were handing out cardboard blanks to prospective buyers. My memory is that this was in the old exhibition space next to Waverley station, which had a roof with gardens and glass insets, it was removed in the 70s. It was the kind of space which had the ideal home show, that kind of thing, but packed out with hardware vendors trying to sell mainframes.

I think I had the only 2 node parallel cardboard pdp-8 in existence.


My father and his friends were academic computer scientists working on AI back in the 60s. I don't know that there's a straightforward path between what they were doing and the popular LLMs today, but I do applaud more stories on what old school comp sci researchers were up to.


LLMs of today display amazing abductive abilities but are limited in inductive and deductive abilities, as well as other optimization techniques of classical AI and algorithms. These abductive abilities are unique and exciting because we’ve typically done really poorly with ambiguous and complex semantic spaces like this. However I think the excitement has obscured the fact it’s just a piece of a larger machine. Why do we care that LLMs are mediocre chess players when we have machine models using more traditional techniques that are the best chess players on earth? Why do we care they fail at deductive reasoning tests? At mathematical calculations? Those are really well understood areas of computing. Somehow people have fixated on the things we’ve already done that this new technique fails at, but ignore the abilities LLMs and other generative models demonstrate we’ve never achieved before. At the same time the other camp only sees generative AI as the silver bullet tool to end all other tools. Neither is correct.


>but are limited in inductive and deductive abilities

LLMs are great at induction.

In a broad sense, they are also very good at deduction.

"I define a new word, the podition. A podition is any object that can fit on a podium. Is a computer a podition ? Why ?"

A correct answer is deductive.

LLMs eat these kind of questions for breakfast. Even the OG 2020 GPT-3 could manage them.

You really do have to stretch deduction to heights most people struggle with to have them falter majorly.


For breakfast, you say?

User:

I define a new word, the podition. A podition is any object that can fit on a podium. Is a computer a podition ? Why ?

ChatGPT:

The term "podition" is a newly coined word, and its definition as "any object that can fit on a podium" is quite specific. Whether or not a computer can be considered a "podition" depends on the size and design of the podium in question, as well as the specific characteristics of the computer.

If the podium is large enough and designed to accommodate objects of various sizes, then a computer could potentially be considered a podition if it fits on the podium without exceeding its space or causing any issues. However, if the podium is small or not designed to hold objects like a computer, then it may not qualify as a podition.

In summary, whether a computer is a podition or not depends on the podium's size and design and whether the computer can comfortably fit on it without causing any problems. The term "podition" itself is a new and specific concept, so its application may vary depending on context.

User:

Using common sense definitions of a computer and a podium, is a computer a podition?

ChatGPT:

Using common sense definitions of a computer and a podium, a computer is typically not considered a "podition." A computer is an electronic device used for processing and storing data, while a podium is a raised platform or stand typically used for holding notes, books, or supporting a speaker during a presentation. These two objects serve very different purposes and have distinct characteristics.

A computer is not designed to be placed on a podium, and it would generally be impractical to do so. A podium is meant for supporting objects like papers, books, or notes during a presentation, not electronic devices like computers. Therefore, in common usage and understanding, a computer would not be classified as a "podition."


>For breakfast, you say?

Yes and it's honestly pretty funny that you think this proves otherwise. The error is mine. a computer is not always small enough to fit on a podium. It's not about a "common sense" definition or not. and GPT caught that.

Go ahead and replace computer with any object that is always small enough to fit on a podium.


The mental gymnastics are beautiful and entertaining.

Google image "podium". Google image "computer". The vast majority of computers shown can be placed on the vast majority of podiums shown.

But again, as the other commenter points out, this is abductive reasoning.

In addition to getting lost in the irrelevant weeds of "these two objects serve very different purposes and have distinct characteristics", ChatGPT also fails to point out that there are 2 common definitions for podium - one made for lecturing, on which you could place most modern computers that aren't 30kg hunks of metal (all of my university lecturers definitely had computers on the podium). And one made for humans, on which you could definitely place any bulky desktop computer short of a mainframe.

What this really shows though is that LLM output is a mirror in which people will read mostly what they want to read which matches their narrative.


>The mental gymnastics are beautiful and entertaining.

Sure

>But again, as the other commenter points out, this is abductive reasoning.

Deduction is divining the effect from rule and cause. The rule is the definition of the podition, the cause is that a computer is small enough to fit a podium and the effect is that a computer is a podition.

>Google image "podium". Google image "computer". The vast majority of computers shown can be placed on the vast majority of podiums shown.

Deduction does not deal with probable statements. The rule and cause must both be true. My cause is not always true.

Like i said, i genuinely do think it's funny that you read all that and decided it was absence of reasoning. Just kind of shows the state of things.


This is actually abduction. Induction doesn’t fail, abduction is probabilistic.


three components: a rule, a cause, and an effect.

The example i gave is deductive. i.e divining the effect from rule and cause. The rule is the definition of the podition, the cause is that a computer is small enough to fit a podium and the effect is that a computer is a podition.

Induction is divining the rule from the cause and effect. It can definitely fail. There could be exceptions to the general rule that aren't captured in observation. It is the basis of experimental science.

abduction is divining the cause from the rule and effect.

i.e When it rains, the grass gets wet (rule). The grass is wet (effect). Therefore, I predict that it just rained (cause).

Both induction and abduction deal with probable statements. Only deduction is guaranteed to be true (provided the rule and cause are true)


I’m sorry you’re right, I dashed off a quick reply and wrote the wrong word as I was distracted.

However I still stand by the statement that their inductive and deductive reasoning is weaker than abductive. This is why they so easily hallucinate - that the very nature of choosing the semantically likely next token is at its root abductive.

GPT is remarkable, but it’s not reasoning in any meaningful sense. It’s not structuring logical constructs and drawing conclusions. I’d hold by my assertion that they are abductively simulating inductive and deductive reasoning.


>However I still stand by the statement that their inductive and deductive reasoning is weaker than abductive

Technically abductive is the weakest form of reasoning in the sense of the reasoning type likeliest to form incorrect conclusions. The conclusions are wrong if you decide on the wrong rule. In the example, there are other rules that could make grass wet other than rain. It could be a sprinkler.

However, having a good sense of what rule to pick for conclusions ? I agree it is the hardest to replicate in an artificial system by far.

>GPT is remarkable, but it’s not reasoning in any meaningful sense. It’s not structuring logical constructs and drawing conclusions. I’d hold by my assertion that they are abductively simulating inductive and deductive reasoning.

Seeing output from GPT that demonstrates intelligence, reasoning, or whatever, and saying it is not real reasoning/Intelligence etc, is like looking at a plane soar and saying that the plane is fake flying. And this isn't even a nature versus artificial thing either. The origin point is entirely arbitrary.

You could just as easily move the origin to Bees and say, "oh, birds aren't really flying". You could move it to planes and say, "oh, helicopters aren't really flying." It's basically a meaningless statement.

If it can do and say things demonstrating induction or deduction then it is performing induction or deduction.

>It’s not structuring logical constructs and drawing conclusions

I don't think people are structuring logical constructs with every deduction they make


I don’t think people always do deductive reasoning when they attempt to do it. In fact I think people largely do abductive reasoning, even when they attempt deductive reasoning. Machines are better at deductive reasoning because sans some special purpose approach they can do nothing but follow the rules.

This is specifically why I think LLMs are so enchanting to humans, because it’s behavior and logic is more less sterile and more human in nature precisely because it’s a “most likely” based on its training data approach. With lots of examples of deductive reasoning it can structure a response that is deductively reasoned - until it doesn’t. The fact it can fail in the process of deductive reasoning shows it’s not actually deductively reasoning. This doesn’t mean it can’t produce results that are deductive - it’s literally unable to formulate a sense of rules and application of those rules in sequence to arrive at a conclusion based on the premise. It formulates a series of most likely tokens based on its training and context, so while it may quite often arrive at a conclusion that is deductive it never actually deduced anything.

I feel like you feel I’m somehow denigrating the output of the models. I’m not. I’m in fact saying we already have amazing deductive solvers and other reasoning systems that can do impressive proofs far beyond the capability of any human or LLM. But we have never built something that can abductively reason over an abstract semantic space, and that is AMAZING. Making LLMs perform rigorous deductive reasoning IMO is a non goal. Making a system of models and techniques that leverages best of breed and firmly plants the LLM in the space of abstract semantic abductive reasoning as the glue that unites everything is what we should be focused on. Then instead of spending 10 years making an LLM that can beat a high school chess champion, we can spend two months integrating world class chess AI into a system that can delegate to the AI chess solver when it plays chess.


How are LLMs bad at induction? I thought they were great at induction. This paper doesn’t go into measurements of it, but helps lay out the nature of reasoning well.

https://aclanthology.org/2023.findings-acl.67.pdf#page15


They are great at saying things that sounds like the next line of the conversation. That's a certain kind of induction for sure, but probably not the kind you're after.


There's some effective ways to make them a lot more "truthful" than that, as you can tell by how good GPT-4 is at it.

Simple example: sample it a bunch of times. If it's different each time, it likely made it up.


'Duck induction', by analogy to 'duck typing'.


There's some value in putting a flag in the ground. Even if most of those people there were in the symbolic camp, a lot of their critiques of neural networks as they existed were well-founded and were really only proved obviously _wrong_ after many many rounds of moore's law.


The criticism from the beginning was of a fundamental theoretical nature, and died at the 90's when people proved and demonstrated that neural networks were powerful enough to run any kind of computation.

In fact, I don't recall people criticizing neural networks from being too small to be useful. Ever. There was a lot of disagreement between wide and deep network proponents, that deep won by demonstration, but "how large a network we need to handle X" was always more of a question than a "see, we'll never get there". (Even more because the "we will never get there" is obviously false, since the thing practically no limit on scaling.)


My favourite detail about that 1956 meeting is this extract from the conference proposal:

> An attempt will be made to find how to make machines use language, form abstractions and concepts, solve kinds of problems now reserved for humans, and improve themselves. We think that a significant advance can be made in one or more of these problems if a carefully selected group of scientists work on it together for a summer.

I think this may be one of the most over-ambitious software estimates of all time.

The whole proposal is on https://en.wikipedia.org/wiki/Dartmouth_workshop


The promise was "a significant advance in one or more of these areas", not the Holy Grail. Not unreasonable!


I forget where I read it, but there was a similar assessment of the time needed for computer vision. Overall, quite an optimistic time.


For those interested in reading Peter Milner's autobiography, here's the pdf.

https://www.sfn.org/-/media/SfN/Documents/TheHistoryofNeuros...


I just learned about this conference a couple weeks ago while watching the Computer History Museum video on AI: https://youtu.be/NGZx5GAUPys?si=aVDZAmpR2ziKq4x9

(Video is from 2014)


iirc simon and newell arrived with working code despite not yet having a working a computer at CMU


67 years later: https://aiconference.com


and https://ai.engineer/summit ! as this industry builds out there need to be multiple forums for different groups to get together


Summer camp for mathematicians


This group of people may have been the first to mention the words “AI” prominently in academia but is this flag planting or are they truly foundational to the work with the same name today? If none of these people had done anything, would we really be far behind?

My sense is that modern AI has more to owe Fukushima’s neocognitron, Hubel and Wiesel, and the connectionists than any intellectual descendant of the work mentioned here.


Modern AI depends on one thing above all else: The unimaginable compute power provided by today's GPUs.

All theoretical foundations are trivial by comparison. The basic math can be understood by an interested high school student, but unless you are able to do billions of matrix multiplications per second, it doesn't mean anything. AI was going nowhere before Nvidia.

Calling these people "the minds that launched AI" is like calling Archimedes the father of hot-air balloons, because he recognized the principle of buoyancy.


Are you trolling? Minsky, McCarthy, Newell and Simon all went on to win Turing awards for their work, (as later did several other AI luminaries over the decades). And Claude Shannon?

In the mid sixties Minsky and Papert published a paper/book called "Perceptrons" which explored the limit of perceptrons, though it said those limits could be overcome by multilayer networks, if they were ever computationally feasible. And, thanks to Moore's law, they now are.

Almost every one of those attendees is a well known scholar whose work you depend on every day outside AI (as well as within it, though nobody really depends NNs yet). Of them, the least known in computing, Ray Solomonoff, might be the smartest of the bunch, though he was never univerity affiliated as far as I know.


> And, thanks to Moore's law, they now are.

Not thanks to "Moore's law". Thanks to the countless engineers who made that happen, and whose work is much more important for today's AI systems than the theory that was cooked up in the 60s and 70s.

Your comment is a typical example of the hero-worship towards theorists, and the casual disregard towards engineers, that is so common in today's science culture.

Any above-average grad student could reinvent the perceptron network from scratch. Good luck having a grad student (or even a Nobel laureate) redesign the H100 GPU from scratch.


You got that wrong, the problem with multilayer networks at the time was not about scaling, but how to train such a network at all.

Using error backpropagation for training multilayer networks was what overcame that problem, not Moore's law, or anything else.


Don't you remember what machines were like back then? The PDP-10 was about a 400 MIPS timesharing computer with an 18-bit address space.

People like Rumelhart kept at it, and eventually the hardware caught up with the requirements.


I know computers were slow, and Moore's law was what made really big networks computationally feasible.

Still, the decisive algorithmic breakthrough for the Perceptron was applying BP to MLPs. Without multiple layers you can't solve problems which aren't linearly separable, and without error backpropagation you can't train multilayer networks.


[flagged]


Could you please stop all this? It isn't interesting.


Other than Minsky, I don’t think others (who are nevertheless scientists in their respective fields) are considered to have made significant contributions to modern machine learning or AI. McCarthy’s work around this topic culminated in LISP, leading to Emacs, a text editor!

From that period, Rosenblatt’s work was instrumental to modern AI.


Solomonoff's https://en.wikipedia.org/wiki/Solomonoff%27s_theory_of_induc... is about as basic to the theory of intelligent agents as anything gets.

(He's in the pic and I'd guess this article was by a relative.)


If I was an intelligent agent, I would prefer to be based on a theory that was computable without time travel, which this one isn't.


Ah, but time travel (or rather, prediction, but I’m being whimsical here) is the essence of intelligence. Working off your current state and inputs your mind peers forward in time to imagine the ghost of the future, and echoes of this future ripple back to drive your actions.


Claiming that the creator of LISP did not have a significant impact on AI is not a defensible position.


People forget for how long Lisp had an impact on AI, even outside GOFAI techniques; LeCun's early neural networks were written in Lisp: https://leon.bottou.org/publications/pdf/sn-1988.pdf


I don’t know - there’s real impact and then there’s inconsequential path dependency. This feels like the latter. The networks turned out to be valuable but LISP did not.


The story goes as, John McCarthy was applying for an assistant professorship position at MIT. MIT told him, but we have here Norbert Wiener who was a renowned mathematician at the time and had published cybernetics some time ago, in which he talks about agents interacting with the environment and feedback control, sort of modern computation-based AI. McCarthy changed the name from cybernetics to AI, and focused on symbolic systems and logic. The approach was generally not successful.

Some people consider that the logic-based approach to AI pioneered in this conference contributed to an (what we now call) AI winter. People like John Pierce of Bell Labs, a very influential figure in government, defunded research in computation-based AI such as for speech recognition (he wrote articles, saying, basically, researchers pursuing these techniques are charlatans).

There is no major algorithm or idea in undergrad machine learning textbooks named after these people. There are other people from that era.


Makes sense. I heard that some of Wiener’s anti-war sentiment (specifically anti-military-work-during-peacetime) may have contributed… cybernetics really collapsed hard as a discipline, even though I find it very helpful from a systems design perspective. AI has always bothered me as a term because, from a design perspective, the goal should be creating intelligent systems—not necessarily entirely artificial ones.

>There is no major algorithm or idea in undergrad machine learning textbooks named after these people.

Maybe the pandemonium idea from Selfridge?


Shannon did foundational work on the theory of computers being able to play chess. That stuff might seem ‘obvious’ but you have to remember that would have seemed futuristic if not impossible when it was first proposed. That work in itself is fundamental to a lot of subsequent AI research (esp just the concept of ‘game playing’ as a model for testing approaches)


If I recall, Shannon returned a proposal to McCarthy for research during his stay on the cellular automata (perhaps inspired by von Neumann), not machines playing games.

Shannon did foundational research in information theory, communications, cryptography, digital relay circuit design, and gambling. Crediting him for impact on AI and machine learning is a bit of a stretch (even though the IEEE Spectrum and Bell Labs published articles claiming credit, indicating that Shannon played a role).

He was an avid chess player, and demonstrated machines doing various things: playing, finding their ways in a maze, juggling etc. It wasn’t based on a foundational theory of computers playing games (and the approach is different from the reinforcement learning or optimization of the models).


Shannon published a paper on Chess AI, just give it a quick google.


> finding their ways in a maze,

Another beginning building block of the field of AI.

Are you even thinking about what you’re writing here or is this the output of an LLM?


Emacs is so much more than a text editor! But I need to stay on topic...

I believe your assessment of LISP (and therefore of MacArthy)'s impact on AI to be unfair. Just a few days ago https://github.com/norvig/paip-lisp was discussed on this site, for example.


Shannon was, I believe, the first person to define an n-gram probabilistic language model of English text and then sample from it!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: