Hacker News new | past | comments | ask | show | jobs | submit login

What excites me about this and similar work (e.g. https://arxiv.org/abs/1904.12584) is that it augurs well for a new era of AI in which we combine the strength of deep learning (specifically, the teaching signal provided by automatic differentiation) and symbolic GOFAI (with its appeal to interpretability and traditional engineering principles).

Of course we need to keep the gradients, architecture search, hyperparameter tuning, scalable training to massive datasets, etc., but there is a growing sense that the programs we write can encode extremely powerful priors about how the world works, and to not encode those priors leaves our learning algorithms subject to attacks, bugs, poor sample efficiency, bad generalization, weak transfer. Not to mention a host of rickety conclusions that are probably poisoned by hyperparameter hacking.

Conversely, we need to try to avoid the proliferation of black box systems that require heroic efforts of mathematical analysis to understand and debug. Take for example the highly sophisticated activation atlas work by Shan Carter and others, which was needed to reveal that many convnets are vulnerable to an almost childlike kind of juxtapositional reasoning (snorkler + fire engine = scuba diver). Beautiful work, but to me it would be better if that form of analysis wasn't necessary in the first place, because the nets themselves were incapable of reasoning about object identity using distant context.

We need systems that are, by design, amenable to rigorous and lucid scientific analysis, that are debuggable, that admit simple causal models of their behavior, that are provably safe in various contexts, that can be straightforwardly interrogated to explain their poor or good performance, that suggest modification and elaboration and improvement other than adding more neurons. We need to speed the maturation of modern deep learning out of the alchemical phase into something more like aeronautical engineering.

The major innovations in recent years have been along these lines, of course. Attention is a great example, basically supplanting RNNs for a lot of sequence modelling. Convolutions themselves are probably the ur-example, of course. Graph convolutions will be the next major tool to be pushed into wider use. To the interested observer the stream of innovations seems not to end. But the framing that makes this all very natural is precisely that of this being the union of computer programming, where coming up with new algorithms for bespoke tasks is commonplace, with automatic differentiation, which allows those algorithm to learn.

What remains exciting virgin territory is how we best we put these new beasts into the harness of reliable AI engineering. That is in its infancy, because how you write and debug a learning program is completely different to the ordinary sort... there are probably 10x and 100x productivity gains to be realized there from relatively simple ideas.




>> We need systems that are, by design, amenable to rigorous and lucid scientific analysis, that are debuggable, that admit simple causal models of their behavior, that are provably safe in various contexts, that can be straightforwardly interrogated to explain their poor or good performance, that suggest modification and elaboration and improvement other than adding more neurons.

You mean specifically machine learning systems with these properties. Such machine learning systems do exist and have a big great body of research behind them: I'm talking about Inductive Logic Programming systems.

ILP has been around since the '90s (and even earlier without the name) and it's only the lack of any background in symbolic logic on the part of the most recent generation of neural net researchers that stops them from evaluating them, and using them to mine ideas to improve their own systems.

For a state-of-the-art ILP system, see Metagol (created by my PhD supervisor and his previous doctoral students):

https://github.com/metagol/metagol


Thanks! I see you are doing your PhD in ILP. As someone who is pondering the topic of his future PhD, the obvious question is: have ILP models been enriched with automatic differentiation? How? Did it help?

Either way, can you recommend a good survey article on ILP and the last few years of progress on that front?


Hi. ILP is Inductive Logic Programming, a form of logic-based, symbolic machine learning. ILP "models" are first-order logic theories that are not differentiable.

To put it plainly, most ILP algorithms learn Prolog programs from examples and background knowledge which are also Prolog programs. Some learn logic programs in other logic programming languages, like Answer Set Programming or constraint programming languages.

The wikipedia page on ILP has a good general introduction:

https://en.wikipedia.org/wiki/Inductive_logic_programming

The most recent survey article I know of is the following, from 2012:

(ILP turns 20 - Biography and future challenges) https://www.doc.ic.ac.uk/~shm/Papers/ILPturns20.pdf

It's a bit old now and misses a few recent developments, like learning in ASP and meta-interpretive learning (that I work on).

If you're interested specifically in differentiable models, in the last couple of years there has been a lot of activity on the side of mainly neural networks researchers interested in learning in differentiable logics. For an example, see this paper by a couple of people at DeepMind:

(Learning explanatory rules from noisy data) https://arxiv.org/abs/1711.04574

Edit: May I ask? Why is automatic differentiation the "obvious" question?


What do you think should be a good benchmark for such hybrid models? Here they created a toy dataset of simple geometric shapes with simple relationships. This is fine to start with, but we need to come up with some more realistic and useful scenario. Even MNIST for image classification is both realistic and useful. I wonder what would be an equivalent of MNIST or ImageNet for models which implement reasoning and common sense.



Looks good! So why didn't they use it in this paper?


Just read your comment, it looks a very promising direction. I'd be glad to hear your response to my comment: https://news.ycombinator.com/item?id=20074750


"GOFAI" is a buzzword used by people who don't know how to make auditable systems, so they declare auditable systems "old fashioned" so-as to excuse themselves from learning them.


I don't know how you see that sentiment in what I wrote.


GOFAI is a slur against people who build systems responsibly, it doesn't matter the context for a slur just don't do it.


I haven't encountered it used with negative connotations, even by deep learning people. Neither does cursory Googling show it having that valence.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: