Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I may have chosen a bad example in the form of reconstructing 3D geometry. But take sentence parsing instead. If I parse a sentence using classical techniques, I know exactly how that works and exactly which sentences would or wouldn't be accepted. That level of understanding isn't (currently) possible with deep learning techniques, even if in some cases they perform better.

It's true that some problems are so ill-defined that all you can judge is whether or not a particular technique succeeds over a sample of interesting cases. But not all problems are like that.

>what methods of reconstructing complex 3D geometry from real world photographs have some way of proving that their transformation is correct within certain bounds?

The issue I have with this is that it's essentially giving up on understanding how reconstruction of 3D geometry works. One might at least hope that the techniques that make it possible to do this from real-world photographs are, with some idealization, the same techniques that make it possible to do this (nondeterministically) from a 2D perspective rendering of a 3D scene made of polygons. And we certainly can prove results about the effectiveness of those techniques. I think it's far too early to give up on that possibility and just say "it's all a mess, and whatever methods happen to work happen to work".

>we know how to make misleading objects that produce visual illusions to appear to have a much different 3d shape that they do?

But that supports my point, I think. We can prove that those objects have the propensity to give rise to illusions given certain assumptions regarding the algorithms used to reconstruct the scene. We can't (yet) prove what kinds of objects would fool a deep learning model.



Okay, let's take sentence parsing instead, I've got much more backround in that. If we're looking at classical techniques in the sense of 'classical' as in techniques popularized pre-neural networks some 10+ years ago, e.g. something like Charniak-Johnson or Nivre's Maltparser (and generally augmented with all kinds of tricks, model ensembles, transfer learning, custom preprocessing for structured data e.g. dates, and a whole NLP pipeline before the syntax part starts - all this was pretty much a must-have in any real usage) then all the same criticisms apply, the factors that the statistical model learns aren't human-understandable, and the concept of "accepted sentences" is meaningless (IMHO rightfully so), the parser accepts everything but the question is about the proper interpretation/ranking of potentially ambiguous locations. Even simple methods such as lexicalized PCFG would fall into this bucket; pretty much all "knowledge" is embedded in the learned probabilities and there isn't much meaningful interpretability lost by switching from a PCFG to neural networks.

On the other hand, if we think of 'classical techniques' as in something a textbook would describe as an example, e.g. a manually, carefully built non-lexicalized CFG, then these fall under the "highly simplified artificial case" I talked about earlier - they provide a clean, elegant, understandable solution to a small subset of the problem, a subset that no one really cares about. They simply aren't nowhere near competitive on any real world data, they either "don't accept" a large portion of even very clean data, or provide a multitude of interpretations while lacking the power to provide good rankings comparable to state-of-art pipelines run by statistical learning or, recently, neural networks.

Furthermore, syntactic parsing of sentences has the exact theoretical limits to provability - there is no good source of "truth", and there is no good source of truth possible; if you follow descriptive linguistic approach then English (or any other language) can only be defined by lots of examples; and if you follow a perscriptive approach (which could be made into some finite formal model) then you get enough "ungrammatical" sentences in literal literary English e.g. reviewed&corrected works by respected authors; even more so in any real world text you're likely to encounter. Careful human annotators get disagreements in some 3% primitive elements, i.e. every 2-3 sentences - how can you prove that any model will get the correct answer if for almost half the sentences we cannot agree what exactly the correct answer is?


Of course it is not possible to prove that a particular algorithm "does the right thing", because "doing the right thing" is an inherently vague notion. The issue with neural networks is that we often don't understand how they work even under idealized conditions. In the case of the PCFG, we can characterize precisely which sentences will or won't be parsed for a given parsing algorithm. We are never going to have an explanation of how real world parsing works because the real world is too complicated. But we might hope to figure out how the techniques that work in the real world can be understood as extensions of techniques that work in idealized conditions. The PCFG is a good example of that. There's nothing to understand about the probabilities. As you say, they're amalgamations of an indefinite number of real-world factors. But there is a core to the parsing algorithm that we do understand.


People have used genetic algorithms to evolve a set of images to fool DL machine vision models. Your point stands, we don't know of a deterministic method that I know of that can generate images optimized to fool DL models, and that'd indicate a black box model.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: