Yours is a common sentiment but it is not very accurate. Deep learning is not separate from or a more evolved form of machine learning. It's a learning algorithm, with its own structural biases, such as from architecture and optimization method. There is a duality between optimization/search and sampling, they're two sides of the same coin. Instead of giving labels, I prefer thinking about shared/unshared properties and assumptions of the algorithm. Deep learning is unique in its ability to use gradients and credit assignment through a hierarchy when possible.
Deep learning, like all learners, must take assumptions and there are trade-offs. You trade time spent thinking up features (which reduce learning time and improve sample efficiency), for time thinking up architectures and the appropriate biases (graph conv, rnn, convnet, etc). Though still limited to a particular domain, you hope to gain an architecture that works with broader applicability than a feature based method. You also add more complex training, longer experimentation waits and gain a level of indirection between the problem and the learning algorithm. This means that the gains from new knowledge and understanding of a domain filter through slower for deep learning.
In deep learning, it's a very experimental science. Theory is usually after the fact and not very filling. People try this or that and then write a show and tell paper, but knowledge still helps. Where as in shallow methods, learning something about how images work can directly inform your algorithm, in deep learning you now have to think what sort of transformation will best capture the invariant I am looking for.
Deep learning (especially when you can get it end to end) has removed the expertise threshold for impressive results (the bigger your budget the better). But if it were really so magical, the way some people go on about it, then it would already be obsolete as a field because, minimal need for human expertise.
Deep learning, like all learners, must take assumptions and there are trade-offs. You trade time spent thinking up features (which reduce learning time and improve sample efficiency), for time thinking up architectures and the appropriate biases (graph conv, rnn, convnet, etc). Though still limited to a particular domain, you hope to gain an architecture that works with broader applicability than a feature based method. You also add more complex training, longer experimentation waits and gain a level of indirection between the problem and the learning algorithm. This means that the gains from new knowledge and understanding of a domain filter through slower for deep learning.
In deep learning, it's a very experimental science. Theory is usually after the fact and not very filling. People try this or that and then write a show and tell paper, but knowledge still helps. Where as in shallow methods, learning something about how images work can directly inform your algorithm, in deep learning you now have to think what sort of transformation will best capture the invariant I am looking for.
Deep learning (especially when you can get it end to end) has removed the expertise threshold for impressive results (the bigger your budget the better). But if it were really so magical, the way some people go on about it, then it would already be obsolete as a field because, minimal need for human expertise.