A gradient descent is used to solve optimization problems, those arise in many m...

A gradient descent is used to solve optimization problems, those arise in many many cases unrelated to ML. Please research a little the history of this field, and notice how it predates ML by decades (even 180 years in the case of the gradient descent specifically).

A great deal of applied mathematics is related to finding a minimum or maximum quantity of something. There are not always constructive methods, sometimes (often) there's no better way than to step through a generic optimization method.

Some quick examples clearly unrelated to ML, and very common as they relate to CAD (everywhere from in silico studies to manufacturing) and computer vision:

- projecting a point on a surface

- fitting a parametric surface through a point cloud

Another example is non-linear PDEs. Some notable cases are Navier-Stoke's equations, non-linear elasticity, or reaction-diffusion. These are used in many industries. To solve non-linear PDEs, a residual is minimized using, typically, quasi-Newton methods (gradient descent's buff cousin). This is because numerical schemes only exist for linear equations, so you must first recast the problem as something linear (or a succession of those, as it were).

By the way, I might add that most PDEs can be equivalently recast as optimization problems.

Yet another is inverse problems: imaging (medical, non destructive testing...), parameter estimation (subsoil imaging), or even shape optimization. Similarly, optimal control. (similar in that it is minimizing a quantity under PDE constraints)

To summarize, almost every time you seek to solve a non-linear equation of any kind (of which there are many completely unrelated to ML), numerical optimization is right around the corner. And when you seek to find "the best" or "the least" or "the most" of something, optimization. Clearly, this is all the time.

I think I've provided a broad enough set of fields with ubiquitous applications, that it is clear optimization is omnipresent and used considerably more often than ML is. As you see, there is no association from optimization to ML or AI, although there is one the other way around. (much like a bird is not a chicken).