I started doing a lot better in calculus when I started using longer notation (e.g. f = x -> x³ instead of f(x) = x³) and making sure that things "type checked". For instance, this tendency to use f(x) to refer to a function, rather than just f, was very confusing to me, because f(x) is an element of the co-domain while f is a function (typically from real to real in my undergrad classes). I had to figure this out by myself because the textbook I was using and the prof all went with the notation that wouldn't type check. When I finally realized that dy/dx should instead be (d/dx)(f), things started being a lot clearer to me: derivation takes a function and returns a function and f is a function so everything checks out.
It's good to think of dy/dx as (d/dx)y. In addition, it is also possible to make some sense of dy/dx. Here's one very hand-wavy way of looking at it.
Let ε be something very small, and define the difference operator d so that (df)(x) = f(x + ε) - f(x). Usually we don't want to handle the functions dx and dy by themselves, because they are so small, and their exact values depend on ε. But when we divide dy by dx we get something that is no longer ε-sized, and doesn't (in a limit sense) depend on the value of ε.
And why think way? When I learned the chain rule dy/dx = dy/du * du/dx I was told that even though the du's appear to cancel out, this is just abuse of notation and basically a meaningless coincidence. I understand that the teachers just wanted students to be careful; they don't want people "simplifying" dx/dy to x/y. However, I was never really satisfied with this explanation. I finally realized that by thinking about it using the difference operator above, it is not a meaningless coincidence: the du's actually do, in a sense, cancel out.