Since you mentioned FFT, I wondered whether GPU-based computation would be useful for audio DSP. But then, lots of audio DSP operations run perfectly well even on a low-end CPU. Also, as one tries to reduce latency, the buffer size decreases, thus reducing the potential to exploit parallelism, and I'm guessing that going to the GPU and back adds latency. Still, I guess GPU-based audio DSP could be useful for batch processing.
Yes. Probably the single most effective use of the GPU for audio is convolutional reverb, for which a number of plugins exist. However, the problem is that GPUs are optimized for throughput rather than latency, and it gets worse when multiple applications (including the display compositor) are contending for the same GPU resource - it's not uncommon for dispatches to have to wait multiple milliseconds just to be scheduled.
I think there's potential for interesting things in the future. There's nothing inherently preventing a more latency-optimized GPU implementation, and I personally would love to see that for a number of reasons. That would unlock vastly more computational power for audio applications.
There's also machine learning, of course. You'll definitely be seeing (and hearing) more of that in the future.