It's definitely a work in progress - but something that active development is being focused around.
The way this is being handled in an upcoming update involves a few things - an OCR tool identifies math formulas, applies a bounding box and takes an image. That image gets sent to a multimodal-LLM which attempts to "describe" the formula reasonably.
While not yet perfect, this is something I anticipate to improve quite a bit soon.
The same approach is going to be applied to tables, graphs, figures, and images.
I once listened to a (human-made) audiobook where the narrator read all the mathematical notation as names of symbols ("open parenthesis open parenthesis" etc, in a discussion of lambda calculus!) So knowing how to convert the notation into natural language requires some domain knowledge beyond that of regular TTS. Maybe LLMs could help, but it's a problem to use an LLM for something where 100% accuracy is important, and there's no easy way to validate the output.