Elevenlabs has a feature for a "full cast"-type generation, where different characters will get different voices. It's certainly not automatically sensitive to dialect though.
It's probably possible with current systems to do though. I believe there are TTS systems that can use context/prompting to change emphasis and other speech qualities, though I'm not sure how reliably.
I’m sure it’s doable. I think you’d want to break it into a few discrete steps for the best quality. First process the book and identify key info like genre, tone, etc. Use that to determine the best voice(s) and reading style, assign actors for multiple characters/subjects. Maybe output some examples to spot check for approval. Tweak based on that then generate the audio. Prob a couple other steps in there and maybe a bit of custom work to optimize in key areas. If someone wants to do this as a side project I can help scope the architecture and process but I don’t want to code it. :p
It's probably possible with current systems to do though. I believe there are TTS systems that can use context/prompting to change emphasis and other speech qualities, though I'm not sure how reliably.