Yes, it is good, but not good enough for many applications. You're also left with the issue that one kind of "apple" is more common than the other kind of "apple" so the baseline accuracy of something that always assumes it's one kind of apple might be surprisingly good.
That said, text-to-speech is a system where it's important to do disambiguation of a particular set of words. For instead,
"I read the news today, oh boy", "read" sounds like "red"
"I read the news every day", "read" sounds like "reed"
You need to be able to disambiguate the word sense to be able to correctly read the world "read". There are maybe 20 or so very common words that are like this, so a modest amount of work in this area would be part of a good TTS system.
That said, text-to-speech is a system where it's important to do disambiguation of a particular set of words. For instead,
"I read the news today, oh boy", "read" sounds like "red"
"I read the news every day", "read" sounds like "reed"
You need to be able to disambiguate the word sense to be able to correctly read the world "read". There are maybe 20 or so very common words that are like this, so a modest amount of work in this area would be part of a good TTS system.