Did you even watch the video ? It's just baffling how I have to spell this out. ...

barrell · on May 14, 2024

Yes I watched the demo. True those things were not possible, so if that’s what’s blowing you away then fair enough I guess. For me that doesn’t impact at all anything have ever used voice for or probably will ever use voice for.

I’ve voice chatted with ChatGPT for hundreds of hours and never once thought “can you modulate your tone please?”, so those improvements are a far cry from magic or revolutionary imho. Again, that’s not to say they aren’t cool tech, forward advancements, or impressive —- but magic or revolutionary are pretty high bars.

To each their own though.

famouswaffles · on May 14, 2024

Few people are going to say "modulate your tone" in a vacuum sure but that doesn't mean that ability along with being able to manipulate all other aspects of speech isn't an incredible advance that is going to be very useful.

Language learning, audiobook narration that is far more involved, you could probably generate an audio drama, actual voice acting, even just not needing to get all my words in before it prompts the model with the transcribed text, conversation that doesn't feel like someone is reading a script.

And that's just voice.

This is the kind of interaction that's possible now. https://www.youtube.com/watch?v=_nSmkyDNulk

And no, thumbing the pause button, sending an image and going back does not even begin to compare in usability.

Great leaps in usability are a revolution in itself. GPT-3 existed for years so why did ChatGPT explode when it did? You think it was intelligence? No. It was the usability of the chat interface.