This is interesting, though a lot of the claims are highly debatable, but one re...

sbuttgereit · on June 11, 2019

I can only imagine the joy of an open office floor plan where employees also depend on a voice based interface for their computers.

bdcravens · on June 11, 2019

To say nothing of accessibility.

tyre · on June 11, 2019

This ad ran last night during The Finals and really hit home how audio can improve accessibility

https://www.youtube.com/watch?v=aqoXFCCTfm4&feature=youtu.be

progval · on June 11, 2019

If a talking interface becomes a popular option in addition to a visual interface, that would actually improve accessibility.

Ajedi32 · on June 11, 2019

Solving that is simply a matter of developing a way for the computer to pick up inaudible or mouthed commands. Not a trivial task, but it does seem doable with current technology.

rijoja · on June 11, 2019

Okay so basically controll the user input with muscle movements. Hm wonder if some other group of muscles aren't more suitable for that, how about trying say our fingers for starters?

edit: I've read that NASA developed a microphone that doesn't need sound though so it is doable.

blunte · on June 11, 2019

Just hurry up with the neural interface please :). Plug that baby into the port on the back of the neck, close your eyes, and get things done!

lm28469 · on June 11, 2019

Myo tried a few years ago, looks like they're closing now.

https://support.getmyo.com/hc/en-us

Ajedi32 · on June 11, 2019

Language is much more expressive than your fingers for certain tasks (and vice-versa). This design advocates for the use of _both_, not one or the other.

dguts · on June 11, 2019

Language is more expressive, but I can press hundreds of buttons and type commands faster than I can yell expressive commands at my computer

Ajedi32 · on June 11, 2019

I think it's very unlikely that any significant number of people can type faster than they can speak.

Depending on the complexity of the task, you might very well have to press "hundreds of buttons" in order to achieve the same result as a single sub vocalized voice command. Not to mention that speech can be far more intuitive than keyboard shortcuts or nested sub-menus for certain tasks.

Again though, this isn't about replacing keyboard input, it's about supplementing it.

Tyr42 · on June 11, 2019

Yeah, but I remember spending 20 minutes driving on the highway and trying to get google assistant to play the next episode of my podcast (Not the most recent! The NEXT ONE. Gaah)

It would be 4 taps to get done, and it is impossible to do via voice.

crooked-v · on June 11, 2019

Similarly, I still have no idea how to get Siri to add a stop on a route instead of replacing the current destination.

Even worse, if you even use the word "stop" in a sentence it doesn't understand, it cancels the current driving directions.

rijoja · on June 12, 2019

Chorded typists write faster than speech needed to transcribe court sessions for example.

Also math is an example where you say oh so much more than you can with language.

The way I see it say a shell interface is a different language and it's so much more expressive.

or for example imagine playing a FPS with voice commands.

seventhtiger · on June 12, 2019

As seen in the movie Her.

scrumper · on June 11, 2019

Combining an engine to interpret voice commands with textual keyboard input seems like the best approach here. You get all the useful fuzziness of voice with none of the transcription errors (or revolting workplace noise pollution).

I'm fairly well sold on some sort of "omnibar" interface concept where you just tell the computer what you want with a keyboard. Alfred, Spotlight, Google search, Wolfram Alpha, iCal's smart add (you can just type "2pm next Friday"), the Action palette in IntelliJ, VS Code and Sublime's command palette, and so on. That, just... everywhere. And if you're alone, sure, use voice instead. Just don't deprecate the keyboard: still the most reliable way to get accurate text into a computer.

mikob · on June 11, 2019

Exactly, voice works amazingly as a "3rd interface" complementing keyboards and mice which aren't completely accurate, either, fwiw. Voice gives us more speed, frees up UI clutter and can free up our hands.

I'm working on a tool that explores this area more: https://www.lipsurf.com

scrumper · on June 11, 2019

Oops I was a bit unclear. I meant, include the engine part that matches English statements to commands and parameters but hook it up to a keyboard-driven text box instead of a microphone. So it's:

written text->'AI'->command vs. speech->'AI'->command

(Edit: and sure, voice as an adjunct per my first comment.)

ww520 · on June 11, 2019

Yes, please don't make voice input mandatory. In one place I work, the guy in next cube used Dragon Speech for navigation and typing (he's not disabled; he just thought it's hip.) It was beyond annoying to hear all the command input.

IOT_Apprentice · on June 11, 2019

Take a look at Apple's new Voice Control feature coming to iOS and macOS. You can see it on youtube or on Apple's home page.

flukus · on June 12, 2019

> Current voice input seems to rely heavily on the scope for commands being limited, and even then it breaks down often with names and such.

Voice commands are more like a CLI (short limited commands) than a GUI and the CLI has been proven to be much more composable and scriptable then GUI's. AFAIK it hasn't been done, but adding stdio and pipes to a voice interface could make it shine for complex workflows where a GUI fails.

Ajedi32 · on June 11, 2019

Eye tracking could be useful for providing the necessary context for limiting the scope of commands.

I also believe that minimizing latency and eliminating the need for hot words will make a big difference in the usefulness of voice commands for more common tasks: https://twitter.com/Google/status/1125815241026166784

la_barba · on June 11, 2019

Yeah, voice based interfaces have huge potential but maybe its easier to we adapt ourselves to match what its expecting. Like a shorthand for voice. We already learn seemingly random words/numbers/spcial characters in terms of programming syntax. We could do that for voice too.. Or maybe a 'grunt' based interface? A grunt is pretty universal right? :P

Avshalom · on June 11, 2019

Like what do they imagine voice really doing for CAD programs or Excel, ERP, video editing... at least I guess in Photoshop you could switch tools with voice with out moving the brush but that's not that compelling as an example of voice-for-complex-tasks.

gmueckl · on June 11, 2019

It depends on the task and the UI. In some tools like CAD I need to switch tools a lot: add an element to sketch, constrain it, repeat. Switch back imto 3D, pick new construction plane, etc... most tools have pretty distinct names and could be called out faster than doing the mouse acrobatics.

We aren't used to talking to things instead of humans. But I think that voice input would be the next logical step beyond the search bars for progran features that have been showing up in the last few years. These search bars are halfway to freeform textual commands. Entering such commands by voice is then mostly bashing together existing technology to make something new.

NikkiA · on June 11, 2019

<select item> "dimension as 20 millimeters and make reference" <select next item> "rotate 90 degrees and scale to radius 100 millimeters"

sounds to me that it would make CAD far less dependant on context switches between mouse/puck/pen and keyboard than it is.