If anyone is wondering how this might be done without Flash/Java, WhatWG have spec'd recording for the HTML5 Stream API but I don't think it's properly defined or implemented in any browser yet:
Among other things, it doesn't yet define what codecs would be used in the audio capture. There's also a lot of comments in the draft that amount to "we'll figure out how this part would work later."
Phono is intended to be a unified jQuery API for real time communications (audio, video, chat, screen sharing). The Flash movie is an implementation detail -- the API is intended to be implementable with other media capture methods when they're available, including HTML5.
They're referring to the fact that Flash is the only way to record audio from the user's mic at this time. The spec linked by the grandparent outlines an HTML5 API for accessing the microphone.
Flash is also the only way to record video from a webcam at this time, as far as I know.
Technically Java applets (or custom browser plugins) should be able to work as well, but I don’t see much of an advantage over Flash (well, probably compatibility with 64-bit browsers). Certainly Flash is more common.
Java's security model is meaning a HUGE Java applet to do this. And a custom plugin requires that you get people to install it. Works if you're Google (that's how they do Voice in Gmail), but probably not if you're Joe's Bait Shop wanting to add a click to call application.
I tried it. The website said it was ringing before my phone rang, then it said connected before my phone rang. Then my phone rang and I picked up, but no voice was transmitted in either direction. Interesting concept but it doesn't work for me.
I am very excited about the prospect of Phono Gateway :). I have a service that's begging for Phono (as a resell/addon), but I need Phono Gateway to deploy it internally as a proof-of-concept.
Has "jQuery plugin" just become a euphemism for "this project requires jQuery and add itself to the $ object"?