Enhance Speech from Adobe – Free AI filter for cleaning up spoken audio

ritwikgupta · on Dec 19, 2022

This is incredible. I took a talk recording I made for NeurIPS and passed it through this tool. The improvement in audio quality was night and day. It went from clearly being recorded in a bedroom to a studio-like experience.

[0, original] https://youtu.be/gwkCIdwHRhc

[1, enhanced] https://youtu.be/RPnUqmSyZ6Q

laserlight · on Dec 19, 2022

The result is nowhere near studio quality. It's highly compressed and full of artifacts. I found the original to be more pleasant to listen to. It at least sounds natural.

redox99 · on Dec 19, 2022

It's so good it looks weird. It removed all the room acoustics, such that you would expect to see a mic very close to your face.

adroitboss · on Dec 19, 2022

I feel the exact same way. It sounds like the mic is close to his mouth, but when watching the video it's clearly not there. It's super trippy.

ritwikgupta · on Dec 19, 2022

I agree. I was also taken aback for the first few seconds. After those first few seconds, I think I was able to adjust to the sound.

This would be wilder if I were in an unconstrained environment with similar enhanced audio quality. It would seem like a voice-over.

rychco · on Dec 19, 2022

Wow you’re right, this is a great improvement in quality.

Also, very enjoyable & clear presentation.

ritwikgupta · on Dec 19, 2022

Thank you very much! :)

whywhywhywhy · on Dec 19, 2022

This difference actually has an audio “uncanny valley” effect because the voice and the video setting are so different now.

sitkack · on Dec 19, 2022

Use AI to separate the speech and remix the room noise back into the recording.

inkcapmushroom · on Dec 19, 2022

Or just remix the background to match the expected environment for the audio!

rjzzleep · on Dec 19, 2022

This seems to work well for certain voices and recordings and quite badly for others.

Is there any way one could easily train it on my own voice to make sure it isolates my or any other trained voice from noisy environments?

wiihack · on Dec 19, 2022

I was shocked at how good it worked in this case. Thanks for the examples!

fock · on Dec 19, 2022

I was shocked how (on my notebook) it sounds like there is something with the speakers.

ritwikgupta · on Dec 19, 2022

I tried listening to it with my AirPods on. The enhanced audio sounds much more natural this way, as if I’m in the room with you.

Kerrick · on Dec 19, 2022

I was very excited about this, and then I tried it. I gave a talk at a local meetup 8 years ago [0] that wasn't mic'd properly, but I uploaded it anyways. It was never the most pleasant thing to listen to, but the audio was understandable enough that for years it was a top search result for "javascript promises" on YouTube and accumulated 39K views.

However, when I ripped that video's audio and put it through the linked AI filter (Adobe Podcast's Enhance Speech), it went from being unpleasant-but-understandable to perfectly clear gibberish [1]. For example, about 15 seconds in I said:

> [...] at Second Street, the company whose office you are currently sitting in. Uh, you can find me on the web on GitHub, on Twitter, or my own site at Kerrick Long Dot Com.

In the "enhanced" version, it sounds like a person who isn't me said the following in soundproofed studio (with a head cold):

> [...] at Second Street, the cubbany od office you are currently city-aly. You gidhi me on the web on GitDub, on Twitter, or my own side at Kerri Flow Dot Cowf.

[0]: https://youtube.com/watch?v=wc72cyYt8-c

[1]: https://soundcloud.com/kerricklong/javascript-promises-think...

relwin · on Dec 19, 2022

I used this filter to reduce wind noise during a tour guide of BB63:

https://youtu.be/1LDlOmKtfeQ?t=60

This is a 20 yr old vid shot at Pearl Harbor on the deck of USS Missouri with a low-end consumer Hi-8 camera using built-in mics. Notice flags and clothes rippling in the wind. This filter works pretty well in this case. Maybe it was trained on a New York accent? Mixing in the enhanced track as needed works well. Using only the enhanced track may sound artificial at times.

Here's another vid shot at a brewery where this filter helped clarify the brewmaster's voice over a noisy restaurant and outdoor machinery (again, built-in mic):

https://youtu.be/nANSdnYj-R0

I found this filter useful.

black_puppydog · on Dec 19, 2022

> Mixing in the enhanced track as needed works well. Using only the enhanced track may sound artificial at times.

Funnily, in order to use this sort of model for tasks involving speech recognition it's often recommended in the literature to mix back in some of the original noisy audio. This reduces the impact of artifacts introduced by the enhancement which would otherwise reduce ASR quality due to domain shift in the data.

Guess humans and computers have similar needs in this case. :)

These are impressive results, the audio mostly sounds like you gave the guy a lapel mic. :P

homero · on Dec 19, 2022

As a human I had a hard time understanding that audio

0_____0 · on Dec 19, 2022

Honestly you might have had a hard time understanding that speech in person

O__________O · on Dec 19, 2022

Having unlisted version on Youtube “before” the audio was filtered would make it easier to hear the true difference, but agree the quality is impressive as is.

relwin · on Dec 19, 2022

I did so on the brewery tour:

https://youtu.be/M5pdHVoXQHE

O__________O · on Dec 19, 2022

Thanks! Honestly I liked the post filtering audio with no “natural audio” mixed back in. Get the reasoning, but still. Did you do testing with anyone to gauge viewer preference? If so, how?

relwin · on Dec 19, 2022

This filter is not perfect so you'll have a few audio artifacts that you won't want in your final mix. Mixing with the original audio hides these artifacts somewhat. Using this filter also increases your audio mixing effort (you're mixing in a second vocal track) so it's easiest to mix the 2 audio tracks at constant levels then adjust when you hit an artifact. This is my personal preference on this style of audio mixing, and others who watched (and gave feedback) couldn't tell it was enhanced and had no difficulty understanding the speaker.

vouaobrasil · on Dec 19, 2022

To push it to the limit I recorded exactly the same recording with my phone microphone and my AT875R XLR shotgun. I did this because my phone microphone is poor and picks up a lot of echo. Results are as follows:

- If the microphone quality itself is bad, the enhanced audio is still pretty horrible. - It does clean up echo but with there's some pretty aggressive EQ that doesn't sound nice, and the noise gate is pretty severe - Compared to my XLR shotgun, the quality of the phone was pretty horrible

What we can conclude is that if you already have a good recording but with some problems, you might be able to use this to remove those problems. However, don't expect a crappy microphone to turn into a good microphone, or a crappy recording to turn into "studio quality".

The bottom line is that there's no substitute for a decent microphone in a decent space. (At the very minimum, small room without echo.)

echelon · on Dec 19, 2022

This is just beginning.

Look at all the money that's been going into making phones compelling substitutes for professional cameras. And camera tech is still evolving. Consumer audio devices will get the same attention and investment.

We'll eventually have models for audio signals in all sorts of distorted and noisy environments. I'd bet that in ten years a cellphone microphone can duplicate a professional audio setup in 90% of circumstances.

_ajoj · on Dec 19, 2022

Out of curiosity what phone do you have?

simfree · on Dec 19, 2022

OP should clean both microphones on their phone, usually a sewing needle or thin toothpick can do the trick, but 99% Isopropyl on a toothbrush might be needed afterwards if the grill inside that protects the microphone is also clogged up.

Both microphones need to be cleaned of any blockages so the hardware echo and noise cancellation on a given phone works well. Otherwise you've got distorted audio getting processed as if it's not distorted...

hunter2_ · on Dec 19, 2022

Important question because a microphone array (which exists on some phones, and things like home voice assistants) can be steered into the equivalent of a shotgun mic's pattern, or even more focused than that. It's just that an algorithm aims it toward the strongest signal, instead of the user aiming a hypercardioid mic manually. Either way, this is what reduces the ratio of reverberant room sound ("echo").

alister · on Dec 19, 2022

> At the very minimum, small room without echo

Why small room? Does that reduce echo?

In theory, would a gigantic room that was miles to the nearest wall be even better?

orbital-decay · on Dec 19, 2022

>In theory, would a gigantic room that was miles to the nearest wall be even better?

Yes, as effectively it's open space. In practice though, to record in high quality you would rather build an anechoic environment, as small as possible (preferably a booth).

QuantumGood · on Dec 19, 2022

Small spaces bring hard surfaces closer to the mic, making flutter echoes and room mode resonances louder, create murky-sounding bass pooling, and can be overly-sensitive to mic position within the space; you can sound oddly different without warning.

You need to absorb the sound of your voice so there is less echo by baffling material on the back of the mic, and absorb room tone and echoes in the area the directional microphone is pointed, generally behind your head. Small spaces have only disadvantages as studios.

To take advantage of a reach-in closet full of clothes, put some pillows on the shelf over the clothes, take the closet doors off, and back into the closet as much as you can. In this way the microphone is primarily listening to the baffled sound inside the closet, and you can avoid bass pooling by speaking into the room—ideally with baffling material (e.g. see http://PillowFortStudios.com/ ) ON the back of a LDC microphone.

weinzierl · on Dec 19, 2022

"as small as possible (preferably a booth)."

This is where the Startup Garage analogous cliche for musicians comes from: recorded in the closet

AkshatJ27 · on Dec 19, 2022

Sound takes lesser time to travel in smaller rooms, hence the difference in time is not very large between the original sound wave and the reflected one which makes it harder to distinguish to human ears.

jraph · on Dec 19, 2022

Beware of the wind in such a room

(and of the rain too)

tempodox · on Dec 19, 2022

> don't expect a crappy microphone to turn into a good microphone

It's the same as with photos. If your raw material is bad, no tool on earth can make it good.

egeozcan · on Dec 19, 2022

The tools can make stuff up and produce something, may I say "good", from it, but it won't be the same contents :)

ZainRiz · on Dec 19, 2022

StableDiffusion+img2img may beg to differ :)

retSava · on Dec 19, 2022

How about running it several times, will that make it better or worse?

kamiheku · on Dec 19, 2022

That could be a fun exercise, I'd imagine you might end up with something like "I Am Sitting In A Room" [0][1]

[0] https://en.wikipedia.org/wiki/I_Am_Sitting_in_a_Room [1] https://www.youtube.com/watch?v=fAxHlLK3Oyk

michaelaudo · on Dec 19, 2022

Shameless plug, but i've built a similar software as an indiehacker. I think it's similar quality to adobe's

https://www.audostudio.com/

tr3ntg · on Dec 19, 2022

I was suspicious of your claim, but this seems very good. Your first example is way more extreme than Adobe's (a restaurant with background noise as loud as your voice??), but the output is ridiculously clear.

I would definitely say this is "similar quality" to adobe's. Nice work.

techdragon · on Dec 20, 2022

This isn’t meant as a criticism of your work, just your pricing model in that it reflects the industry wide creep of subscription based services into more and more narrow niches.

I’ve checked several similar services and it really frustrates me that all of them price by the hour the bill is a monthly plan, but in actual fact the real cost is $/hour of processed data. The these plans inevitably end up as a series of x hours per month + some features, and all features of the previous plan teir… with many companies using larger hour requirements as a way to force you to pay more per month.

Also why the hell is it so hard to find the equivalent of this technology but for realtime as in streaming my microphone audio live … i would happily pay $100 -> $250 for a AudioUnit plugin (or other equivalent audio pipeline plugin formats) for this kind of real-time voice cleanup… but it doesn’t exist. So can you help explain why? Since you’ve built something like this I’m hoping you have more insight into why it’s harder for real-time processing…

smachiz · on Dec 20, 2022

https://krisp.ai might work for you?

hobbitstan · on Dec 19, 2022

I tested a few recordings of an Indian Swami giving speeches in English back in the 70's. Recording had a lot of background noise, not great. You have to listen very carefully to hear what is being said. I was hoping for good results, but...

Results:

- background noise was reduced

- some previously clear words are turned into garbled non-words

- some parts are replaced by a different Indian voice, I assume AI, so it sounds like multiple people talking

All in all, the results are not anywhere near what the sample shows.

MengerSponge · on Dec 19, 2022

Time is a flat circle. Remember when Xerox copiers would randomly replace digits with different digits?

https://www.theverge.com/2013/8/6/4594482/xerox-copiers-rand...

paxys · on Dec 19, 2022

Good news: This fancy new compression reduces the error rate from 5% to 1% for the same file size

Bad news: While the 5% was a minor inconvenience for customers, the 1% is bad enough to end your company

hobbitstan · on Dec 19, 2022

Well, that’s a nightmare I never knew existed…

masswerk · on Dec 19, 2022

Epic talk (in German, David Kriesel, "Traue keinem Scan, den du nicht selbst gefälscht hast"):

https://www.youtube.com/watch?v=7FeqF1-Z1g0

Bombthecat · on Dec 19, 2022

Watched it two times! It is very well made and funny!

pragmatick · on Dec 19, 2022

They should've added english subtitles by now.

Moru · on Dec 19, 2022

https://www.youtube.com/watch?v=c0O6UXrOZJo

aspyct · on Dec 19, 2022

I love it how they say "yeah, character substitution is a known issue". How was that ever fine?

MengerSponge · on Dec 19, 2022

Close the ticket. My performance review scorecard doesn't include "known issues"

shahules · on Dec 19, 2022

Hi there, I have made a free open-source tool that does better. Care you check that out? https://github.com/shahules786/mayavoz

simfree · on Dec 19, 2022

This is very neat! Have you done any profiling on what codecs and sample rates this performs best on?

Just curious how the performance differs between PCMU @ 8khz compared to Opus @ 48k or IMBE and AMBE+2 (Project 25 Public Safety audio codecs) :D

My dream would be doing audio processing in real time to clean up the audio of phone calls

shahules · on Dec 19, 2022

Hi, most models performs best at 16KHz. Current architectures does not support real-time speech enhancement but I plan to add that in future.

godelski · on Dec 19, 2022

Honestly, it sounds like you're judging on a pretty big outlier example. The sample seems to more be aimed at background noise and even that sample is extremely easy to understand without the enhancement. There are a bunch of tools out there that are probably better aimed at your goals.

echelon · on Dec 19, 2022

In time, these tools will gain control knobs and eventually start to focus on longer tail audio recovery tasks. I have hope for our old audio. Where there's signal, there's a way.

asimpletune · on Dec 19, 2022

One interesting thing I realized after experimenting with this today is that if you upload mp3 (as opposed to wav) your audio won't really sync correctly afterward. For example, I found the effect to work too well at certain points, and so I tried to blend it with the original audio, but then it sounded as if the phase was off. Uploading a wav resolved this issue for me.

bigwheeler · on Dec 19, 2022

I tried this on a couple of speeches that were recorded in front of an artificial waterfall, and the output is not just bad - it’s not the English language. Nor any language on earth that I’m aware of. Haha

The tool that now comes with the latest updates to FCPX handled them without a problem. (Still some background noise, but you can clearly hear every word.) I think Adobe has a long way to go on this.

adwi · on Dec 19, 2022

After seeing this this AM, I just used this in a proper professional mix session for an ad we were finishing today. Sit-down interview shot with an iPhone (don't ask). I output stems of each subject's dialog, uploaded separately, laid back in and mixed little bit of roomtone back into it.

Gobsmack all around. Mine, that it did better in 10 seconds than I think nearly anyone could have done in 10 hours. Theirs, that they thought I performed a literal miracle.

Life comes at you pretty fast sometimes.

qwerty456127 · on Dec 19, 2022

Can it remove the professor's coughing? I have recordings of some interesting lectures recorded by a professor who had covid or something, coughing after every some words.

Many other professors (as well as me, occasionally) also use to involuntarely (and often unaware of that) say something like eeeeeh when they strugle to recall the right word. Would be great if this could be removed as well.

killerdhmo · on Dec 19, 2022

I don’t know their roadmap, but if anything I’d point you to a tool called Descript which is great at this. You can get transcriptions, and make edits based on the text (e.g. cough, um s, etc.) Descript.app (I am not affiliated with them, just a fan)

qwerty456127 · on Dec 20, 2022

Thank you very much. I never knew anything like this exists already, only wondered why is there not given all the advances in the AI.

nharada · on Dec 19, 2022

Interesting "automagic" tool for audio post targeted towards hobbyists and creators. Anyone able to compare this with the entry level[1] version of the professional standard dialog cleanup software, RX?

[1] https://www.izotope.com/en/shop/rx-10-elements.html

code51 · on Dec 19, 2022

Programmed to hallucinate from every sound. Goes "du du duuuu du du" with background music.

charcircuit · on Dec 19, 2022

The demo is pretty disappointing in that it hitches whenever you flip the switch. They should have invested a little more in making it seamlessly switch.

ec109685 · on Dec 19, 2022

One of the good things about OpenAI is that their demos are great on a mobile phone. This one is totally broken on iOS.

acchow · on Dec 19, 2022

Audio demo works on iOS for me (safari and chrome). I think the visual is cut off tho

jerrygoyal · on Dec 19, 2022

and on Android

philsci · on Dec 19, 2022

Tried to upload my favorite recorded audio file (MP3) from Japanese program given in the middle of 1980s about a ancient mound (grave) in Japan. A Japanese famous archaeologist, late Koichi Mori (a professor of Doshisya University at that time) talks about Hashihaka mound, but he talks in ... in Spanish?

resoluteteeth · on Dec 19, 2022

> Tried to upload my favorite recorded audio file (MP3) from Japanese program given in the middle of 1980s about a ancient mound (grave) in Japan. A Japanese famous archaeologist, late Koichi Mori (a professor of Doshisya University at that time) talks about Hashihaka mound, but he talks in ... in Spanish?

I'm not sure I understand what you're saying. Do you mean that the talk was in Japanese but the output of this service somehow screwed it up in a way that it sounds like spanish?

Or are you just mentioning that the thing you uploaded was spanish and not describing the quality of the output?

philsci · on Dec 20, 2022

I uploaded a sound file the professor talks in Japanese, and the enhanced file can be listen like a speech in Spanish.

seydor · on Dec 19, 2022

I wish youtube could have such an option, along with sound normalization and perhaps compression. So many videos, lectures and chats have bad sound and it really becomes an issue for people who dont have the best hearing.

Also would be great if used in Zoom recordings of podcasts

asimpletune · on Dec 19, 2022

I experienced something absolutely bizarre with this, making me want to try and reupload to see if the same thing occurs. I had some footage laying around that was taken on a windy day, with buses and wind and kids screaming, and for the most part it was greatly improved.

However, a few seconds into my recording there is a part where there is someone else's dialogue for a few seconds. I can't make out what they're saying but it definitely sounds like a man, with a Latin-American accent, speaking English for a second.

Could that be a hallucination or somehow they mixed audio from another recording? It only last for about a second, but it's so strange.

newqer · on Dec 19, 2022

My guess would be it fixates on the most dominant source available and mutes the other factors. It probably favors human voices over other ambient noise, therefore singeing the man out.

It will really get freaky when there an ambient noise resembling a human voice. I'm thinking the Bear scene from the movie Annihilation.

sitkack · on Dec 19, 2022

One should take a STT transcription on the raw and modified media streams and do a diff to find unintended modifications.

asimpletune · on Dec 19, 2022

I tried reuploading and again the exact same thing happened, which is interesting because it seems that it's producing audio fairly deterministically, which is not how I think of most AI produced results are but I'm not an expert.

RugnirViking · on Dec 19, 2022

a lot of AI produced results can be deterministic if you want them to be. For stable diffusion, just set the seed of the initial noise, and it's deterministic. With GPT-3, set the temperature to 0 (always choose the highest probability word).

It can be a useful technique for learning how slightly different prompts affect things

wds · on Dec 19, 2022

Whatever happened to Voco, their 'photoshop for voice'? I'm not a frequent HN poster, so I don't know how to 'properly' cite the presentation, but I remember them showing it off years ago and... nothing came of it.

mkl · on Dec 19, 2022

To find old things like that, use the search box at the bottom of the page: https://hn.algolia.com/?q=adobe+voco. Big splash 6 years ago, then nothing.

It seems Adobe never advanced Voco past the research project stage ([1] via [2]). I'm guessing they had trouble getting it to work reliably on a wide-enough range of real-world audio.

[1] https://community.adobe.com/t5/audition-discussions/beta-tes...

[2] https://en.wikipedia.org/wiki/Adobe_Voco

wongarsu · on Dec 19, 2022

I imagine it got shut down, either by their own executives/ethics department, or by outside pressure. It was announced in the same year that the word "fake news" took off. Not the best time to get the world excited for a "photoshop for voice"

amelius · on Dec 19, 2022

That would be the first time for a big company to make a morally sound decision, though.

WoodenChair · on Dec 19, 2022

I've been using this as one of several tools (noise gate in Audacity, The Levelator) to increase the audio quality in my podcast. Subjectively, I think it's been working, and I love the simplicity of the interface. No tuning, just trust the AI. It works well for standard spoken English, but will do some horrible things to music (it won't detect music and be like "hey don't do anything to this"). So you shouldn't run it on a file that includes both music and voice.

tristor · on Dec 19, 2022

This is really cool, I wish this was shipped as an AU plugin I could use with Audio Hijack and Loopback to process audio for video conferences. I already use a pretty detailed filter-chain that greatly improves audio quality, but this would give more of a "radio voice" quality compared to my current filter chain. I've found that improving audio quality has a marked difference in how people respond to my proposals when I present them, working remotely.

shahules · on Dec 19, 2022

There is an open-source tool that does the same https://github.com/shahules786/mayavoz

HappyTypist · on Dec 19, 2022

Plugging is okay, but you should always disclose if you are involved (which appears to be the case here based on your username).

simfree · on Dec 19, 2022

Their username is in the repository name, that seems pretty clear to me...

shahules · on Dec 19, 2022

Thanks for pointing out, I'm not a regular user of HN. I myself am the creator of Mayavoz.

rjzzleep · on Dec 19, 2022

Is there a way to train it on your own voice to isolate your own voice but not distort it too much?

realusername · on Dec 19, 2022

I've just tried it but it seems the Adobe one works better and outputs a clearer voice.

hs86 · on Dec 19, 2022

How does this stack up against something like Auphonic? https://auphonic.com/

davidzweig · on Dec 19, 2022

Does anyone have recommendation how to 'upsample' recordings like these? [0] It's ripped from old tapes, with heavy NR, and compressed.

[0] https://fsi-languages.yojik.eu/languages/FSI/fsi-french-basi...

pranxxter · on Dec 19, 2022

try descript.com - their tool called „studio sound“ is quite incredible

joshspankit · on Dec 19, 2022

How close are we to AI understanding your voice, then simply re-creating it with perfect clarity?

I feel like we’re real close

CharlesW · on Dec 19, 2022

> How close are we to AI understanding your voice, then simply re-creating it with perfect clarity?

People do this regularly today using Descript. https://www.descript.com/overdub

Through answering this I also found Respeecher, which seems interesting too. https://www.respeecher.com/

joshspankit · on Dec 20, 2022

Oh! Not only are we past close, but it’s a polished commercial product. AI is stunning. Thank you for the links.

leet · on Dec 19, 2022

Adobe had this sometime ago https://www.youtube.com/watch?v=I3l4XLZ59iw It was discontinued.

Terretta · on Dec 19, 2022

For folks that would like something like this live during video conferencing, check out krisp.ai.

nonoesp · on Dec 19, 2022

I was expecting someone to have mentioned Descript's Sound Studio audio enhancement filter.

peatfreak · on Dec 19, 2022

Does anybody know how this works? As somebody with a serious interest in signal processing, machine learning, and audio, I'm genuinely interested. Have Adobe published anything about the technology behind this?

bravura · on Dec 19, 2022

I seem to recall a demo where a trumpet sound was plugged into an aggressive high quality speech denoiser, making it sound like a speaker. Does anyone remember this demo, or have similar links on creative use of this tech?

jacooper · on Dec 19, 2022

Nvidia has a similar thing called RTX audio, but its even more impressive and runs live on recording

https://www.youtube.com/watch?v=uWUHkCgslNE

rjzzleep · on Dec 19, 2022

It seems quite different actually. While this filter tries to completely change the sound of the whole recording, RTX voice isn't really improving the audio quality, but is rather focusing only on removing noise. I think it's a much more sound approach and doesn't really end up in actually changing language like in some of the examples below.

thulle · on Dec 19, 2022

It's not usable for everyone though, I got a friend with a high pitched voice and RTX audio totally butchers it if she's not careful with how she speaks. And a giggle turns into a garbled mess and so on.

causi · on Dec 19, 2022

How long until there's a way to process more than one hour at a time? After trying a sample, I cannot wait to run some old audiobooks through this. The results sound like it was made twenty years newer.

bloopernova · on Dec 19, 2022

I have a lot of trouble understanding people on group calls. I wonder if something like this software could help me?

sva_ · on Dec 19, 2022

Free as in "free beer", I assume.

paulluuk · on Dec 19, 2022

Most likely they'll use your recordings to enhance their own audio models.

Aulig · on Dec 19, 2022

It significantly improved the quality for me but also made many (english) words very hard to understand.

drewbeck · on Dec 19, 2022

Not sure when we get this tech real-time and part of our video conferencing tools but … can't wait.

_kb · on Dec 19, 2022

No need to wait. It's already built into all major conferencing platforms today (Teams, Webex, Meet etc).

If you'd like to run something locally, there's also https://www.nvidia.com/en-us/geforce/guides/nvidia-rtx-voice....

doodlesdev · on Dec 19, 2022

AMD and NVIDIA have already done that. There's also Krisp which you can pay to use anywhere or just use the free Discord integration. I'm sure there are others too.

paulgb · on Dec 19, 2022

I hope they add this to Adobe Premiere, it would be useful for cleaning up audio from talk recordings.

IAmGraydon · on Dec 19, 2022

That would be convenient, but for now you could bounce the audio out, run it through the online tool and then bring it back in to Premiere.

itake · on Dec 19, 2022

I want to add this to zoom so I can hear my coworkers better.

lynxaegon · on Dec 19, 2022

You can use Nvidia Broadcast with a noise filtered Speaker. I use this for zoom meetings and it works wonders

itake · on Dec 19, 2022

I use an m1 and iphone for my zoom calls :-/. AFAIK, that requires an nvidia chip.

andreyk · on Dec 19, 2022

I use Adobe Audition for podcast editing and they have a pretty nice feature built in for that already - not sure how this differs. I am also not a big fan of edit-by-transcription, since I do like to remove occasional long silences, weird sounds, etc. That being said the mic check looks pretty useful.

jccalhoun · on Dec 19, 2022

i uploaded the raw audio of myself and in the beginning I'm talking to myself while off mic. Adobe turned it into some weird non-english thing.

boplicity · on Dec 19, 2022

Anyone know if this is integrated into Premiere yet?

jonathanstrange · on Dec 19, 2022

Waves Clarity Vx is comparably good and costs 29.95 USD at one of the frequent sales (such as now). All you need is a VST host, which you're probably using already anyway.

detay · on Dec 19, 2022

I need this on my guild discord server