"Widjosumarajzer" = video summarizer It's just a hodgepodge of prototype scripts...

ravila4 · on Feb 5, 2024

Very interested in this. I have been contemplating building something similar, but am unaware of any existing services that do this. Haven't played with pyannote, how does it compare to whisper? Also thought it might be useful to be able to OCR screenshots and use the text to inform the summariation and transcription especially for things like code snippets and domain-specifc terms.

eurekin · on Feb 5, 2024

I remember whisper v3 large blowing my mind: it was able to properly transcribe some two language monstrosity (przescreenować, which is a english word "to screen a candidate", but conjugated according to standard polish rules). Once I saw that I thought "it's finally time: truly good transcription has finally arrived".

So I view whisper as sota with excellent accuracy.

Now, for the type of transcription I need speaker discerning is much more valuable than accurate to the point translation: so it will be summarized anyway and that tends to gloss over some of errors anyway.

That said, pyannote has also caught me off guard: it correctly annotated lazily spoken "DP8" with non native speaker accent.

It looks really good

zakariaelhjouji · on Feb 6, 2024

Is pyannote the best diarization library you found? What's SOA? I've been using a saas product (Gladia) and I'm getting close to my 10-hour mark.

eurekin · on Feb 6, 2024

The first and good enough for me not to look further