More

hamza_q_ · 2025-12-23T20:19:06 1766521146

Use Demucs bruh https://github.com/adefossez/demucs

yunwal · 2025-12-23T20:33:25 1766522005

Hilarious that this is maintained by facebook and yet SAM fails so badly

hamza_q_ · 2025-12-23T20:05:14 1766520314

Yeah I was frustrated by slow and hard to use OSS diarization too; recently released a library to address that, check it out: https://github.com/narcotic-sh/senko

Also https://zanshin.sh, if you'd like speaker diarization when watching YouTube videos

noman-land · 2025-12-23T23:26:38 1766532398

Hey, thanks for this. Been trying it out and it's very fast but seems to hear more speakers than are in the audio. I didn't see a way to tweak speaker similarity settings or merge speakers in some way. Any advice?

hamza_q_ · 2025-12-25T19:30:23 1766691023

Thanks for checking it out!

Yeah unfortunately, since the diarization is acoustic features based, it really does require high recorded voice fidelity/quality to get the best results. However, I just added another knob to the Diarizer class called mer_cos, which controls the speaker merging threshold. The default is 0.875, so perhaps try lowering to 0.8. That should help.

I'll also get around to adding a oracle/min/max speakers feature at some point, for cases where you know the exact number of speakers ahead of time, or wanna set upper/lower bounds. Gotten busy with another project, so haven't done it yet. PR's welcome though! haha

noman-land · 2025-12-26T03:01:05 1766718065

Thanks, `mer_cos` definitely gets me closer. I appreciate that. Yeah, I was thinking providing a param for the expected number of speakers would be nice. I'll check out the codebase and see if that's something I can contribute :).

hamza_q_ · 2025-12-26T20:26:32 1766780792

Yeah would love contributions! Here's a brief overview of how I think it can be done:

Senko has two clustering types, (1) spectral for audio < 20 mins in length, and (2) UMAP+HDBSCAN for >= 20 mins. In the clustering code, spectral actually already supports orcale/min/max speakers, but UMAP+HDBSCAN doesn't. However, someone forked Senko and added min/max speakers to that here (for oracle, I guess min = max): https://github.com/DedZago/senko/commit/c33812ae185a5cd420f2...

So I think all that's required is basically just testing this thoroughly to make sure it doesn't introduce any regressions in clustering quality. And then just wiring the oracle/min/max parameters to the Diarizer class, or diarize() func.

websiteapi · 2025-12-23T21:38:51 1766525931

looks interesting. will check it out.

hamza_q_ · 2025-12-22T20:51:09 1766436669

Thanks for COD: MW2 (2009), Vince. The game of my childhood. Rest in Peace.

hamza_q_ · 2025-11-14T21:34:51 1763156091

Cool use of ONNX! Fluid Inference also have great implementations of Parakeet v2/v3 in CoreML for Apple devices and OpenVINO for Intel:

https://github.com/FluidInference/FluidAudio

https://github.com/FluidInference/eddy-audio

hamza_q_ · 2025-11-04T06:19:06 1762237146

Location: Vancouver, BC, Canada

Remote: Yes

Willing to relocate: Yes

Technologies: diarization, Voice AI, PyTorch, CoreML,

Svelte/SvelteKit, Flask, SQLite, Tauri

Résumé/CV: https://hamzaq.com/Hamza_Qayyum_Resume_Public.pdf

Email: mhamzaqayyum [at] icloud [dot] com

---------

Projects:

- Senko: very fast, accurate, speaker diarization (https://senko.sh)

- Zanshin: novel media player that allows you to navigate by speaker (https://zanshin.sh)

hamza_q_ · 2025-10-25T18:26:50 1761416810

Thought about it but it seems they have some stringent pre-req's they'd like: https://github.com/ghostty-org/ghostty/issues/189

I didn't care for those; just told Claude Code to add in the feature directly. So they probably wouldn't accept the PR if I made one.

hamza_q_ · 2025-09-21T17:17:40 1758475060

Thanks :) Agreed, the limiting factor has been diarization (generating the "who speaks when" data) speed. But the diarization backend of this app that I developed can now process 1 hour of audio in ~8 seconds on a M3 Mac. So that's more or less a solved problem now (at least on Mac), just UI work remains.

hamza_q_ · 2025-09-11T16:49:58 1757609398

We do know; it's just not in the popular conscience yet. Read a bit of Marshall McLuhan.

hamza_q_ · 2025-09-11T16:48:04 1757609284

Taking bets on how fast Marshall McLuhan re-enters the public conscience :)

hamza_q_ · 2025-09-05T00:56:13 1757033773

It's remarkable that Marshall McLuhan's ideas haven't entered the public conscience yet.

RajT88 · 2025-09-05T03:28:27 1757042907

That book is brutally dense reading. It almost needs a translation for normal folks.

It is absolutely no wonder the ideas have not caught on more.

mapontosevenths · 2025-09-05T01:14:29 1757034869

53% of American adults read below the sixth grade level. No idea that requires more than a sixth grade education will ever be mainstream again.

Huxley was right.

SoftTalker · 2025-09-05T01:21:10 1757035270

It’s been a long time, if ever, that people voted for ideas. They vote for party, as they always have, or for a charismatic candidate.

yencabulator · 2025-09-05T16:09:39 1757088579

Blame that one on the US two-party system. Multiple parties means you can have okay-ish feelings about 2-3 of them, and support people based on their ideas.

croon · 2025-09-05T07:44:53 1757058293

While genuinely a sad statistic, should it still be called "sixth grade level" at that point if less than half of adults, much less 12 year olds actually reach it?

I mean it should because it should be a reasonable level to reach were it not for the dismantling of the educational system, but apparently it's not.

mapontosevenths · 2025-09-05T14:44:13 1757083453

I think this is a reasonable question. However, I would argue that it still should be.

Firstly, I should say that reading scores aren't typically measured by grade levels for this type of study. That's just a colloquialism we use to make it comprehensible to the average person. The PIAAC for example uses a numerical score that translates to "levels of competency". [1]

Still, I think it's still a valuable way to express the idea. There exist levels beyond the sixth. Even if most folks don't attain those higher levels anymore we do need some way to refer to them and the sixth grade is when a high school bound adult should have attained that level in order to keep up with later coursework.

[1] https://nces.ed.gov/surveys/piaac/skillsmap/