The problem with using OpenAI whisper is that its too slow on CPU only machines. Whisper.CPP is blazing fast compared to Whisper and I wish people build better diarization on top of that.
Another advantage of Whisper.CPP is that it can use cublas to accelerate models too large for your GPU memory; I can run the medium and large models with cublas on my 1050, but only the small if I use the pure GPU mode.
Ah I see, thanks. Hm, I would imagine that it's not hard to make something that works with both (the surface area of the API should be fairly small, I imagine), odd that projects use the former and not the latter.