futo.org has FOSS voice input android app (voiceinput.futo.org) and live captions (https://github.com/abb128/LiveCaptions) for Linux. They specifically developed their own model that does fast real time transcriptions.
I've been using this for the past two or three weeks and I have been very impressed with it. I ended up giving them five dollars because it's now my primary keyboard!
Not sure if that helps for your specific usecase.