Thomas here, maker of Spectropic and Audiogest. I am indeed focused on building a simple and reliable Whisper + diarization API. Also working on providing fine-tuned versions of Whisper of non-English languages through the API.
Feel free to reach out to me if anyone is interested in this!
Great looking API. Are you able to, or do you have plans, for there to be automatic speaker identification based on labeled samples of their voices? It would be great to basically have a library of known speakers that are auto matched when transcribing
Thanks! That is something I might offer in the future and is definitely possible with a library like pyannote. Would be really cool to add for sure.
I am also experimenting with post-processing transcripts with LLMs to infer speaker names from a transcript. It works pretty decent already but it's still a bit expensive. I have this feature available under the 'enhanced' model if you want to check it out: https://docs.spectropic.ai/models/transcribe/enhanced
Currently I only have a discord channel for support and help, but your suggested approach of offering some included one on one consulting makes sense. I am gonna look into how I can implement this
Haha yes, was an issue with a rate limit on the Github API, should be fixed now! Didn't expect this amount of traffic :) Docs were built with Mintlify and the links to the repos in the docs will work once you have accepted the invite (after making a purchase).
I honestly don't know. But from my perspective, this year I've spent probably in excess of 1000 hours working on my SaaS app, by contrast it took me maybe 3 days to setup all the required services to connect to Stripe, my analytics provider, etc.
Simply put, saving a few days of setup time does not cross the threshold of a product or service I would pay for. If this was something that would take me a few weeks or a month to put together myself then it would be a product I'd be willing to pay for.
Take a product like TailwindUI, it's expensive but also very worth it because it takes the time to build a professional landing page from 2-3 weeks of work down to 2-3 days of work.
This is a huge problem in the tech world I think. I think of myself as $30/hour against purchases and subscriptions - each hour something saves me is worth $30.
This boilerkit is worth $39 to me because it would definitely take me longer than a couple hours to build myself. Someone else has already put in the 100 to 1000 hours to make it a decent experience.
Thomas here, maker of Spectropic and Audiogest. I am indeed focused on building a simple and reliable Whisper + diarization API. Also working on providing fine-tuned versions of Whisper of non-English languages through the API.
Feel free to reach out to me if anyone is interested in this!