For revoldiv.com we have profiled, many gpus, the best one is 4090. We do a lot of intelligent chunking and detect word boundaries and run the model in parallel in multiple gpus and we get about 40 to 50 seconds for an hour long audio but without expect 7 minutes for an hour long audio on tesla t4
on tesla-t4-30gb-memory-8vcpu google cloud
on tiny and tiny.en
for 10 minute = 30 seconds
on medium
for 10 minute = 1m 30s
for 60 minute = 7m
on large
for 60 miutes = 13m
on NVIDIA GeForce RTX 4090
on tiny
for 10-minute = 5.5 seconds
for 60-minute = 35 seconds
on base
for 10-minute = 7 seconds
for 60-minute = 50 seconds
on small
for 10-minute = 14 seconds
for 60-minute = 1 min 35 sec
on medium
for 10-minute = 26 seconds
for 60-minute = 3 mins
on large
for 10-minute = 40 seconds
for 60-minute = 3 min 54 sec
thats crazy. 35 seconds for 60 min on base with a 4090. wow!
thanks for the info! also btw i mentioned on another thread i was getting an error. but are you planning on offering this as a paid api?
Can you send me the audio that caused it, you can email me at team AT revoldiv .com. If there is going to be a lot of interest, yes we can provide it as an api service. Our service has some niceties like word level timestamp, paragraph separation, sound detection etc... for now it is a free service you can use as much as you want
Anyone ran tiny en on t4 gpu (aws g4dn iirc)? What’s the speed up