Hacker Newsnew | past | comments | ask | show | jobs | submit | synesthesiam's commentslogin

As the sibling comment mentions, the next version of Piper will no longer use espeak-ng to avoid potential GPL licensing issues.


Yes, it is possible by installing the Wyoming Satellite software on the Mark II. All the pieces are there, it's just missing some easy-to-install firmware. Source: I worked at Mycroft on the Mark II and am now the voice guy for Nabu Casa (Home Assistant).


I just checked out that project, thank you! But, I don't see any instructions at all for even getting started, even after searching the issues. Is there a good place to find information? When you say "missing some easy-to-install firmware" it sounds like there is a big piece missing, no?


Look into training a Dreambooth model. Huggingface has a guide for this.


You may want to try Piper for this case (RPi 4): https://github.com/rhasspy/piper


Thank you for the kind words, @follower!

I'm the author of Piper; it is a successor to Larynx (originally named Larynx 2). Piper uses the same underlying model as Mimic 3, which I developed before joining Mycroft. However, Piper uses a different library to get word pronunciations, so the voices aren't compatible between the two projects.

It's been an awesome year so far with Nabu Casa, and I'm very fortunate to be able to work on something I love. I hope to contribute to the open source voice space for many years to come :)


Coauthor of the blog post here. You're right, I said it on the live stream we had today but forgot to mention it in the blog post: the i5 is from a Lenovo ThinkCentre M72e. They're available refurbished for less than the cost of a Pi 4 these days, so it seemed to be a good comparison!


Man, I switched my home assistant box from a raspberry pi to one of these machines, because I wanted the raspberry pi to run one of my 3D printers, and it has made such a beautiful difference in terms of the snappiness of everything home assistant does.

I super recommend it, if you can afford the extra 10 to 15 watts of power.


The ThinkCentre runs at 65W, a Pi4 runs at ~8W. There's a slight energy crisis (in Europe), so you need to optimise for workloads that can run at lower power levels. Just, voice in my opinion, does not justify 65W constant draw.


No personal offense, this is a very common affectation that has just personally been a bugaboo for me lately due to technical documentation containing these non-technical statements.


Home Assistant and Rhasspy 3 will be able to share voice services thanks to the shared Wyoming protocol. Rhasspy 3 will have more options, including lots of experimental services.


Mark I or Mark II?


I have a mark I that hasn't been plugged in for years. Iirc The SD card is bad.


Mark II.


I used to work for Mycroft, so I'm hoping to eventually create an image that's compatible with Home Assistant pipelines.

For now, though, you may want to check out OVOS: https://openvoiceos.com/


I checked it out but didn't see that it runs on my hardware. I'm worried this gets me closer not father to brick status with my device.


Neon AI and OVOS are taking over development of Mycroft Core and the Mark II: https://www.reddit.com/r/Mycroftai/comments/1212h87/neon_ai_...


I'm working right now on adding voice control to Home Assistant: https://www.home-assistant.io/blog/2022/12/20/year-of-voice/

Our goal is local voice control of your devices in your native language (no Internet required). Hardware is still an open question, but I believe we will succeed due to our focus on a limited domain (IoT) and large open source community.


I'm a home assistant user that's been following along with the year of voice. I played a little with Almond and some with Rhasspy; what's the best way to get up and running right now? I've been using my arch desktop with a mic (running through pipewire) as my primary hardware, not scared of deploying to a pi or similar but I don't have one to spare right now.



The streaming I was speaking to is from a url or network resource. I keep a directory of m3u files which is just a url inside that you can open in vlc etc. `So ffmpeg -i [url] -c copy [file-name].mp3` does the trick for now. `mpv` can do this while saving to a file but it's a nice point to start from as I'm already getting some ideas how I could use this or roll my own, thanks!


whisper.cpp provides similar functionality via the `livestream.sh` script that performs transcription of a remote stream [0]. For example, you can transcribe BBC radio in 10s chunks like this:

  $ make base.en
  $ ./examples/livestream.sh http://a.files.bbci.co.uk/media/live/manifesto/audio/simulcast/hls/nonuk/sbr_low/ak/bbc_world_service.m3u8 10

  [+] Transcribing stream with model 'base.en', step_s 10 (press Ctrl+C to stop):

  Buffering audio. Please wait...

   here at the BBC in London. This is Gordon Brown, a former British Prime Minister, who since 2012 has been UN Special Envoy-
   Lemboy for global education. We were speaking just after he'd issued a rallying cry on the eve of the 2022 Football World Cup. For
   Governments around the world to pressure Afghanistan to let girls go to school. What human rights abuses are what are being discussed as we...
   run up and start and have happened and have seen the World Cup matches begin. And it's important to draw attention to one human rights abuse.
   that everyone and that includes Qatar, the UAE, the Islamic organization of countries, the Gulf Corps...
[0] https://github.com/ggerganov/whisper.cpp/blob/master/example...


Wow! Thanks for sharing, I didn't explore the repo much beyond that link, this looks very promising, I was going to tomorrow but checking out the src and building now! The confidence color coding btw is chef's kiss. Great job with this.


Is this English only, or does it support other languages? Thank you.


It's a port of Whisper and uses the original models in a different format so it supports all the languages that the original does.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: