Not OP but if I have to have a CPU and microphone for voice commands anyway it doesn’t sound crazy to throw a whole pi/relay node into every room of the house that I want to have control of. Pi zero 2 is fifteen bucks and can run Whisper speech2text iirc, throw ChatScript on there for local command decoding and call it a day. I think I’d pay 50 to 100 per room for the convenience, paying a premium to not have my voice surveilled by Alexa just to set timers.
Without trying to digress, but why not make it modular too ? I.e. base model is a smart switch, one unit is the “base” unit and the rest talk to that. Possibly even add further switches, dials (thermostat or dimmer etc). Perfect placement in my opinion.
Suppose I have a bias for meshnet vs hub and spoke. Seems to me having full power cpu on every mic is going to be better experience latency and glitchwise than streaming audio feeds around. Of course they would still talk to each other to pass commands around.