If you're less concerned about privacy, I use Gemini 2.5 Flash for this and it's exception...

hamdingers • today at 2:12 PM • 5 replies • view on HN

If you're less concerned about privacy, I use Gemini 2.5 Flash for this and it's exceptionally good and fast as a HA assistant while being much cheaper than the electricity that would be needed to keep a 3090 awake.

The thing that kills this for me (and they even mentioned it) is wake word detection. I have both the HA voice preview and FPH Satellite1 devices, plus have experimented with a few other options like a Raspberry Pi with a conference mic.

Somehow nothing is even 50% good as my Echo devices at picking up the wake word. The assistant itself is far better, but that doesn't matter if it takes 2-3 tries to get it to listen to you. If someone solves this problem with open hardware I'll be immediately buying several.

Replies

_spduchamp • today at 3:05 PM

How about a button?

I'd prefer to physically press a button on an intercom box than having something churning away constantly processing sound.

➕ show 5 replies

ethagnawl • today at 4:47 PM

What's been surprising in my experience regarding the wake word is that it recognizes me (adult male) saying the wake word ~95% of the time. However, it only registers the rest of my family (women and children) ~30% of the time.

➕ show 1 reply

jcims • today at 2:21 PM

I have a feeling beamforming microphone arrays might help here, something like this could improve the audio being processed substantially - https://www.minidsp.com/products/usb-audio-interface/uma-8-m....

➕ show 2 replies

robotswantdata • today at 6:29 PM

What about your wifi APs sensing which room you are in, with your choice of hilarious dance moves as the trigger ?

Funky chicken for Gemini

Penguin dance for OpenAI

Claude?

senkora • today at 3:22 PM

Why not use an easier to detect wake “word”, like two claps in quick succession? Or a couple of notes of a melody?

➕ show 1 reply

alt Hacker News

Replies