Do we know if this is better than Nvidia Parakeet V3? That has been my go-to model locally and it&#x...

pietz • last Wednesday at 4:47 PM • 6 replies • view on HN

Do we know if this is better than Nvidia Parakeet V3? That has been my go-to model locally and it's hard to imagine there's something even better.

Replies

m1el • last Wednesday at 6:29 PM

I've been using nemotron ASR with my own ported inference, and happy about it:

https://huggingface.co/nvidia/nemotron-speech-streaming-en-0...

https://github.com/m1el/nemotron-asr.cpp https://huggingface.co/m1el/nemotron-speech-streaming-0.6B-g...

➕ show 2 replies

d4rkp4ttern • yesterday at 2:44 AM

I’m curious about this too. On my M1 Max MacBook I use the Handy app on macOS with Parakeet V3 and I get near instant transcription, accuracy slightly less than slower Whisper models, but that drop is immaterial when talking to CLI coding agents, which is where I find the most use for this.

https://github.com/cjpais/Handy

tylergetsay • last Wednesday at 5:01 PM

I've been using Parakeet V3 locally and totally ancedotaly this feels more accurate but slightly slower

czottmann • last Wednesday at 5:54 PM

I liked Parakeet v3 a lot until it started to drop whole sentences, willy-nilly.

➕ show 3 replies

moffkalast • last Wednesday at 10:34 PM

Parakeet is really good imo too, and it's just 0.6B so it can actually run on edge devices. 4B is massive, I don't see Voxtral running realtime on an Orin or fitting on a Hailo. An Orin Nano probably can't even load it at BF16.

whinvik • last Wednesday at 6:01 PM

Came here to ask the same question!

alt Hacker News

Replies