Voxtral Transcribe 2

972 points • by meetpateltech • last Wednesday at 3:08 PM • 237 comments • view on HN

Comments

ewuhic • last Wednesday at 5:52 PM

Can it translate in real time?

➕ show 1 reply

bytesandbits • yesterday at 4:17 AM

wow Mistral really cooked

scotty79 • last Wednesday at 8:23 PM

Do you know anything better for Polish language, low quality audio than Whisper large-v3 through WhisperX?

This combo has almost unbeatable accuracy and it rejects noises in the background really well. It can even reject people talking in the background.

The only better thing I've seen is Ursa model from Speechmatics. Not open weights unfortunately.

antirez • last Wednesday at 10:26 PM

Disappointing how this lacks a clear reference implementation, if not mixed at almost yet unreleased VLLM (nightly version) stuff. I'm ok with Open Weights being a form of OSS in the case of models, because frankly I don't believe that, for large LLMs, it is feasible to release the training data, all the orchestration stuff, and so forth. But it can't be: here are the weights, we partnered with VLLM for inference. Come on. Open Weights must mean that you put me in a situation to write an implementation easily for any hardware.

p.s. even the demo uses a remote server via websocket.

dumpstate • last Wednesday at 5:31 PM

I'm on voxtral-mini-latest and that's why I started seeing 500s today lol

boringg • last Wednesday at 4:52 PM

Pseudo related -- am I the only one uncomfortable using my voice with AI for the concern that once it is in the training model it is forever reproducible? As a non-public person it seems like a risk vector (albeit small),

➕ show 1 reply

varispeed • last Wednesday at 4:06 PM

[flagged]

➕ show 3 replies

alt Hacker News

Voxtral Transcribe 2

Comments