logoalt Hacker News

ykolast Wednesday at 7:13 PM2 repliesview on HN

That's a mix of Polish and Ukrainian in the transcript. Now, if I try speaking Ukrainian, I'm getting transcript in Russian every time. That's upsetting.


Replies

overfeedlast Wednesday at 8:09 PM

Oh no! The model won't translate to an unsupported language, and incorrectly reverts to one that it was explicitly trained on.

The base likely was pretrained on days that included Polish and Ukrainian. You shouldn't be surprised to learn it doesn't perform great on languages it wasn't trained on, or perhaps had the highest share of training data.

scotty79last Wednesday at 8:12 PM

Tell it you are going to speak Polish now. It helps.