If you have the correct furigana, you could even detect when the TTS model picked the wrong reading ...

yorwba • yesterday at 10:28 AM • 1 reply • view on HN

If you have the correct furigana, you could even detect when the TTS model picked the wrong reading and regenerate.

But how do you know the furigana are correct? Unless you start out fully human-annotated text, you need some automated procedure to add furigana, which pushes the problem from "TTS AI picked the wrong reading" to "furigana AI picked the wrong reading."

Replies

mariano54 • yesterday at 11:25 AM

Yes it pushes the problem, but it's a much easier problem, and models like Gemini flash 2.5 do very well.

alt Hacker News

Replies