Nice! I'm curious if your software can pick up really subtle details - like for instance, pitch accent in Japanese (which is basically NEVER covered in a beginner level course) but is useful to just be aware of as a language learner.
Thanks!
No, it cannot do this yet, as it's using a pipeline of voice to text to voice. I think models are heading in that direction, and voice to voice models are getting better.
For pitch accent, shadowing is a great way to improve. You can pause and repeat the tutors messages for example, or read out the word when doing flashcard reviews (copying the flashcard audio).
Thanks! No, it cannot do this yet, as it's using a pipeline of voice to text to voice. I think models are heading in that direction, and voice to voice models are getting better.
For pitch accent, shadowing is a great way to improve. You can pause and repeat the tutors messages for example, or read out the word when doing flashcard reviews (copying the flashcard audio).