Hey HN, we're Mariano and Anton from ISSEN (https://issen.com), a foreign language voice tutor app that adapts to your interests, goals, and needs.
Demo: https://www.loom.com/share/a78e713d46934857a2dc88aed1bb100d?...
We started this company after struggling to find great tools to practice speaking Japanese and French. Having a tutor can be awesome, but there are downsides: they can be expensive (since you pay by the hour), difficult to schedule, and have a high upfront cost (finding a tutor you like often forces you to cycle through a few that you don’t).
We wanted something that would talk with us — realistically, in full conversations — and actually help us improve. So we built it ourselves. The app relies on a custom voice AI pipeline combining STT (speech-to-text), TTS (text-to-speech), LLMs, long term memory, interruptions, turn-taking, etc. Getting speech-to-text to work well for learners was one of the hardest parts — especially with accents, multi-lingual sentences, and noisy environments. We now combine Gemini Flash, Whisper, Scribe, and GPT-4o-transcribe to minimize errors and keep the conversation flowing.
We didn’t want to focus too much on gamification. In our experience, that leads to users performing well in the app, achieving long streaks and so on, without actually getting fluent in the language you're wanting to learn.
With ISSEN you instantly speak and immerse yourself in the language, which, while not easy, is a much more efficient way to learn.
We combine this with a word bank and SRS flashcards for new words learned in the AI voice chats, which allows very rapid improvement in both vocabulary and speaking skills. We also create custom curriculums for each student based on goals, interests, and preferences, and fully customizable settings like speed, turn taking, formality, etc.
App: https://issen.com (works on web, iOS, Android) Pricing: 20 min free trial, $20–29/month (depending on duration and specific geography)
We’d love your feedback — on the tech, the UX, or what you’d wish from a tool like this. Thanks!
OMG this app f'n rocks. My convo with my Argentine mujer is soooo fluid and smooth.
I've been living in Buenos Aires for over 18 years now, so my pronunciations and accent is quite good. It's just that I never had the proper early fundamental foundations of grammar ..so I have a bunch of embarrassing holes that need filling -- this app is quite precise when it comes to focusing on those aspects.
Te felicito!
Ps my only nit pick so far is the UX on ios > the Settings modal > when opened there is no clear CTA to close it. Because the click-state of the settings button is 97% the same color as the non-click state.
Solution : 1 - add a close X button to the top right (standard accessibility)
2 - change the click-state Color of the settings button to a reverse color or accent color.
Want more UX tear-downs? Dm me artur at visualsitemaps.com
I created an account just to provide feedback. I would consider myself the core audiance and would be definitly be willing to pay for it. I am learning Japanese with, but, since I am living outside of Japan, struggle to find good opportunities to output on a regular basis. An AI-based language tutor would make my life much easier.
A few comments from the 20min trial I had:
- About the positive aspects: There were definitly parts of the conversation where I was genuinely suprised with the naturalness. Overall, I felt that the demo already helped me practice my output a bit. From this, I definitely believe that AI-based language tutors might become quite widespread soon as they could become really good language exchange partners. In my case, I still have low speaking skills (~Japanese N4, A2), so just having any chance to practice verb conjugations etc is already immensily helpful.
- I also like that there is no gamification aspect. I prefer apps where users can decide how they use them. A lot of Japanese learners, people already use anki for vocab, etc., so forcing extra vocab practice would me definitly quit the app.
Some feedback for improvement:
- Due to my still low speaking skills, I often need mid-sentence pauses which are at least 1-2 seconds long. When I pause, the sentence is already half-transcribed, sometimes in another language. I think I had moments where then the sentence is cut-off entirely. At least, it makes me pressured to finish the sentence in a fluent manner without much pauses.
- I know you want to have multilingual transriptions, but maybe you could add some language bias in transcribing the speech?
- About the cutting off: If you have some sort of VAD frontend: Maybe you could adjust its parameters based on estimated language skills of the users (beginners have longer pauses)
- I saw that my sentences were rated. But they were rated only in text, even though I said that my mistakes should be corrected right away (The speech-based AI partner just kept on talking about the topics). Maybe I have overlooked something, but is there some overview of my mistakes besides the text-based corrections?
>We didn’t want to focus too much on gamification.
Thank you so much for this. Duolingo is literally unbearable because it's so gamified. I'll try it out later. I've seen a few of these apps, can I seamlessly go between my native language and the language I'm trying to learn? If I am trying to learn Hindi, can I ask a question in English in the middle of a conversation?
I don't think I can trust TTS for language learning. I could be internalizing wrong pronunciation, and I wouldn't know. One time I tried Duolingo for Japanese already knowing a bit. To their credit I assumed it was recorded clips, until it read 'oyogu' as something like 'oyNHYAOgu', like it concatenated two syllable clips that don't go together. If I didn't already know, would I be trying to study and replicate that nonsense? So I don't know if I could trust TTS audio for language study regardless of what kind of tech it is. Sure mistakes can be unlearned over time spent immersing, but at much more effort than just not internalizing them in the first place.
Also Japanese specifically has this meme where it literally is a pitch-accent language but many people say it's not and teaching resources ignore it. E.g. 'ima' means either 'now' or 'living room' depending if syllable #2 is higher or lower. Clearly only applies to some languages, but is another dimension even harder to a learner to know there's a mistake. I have to imagine even other Latin languages probably have reading quirks where this could happen to me.
Alright, having tried this with Japanese I can say it's frustrating. As a near complete beginner the tutor kept speaking in Japanese even when I said "sorry I don't understand" repeatedly and then when I asked it to start in English and then gradually transition to Japanese it lasted all of one sentence in English before switching back. I can totally see how this would be useful conversation practice if you've progressed that far, but I'd love to have something for even earlier beginners. Also since many of the models you use are natively multi modal this could readily integrate visual media for discussion and grounding.
Also, for the transcription it would be great to get pure romanji to start with!
I'm trying to learn vietnamese, but the lessons are really really rough and borderline bad advice.
---
AI: Anh mệt is good if bạn are a man speaking about yourself. You can also say, “Em mệt” if you’re a woman.
this isn't correct. If you are of "older brother" age and are male, you say Anh. Em is for if you are "younger person" (does not matter the gender). Women tend to prefer being called "em" (even if they are older), because women prefer to be identified as younger than their true age... But that doesn't mean you can't call younger men em.
A good tutor would know your age relative to theirs and explain this context.
---
It would say english phrases with a vietnamese accent.
---
It also would give me really complex vietnamese phrases that I am not ready for. when I prompt for an explaination or translation, it would get off track from the original thing we were learning.
---
Way more people in Vietnam (and the globe) speak southern Vietnamese, but the tutors seem to be from north Vietnam.
---
The STT also was very forgiving if I pronounced things incorrectly. Or it would confuse english and vietnamese. I would say, "Phai", but it heard "bye"
---
I was ready to pull out my credit card, but I can't trust it to teach me the right information. I pay $160/mo for Vietnamese tutoring ($20 per class). This would be way cheaper and I don't have to schedule my classes.
The ChatGPT mobile app in hands-free voice conversation mode works quite well for language practice with one important call-out: you have to give it a topic at the beginning otherwise it won't be able to drive the conversation forward and will stick to banal pleasantries.
What I usually do is pick a random blurb in the news and paste the entire thing along with the Reuters link at the beginning and inform ChatGPT that we'll be carrying on language practice specifically over that topic of discussion.
I've used this to carry an hour long foreign language practice in Spanish while walking my husky. Just put the phone in my pocket and go. If you're an intermediate/advanced learner, it's a pretty decent solution.
In fact, you can actually instruct ChatGPT that you are going to speak in your native language, but ChatGPT is only allowed to respond in the target language if you just want to focus on practicing listening comprehension.
I'd be interested in hearing how significantly improved Issen is over this.
For me this is great for practice (I tried Russian). However the big missing piece for all these language learning apps is the lack of support for spotting and correcting errors in your pronunciation - as long as you say the word more or less right, the transcription gives you a pass.
I am very excited for the whole STT/TTS to go away and for us to have models that really "hear" exactly what you said.
Sometimes this is about accent but a lot of the time, the AI won't spot areas where you e.g. fudge a case ending or the stress on a word. Yes, you can get some of that pronunciation right by the AI repeating back with the correct stress or clear case, but you never really get the confidence that you would get from an actual human.
Another product suggestion - turn off transcription (at least for the tutor side of the conversation; I'd suggest both). Personally I find it distracting at best for languages I already speak well and a crutch for those I don't.
Finally, I find it really very hard to enjoy having a random conversation that's not very directed ("What interests you most about artificial intelligence?"). I'd suggest that there are ways of making it more goal focused without being explicitly gamified - maybe something like, here's a position and you have to persuade me (AI debate club!), or something that brings out an actual opinion or relates to a concrete experience ("what's your main goal in your job this year").
Overall though this is the first product I've seen in this space that I might actually use, so well done.
Congrats on your app and love it so far! Already sent it to over a dozen family members. Curious about a couple of things
- I see only two employees on LinkedIn -- how were you able to QA all these different languages with just two people?!
- I tried Urdu and the app did quite well. But curious why you have two female voices and not any male voice?
- I realize Sesame is a much bigger team, but curious what you think they are doing that makes their voices feel so real and seamless. I dont think they do multiple languages so I think you have a harder problem of course.
Question: ChatGPT voice mode seems to have too much tolerance for mispronouncing. Sometimes, it understands you even you mispronounce something in a phrase, and it's not aware enough to correct you - it even says your pronunciation is correct if asked. It's good at grammar, though.
It makes me think the audio goes through a kind of voice-to-text model before the answer, so nuance is lost; or the model wasn't trained to distinguish between correct and incorrect pronunciations.
Does Issen have this issue too? Pronunciation vices are common when you're learning a new language.
Congratulations on the launch.
I wish you great success.
Focusing on speaking first, and not writing makes so much sense. As a father I could first hand experience how my child learned 3 languages (German, English, Arabic) without reading/writing first.
The hardest part for you will likely be the "curriculum". It's "easy" to make something that works for a couple of weeks. But language learning takes years.
Btw, if you are up for it, I would enjoy chatting with you. -> I co-founded an AI math tutoring company, and focused my PhD on how to influence human language with AI. Hint: Social connections between humans and AI.
What you are trying to solve is something I dreamed about for years.
I’m at a roughly A2 - B1 level at the language I’m learning and I picked up a whole lot of pretty basic grammar errors in the first conversation.
The app also used a bunch of constructions I’m not familiar with even though I specified I’m a beginner.
If I hired a human tutor and had this experience, I would ask for my money back.
This looks great, congrats! As someone that has gone through Assimil courses and done lots of comprehensible input for various languages, language production is typically the weak point that isn't covered well. I've done plenty of lessons on iTalki, but I've been wanting something more structured and this seems like it could cover it. Definitely going to give it a shot!
The feature request I make for all language course makers: please consider Bengali support in the future! It's wild to me that the 7th most spoken language in the world, with a deep culture around literature and poetry [1], gets zero attention from language course makers. I can buy an Assimil course on Breton, spoken by 200k people, and not Bangla, spoken by 284 million.
Nice! I’ve wanted this for years.
Suggestion: you may be able to integrate SRS into the conversation. —- you could encourage the model to use certain words, and more importantly you can track the student’s active use of words that are on the review list, basically acting as if it were an SRS step. — this could totally eliminate the need for flashcards.
This looks similar to Speak. I took part in their Japanese language beta and enjoyed it. It's an interesting use of AI and I did learn a lot fast. My biggest problem with these are that they feel like magic until it suddenly doesn't. Weird pronunciation and strange replies. It comes nowhere close to replacing a tutor or speaking partner yet. I am optimistic as the tech improves fast.
The beginner experience (like others mentioned) is not there, at all. It's as if I was dropped into a foreign country, and forced to talk. Sure, I'm going to flail around, and maybe get frustrated, and maybe with some hand gestures get my point across in real life.
But this is not real life - I tried to "flag" the tutor with sentences that I don't understand, and I am way over my head, and it just chugs along, with long sentences, totally unaware.
The service should not advertise itself as having a beginner level at all, in my opinion.
Looks like a really great job, congratulations.
I'll try to give it a chance later today if I can find some room for it.
My main fear of anticipated deception is that it won't give me feedback on how out of the track my pronunciation can be deemed and lake tips on how to give a nicer moment to a native listening to me. That's really the thing I would like to be able to experiment more than anything else regarding foreign language acquisition. And giving IPA transcription of expected CS actual and possibly some links to video explaining each phonem that went wrong would be top notch.
Regarding engagement, after having try a bunch of online things, to my mind the best formula is to give insights on cultural and social matter: what are the regions of the country¹ and their specificities, what people love as food, drink, music, dance, literature, what have been the historical struggle of the linguistic community, who are the people prised in this community? Well at least for my profile it drives more interest than anything else.
¹ languages are not one to one bound to a single specific country of course, but you get the idea.
It would probably be better to pick one or two languages, actually work with native speakers to make sure it's right.
These "we cover every single language" tools get it like 75% right at best.
Can it be used in a car? Is looking at the screen required?
Luis von Ahn spoke in the early 2010s—probably around 2014—at The LAB in Wynwood, Miami. He recounted how his fascination with crowd-sourcing led first to reCAPTCHA and then to his latest venture, Duolingo. He made it clear that his real passion wasn’t language per se, but building a crowd-sourced human translation service as a business model. At that point, Duolingo had roughly 24 employees—and, much to his surprise, only two were focused on the crowd-sourcing engine. He explained how they’d enlisted some of the world’s leading language-education researchers as consultants. Their very first question: “Which part of speech should learners tackle first?” The experts confessed they didn’t know, so the team gathered the data and used A/B testing coupled with statistical analysis to pinpoint the answer.
Today, it’s not only easier than ever to launch a platform to challenge Duolingo, but its core product—its crowd-sourced human translation service—has been distrupted.
This morning, I found myself thinking about how all those decade-old learning platforms—like Coursera, as reflected in its ever-falling stock price—are being distrupted.
Your product looks awesome and I hope you distrupt all the language learning platforms. Thank you for sharing.
(I had ChatGPT fix my grammatical errors and now this comment doesn't sound like me, sorry.)
I appreciate your comment about gamification. I’ve kept a streak alive on other apps for no other reason than keeping a streak alive. Not learning a thing.
I've been thinking and playing slightly with this concept myself. A few thoughts:
1. Using a standard transcription service is pretty tricky because it's going to correct the user's speech. Or make it incorrect! Standard transcription is predicated on the speaker saying things correctly.
2. I've tried sending the audio directly to OpenAI to address this issue. I can't say if it works or not. It's very hard to test or understand a system without a transcript as a source of truth!
3. I'd like to learn a new language as a beginner, and all of these AI systems work poorly for this. It's great to immerse the learner in the language, but if you know NOTHING then it's not that helpful.
4. Language learning needs to be MUCH more multimodal than a standard chat. Especially as a beginner.
5. The AI should be generating translations and explanations alongside its responses. I'd like to be able to inspect everything the AI says (in the language I'm learning) to understand it.
6. Emoji would be another easy way to annotate the text.
7. I think giving the user/AI a subject to talk about would be helpful. Again, a subject that is not language-based would be great, like an image or something.
8. As a very new learner I would like an experience where I respond in my native language and then I'm told how to translate this to the language I'm learning. This should include a pronunciation guide. Then I should repeat the phrase I'm given.
9. I should still be able to ask questions in my native language and probably get a response in my native language. But with some prompting the AI should be able to distinguish these two cases.
10. For low latency it's nice if you produce the spoken text quickly, but you still have the opportunity to get the LLM to produce _more_ material immediately after. This is where things like translations can be produced.
11. You probably don't have timestamps on your TTS, but if you did and could highlight words as they were spoken that would be _great_. Probably worth choosing a TTS provider with that in mind.
It's very cool, I'm enjoying playing with it.
Feedback: The tutor pronounces some obvious things wrong that contradicts the words. Two examples: 気滅の刃 - it pronounced 刃 wrong despite the furigana being correct. It also kept pronouncing は as "ha" even when used as topic particle in more complex sentences. Edit: also observed 使い方 pronounced "saifou" - no idea what's going on there. It was in a mixed english-japanese message.
I think I would pay for this if I wasn't worried about learning mispronunciations or errors.
Oh, more feedback: focus the app on the conversation with the tutor and leave the memorization to Anki - just let us export those words we struggle with to CSV or something so we can import into existing vocab workflows.
Thank you for sharing this.
I've been learning Arabic, and I noticed that the app uses Arabic script right from the start. This can be quite challenging for beginners who haven't learned how to read it yet. May I suggest adding an Englishized (romanized) version of the Arabic text to help ease the learning curve?
It also seems to not listen to me when I asked to give me shorter sentences. It seems to not care that I'm struggling despite my pleading.
I later switched to Spanish, which was a better experience. This one seems to listen to me better. I can ask the tutor to repeat what they said in English and give me shorter sentences, and thankfully, it does.
Interacting with the tutors does feel I have to drive the conversation which is taxing. Compared to a human tutor, where I feel assured that I can be guided properly.
Still an interesting app. Would love to try Spanish some more, in the future.
I would exercise a new vocabulary by generating many different phrases based on a limited number of known words, like the Pimsleur method. This teaches words in context, not isolated.
I'm a second-gen Korean-American; my korean is weak but conversational. I am intrigued by the reasoning model that analyzes my speech and points out various mistakes I'm making. It's a good first attempt at separating the 2 tracks of actual conversation vs mistake-correcting.
I think showing the raw reasoning text is not quite the right UI; maybe highlighting the specific text in red and showing a suggested correction would work better?
It's also a little awkward that the conversation is live; I don't really have any breathing room to read the reasoning traces on what mistakes I made / could have done better. I hung up the first time I tried to figure out how to pause.
Yeah great work with this. Seems like a real opportunity given how hard Duolingo is dropping the ball.
I can't wait to try this! I studied a few languages in school and have lost any semblance of proficiency -- mainly because I never have a real occasion to use anything other than English. I've been waiting for someone to build something like this
Well done! I have built a side project in the same space given that i wanted to learn Spanish (my wife is Colombian) and also wasnt happy with existing offerings. I have used the OpenAI realtime API to fully focus on audio conversations. You can check it out (for free) here: http://lucas.alldone.app/
Congrats on the launch!
I tried the Japanese track. I'm a total beginner and the first lesson wasn't helpful at all. The AI asked about maybe mixing up Japanese<>English, but it didn't actually follow through. It either spoke fully in Japanese or fully in English. Maybe this is a standard practice for language lessons? I remember going to the first day of French class in a community college, and the teacher only spoke French, which was extremely overwhelming. Perhaps it's the standard way of teaching? Even if it is, I'm not sure if it works when compressed down to the shorter times I see myself opening the app.
I learned Spanish to an advanced level (B2) many years ago with a combo of Duolingo, Anki flashcards, and real tutors. One of my biggest regrets is focusing too much on the grammar and vocabulary and not enough on having conversations with natives. I'm convinced it would have taken me half the time to reach B2 if I had focused more on conversations. I think this app is going to be really effective. Congrats to you guys on the launch!
I really like the idea and I'm a potential customer, but I don't think this is ready yet. I've been learning Chinese for a while and decided to give this a shot and at my level (somewhere between HSK 2 and 3) it's very frustrating:
When I babble (as someone at my level does) and say "eh... a bit of sentence eh... a bit more of sentence" half the times it cuts me off in the first eh... or the second one. This is extremely frustrating, in fact I didn't even finish the free 20 minutes trial because of this.
Another issue is that like all LLMs it's bad at maintaining context of a conversation. I tried speaking about cars with it, as it's a topic I like so I thought it'd be cool and all of a sudden it's asking me what's my favourite ice cream. Don't get me wrong, I'm 100% certain I said something about ice cream but any human would understand I didn't want to say that.
Also I tried it with Spanish as I'm a native speaker. The speech recognition is bad, I don't know what sort of processing this does but it has a lot of mistakes, however it's very rare that chatGPT ever fails to transcript. I'd say well over 20% of sentences were misunderstood.
The idea is cool, but I wouldn't recommend this to anyone who wants to learn Spanish.
I built a basic version of this for myself with a prompt in chat gpt in an afternoon. It's great that you've built this yourself, but where's the magic? If it's your prompt it can probably be extracted in a few minutes by those who know how to do so.
Speaking of translation with LLMs I've been looking for a solution to quickly open a bi-directional translation context without having to prompt ChatGPT or any other LLM every time. iOS lets you set the action button to use the default translation app quickly, but the translation it provides is vastly inferior to LLMs.
Even some basic app that can pre-load the prompt doesn't seem to exist?
I tried the Web Version. Started, then tried to create an account, but it kept looping, informing me that my email address does not exist in your system. Well, the “Create New Account” got kicked off and gets me in a loop of “Do not Exist”. I just went through the whole process again, and I'm back to the beginning.
I’m going to assume this works better on the App.
Thanks for sharing! I tried using it for Thai language coming from English and found that the app understands me well! But I couldn’t understand it at all. It replied to my turns with very long messages (20+ syllables) in pure Thai and spoke with an unnatural rhythm which made it hard to pick out words or phrases. The foreign alphabet made it really difficult too. I tried changing some settings in the bottom left menu and it started speaking English to me too, but I found it unbearably slow. At one point it asked me if I wanted it to speak in pure Thai or a mix and then ignored my answer. Ultimately as a beginner I don’t think Issen will work for me very well as-is. Happy to check back in the future!
Do you store conversations? And what's the general privacy philosophy behind the app?
Cool stuff! Probably one of the less popular languages, but I noticed that the transcription with Russian is often quite poor.
Part of me loves this—no judgement, endless convenience, cheap. But another part mourns, sensing it strips away the grit, the stumbles, the soul of language learning. The kind that only comes from fumbling through conversations with another human.
When I was learning Spanish, I used italki extensively and found having a live Columbian tutor invaluable and very affordable for most Westerners. It would genuinely make me sad if those excellent tutors start losing work to this kind of AI.
Can't wait to try it; my kids need to learn French in school and I've been trying to keep up with them with Duolingo; but something is missing there.
For me a key feature will be a family plan; Duolingo is great in that regard.
Glad you're working on this. Duolingo is garbage and I've been hopeful that AI can help accelerate language learning in a way that is actually effective.
Just used it for French right now. The Design is excellent! but the LLM task orientedness needs some work. The tutor needs to follow the curriculum well. This has the same issue that I have in my day job i.e. keeping the LLM on topic. Its not strict. i.e. after asking it to make sure to remind me to reply in french it very easily forgets to do so. Its not following a structured approach or even in casual conversation isn't correcting my mistakes unless I ask.
Also I noticed your app doesn't work without a network connection, so i'm assuming you're doing all the TTS and STT server-side. Curious how practical that is w/r/t latency? Any plans to doing it all on-phone?
(probably a more fringe request, but i'm asking because I do all my language learning on the commuter trains w/o a good connection.)
This is the language app I've always wanted to exist. Will try it out - really hoping it can create custom lessons for specific scenarios that I need to study for.
Portuguese should have the flag of Brazil.
Don’t dim screen on iPhone during conversation.
The tutor should terminate the lesson when its goals are achieved and do a warm handoff.
Overall it’s quite good.
I'm glad someone is building this! I was using this in Thai. I expected it to be awful. But it's actually very good. I only used it for a few minutes but will try to use it more later. It's possibly good enough for me to stop paying my tutor. However, please use a different Text to Speech model because the current Thai one sounds robotic, like the old (current?) Google Translate. This seems like a great product.
This actually looks pretty neat. How have you been able to achieve such broad language support so quickly?
How widely have you tested your supported languages on native-speakers and learners?
This might be the most obvious question regarding this, but how are you planning on competing with the entrenched competition for mindshare, namely Duolingo. This is probably technically superior, but from a user standpoint, it might not be so. Happy to be proven wrong
Thanks for working on this! Language learning really needs a breakthrough.
Now, I tried the web app and chose to learn Greek as a beginner. And while I had better experience with your app than with ChatGPT or Gemini voice modes, I still got lost 5 minutes in because the AI tutor doesn't seem to have a plan for me, nor does it "see" my struggles. For example, after asking me about a hobby, it gives me a long sentence in Greek about how how it is nice to hike in mountains. Being absolute noob I cannot reply to it, nor even repeat it. And I don't even know what it is expected from me at the moment. A human tutor here would probably repeat a part of the sentence with a translation and ask me to repeat, or would explain something. The AI just sits there waiting for me to make a sound, and when I make it, it goes on on a tangental subject of beach vacations. :)
Again, this is still relatively not bad, and I'm going to give it another try.