logoalt Hacker News

deanclast Thursday at 5:14 PM3 repliesview on HN

I built a basic version of this for myself with a prompt in chat gpt in an afternoon. It's great that you've built this yourself, but where's the magic? If it's your prompt it can probably be extracted in a few minutes by those who know how to do so.


Replies

M4R5H4LLlast Thursday at 5:38 PM

Why not finish, publish it to the store and get income if it was that straightforward? There's a long way to go between a toy application to demonstrate a product, and something on shelves actually selling. In other words, you can most often quickly tackle the concept or trivial parts of an app, but it's much harder to get a real product out, even if the implementation looks straightforward on surface.

show 2 replies
vimylast Thursday at 10:10 PM

> We wanted something that would talk with us — realistically, in full conversations — and actually help us improve. So we built it ourselves. The app relies on a custom voice AI pipeline combining STT (speech-to-text), TTS (text-to-speech), LLMs, long term memory, interruptions, turn-taking, etc. Getting speech-to-text to work well for learners was one of the hardest parts — especially with accents, multi-lingual sentences, and noisy environments. We now combine Gemini Flash, Whisper, Scribe, and GPT-4o-transcribe to minimize errors and keep the conversation flowing.

Your prompt can't do this. I know because I've been trying to build something similar and a prompt just isn't enough. You need multiple LLMs and custom code working together to achieve realistic conversations.

haiku2077last Thursday at 6:20 PM

[flagged]

show 1 reply