Not to mention the text-only 0.8GB version. Just crazy. You can have basic real-time conversations on-device that's video and audio aware now.
Have you seen a 0.8GB model file floating around yet? I couldn't find one earlier.
I'll be honest with you. My main ask for on device AI is that when I am typing "Going out for a quick j" it corrects to "jog" and not "Jonathan". I don't think it needs that many gigabytes.
0.8GB is for text only. It's more like ~1.1GB if you include video/audio encoder