DeepSeek is close to SOTA today, as are Kimi and GLM. Yes they'll be slow and high-latency on ordinary hardware but let's be real, no one reasonable is running Opus or GPT on a 24/7 basis either. Local AI heavily rewards slow inference around the clock over fast response.