logoalt Hacker News

wongarsuyesterday at 1:53 PM0 repliesview on HN

Yes. I would not consider Kimi a particularly good model relative to its size, and making a SotA model is a lot more expensive. But training costs are explicitly excluded when talking about the cost to serve tokens