logoalt Hacker News

embedding-shapeyesterday at 6:10 PM0 repliesview on HN

Or distilled models, or just slightly smaller models but same architecture. Lots of options, all of them conveniently fitting inside "optimizing inferencing".