Or distilled models, or just slightly smaller models but same architecture. Lots of options, all of ...

embedding-shape • yesterday at 6:10 PM • 0 replies • view on HN

Or distilled models, or just slightly smaller models but same architecture. Lots of options, all of them conveniently fitting inside "optimizing inferencing".

alt Hacker News