logoalt Hacker News

kccqzy06/26/20251 replyview on HN

It seems way worse than other small models, including responding with complete non sequiturs. I think my favorite small model is still DeepSeek distilled with Llama 8B.


Replies

oezi06/27/2025

The key here is multimodal.