logoalt Hacker News

qudatyesterday at 11:22 PM0 repliesview on HN

I feel like the claims come from wildly different personas and use cases. A 24gb vram, 5 year old titan run 27b at 30t/s and the results are good. I use sonnet and opus at my day job and they are more capable but I can still get the same out of qwen, I just need to be mindful of ctx