I’m running the same model on a 48GB MBP with a q4 quant and it’s pretty decent. You definitely don’t 128GB. That’s the scale for 70B models at q8 or something.
How much does one of those cost in the US? Here in Brazil, your notebook is worth as much as a used Honda Fit, which seems absolutely insane. For comparison, the ThinkPad I'm currently running cost me 1/20 of how much this MBP costs here, leaving me with over $8.000 to spend with LLM inference (if I actually spent money with that).
> I’m running the same model on a 48GB MBP with a q4 quant and it’s pretty decent.
Context size?
I've been running it on my 48GB MBP too and it's not particularly great. Super slow and not near enough to the quality provided by even Claude Sonnet.