logoalt Hacker News

PeterStuertoday at 5:24 AM0 repliesview on HN

I use a 4090 and 96GB ram to run local models slowly (atm Qwen-code-next at 7 tps) with their full context window. I keep this up just for testing and practicing fallback should I lose access to Claude and GPT.