logoalt Hacker News

matheusmoreiratoday at 1:20 AM1 replyview on HN

> at this point I'm about to just invest in fully local inference instead

This is the best way forward long term. We won't have frontier performance, but at least the models will be aligned with us instead of refusing us or sabotaging us.


Replies

giancarlostorotoday at 1:53 PM

I think my biggest hangup is some models dont have big enough context windows, my sweet spot personally for Opus is having at least 400 to 600k tokens, if I can have a local model that can go up to that or slightly above 600k maybe 700k for some buffer, that would be perfect.

I've also debated having a frontier model for planning only, and then feeding plan to smaller offline models.