logoalt Hacker News

lemaxtoday at 5:48 AM0 repliesview on HN

I've used RLAIF to build out heuristic based non-LLM models for various decision systems and achieved like, 95% F1 on certain projects. We're in a place where models can be used to fine tune a lot of stuff via loops.