I appreciate your reply but you are completely glossing over his point about how head to head model evals are useless lmao