Has anybody used V4 hard, for the most challenging tasks (agentically, locally)? It's so hard to compare without putting serious time in it. Like spending a year daily with the model.