> In a blind evaluation of nearly 3,000 anonymized comparisons, professors rated AI responses significantly higher than answers written by other professors, with AI winning 75% of head-to-head matchups.
75% win rate seems pretty good!
Paper link: https://law.stanford.edu/wp-content/uploads/2026/06/salinas_...
Yeah, 75% win rate is a ~200 points Elo difference, which is quite massive.
I do wish they'd used some more objective criteria. Simply being preferable one of the things LLMs have trained for since the beginning, hence its sycophantic nature.
I wonder to what degree the AI was just better at communicating. My experience with attorneys is that they are often some of the worst writers.