all 300+ battle data are available at https://app.uniclaw.ai/arena/battles, every single battle is shown with raw conversional history, produced files, judge's verdict and final scores
Thanks! Is the judge an LLM? There's lot of references to "just like LMArena", but LMArena is human evaluated?
Thanks! Is the judge an LLM? There's lot of references to "just like LMArena", but LMArena is human evaluated?