This is the bit I'm suspicious of:
> They calibrated AI responses to match the length and structure of human answers
which I would guess removes AI's hallucinations and errors somewhat.