It seems to heavily depend on what exactly you're transcribing, the performance/quality between them is really uneven. Some models work really well for old cursive but then fail reading 8-bit segment LCD digital fonts, vice-versa or any combination out there.
Basically, to find the answer you really need your own benchmark you run with real examples from what you want to do. Basically the same goes for anything ML nowadays as the public benchmarks cannot really be trusted to give you any sort of indication on how we'll it'd work for you.