> Both Amazon and Google serve Opus at roughly ~1/2 the speed of the chinese models
We were responded about 10x not 0.5x.
x86 vs arm64 could have different performance. The Chinese models could be optimized for different hardware so it could show massive differences.
These providers do not run models on CPUs, x86 vs. Arm is irrelevant.