In addition to the other discussion. It's important to measure outcomes and not just look at the cpu meter...
At the same load, how did latency look for A vs B.
What was throughput and latency at maximum load like for A vs B. For whichever one had the smaller max throughput, what did latency look like for the other option.
For bonus points while testing: is there another observable metric to indicate available capacity, if cpu % free is less useful.