"Each benchmark was run multiple times, and I’m using the median to get rid of any potential outliers."
This is not how you should do benchmarks. Don't take the median, you don't even need to do any "warming up".
Simply run it long enough and only take the best result of each. This is more reliable and correct.