Do you have data for other benchmarks? +7% for HLE isn't nothing but it'd be more compelling if you could show you're consistently doing better with your method across more domains (especially coding, which seems like the primary use-case these days).
As of right now, we do not. I'm working on these other benchmarks, but unfortunately they cost quite a bit of money to run, which I'm hoping will come from many people using Sup :)