Reminds me of the recent Terminal Bench controversy [1][2][3]
If theres a benchmark, people will cheat, lie and optimize for that benchmark. Honest depends on the compliance enforced on teams. But if, compliance itself is weak, it is going to be taken advantage of. Like growing up india, you would optimize for the exam and not what you learn from it.
[1] https://news.ycombinator.com/item?id=47920787
Exactly! The task gets even trickier when you're benchmarking lots of systems of different kinds: cloud databases, self-hosted ones, embedded engines, CLI tools.