deleting and code cleanup is perhaps more an expression of seniority, and personal preferences. Maybe there should be the same kind style transfer with code that you see with graphical generative AI, "rewrite this code path in the style of Donald Knuth"
I imagine there would be value in not just throwing all of GitHub commits in as training data, but also rating the quality.