wouldn't the most obvious thing to do be to build a good training dataset and make the dataset widely available?
Only if you believe other people will value that enough to expend the effort necessary to use it. If you believe other people will see it as low value and ignore it then you'd be better off doing the training yourself in order to guarantee it happens.
There's also a secondary benefit that your team doing the work will learn some useful skills while they do it.