I think you're underestimating how important "just a pain in the ass to work with" may be.
An analogy would be Hansard and theyworkforyou.com. The government always made Hansard (record of parliamentary debates) available. But theyworkforyou cleaned the data, and made it searchable with useful APIs so you could find how your MP voted. This work was very important for making parliament accessible; IIRC, the guys behind it were impressive enough that they eventually were brought in to improve gov.uk.
Hansard is a great example, because originally parliament did not allow publication of its proceedings; to get round this "Several editors used the device of reporting on parliamentary debates under the veil of debates of fictitious societies or bodies. The names under which parliamentary debates were published include Proceedings of the Lower Room of the Robin Hood Society and Debates of the Senate of Magna Lilliputia."
Yeah, this is a common problem with these kinds of issues. Data can be open, data can be available, but if it's not easily exploitable and parseable by anyone there will always be a third party that would do so for a premium. Data has to be ready for consumption when shared by the state. That's why one of my very strong objections that we divide datasets by year on a continuous registry. It should be one dataset with a single column that tells you the year the row corresponds to.
Well said
The problem with "cleaning the data" is it sometimes strips away so much context as to give you a misleading impression. Rory Stewart once said it took him 40 hours to fully understand a piece of legislation he was voting on, yet was expected by the whips to vote on multiple pieces of legislation every week, but most people wave an MP's voting record around like they 100% understood and agree with everything they voted on, despite it being mathematically impossible. If they'd voted differently would it have changed the outcome? Was it even a binding motion? Most of the real debate in the UK Parliament happens beforehand anyway and the government will withdraw any votes they know they're going to lose before it even gets into the chamber so the real rebellions don't even get recorded on theyworkforyou.