Something is either public record - in which case it should be on a government website for free, and the AI companies should be free to scrape to their hearts desire...
Or it should be sealed for X years and then public record. Where X might be 1 in cases where you don't want to hurt an ongoing investigation, or 100 if it's someone's private affairs.
Nothing that goes through the courts should be sealed forever.
We should give up with the idea of databases which are 'open' to the public, but you have to pay to access, reproduction isn't allowed, records cost pounds per page, and bulk scraping is denied. That isn't open.
The issue is that the ease of access to information and the ease of proagating it can be transformative with regards to the effects of information to the public.
> We should give up with the idea of databases which are 'open' to the public, but you have to pay to access, reproduction isn't allowed, records cost pounds per page, and bulk scraping is denied. That isn't open.
I really don't see why. Adding friction to how available information is may be a way to preserve the ability for the public to access information, while also avoiding the pitfalls of unrestricted information access and processing.
The story is about a tool that allows journalists to get advanced warning of court proceedings so them can choose to cover things of public interest.
It's not about any post-case information.
>We should give up with the idea of databases which are 'open' to the public, but you have to pay to access, reproduction isn't allowed, records cost pounds per page, and bulk scraping is denied. That isn't open.
How about rate limited?
> Something is either public record - in which case it should be on a government website for free, and the AI companies should be free to scrape to their hearts desire...Or it should be sealed for X years and then public record.
OR it should be allowed for humans to access the public record but charge fees for scrapers
I don't know what the particular issue is in this case but I've read about what happens with Freedom of Information (FOI) requests in England: apparently most of the requests are from male journalists/writers looking for salacious details of sex crimes against women, and the authorities are constantly using the mental health of family members as an argument for refusing to disclose material. Obviously there are also a few journalists using the FOI system to investigate serious political matters such as human rights and one wouldn't want those serious investigations to be hampered but there is a big problem with (what most people would call) abuse of the system. There _might_ perhaps be a similar issue with this court reporting database.
England has a genuinely independent judiciary. Judges and court staff do not usually attempt to hide from journalists stuff that journalists ought to be investigating. On the other hand, if it's something like an inquest into the death of a well-known person which would only attract the worst kind of journalist they sometimes do quite a good job of scheduling the "public" hearing in such a way that only family members find out about it in time.
A world government could perhaps make lots of legal records public while making it illegal for journalists to use that material for entertainment purpose but we don't have a world government: if the authorities in one country were to provide easy access to all the details of every rape and murder in that country then so-called "tech" companies in another country would use that data for entertainment purposes. I'm not sure what to do about that, apart, obviously, from establishing a world government (which arguably we need anyway in order to handle pollution and other things that are a "tragedy of the commons" but I don't see it happening any time soon).
One of the problems with open access to these government DBs is that it gives out a lot of information that spammers and scammers use.
Eg if you create a business then that email address/phone number is going to get phished and spammed to hell and back again. It's all because the government makes that info freely accessible online. You could be a one man self-employed business and the moment you register you get inundated with spam.
I want information to be free.
I don't think all information should be easily accessible.
Some information should be in libraries, held for the public to access, but have that access recorded.
If a group of people (citizens of a country) have data stored, they ought to be able to access it, but others maybe should pay a fee.
There is data in "public records" that should be very hard to access, such as evidence of a court case involving the abuse of minors that really shouldn't be public, but we also need to ensure that secrets are not kept to protect wrongdoing by those in government or in power.
Spoken like someone who's never spent thousands of dollars and literal years struggling to get online records corrected to reflect an expungement. Fuck anything that makes that process even more difficult which AI companies certainly will.
Yes. This should be held by the London Archives in theory with the rest of the paper records of that sort.
They have ability to seal documents until set dates and deal with digital archival and retrieval.
I suspect some of this is it's a complete shit show and they want to bury it quickly or avoid having to pay up for an expensive vendor migration.
I think the right balance is to air gap a database and allow access to the public by your standard: show up somewhere with a USB.
I think it's right to prevent random drive by scraping by bots/AI/scammers. But it shouldnt inhibit consumers who want to use it to do their civic duties.
The idea that an individual can look up and case they want is the same thing as a bot being able to scrape and archive an entire dataset forever is just silly.
One individual could spend their entire life going through one by one recording cases and never get through the whole dataset. A bot farm could sift through it in an hour. They are not the same thing.
>and the AI companies should be free to scrape to their hearts desire...
Why? They generate massive traffic, why should they get access for free?
> Nothing that goes through the courts should be sealed forever.
What about family law?
[dead]
This is a good use case for a blockchain. AI companies can run their own nodes so they're not bashing infra that they don't pay for. Concerned citizens can run their own nodes so they know that the government isn't involved in any 1984-type shenanigans. In the sealed-for-X-years case, the government can publish a hash of the blocks that they intend to publish in X years so that when the time comes, people can prove that nobody tampered with the data in the interim.
The government can decide to stop paying for the infra, but the only way to delete something that was once public record should be for all interested parties to also stop their nodes.
Open to research yes.
Free to ingest and make someones crimes a permanent part of AI datasets resulting in forever-convictions? No thanks.
AI firms have shown themselves to be playing fast and loose with copyrighted works, a teenager shouldn't have their permanent AI profile become "shoplifter" because they did a crime at 15 yo that would otherwise have been expunged after a few years.