> Differential privacy makes this trade-off explicit, and thus impossible to ignore.
I think he has it backwards here.
Techniques like differential privacy hide the fact that a trade-off exists, except for a small cadre of experts who live and breathe this stuff.
I don’t know enough to defend this decision, but it strikes me that if there is a real trade-off, not having access to these techniques will force people other than statisticians to confront the trade-off.
If data about the public is so dangerous that we must disguise the results, then perhaps its data we shouldn’t be collecting in the first place.
imho, one big reason why Data Science as a big org lost clout in tech companies was a tendency to treat DS as gatekeepers of data. Outsourcing the responsibility of stat thinking gave many DS a weird power trip; when one dude gets to decide the trade-offs first without anyone around them needing to understand properly.
> If data about the public is so dangerous that we must disguise the results, then perhaps its data we shouldn’t be collecting in the first place.
By this logic no one should ever collect your address for any reason ever. How do we function as a society if we can’t ever give PII in any context? Anonymization/security is critical and makes a lot of critical functions possible.
How could you receive your mail in a world where we never give out/collect info that is potentially hazardous?
Nope private data about people is published unintentionally regularly, Netflix history and medical records being some of the notable examples.
People are bad at making the tradeoff because they consistently underestimate the amount of information that is leaked. Forcing them to leak safe amounts of information is the right way.
Not sharing or collecting the data could in some cases be better but there is clear value in this data so the optimal amount to store and make public is not 0.