My brain is a bit slow today:
> You're the only one out of 100 that visits HN
So the HN operator sees someone using this browser, with this timezone. Then I go to some other site. Let's pretend that site's operator and HN's are identical. How will they know that I'm the same guy who went to HN? How does he know there aren't two people who use the browser in the same timezone (and the other one doesn't go to HN)?
I think the point is that it takes very few data points to effectively deanonymize someone. And the less common a data point is, the greater the information gain. "User is male" eliminates ~half of users. "User actively reads HackerNews" eliminates >99%. "User uses this niche browser that only 1000 people have ever been seen using" eliminates 99.999%.
This is how surveillance operates at scale. You don't need a stable identifier linking a specific person's identity, you just need a few data points to narrow it down to even a few thousand people. Then you apply more focus on those people, gathering data points that eliminate people until you're left with your target. And thanks to decades of global iteration on surveillance infrastructure, and AI to glue data sets together, it's all automated.