Differential analysis is amazingly powerful. If you're in the US - 30 bits is all you need. And not all bits are equal - some come with implicit anchors, allowing you to segment and search efficiently.
If you know the state, the median number of bits needed is 23. If you know the city, around 10 bits is all you need to identify you as a unique individual.
A drunk raccoon with one eye and a missing paw can sieve out 10 bits of information about a particular person.
You can do probabilistic assumptions and segment the population by fuzzy characteristics you get, like stylometry, assumptions about native language, interests, etc. For a giant database like the spies and agencies have, they can do probablistic ID with extreme accuracy based on a tiny number of leaked bits.
If you snag a giant pile of readily available website data, then tag the person of interest based on that data, then any time you process new data, you can get a probability of that new data being associated with an already known person. Set a five nines threshold, or higher, and then assume those matches are legitimate, and you can chip away at all sorts of identity handles. From there, you can start doing contrastive searches, sieving out known quantities, improving the statistical accuracy of those fuzzy parameters.
Deanonymization and such is borderline trivial, consumer compute is about 5 generations past the threshold where a global database would be considered particularly difficult or challenging.
Fingerprinting is very easy, but obfuscating it is incredibly challenging, with all of the implicit, deliberately leaky data transactions that are imposed on us.