Our favorite search engine Google does a lot of analysis of what people search for. They represent such a huge percent of the search market that they can actually deduce real world facts from virtual world searches. This is pretty impressive. For comparison, if 20 people come to plip.com and read about how to secure their passwords, I can probably deduce much of nothing about the US population as a whole. Google, on the other hand, can deduce flu trends from around the country based of searches on their site. In a more real time scenario (and a bit more PR’ey), they had their own “Google search olympics” based on searches during the real olympics. Average joe users are OK with this level of privacy about their searches because it is massively aggregated, thus there is no chance of picking Joe from Jane in the resulting data. Further, Google’s intent is to be helpful with their statistical deductions. Finally, you have the expectation that your search metadata might be used in some way, if not explicitly stated as much.
This whole set up got me thinking: what about other huge companies that could expose similar data that maybe you would not be so OK with? My ISP is Comcast. They’ve been involved in some filtering/privacy issues in the past, but have otherwise kept their hands pretty clean. What if Comcast released their own data explorer (think Google’s, of course) based on net usage across the country broken up by zip code. Would the posh upstate New York neighborhood be OK if they were rated the #1 porn consumer per capita on weeknights between 11pm and midnight? Yes, of course, this is a fictitious stat, but the point is that likely no one would be OK with this level privacy because they don’t want anyone to know.
There is a fine line between a company’s intent and your what your expectation of privacy is. Somehow we’re OK with what Google does (because they can do no evil, right?), but we might not be OK with Facebook auto-sharing your data with another company (real !) or if RIMM, with over 40% of the smartphone market, decided to guesstimate what percent of its users were having extramarital affairs based on emails going through their servers (fake).
Aggregated data is out there and you should be aware that every move you make across every platform you use, physical or virtual, is being tracked.
Follow up reading/things that I couldn’t cram in above:
- T.A.C.O. – Targeted Advertising Cookie Opt-Out
- Slight Paranoia – A great blog about online privacy
- E.F.F. – Electronic Frontier Foundation
- AOL Data “anonymized” – old but good