Law in the Internet Society

Reclaiming privacy from search engines.

-- By JacobusVanEssen - 01 Dec 2009

Search engines sell your privacy

Search engines play a big part in the way people use the internet. More often than not a search engine is the intermediary for accessing information on the web. The effectiveness of search engines make them a very useful utility for most internet users. However, the way that most search engines operate imposes a big threat to the privacy of the person using it. The underlying problem is that the business model of a search engine like Google is in a direct conflict of interest with the privacy of its users. Google's ability to facilitate targeted advertising (thereby increasing its revenue) competes directly with the methods by which a user can achieve anonymity and preserve whatever little is left of his online privacy.

I've corrected linguistic imperfections in the paragraph above. You should go through the remainder of the essay with an editor of your own choosing to get the benefit of corrections for the rest as well.

On the substance, you have to work a little harder to prove the point you claim. Google monetizes search by massively improving the efficiency of advertising, but perhaps the revenue produced by doing no more than showing ads based on the search query of the moment, or the search query disambiguated by past queries, is enough to support the infrastructure of search, without getting users' clickstream data, reading their mail, etc. It appears to me, as one reasonably well-informed outside observer, that the effectiveness of search-integrated advertising would allow more than enough monetization to enable search. The rub is that once the scale of operation produced by the business has been achieved, offering to host all the user data in the world in return for the right to infer from it is an even more attractive proposition. That's not quite the same as your statement, and the differences are very significant. So some careful analytical weeding out has to occur, to differentiate between what you think is happening, an what--for example--I think is happening.

Nothing to hide?

The strange thing is that most people have a rather indifferent attitude towards this phenonomen. What could be so bad about someone knowing what you search for on the internet? You've got nothing to hide right? The weakness in this argument is well illustrated in the essay ‘I've got nothing to hide' by D. J. Solove, associate professor at George Washington University Law School. One of the problems with this argument is that the underlying presumption is that privacy is about hiding things. This is a far to narrow view, which does not take into account that privacy is more about a personal, social value. That the right to privacy is recognizes the sovereignty of the individual. Another issue that is overlooked is that the privacy you give up online is not just the information you disclose, but also the information you deliberately do not disclose. Because of the scale that data about individuals is collected, the data that you do not disclose can be inferred by simply putting you're data in a pool with millions of other subjects.

Why is this problematic?

Search engines make detailed profiles about their users which contains anything from their tastes in consumer goods to their sexual orientation and medical issues. The main purpose of this data collection is to sell it for commercial purposes to parties who conduct targeted advertisements. Just the fact that you're personal information is sold for commercial purposes is a bad thing. Even more worrying though is thinking about these detailed profiles falling in the wrong (or worse) hands. One needs only to look at history to see the devastating consequences that this could have. In my hometown of Amsterdam, Netherlands a disproportionate number of Jews were deported to the concentration camps precisely because there was a very detailed data collection about the inhabitants of the city at that time.

An attempt at regulation by the E.U.

So admitting that we have a problem is the first step, but what needs to be done about it? In the E.U., where there is in general a deeper tradition of privacy protection then in the US, legislators have seemed to recognise the issue and have adopted a directive concerning online data protection. The directive (95/46/EC) is an effort to regulate the data collection by search engines. It sets out a number of policy rules, such as that search engines must delete or irreversibly anonymise personal data once they no longer serve the specified and legitimate purpose. They must justify the retention and longentivity of cookies deployed at all time. The consent of the user must be sought and the user must be able to access, inspect or correct their personal data. While this directive is a reasonable effort to stop the uncontrollable data-mining practice of search engines it is not very adequate. To make search engines ‘irreversibly' anonymise personal data after they no longer serve their ‘specified and legitimate' purpose, has proven not to be very effective.

Well, you could say it has been effective in one respect: no search engine business of any significance is domiciled within the jurisdiction of the European Commission.


Several studies have shown that de-identified information can be re-identified very accurately in almost every case. A persons search inquiries might contain their own name, neighbourhood etc, making it fairly easy to link them to a specific, identifiable user. Clearly the anonymization of the IP address does not at all solve the problem. Another thing that the directive wants to achieve is to give the users of the search engines the right to access, inspect and correct the private data that has been collected. I agree that this type of transparency would contribute to better online privacy. If users are aware of what data is being collected they can make more conscious decisions about the way they use search engines.

But until Europe conquers North America this isn't going to happen.

What should users do?

In absence of adequate regulation users of search engines should look for alternatives to the search engine with the ‘data collecting and selling' business model. The technology to do encrypted searches on encrypted data is already available, such as PIR (private information retrieval). What needs to be achieved is a mainstream, potent alternative from google and the other commercial search engines. Free (as in freedom) software might be the best way to make such an alternative.

It's not specifically a software problem. It's a computer science problem: no one knows how to do web search in a decentralized manner. So long as all evident ways of doing search show positive returns to scale, and require centralization, incumbency has enormous advantages.

We need search engines who do not depend on advertisement for their revenue.

This is backward. Search is funded by advertising because traditional advertising is so much more inefficient than search-integrated advertising that advertising wants to integrate with search. If you are doing the world's searching, you will get the world's advertising revenue. That will remain true until we find a way to search a data structure like the web in a decentralized fashion, or replace the web with a different data architecture.

The thing that needs to happen then is the public getting more aware of the dreadful situation they are in and turning away from products from google and other companies that pose a threat to privacy. Unfortunately, the data that has already been collected about us is out there and the use of it will depend on commercial companies like google, who's very existence depend on misusing it.

If you had demonstrated that last point it would make a strong closer. But as it doesn't have any proof behind it you get little service from it.


