Law in the Internet Society

The Dangers of Data Mining

-- By BradEhrlichman - 17 Nov 2009


Data mining empowers corporations to profile consumers cheaply and effectively. This profiling, in turn, leads to the exploitation and commoditization of customers under the guise of convenience and responsiveness. Given the twin dangers engendered by data mining – exploitation and commoditization – the need for consumer protection is clear. The yawning asymmetry of information between corporations profiting from data mining and consumers unaware of its practice or perniciousness places the burden of a solution on the former. However, it is na´ve and impractical to expect corporations to abandon the practice out of charity or right-headedness. Further, it would be altogether draconian to expect consumers on their own to either educate themselves on the dangers of data mining and the methods of obstructing it, or to face its unwanted consequences. Thus, the government, in its most basic, Hobbesian role must both restrain the use of data mining while educating citizens of its practice and providing them with better mechanisms for easily and unequivocally opting out of the regime.

A Siren's Song

The primary menace of data mining lies in its ability to allow corporations to hit all the right notes in appealing to unwary consumers, thereby making every advertisement a siren’s song. Vague titillation has been replaced by targeted temptation, as advertisement – perhaps solicitation is the more apt term – is now directed at an audience of one. Anniversary fast approaching? Been looking into that dream Caribbean vacation? Well, check out these low, low rates that even you can afford. And, if you need a little money to bridge the gap, well, we can help you with that too.

Of course, the uniformed consumer regards these offers not as dangerous, but as almost providential. He is being offered exactly what he’s looking for, shown just what he needed without ever really knowing he needed it. Like the oleander, however, the beauty is the danger. Consumers unthinkingly welcome the convenience and enticement of the perfectly tailored offers, and the corporations profit thereby. In effect, every consumer is like a wanderer lost in the desert, and data mining allows corporations to know the exact moment when he is so overcome by thirst that he will pay any price for a drink of water.

As a further insult, the corporations posit data mining not only as benign, but as benevolent. For example, Google alerts its users (read: almost everyone with an internet connection) that: “We may combine the information you submit under your account with information from other Google services or third parties in order to provide you with a better experience and to improve the quality of our services.” Graciously, “for certain services, [Google] may give [its users] the opportunity to opt out of combining such information.” Data mining is presented as a service to consumers, a magnanimously provided convenience inuring to the benefit of those who cannot see the snake for the apple.

  • I don't think you quite resolve the tension between data-mining that predicts what you want and data-mining that knows when you're desperate for a drink in the desert. The latter is a form of interaction that enables no repeat business. So are you sure there's a snake and is it about being overcharged?

  • Perhaps the snake metaphor should be traded for one involving the Trojan Horse. As you note, there is a difference between data mining that is convenient, or even informative, and data mining that is predatory and manipulative. The problem I was trying to identify is that the latter is facilitated by the presence of the former. Frankly, I like when Amazon notes the last book I bought was Ham on Rye, and then suggests I next read Hunger. There, the convenience outweighs the intrusion into my privacy and the resulting profiling. However, the danger arises where there are a thousand intrusions, allowing for a more robust and sinister profiling. Also, I don't think the danger inheres in just being overcharged; that's just a fact of capitalism. I think the true problem, the problem you identified in class when discussing data mining and the endemic of foreclosures and bankruptcies, is when complex goods or services - not just books - are being foisted upon people at the exact moment they are unable and unwilling to truly understand the risks of clicking yes.

Pictures of You

The second identified danger of data mining is its reduction of individuals to commodities. As Jeffrey Rosen points out in this article, data mining represents not only an extreme invasion of privacy, but also a fracturing of individuality into a random survey of tastes, habits and purchases that are used to ensnare individuals-cum-consumers into a cycle of spending. There are two resulting problems. First, the dignity of individual human beings is degraded by an alchemy converting their tastes, passions and dreams into data used to separate them from their money.

  • Perhaps the "dignity" was transcendental nonsense in the first place, as it took merely the smallest of inducements to cause people to give it up. Real dignity or integrity should have some staying power, don't you think?

  • Dignity is, probably, too stylized and nonsensical term. On the other hand, human history is full of people who sold their dignity for a few pieces of silver. While the word dignity might aspire to something more transcendental, in practice it does seem to often be traded for the smallest of inducements.

Once a human being is commoditized, it is easier to sell him carcinogenic products or ply him with debt from which he will never escape.

  • Maybe this connection is clear to you, but it isn't to me, so I'd welcome a sentence that explained itself a little better.

  • Here, the point is related to Stalin's observation that while one death is a tragedy; one million is a statistic. Assuming any sort of compunction on the part of cigarette vendors or predatory creditors, the fact that they are plying their wares not on people as such - that is, on Eben Moglen or Brad Ehrichman - but on aggregations of statistics must make it personally less horrifying for them to continue their business model.

Second, it is unlikely that any dissemination of purchases, wall posts and browsed websites can accurately and wholly limn the contours of an individual’s personality, status and situation. As this article demonstrates, data mining operates in broad strokes. Charging a bottle of water at Duane Reade may indicate a lack of available money, but it may also be the case that the purchaser finds – like I do – that paper burns a whole in his pocket faster than plastic, and thus prefers not to carry cash. However, data mining does not always provide for such subtle distinctions. Given, the decisions made by accumulators and analyzers of data mining identified by the article – denying credit, increasing interest rates – the dangers of such blanket and under-informed analysis are obvious.

  • If bad analysis has a cost, such parties will naturally decide whether to incur it or to refine their information further. Why should we be concerned about where there stopping-place is?

  • I think the concern arises out of the victims of those decisions injured between now and the time when the analysis is either refined or discontinued.


The theory of social contract, roughly, posits that individuals cede freedom to a government in exchange for protection of their “life, health, liberty [and] possessions.” In the modern era, our possessions are no longer just threatened by marauders coming onto our land to steal our livestock or finery. As has been discussed, data mining may be used to deprive citizens of their possessions through an unconscionable pressure to buy, borrow or bargain. In the absence of an eleemosynary abandonment of data mining by corporations, it is the government’s place to restrict its use. Such restriction may be achieved through a federally mandated informed consent opt-in requirement coupled with an open-source statute.

  • Lots of forms of persuasion are used to separate fools from their money. False statements materially relied upon are fraud, which can surely be constitutionally prohibited. But how do you constitutionally prohibit learning facts about people legitimately, making inferences from those facts and trying to convince them of something, regardless of whether it is profitable to you?

* I would think that, at the very least, the opt-in requirement would allow consumers to consider the price of some of the conveniences they have gained from living in the internet society. While it may be the that the corporations are learning this information legitimately, the opt-in requirement would allow people to be more careful in disclosing it, as well as more able to recognize the fruits of that disclosure. I was a toy store cashier once. We had to ask everyone's zip code when they checked out. Almost every customer asked why we were gathering that information, and most didn't consent to giving it. I think that at least letting people know data is being culled will have some effect on behavior.

The informed consent opt-in requirement to data mining would protect consumers while respecting their autonomy, as well as mesh with familiar contract principles. Such a requirement, represented by a uniform, concise and explanatory terms of use would empower individuals to make knowledgeable decisions regarding dissemination of their private information while avoiding excessive nannyism. Also, the requirement would allow the parties to bargain more fairly. Corporations that sell information without individuals’ consent receive a windfall unknown to those individuals. An opt-in requirement would cause those corporations to ‘pay for’ that windfall. Moreover, an open-source statute mandating the publication of the internal code indicating how private data is shared after collection would further reduce the present asymmetry of information between corporations and individuals and additionally provide for informed bargaining.

  • Opt-in for what? Can you prohibit someone from thinking about available data unless someone else gives permission? What do you assume the constitutional limits on regulation here are?

* As discussed in my reply above, I would center the regulations on how much information can be gathered without informing the consumer that it is being collected and analyzed. I agree that there is not much you can do to say don't consider this information once it's out there. Obviously, that would lead to absurd results. Would the car salesman not be able to consider the clothes, watch and shoes of two different people coming into his showroom in deciding whom to pitch first and which car to show which customer?


Data mining allows corporations to exploit and commoditize individuals, thereby threatening individuals' autonomy and possessions. Thus, the government should require heightened transparency to allow individuals to protect their identity and property.

I'm curious how you think government is going to be motivated to enact strong privacy legislation. It seems like all of the money and the power is in the hands of data miners. With such enormous commercial incentives to have loose privacy protections, wouldn't industry lobbyists be able to stop any Congressional proposals?

People just don't seem very fired up about privacy protection. Wouldn't something drastic need to happen to change that?

-- GavinSnyder - 29 Nov 2009

Thanks for the comment! I would answer your question in two ways. First, Congress seems to have already taken up the issue in the Gramm-Leach-Bliley Act, that I have yet to link to in my article. Here is a link to the press release: GLB Act. Second, I think Professor Moglen was making the point in class a few weeks ago that something 'drastic' did happen to get people fired up about privacy protection; specifically, he seemed to argue (and my article agrees) that data mining was one of a number of causes of the recent financial crisis. Still, I agree with you that the organized power ostensibly lies with those opposed to privacy protections.

-- BradEhrlichman


I hope you do not mind my commenting (and editing a link in your reply to Gavin). If you would like other comments, you might consider adding a comment box (just add the word %COMMENT% to the very bottom of the page if you do want such a box).

The reason I am commenting is because I agree with your analysis that the decline in privacy is significant, and as I have said elsewhere, I agree that opt-in is perhaps the most viable solution. But my bigger concern is, as Gavin suggested, that people just don't seem concerned about privacy loss. Have you run into much literature suggesting otherwise?

-- BrianS - 2 Dec 2009


