Computers, Privacy & the Constitution

Resolving the Paradox: Public Collaboration, Data Sharing, and Privacy

-- By LuisVilla - 27 Feb 2008

Collaboration in the Public Eye

"[W]e all derive immense benefit from reading one another's work..." -- CompPrivConst? wiki

The desire for privacy and the desire to share and collaborate have always been in tension- collaboration requires knowledge of one's collaborators and committing ideas and information to permanent media, both of which can compromise privacy. This problem is compounded in the networked age. Exposing all of one's ideas to the world makes it possible for others to discover and build on your work, and sharing identity makes it possible to form meaningful creative communities with others. Of course, sharing so much makes it possible to collect and reframe that same information. Exploring this tension may suggest useful ways to think about privacy in the networked era.

Free Software is at the edge of this phenomenon. Free Software developers casually and knowingly scatter information about themselves all over the public internet- commits, bug reports, mailing list discussions, and (more recently) blog posts are all available as side effects of the process of collaboration. The wide availability of this information often makes it easier for newcomers to get involved, since they can easily see what has been done and who to contact about it. It also contributes to a sense of community, since contributions are made by real people rather than faceless automatons. On the flip side, this data can be tied together to form a dynamic picture of every contributor- with or without the contributor's permission. Despite the previous public availability of all of this data, such aggregation has been controversial.

What then to make of it? The publication of the data has undeniable benefits; it allows for accountability, communication, and community building. (It is telling that every major free software project with a written privacy policy has a section which basically says 'if you're a developer, all bets are off.') Yet there is a strong instinct that a line can be crossed, and that innocuous dilute public data can become problematic concentrated public data.

Examining what triggers this instinctive distrust should be instructive and may help us understand what to emphasize when interpreting the fourth amendment. Aggregation is critical- when we publish the data, we assume that it will stay roughly as fragmented and meaningless as it is when we publish it. Vast simplification of this process (like ohloh or facebook) violates this assumption. The instinct also stems from fixation: data that intuitively seemed impermanent (or at least, published in ways that are under personal control, like a blog) is suddenly both permanent and uncontrollable. And it grows from quality and context: many of the complaints about ohloh stemmed not so much from the aggregation of the data, but by the insertion of the data into a 'ranked' comparison which not only made the social suddenly competitive but also increased the salience of errors in the data.

So what interpretations does this suggest? First, it suggests that the critical act is the analysis and combination with other data sources that changes the nature of the data from innocuous to potentially problematic. Unfortunately, this focus on the transformative act does not map terribly well to the Constitution's 'search and seizure' language; one could analogize the act of data collection to seizure, in that both are prerequisites to investigation of the collected/seized data/goods but the analogy would not focus on the truly significant issue. Orin Kerr, in his 'Searches and Seizures in a Digital World', suggests that this problem can be dealt with by defining 'search' to include any human interaction with the aggregated data. This definition has the benefit of being simple, but also allows the unfettered creation of searchable databases for use at any later time or for any later purpose. A more aggressive definition might focus on the transformative act and try to define search to include the modification or combination of any personal data- bringing search closer to notions of 'investigation.' While such a rule would be difficult to cleanly define judicially, it would have the benefit of allowing targeted searches of existing databases ('show me what the suspect was using his credit card for on day X') while prohibiting wide dragnets that would require transformation and aggregation of multiple databases ('show me anyone whose data is suspicious.')

Secondly, the analysis suggests that the context and scope in which collected data are used is critical- in the ohloh case, what would often be allowed for academic researchers became problematic when used in potentially misleading public rankings. While looking at the problem from this perspective certainly fits with the EU's rules on data collection, it isn't clear that the search and seizure metaphor can be recast to prevent the US government from doing this. One possible tack would be to focus on the change in control over the data and the resulting possibility for chilling effects, rather than the traditional focus on seizure as deprivation of property. If taking this tack, the focus on context would be critical- since there is no obvious damage done by mere copying of data, there must be some other damage in order to show courts that a 'seizure' has occurred. The change in context could provide that hook.

Finally, this example reminds us that publishing data without fear of coercive surveillance is an important part of collaboration and creation. This suggests that there is a case to be made that the ability to publish freely is a critical part of personhood and hence can be used to extend the notion of 'person' in the fourth amendment. It may be that the non-commercial context of data sharing with friends and the broader public may differentiate this from the commercial and private settings where the courts have been reluctant to extend the reach of the fourth amendment.


Webs Webs

r7 - 23 Jan 2009 - 15:29:48 - IanSullivan
This site is powered by the TWiki collaboration platform.
All material on this collaboration platform is the property of the contributing authors.
All material marked as authored by Eben Moglen is available under the license terms CC-BY-SA version 4.
Syndicate this site RSSATOM