Book Worms: Privacy in Digital Libraries

-- By BrianS - 27 Feb 2010

The Spark: Google Books

In 2002, Google took its first steps towards digitizing every book on the planet by launching the Google Books project. The venture involved partnerships with major libraries through which Google scanned millions of books in the libraries' collections. Aside from making Google more money through advertising on display results, the goal of the Google Book Project was to create a searchable database for the full text of the world's books. Google's digital library would include books that were out of print or otherwise lost in time and, through partnerships with rightholders, the project would also include readable versions of in-print books. Readers would also be able to buy and store books through Google, and even make notations in the digital margins. Google was building a digital Library of Alexandria.

The Smoke: Digital Libraries

The possibilities for digital libraries are incredible; full-text search and nearly limitless preservation are just the tip of the iceberg. Beneath the surface are innovations like the simplification of personal digital libraries and tools that map book contents onto real space (and vice versa). Other tools will follow. Most importantly, however, digital libraries dramatically increase access to information through their online interfaces and, in Google's case, through public terminals in libraries. And Google is not the only digital library. Project Gutenberg, the Universal Digital Library, the World Digital Library, and Europeana are just a few examples of other digital libraries under construction.

As it has elsewhere, however, technological evolution empowers social advancement while simultaneously endangering user privacy. The power to follow your friends' lives is also the power for companies to datamine your personality and preferences. So too are the mixed blessings of digital libraries. Monitoring what you are reading, what terms you search for within books, how long you spend on each page, and what you write in the margins are simple tasks for a digital library. That information might reveal your political inclinations, your sexual orientation, your medical conditions, or a variety of other sensitive facts. Is this power to read over your shoulder concerning?

The Fire: Privacy Protections and You

The First Amendment protects the right to receive information and ideas. See, e.g., Stanley v. Georgia, 394 U.S. 557, 564 (1969) (“It is now well established that the Constitution protects the right to receive information and ideas.”). Similarly, the right to remain anonymous is also protected by the First Amendment. See, e.g., MyIntyre v. Ohio Elections Commission, 514 U.S. 334, 357 (1995) (“Anonymity is a shield from the tyranny of the majority.”). And the Fourth Amendment also protects individuals from unreasonable government searches and seizures.

The Supreme Court, however, has held that individuals do not have an expectation of privacy in records maintained by "third parties" such as a bank, a grocery store, a service provider, or perhaps a digital library. See Smith v. Maryland, 442 U.S. 735 (1979); United States v. Miller, 425 U.S. 435 (1976). Unless regulated by federal or state statute, information is just a subpoena away from the government's hands; the Fourth Amendment is no bar.

Some courts, however, have recognized and enforced the right to read unsurveilled under First Amendment concepts. In Tattered Cover, Inc. v. City of Thornton, 44 P.3d 1044 (Colo. 2002), the Colorado Supreme Court rebuked a search warrant directed to a book store for purchase records. Similarly, in In re Grand Jury Subpoena to Kramerbooks & Afterwords Inc., 26 Med. L. Rptr. 1599 (D.D.C.1998), a federal district court held that the First Amendment imposed limits on a government-sought subpoena to a book store. These cases demonstrate that there can be protections for reading information. However, Tattered Cover grounded its protection in a state constitution and such protection will vary from state to state. Without government-backed protections for users of digital libraries, and given the immense information such entities can provide, the libraries present a Hobson's Choice: do we sacrifice the power to read unsurveilled in the name of expanded information access? Or is there another way?

The Ashes: Conclusion

The present constitutional structure of privacy protections for digital libraries is insufficient. Courts like Kramerbooks had it right in upholding a First Amendment protection to read unsurveiled. Similarly, the third party doctrine embodied in Smith and Miller fails to recognize that "[w]e are becoming a society of records, and these records are not held by us, but by third parties." Daniel J. Solove, Digital Dossiers and the Dissipation of Fourth Amendment Privacy, 75 S. Cal. L. Rev. 1083, 1089 (2002). Until the Supreme Court adopts a First Amendment right of reader privacy or the third party doctrine of the Fourth Amendment changes, digital library patrons must look to statutory protection (of which there are few, see id. at 1138-51), the library's privacy policy, and state constitutions. A broader umbrella - such as a federal statutory regime or a revised approach to the First and Fourth Amendments - is desirable given the vast information digital libraries can collect and the multi-state scope of such libraries.

"Once the government can demand of a publisher the names of the purchasers of his publications, the free press as we know it disappears. Then the spectre of a government agent will look over the shoulder of everyone who reads.... Fear of criticism goes with every person into the bookstall. The subtle, imponderable pressures of the orthodox lay hold. Some will fear to read what is unpopular, what the powers-that-be dislike.... [F]ear will take the place of freedom in the libraries, book stores, and homes of the land. Through the harassment of hearings, investigations, reports, and subpoenas government will hold a club over speech and over the press." United States v. Rumely, 345 U.S. 41, 75-58 (1953) (Douglas, J., concurring).

The spectre is at hand. Until reader privacy is guarded by sturdier stuff digital library users are at risk.

Now put down that book and back away slowly, citizen.

Hey Brian,

I enjoyed your article (including the titles and the punchline at the end). I understand your concerns and I still wonder whether the road that the European Union has chosen for the past fifteen years - i.e. having a data protection regulatory framework which imposes specific obligations to all controllers of third party personal data - bears relatively better results by imposing "privacy-by-design" architectures, data-minimization and purpose limitation principles. It is indeed difficult to ensure the protection of user privacy where the very business model of digital libraries depends on the exploitation and process of their users' personal data.

-- NikolaosVolanis - 24 Mar 2010


I think that the idea of imposing requirements on controllers (and collectors as well) of personal data has merit. I think it'd have to be done by regulations instead of legislation given the speed at which the game changes (but I read your comment to suggest such a model anyway, so we don't seem to disagree). Ultimately, though, I think I agree with the view that the best safeguard is going to be technological tools adopted by end-users. For example, I'm not sure I believe any regulation could sufficiently control Facebook; until something better/safer comes along (i.e. the wall-server) I think it will remain an unraveller of all things private.

-- BrianS - 04 May 2010


This was a fantastic article. I especially enjoyed because it was similarly related to my first article. It is incredibly interesting to see all the ways in which the "third-party" doctrine inhibits privacy in the age of computer technology. I think the "third-party" doctrine is another antiquated doctrine that needs to be abolished or narrowed in its application. The author I recommended for you, Orin Kerr, has a great examination of the "third-party". It's called "A Case for the Third-Party Doctrine." He ultimately concludes that the "third-party doctrine" is still a good thing overall, but despite my disagreement with his conclusion, it has some great analysis of both sides of the argument.

As an aside, do you have any idea if the datamining that is involved with Google Books also occurs with Google Reader and its RSS articles?

-- EdwardBontkowski - 07 May 2010

Oh, by the way here's a direct link for the Orin Kerr article. I forgot to include it.

-- EdwardBontkowski - 07 May 2010

Great, thank you Edward. Unfortunately no, I don't know the extent to which Google collects info re: RSS articles.

-- BrianS - 10 May 2010

In the event you find yourself back this way Edward (or for any other interested folks), I ran into this article on Amazon's collection of some user input information from the Kindle, in case you didn't already see it.

-- BrianS - 12 May 2010


