=The Importance of Privacy Hooks for Advanced Web APIs= Nick Doty and Deirdre K. Mulligan UC Berkeley, School of Information July 12, 2010 Privacy on the Web today depends largely on long, out-of-the-way privacy policies and the mantra of "notice and consent". Many consider the existing system unsatisfying [1], but as web sites get more automated access to more potentially sensitive personal information (locations, hard drive files, webcam pictures), the failures of ignored privacy policies become potentially extreme. Privacy cannot be protected by a technical infrastructure alone, any more than it could be legislated into existence. However, we believe that by providing technical hooks for the expression of policies or requirements, technical standards can meaningfully promote privacy protection on the Web. Our research on the W3C Geolocation API has shown that users are rarely presented with a clear idea about how their location information will be stored, shared or used when a web site requests it from them [2]. This problem may be exacerbated in mobile devices (a prime use case for location-based services), where modal or display-blocking dialogs may prevent users from exploring the site to infer how their location will be used [3]. Many sites prompt the user for their location immediately upon loading a page; no sites that we have seen have taken advantage of the rich capabilities of HTML and JavaScript to proactively disclose their policies to their users. Although some web sites explain their usage practices in the associated privacy policy, this can be hard to reach or read, particularly with a mobile device. Work has been done both in the past [4] and more recently [5] to simplify and make machine-readable these long and rarely-read privacy policies, with the promise of browsers that can summarize and prominently display the important points for users. However, privacy policies are inherently complex documents that must cover various types of information (logging data, user-provided content, behavioral tracking, etc.) from a multitude of possible interactions that users might have with a web property. Why not instead make these informational prompts just-in-time (like CMU's proposed "privacy nudges" [6]) and tied to the specific piece of sensitive information? Privacy hooks in advanced Web APIs would let sites explicitly describe their policy regarding sensitive information or let users explicitly express their own privacy preferences per datum. We recognize that neither the W3C nor the web browsers can control or enforce web sites' compliance either with the normative requirements of Web standards or with any policy promises they might make when using such a standard. As in law, where we rely heavily on regulated companies to comply, these policy statements are not self-enforcing. Nevertheless, standards bodies must recognize the interaction between technical standards and market, societal and regulatory forces and be aware of the empirical implications their decisions have on standards adoption and user privacy. Though an API privacy hook that required sites to transmit a usage notification would not be self-enforceable in the way that encryption or DRM might be, forcing web sites to make this statement of their policy would encourage competition between sites on their privacy practices and allow legal remedies against sites that mislead users. Critics might liken privacy hooks to the satirical 'evil bit' RFC [7], but in dealing with complex issues like privacy, a policy bit is not a naive attempted cure-all, but rather a technical means of enabling a richer regulatory structure. Though a malicious attacker has no incentive to be truthful about the evil bit, established web sites have PR, competitive and legal incentives to respect a user-specified no-retransmission flag and a privacy hook in the API would provide a level of non-repudiation. Given that W3C specification documents themselves may not ever be read by web developers (as opposed to web browser developers), adding required privacy hooks to APIs would act as a forcing function for web developers and site owners to make them aware of normative privacy requirements in Web standards. Furthermore, a technical standard developed by the consensus of industry, academic, governmental and consumer advocacy groups provides a useful guidepost for regulators. Providing a robust framework for the expression of privacy policy could preempt additional regulation, which would impose variable and potentially cumbersome requirements on web site developers. In Europe, the Article 29 Working Party has already concluded that acceptance of general terms and conditions does not qualify as consent for collecting or processing location information [8]; in the US, location information from telecommunication providers has special protection in 47 USC 222 and proposed legislation would classify location information as "sensitive" and require separate "express opt-in consent" to collect, process or distribute it [9]. Empirical evidence from the Geolocation API suggests that web sites are unlikely to proactively inform users of their privacy practices, even with sensitive information. As a result, we believe that the W3C can increase user privacy around increasingly sensitive personal data by providing privacy hooks in advanced Web APIs. Though not self-enforcing, expressions of policy transmitted via an API can fulfill a valuable forcing function in making web site developers consider, express and accept statements of privacy policy. While Web standards and privacy hooks cannot alone ensure user privacy on the Web, they can support privacy by enabling both legal enforcement and market competition. [1] Steve Lohr. "Redrawing the Route to Online Privacy". February 27, 2010. http://www.nytimes.com/2010/02/28/technology/internet/28unbox.html [2] Nick Doty, Deirdre K. Mulligan and Erik Wilde. "Privacy Issues of the W3C Geolocation API". February 2010. http://escholarship.org/uc/item/0rp834wf We've recently begun analyzing a much larger sample of web sites using the Geolocation API (thousands of instances pulled from a corpus of billions of websites); very preliminary investigations show a similar situation to the previous study. [3] Marcos Cáceres. "Privacy of Geolocation Implementations". http://www.w3.org/2010/api-privacy-ws/papers/privacy-ws-21.pdf [4] Platform for Privacy Preferences (P3P) Project. http://www.w3.org/P3P/ [5] Aza Raskin and Arun Ranganathan. "Privacy: A Pictographic Approach". http://www.w3.org/2010/api-privacy-ws/papers/privacy-ws-22.txt [6] CyLab Usable Privacy and Security Laboratory. http://cups.cs.cmu.edu/ [7] http://en.wikipedia.org/wiki/Evil_bit [8] "Working Party 29 Opinion on the use of location data with a view to providing value- added services". November 2005. http://ec.europa.eu/justice_home/fsj/privacy/docs/wpdocs/2005/wp115_en.pdf [9] "To require notice to and consent of an individual prior to the collection and disclosure of certain personal information relating to that individual. STAFF DISCUSSION DRAFT". May 3, 2010. http://www.boucher.house.gov/images/stories/Privacy_Draft_5-10.pdf