Privacy Rulesets: A User-Empowering Approach to Privacy on the Web

W3C Privacy Workshop
July 13-14, 2010

Alissa Cooper, John Morris, and Erica Newland
Center for Democracy & Technology

1 Introduction

In the convential sense, the term “privacy policy” usually describes a company’s statement about what it will or will not do with user or customer data that it collects. Privacy policies have become ubiquitous on the web; most web sites and web applications that are run by organizations have them posted on their sites.

But companies are not the only ones that have policies about privacy – most individuals have these policies too, although they have probably never written them down. Web users have their own individual conceptions of whether and how they would like data about them to be collected, used, stored, and shared. These preferences apply to the same kinds of issues that corporate privacy policies address – whether data collected to administer a service can also be used for marketing, how long data is retained after the service is administered, and so forth. But in some cases users’ preferences will differ from those of the companies collecting the information: users may prefer not to have their data shared for marketing or retained beyond their use of the service, whereas companies may prefer the opposite.

Historically, companies that collect user data have ultimately been in control of what happens to that data, which is one of the reasons why most discussions about privacy policies have centered around the documented policies of companies. But CDT believes that users’ notions of their own privacy preferences have an important role to play in shaping and shifting the web privacy paradigm, which in recent years has done little to serve the privacy interests of Internet users. Our motivations for pursuing this approach are discussed in a separate position paper, "Binding Privacy Rules to Data: Empowering Users on the Web."

This position paper proposes a way for users to express their own privacy policies, known hereafter as “privacy rulesets.” It is based in large part on a proposal we made to the Device APIs (DAP) working group on this topic [PRIV-RULESETS], with additional explanations to help those who have not been following DAP to understand the proposal. To help illustrate some of the contexts in which privacy rulesets could be used, the examples in Section 4 make mention of relevant DAP APIs.

2 Privacy Rulesets

The goal of this proposal is to define (1) a simple set of privacy attributes (or "rules") and (2) the combinations of those rules (or "rulesets") that will be of most use to web users. The rules describe the most common aspects of privacy about which users might want to express their preferences. The rulesets group together multiple rules.

The model of Creative Commons licenses [CC-ABOUT], while not aimed at privacy, serves as an inspiration for the privacy rulesets approach. Creative Commons offers four simple license conditions (Attribution, Share Alike, Non-Commercial, and No Derivative Works) that users can combine to form licenses for creative works. The experience of Creative Commons demonstrates that three constraints are key for any user-driven policy expression scheme: the set of policies should be small, simple, and comprehensible to both users and developers. The experience of the many previous efforts at encapsulating the privacy policies of companies -- not users -- in reduced form also seems to support the notion that a minimal set of simplistic, comprehensible policies has the greatest chance of success [P3P11][GEOPRIV-ARCH][ID-MGM][LIC-PRIV][FIN-PRIV-NOTICE][MOZ-ICONS][PRIV-ICONS][PRIV-ICONSET][PRIV-LABEL].

The scheme proposed below for defining privacy rulesets covers three elements of privacy that seem to be of high concern to users, can be encapsulated in brief form, and have been addressed by similar previous efforts: sharing, secondary use, and retention of user data. Each element has three possible attributes. When one or more of these attributes are combined, they produce a privacy ruleset.

The scheme assumes that a ruleset would be conveyed together with one or more specific user data items, such as the user's email address or current location, that get shared with a company or other organization (known as the "data collector") via a web-enabled interaction (referred to as the "current interaction" below). The ruleset would be meant to convey to the data collector what the user's preferences are about the data being conveyed. A given ruleset is meant to govern only the data that gets conveyed with it.

This proposal focuses on the semantics of the rules and rulesets, leaving implementation details for further exploration at the workshop or elsewhere. There are many different potential contexts in which privacy rulesets could be used and conveyed: they could be sent together with information that users submit into web forms or bundled together with data that is automatically sent to web sites and apps on behalf of the user. There are also many potential mechanisms for conveying privacy rulesets, including as application header fields, URI parameters, or as parameters in web API functions (including those being standardized by the DAP and Geolocation WGs). The rulesets could be expressed as combinations of two-letter codes (as in Creative Commons), in a markup language, or in any other variety of ways; they could be managed and transmitted explicitly by user agents or they could reside remotely and get conveyed via URIs. They could have standard UI elements or images associated with them (again as in Creative Commons), or they could be strictly text-based. All of these considerations provide fodder for further discussion.

3 Scope

For simplicity, the rulesets only apply to identified data -- information that can reasonably be tied to an individual. What data collectors do with other kinds of data that is not linkable to an individual or is held in the aggregate is out of scope.

The ruleset proposal presupposes that users are aware that their data is being conveyed to a web site or application and that they have provided the appropriate permission for this to occur (known as "notice and consent" in privacy terminology). In other words, the rulesets work in tandem with -- but do not replace -- the requirement that sites and applications afford users the opportunity to make informed decisions prior to revealing information. The policies expressed by the rulesets govern what happens to data after users have made such decisions and allowed their data to be conveyed to sites and applications.

4 Privacy Elements

The elements and their attributes are defined below.

4.1 Sharing

The sharing element addresses whether user data will be transmitted outside of the organization that is the data collector. The sharing attributes are as follows:

internal: The data can be shared internally within the data collector's organization and with other organizations that help the data collector provide the service requested in the current interaction.

Example: A user uses a voice search service on her mobile device (perhaps using the Media Capture API). The voice capture gets shared only with the organization that provides the app and its partner company that provides the search results, but not with any other company.

affiliates: The data can be shared with other organizations that the data collector controls or is controlled by.

Example: A user provides a photo (perhaps captured with the Media Capture API) to Flickr and that photo gets shared with Yahoo (Yahoo owns Flickr).

unrelated-companies: The data can be shared outside of the data collector's organization with other organizations that it does not control and is not controlled by.

Example: A user provides her contact details (perhaps obtained through the Contacts API) to an application provider and that application provider shares them with other unaffiliated companies, like direct marketers or credit reporting agencies.

public: The data can be made public.

Example: A user uses a calendar application (perhaps employing the Calendar API) to post an event on a public web site.

It is important to note that none of the sharing attributes are mutually exclusive -- any of them may be combined to form more permissive grants of sharing abilities than any single one of them on its own.

4.2 Secondary Use

In privacy discussions, a distinction is usually made between the "primary" uses of data (uses directly necessary for completing the user's current interaction) and "secondary" uses of data (all other uses). Users may be interested in limiting secondary uses while facilitating primary uses.

It can sometimes be difficult to distinguish between primary uses and secondary uses. What users believe to be primary uses and what applications providers believe to be primary uses are not always the same, because all of the functionality that contributes to being able to provide a particular application or service is not always evident to users. The attributes below are crafted with the user's conception of secondary use in mind, and therefore attempt to cover all uses of user data that users might want to express a preference about (without making the attributes overly granular).

The secondary use attributes are as follows:

contextual: The data may only be used for the purpose of completing the current interaction. Contextual uses may include securing, troubleshooting or improving the service being provided or providing advertising in the context of the current interaction.

Example: A user sets reminders for upcoming events using a web-based calendar application (perhaps using the Calendar API). The application uses the events data to deliver the reminders and to serve a contextual ad when the user sets a reminder.

customization: The data may be used to customize, personalize, or otherwise tailor the current interaction for the user.

Example: A user records songs that he or she hears using a web application (perhaps employing the Media Capture API). The application identifies and uses the recorded songs to suggest new music that the user may be interested in.

marketing-or-profiling: The data may be used for marketing and/or profiling purposes. Marketing may occur over time and via any channel (web, email, telemarketing, etc.). Profiling involves the creation of a collection of information about an individual and applies to profiles created for any purpose other than customization (e.g., for research, to sell to other organizations, etc.).

Example: A user sets reminders for upcoming events using a web-based calendar application (perhaps using the Calendar API). The application uses the events data to deliver the reminders and to serve ads based on all of the user's reminders.

None of the secondary use attributes are mutually exclusive; multiple of them can be combined to grant multiple secondary use permissions.

4.3 Retention

Retention addresses users' preferences about how long data collectors keep the data they collect. The fact that most web servers automatically record logs of user activity -- and that many of these logs are never deleted -- can complicate the task of having applications abide by user-defined retention policies. The retention attributes defined below assume that as a general matter, all data collectors may retain user data for a baseline period of 35 days for the purposes of maintenance, security, and troubleshooting. The attributes express user preferences that apply to retention practices that go beyond this baseline period.

The retention attributes are as follows:

no: The data may only be retained for the baseline period.

Example: A user uses a webcam service (perhaps employing the Media Capture API). The video data is not retained after 35 days.

short: The data may be retained beyond the baseline period, but only for a limited time.

Example: A user uses a web application (perhaps invoking the Media Capture API) that provides a voice search service. The voice searches are retained for 90 days to optimize search results.

long: The data may be retained beyond the baseline period for an unspecified or indefinite amount of time.

Example: A user drafts SMS messages using a web application (perhaps invoking the Messaging API). Those draft SMS messages are retained indefinitely until the user deletes them.

The retention attributes are mutually exclusive.

5 Privacy Rulesets

The attributes listed above could be combined in many different combinations. Not all of them are possible or sensical (i.e., allowing marketing-or-profiling but not retention), and like Creative Commons licenses, there are likely only a handful that users would want to employ regularly. A list of these potentially common rulesets is proposed below.

Least permissive: sharing=internal secondary use=contextual retention=no: The least permissive ruleset says that the user wants her data shared only internally by the data collector and organizations that help the data collector deliver the service, only used for contextual purposes (which includes contextual advertising), and not retained beyond the baseline period.
Internal customization/personalization: sharing=internal secondary use=customization retention=short: Some users may want to permit their data to be used internally by the data collector to do individualized analytics or provide some personalization based on recent activity, but not for marketing purposes. This ruleset, which allows data to be retained for a limited period and used for customization but not shared, corresponds to that set of preferences.
Profile-based advertising: sharing=internal secondary use=marketing-or-profiling retention=long: If users want to allow the data collector to use their data in profiles that are later used to target ads back to them, this ruleset would allow for that, with sharing still limited for internal use but with marketing, profiling, and retention allowed.
Public: sharing=public secondary use=contextual retention=long: This ruleset lets users express their permission to have their data shared publicly, but not used by the data collector for non-contextual purposes.
Most permissive: sharing=internal sharing=affiliates sharing=unrelated-companies secondary use=contextual secondary use=customization secondary use=marketing-or-profiling retention=long: The most permissive ruleset allows all three kinds of sharing, all three kinds of secondary use, and indefinite retention.

    
    



    
    
      Glossary
      
      

     affiliate
      An organization that controls, is controlled by, or is under common control with another organization. This comports with the [AD-INDUSTRY]'s definition of this term.
     data collector
      The organization that owns or otherwise controls the web site or application with which the user interaction occurs. 
     identified data
      Information that can reasonably be tied to an individual. See [[P3P11]]'s definition of this term.
     privacy ruleset
      A combination of privacy rules describing the user's preferences about the sharing, secondary use, and retention attributes of his or her data.
     primary use
      A use of data that is directly necessary to complete the user's interaction with the web site or application.
     profile
      A collection of data about an individual.
     secondary use
      Any use of the user's data other than the primary use(s).
     unrelated company
      Any organization that is distinct from the data collector and is not an affiliate of the data collector.
     
      
    
    
	
      References

	
      

     	[AD-INDUSTRY]
		American Association of Advertising Industries, et al. Self-Regulatory Principles for Online Behavioral Advertising July 2009. URI: http://www.iab.net/media/file/ven-principles-07-01-09.pdf
		
		[CC-ABOUT]
		Creative Commons: About Licenses URI: http://creativecommons.org/about/licenses/
		
		[FIN-PRIV-NOTICE]
		Evolution of a Prototype Financial Privacy Notice  Kleimann Communications Group, Inc.. 28 February 2006. URI: http://www.ftc.gov/privacy/privacyinitiatives/ftcfinalreport060228.pdf
		
		[GEOPRIV-ARCH]
		Barnes, R. Lepinski, M. Cooper, A. Morris, J. Tschofenig, H. Schulzrinne, H. An Architecture for Location and Location Privacy in Internet Applications (Internet Draft) 27 May 2010. URI: http://tools.ietf.org/html/draft-ietf-geopriv-arch-02
		
		[ID-MGM]
		Rundle, M International Data Protection and Digitial Identity Management Tools Internet Governance Forum 2006 - Athens, Privacy Workshop I. 31 October 2006. URI: http://identityproject.lse.ac.uk/mary.pdf
		
		[LIC-PRIV]
		Berjon, R. License-based Privacy: Technical Aspects 19 April 2010. W3C Internal Document URI: http://dev.w3.org/2009/dap/docs/privacy-license.html
		
		[MOZ-ICONS]
			
Martin, J. Raskin, A. Gelman, L. Rood, D. Surman, M. Hadfield, G. Greant, Z. Privacy Icons Mozilla Wiki. 6 March 2010. URI: https://wiki.mozilla.org/Drumbeat/Challenges/Privacy_Icons">
		
		[P3P11]
			Matthias Schunter, Rigo Wenning. The Platform for Privacy Preferences 1.1 (P3P1.1) Specification. 13 November 2006. W3C Note. URL: http://www.w3.org/TR/2006/NOTE-P3P11-20061113
		
		[PRIV-ICONS]
			
Raskin, A. The 7 Things that Matter Most in Privacy 31 March 2010. URI: http://www.azarask.in/blog/post/what-should-matter-in-privacy
		
		[PRIV-ICONSET]
			Mehldau, M. Iconset for Data-Privacy Declarations v0.1 URI: http://www.netzpolitik.org/wp-upload/data-privacy-icons-v01.pdf
		
		[PRIV-LABEL]
		Kelley, P. Bresse, J. Cranor, L. Reeder, R. A 'Nutrition Label' for Privacy Carnegie Mellon University. 10 November 2009. URI: http://cups.cs.cmu.edu/soups/2009/proceedings/a4-kelley.pdf
		
		[PRIV-RULESETS]
      Cooper, A., Morris, J., Newland, E. Privacy Rulesets 1 June 2010. W3C Editor's Draft. URI: http://dev.w3.org/2009/dap/privacy-rulesets/