But companies are not the only ones that have policies about privacy – most individuals have these policies too, although they have probably never written them down. Web users have their own individual conceptions of whether and how they would like data about them to be collected, used, stored, and shared. These preferences apply to the same kinds of issues that corporate privacy policies address – whether data collected to administer a service can also be used for marketing, how long data is retained after the service is administered, and so forth. But in some cases users’ preferences will differ from those of the companies collecting the information: users may prefer not to have their data shared for marketing or retained beyond their use of the service, whereas companies may prefer the opposite.
Historically, companies that collect user data have ultimately been in control of what happens to that data, which is one of the reasons why most discussions about privacy policies have centered around the documented policies of companies. But CDT believes that users’ notions of their own privacy preferences have an important role to play in shaping and shifting the web privacy paradigm, which in recent years has done little to serve the privacy interests of Internet users. Our motivations for pursuing this approach are discussed in a separate position paper, "Binding Privacy Rules to Data: Empowering Users on the Web."
This position paper proposes a way for users to express their own privacy policies, known hereafter as “privacy rulesets.” It is based in large part on a proposal we made to the Device APIs (DAP) working group on this topic [PRIV-RULESETS], with additional explanations to help those who have not been following DAP to understand the proposal. To help illustrate some of the contexts in which privacy rulesets could be used, the examples in Section 4 make mention of relevant DAP APIs.
The goal of this proposal is to define (1) a simple set of privacy attributes (or "rules") and (2) the combinations of those rules (or "rulesets") that will be of most use to web users. The rules describe the most common aspects of privacy about which users might want to express their preferences. The rulesets group together multiple rules.
The model of Creative Commons licenses [CC-ABOUT], while not aimed at privacy, serves as an inspiration for the privacy rulesets approach. Creative Commons offers four simple license conditions (Attribution, Share Alike, Non-Commercial, and No Derivative Works) that users can combine to form licenses for creative works. The experience of Creative Commons demonstrates that three constraints are key for any user-driven policy expression scheme: the set of policies should be small, simple, and comprehensible to both users and developers. The experience of the many previous efforts at encapsulating the privacy policies of companies -- not users -- in reduced form also seems to support the notion that a minimal set of simplistic, comprehensible policies has the greatest chance of success [P3P11][GEOPRIV-ARCH][ID-MGM][LIC-PRIV][FIN-PRIV-NOTICE][MOZ-ICONS][PRIV-ICONS][PRIV-ICONSET][PRIV-LABEL].
The scheme proposed below for defining privacy rulesets covers three elements of privacy that seem to be of high concern to users, can be encapsulated in brief form, and have been addressed by similar previous efforts:
secondary use, and
retention of user data. Each element has three possible attributes. When one or more of these attributes are combined, they produce a privacy ruleset.
The scheme assumes that a ruleset would be conveyed together with one or more specific user data items, such as the user's email address or current location, that get shared with a company or other organization (known as the "data collector") via a web-enabled interaction (referred to as the "current interaction" below). The ruleset would be meant to convey to the data collector what the user's preferences are about the data being conveyed. A given ruleset is meant to govern only the data that gets conveyed with it.
This proposal focuses on the semantics of the rules and rulesets, leaving implementation details for further exploration at the workshop or elsewhere. There are many different potential contexts in which privacy rulesets could be used and conveyed: they could be sent together with information that users submit into web forms or bundled together with data that is automatically sent to web sites and apps on behalf of the user. There are also many potential mechanisms for conveying privacy rulesets, including as application header fields, URI parameters, or as parameters in web API functions (including those being standardized by the DAP and Geolocation WGs). The rulesets could be expressed as combinations of two-letter codes (as in Creative Commons), in a markup language, or in any other variety of ways; they could be managed and transmitted explicitly by user agents or they could reside remotely and get conveyed via URIs. They could have standard UI elements or images associated with them (again as in Creative Commons), or they could be strictly text-based. All of these considerations provide fodder for further discussion.
For simplicity, the rulesets only apply to identified data -- information that can reasonably be tied to an individual. What data collectors do with other kinds of data that is not linkable to an individual or is held in the aggregate is out of scope.
The ruleset proposal presupposes that users are aware that their data is being conveyed to a web site or application and that they have provided the appropriate permission for this to occur (known as "notice and consent" in privacy terminology). In other words, the rulesets work in tandem with -- but do not replace -- the requirement that sites and applications afford users the opportunity to make informed decisions prior to revealing information. The policies expressed by the rulesets govern what happens to data after users have made such decisions and allowed their data to be conveyed to sites and applications.
The elements and their attributes are defined below.
sharing element addresses whether user data will be transmitted outside of the organization that is the data collector. The
sharing attributes are as follows:
internal: The data can be shared internally within the data collector's organization and with other organizations that help the data collector provide the service requested in the current interaction.
affiliates: The data can be shared with other organizations that the data collector controls or is controlled by.
unrelated-companies: The data can be shared outside of the data collector's organization with other organizations that it does not control and is not controlled by.
public: The data can be made public.
It is important to note that none of the
sharing attributes are mutually exclusive -- any of them may be combined to form more permissive grants of sharing abilities than any single one of them on its own.
In privacy discussions, a distinction is usually made between the "primary" uses of data (uses directly necessary for completing the user's current interaction) and "secondary" uses of data (all other uses). Users may be interested in limiting secondary uses while facilitating primary uses.
It can sometimes be difficult to distinguish between primary uses and secondary uses. What users believe to be primary uses and what applications providers believe to be primary uses are not always the same, because all of the functionality that contributes to being able to provide a particular application or service is not always evident to users. The attributes below are crafted with the user's conception of secondary use in mind, and therefore attempt to cover all uses of user data that users might want to express a preference about (without making the attributes overly granular).
secondary use attributes are as follows:
contextual: The data may only be used for the purpose of completing the current interaction. Contextual uses may include securing, troubleshooting or improving the service being provided or providing advertising in the context of the current interaction.
customization: The data may be used to customize, personalize, or otherwise tailor the current interaction for the user.
marketing-or-profiling: The data may be used for marketing and/or profiling purposes. Marketing may occur over time and via any channel (web, email, telemarketing, etc.). Profiling involves the creation of a collection of information about an individual and applies to profiles created for any purpose other than customization (e.g., for research, to sell to other organizations, etc.).
None of the
secondary use attributes are mutually exclusive; multiple of them can be combined to grant multiple secondary use permissions.
Retention addresses users' preferences about how long data collectors keep the data they collect. The fact that most web servers automatically record logs of user activity -- and that many of these logs are never deleted -- can complicate the task of having applications abide by user-defined retention policies. The retention attributes defined below assume that as a general matter, all data collectors may retain user data for a baseline period of 35 days for the purposes of maintenance, security, and troubleshooting. The attributes express user preferences that apply to retention practices that go beyond this baseline period.
retention attributes are as follows:
no: The data may only be retained for the baseline period.
short: The data may be retained beyond the baseline period, but only for a limited time.
long: The data may be retained beyond the baseline period for an unspecified or indefinite amount of time.
retention attributes are mutually exclusive.
The attributes listed above could be combined in many different combinations. Not all of them are possible or sensical (i.e., allowing marketing-or-profiling but not retention), and like Creative Commons licenses, there are likely only a handful that users would want to employ regularly. A list of these potentially common rulesets is proposed below.