Requirements for a P3P Query Language

22 November 1998

Lorrie Faith Cranor

Position paper to be presented at QL'98 W3C Query Languages Workshop, Dec. 3-4, 1998. This paper liberally borrows material from the 9 October 1998 APPEL working draft as well as a recent W3C note describing P3P. It reflects the opinions of the author as well as the other members of the P3P Preferences Working Group.

The goal of the Platform for Privacy Preferences (P3P) project is to enable users to exercise preferences over web sites' privacy practices. P3P applications will allow users to be informed about web site practices, delegate decisions to their computer agent when they wish, and tailor relationships with specific sites. The P3P1.0 working draft [P3P10] specifies the protocol for exchanging P3P information between web sites and user agents, and a vocabulary for sites to use to describe their privacy practices. However it places very few requirements on user agent implementations and makes no recommendations about languages for encoding user privacy preferences. The P3P Preferences Working Group has developed a separate working draft that specifies A P3P Preference Exchange Language (APPEL) [APPEL] that can be used to encode and exchange user privacy preferences in the form of rules. APPEL was designed as a special-purpose language for this application; however, more general XML or RDF query languages might also be suitable for this purpose.

This paper is intended to inform the larger query language community about P3P's query language requirements. It presents APPEL as one solution that satisfies P3P's needs. We hope that it will provoke discussion as to whether these needs can be satisfied by a more general query language.

P3P Basics

P3P is designed to help users reach agreements with services (web sites and applications that declare privacy practices and make data requests). As the first step towards reaching an agreement, a service sends a machine-readable proposal in which the organization responsible for the service declares its identity and privacy practices. A proposal applies to a specific realm, identified by a URI or set of URIs. The privacy proposal enumerates the data elements that the service proposes to collect and explains how each will be used, with whom data may be shared, and whether data will be used in an identifiable manner. Proposals can be parse automatically by user agents -- such as web browsers, browser plug-ins, or proxy servers -- and compared with privacy preferences set by the user. If a proposal matches the user's preferences, the user agent may accept it automatically by returning a fingerprint of the proposal, called the propID. If the proposal and preferences are inconsistent, the agent may prompt the user, reject the proposal, send the service an alternative proposal, or ask the service to send another proposal.

A basic P3P interaction might proceed as follows:

The agent requests a web page from a service.
The service includes an HTTP header in its response that contains one or more P3P proposals or a URI from which one or more P3P proposals may be retrieved. (Alternatively a P3P proposal or a link to a P3P proposal may be embedded in the returned content rather than in an HTTP header.)
The agent evaluates the proposal according to the user's privacy preference ruleset and determines what action to take (e.g., deny, accept, prompt, or send a counter proposal).
If the proposal is consistent with the user's preferences (either because it was accepted by the ruleset or because the user was prompted and agreed to accept the proposal) an agreement is reached and the agent sends the service a propID.
The service sends the contents of the web page.
The agent displays the web page for the user.

The proposal may include requests for specific data elements from the user. If these elements are already stored in the user's data repository, and if the request is consistent with the user's ruleset, the agent may send these elements to the service. If these elements are not in the repository, the agent may prompt the user to type in this information.

The following is an example of a P3P proposal:

P3P Proposal Example

English language version:

CoolCatalog, makes the following statement for the web pages at http://www.CoolCatalog.com/catalogue/. We collect clickstream data in our HTTP logs. We also collect your first name, age, and gender to customize our catalog pages for the type of clothing you are likely to be interested in and for our own research and product development. We do not use this information in a personally-identifiable way. We do not redistribute any of this information outside of our organization. We do not provide access capabilities to information we may have from you, but we do have retention and opt-out policies that you can read about at our privacy page http://CoolCatalog.com/PrivacyPractice.html. The third party PrivacySeal.org provides assurance that we abide by this agreement.

Abridged P3P syntax version:

<PROP realm="http://CoolCatalog.com/catalogue/" entity="CoolCatalog" propID="94df1293a3e519bb"> <USES> <STATEMENT purpose="1" recipient="0" id="0"> <REF name="Web.Abstract.ClientClickStream"/> </STATEMENT></USES> <USES> <STATEMENT purpose="2,3" recipient="0" id="0" consequence="a site with clothes you'd appreciate."> <WITH><PREFIX name="User."> <REF name="Name.First"/> <REF name="Bdate.Year" OPTIONAL="1"/> <REF name="Gender"/> </PREFIX></WITH> </STATEMENT></USES> <DISCLOSURE discURI="http://CoolCatalog.com/PrivPractice.html" access="3" other="0,1"/> <ASSURANCE org="http://PrivacySeal.org" text="third party" image="http://PrivacySeal.org/Logo.gif"/> </PROP>

P3P Proposal Example
English language version: CoolCatalog, makes the following statement for the web pages at http://www.CoolCatalog.com/catalogue/. We collect clickstream data in our HTTP logs. We also collect your first name, age, and gender to customize our catalog pages for the type of clothing you are likely to be interested in and for our own research and product development. We do not use this information in a personally-identifiable way. We do not redistribute any of this information outside of our organization. We do not provide access capabilities to information we may have from you, but we do have retention and opt-out policies that you can read about at our privacy page http://CoolCatalog.com/PrivacyPractice.html. The third party PrivacySeal.org provides assurance that we abide by this agreement. Abridged P3P syntax version: <PROP realm="http://CoolCatalog.com/catalogue/" entity="CoolCatalog" propID="94df1293a3e519bb"> <USES> <STATEMENT purpose="1" recipient="0" id="0"> <REF name="Web.Abstract.ClientClickStream"/> </STATEMENT></USES> <USES> <STATEMENT purpose="2,3" recipient="0" id="0" consequence="a site with clothes you'd appreciate."> <WITH><PREFIX name="User."> <REF name="Name.First"/> <REF name="Bdate.Year" OPTIONAL="1"/> <REF name="Gender"/> </PREFIX></WITH> </STATEMENT></USES> <DISCLOSURE discURI="http://CoolCatalog.com/PrivPractice.html" access="3" other="0,1"/> <ASSURANCE org="http://PrivacySeal.org" text="third party" image="http://PrivacySeal.org/Logo.gif"/> </PROP>

For the purpose of understanding this paper it is not necessary to understand the entire P3P proposal syntax. The important things to observe are:

A P3P proposal contains a series of statements, each enclosed in <USES><STATEMENT> beginning and end tags.
Each statement has several attributes and corresponding values. The values of most of these attributes are represented as integers, with meanings defined in the Harmonized Vocabulary chapter of the P3P specification.
Statements may also contain one or more data references, each contained within a <REF/> tag. A data reference refers to a specific data element, a set of data elements, or a data category to which the statement attributes apply. The Harmonized Vocabulary chapter of the P3P specification defines a set of data categories. The Base Data Set chapter of the P3P specification defines base data elements and data sets. There is also an extensibility mechanisms that allows for the creation of new data elements and data sets.
Proposals also contain several other elements including a <DISCLOSURE> element and an <ASSURANCE> element. Each of these elements also has attributes.

For a detailed overview of P3P see [P3PNote] or refer to the latest draft of the P3P1.0 specification [P3P10].

Goals

The APPEL working draft [APPEL] specifies a language for describing collections of preferences regarding P3P proposals. Using this language, users can express their preferences in a set of rules (called a ruleset), which can then be used by user agents to make automated or semi-automated decisions regarding the exchange of data with P3P enabled web sites. Much of the underlying logic is based on PICSRules [PICSRules].

APPEL was developed with several goals in mind that are not met by the P3P1.0 specification alone:

Allow users to install rulesets. Sophisticated preferences may be difficult for end-users to specify, even through well-crafted user interfaces. An organization should be able to create a set of recommended preferences for users. Users who trust that organization should be able to install a pre-defined ruleset rather than specifying a new set from scratch. It should be easy to change the active ruleset on a single computer, or to carry a ruleset to a new computer. Once a ruleset is installed, users should be able to customize it to better meet their needs.
Support communication of rulesets to agents, search engines, proxies, or other servers. Servers of various kinds may have the ability to tailor their output to better meet users' preferences, as expressed in a ruleset. For example, a search service might return only links that match a user's ruleset, which may specify criteria based on a variety of factors including quality, privacy, age suitability, or the safety of downloadable code.
Support portability of rulesets between products. The same ruleset should work with any P3P-APPEL enabled product.

Primarily, we envision this language will be used to allow users to import preference rulesets created by other parties and to transport their own rulesets files between multiple user agents. Implementors might also choose to use this language (or some easily-derived variation) to encode user preferences for use by the rule evaluators that serve as the decision-making components of their user agents.

While we do not expect users to be exposed to the expression of the rules themselves, tools that allow them to test their rules may be useful. For instance, a user could make a query, "under what conditions do I give out identifiable information?"

Requirements

The 9 October 1998 APPEL working draft [APPEL] is based on the following requirements:

APPEL rules should allow the expression of preferences over anything that can be expressed in the P3P base schema as well as all other RDF metadata relevant to P3P decision making (e.g., is the communication channel secured). This requires the use of a wide range of integer, string, and set comparison operators, as well as wild card symbols. Note, APPEL rules need not express preferences over PICS labels unless they are translated into an appropriate RDF schema.
APPEL should address situations in which a service does not offer a P3P proposal (i.e., APPEL rules should be able to express preferences over the presence or absence of a piece of metadata). Likewise, APPEL rules should be able to express preferences over the presence or absence of a particular attribute of a P3P proposal.
APPEL rules should be able to prescribe the following set of behaviors: accept, reject, and prompt. In addition, APPEL should include an extensibility mechanism that allow additional behaviors to be specified.
APPEL encoding should be consistent with other P3P work and leverage members' existing work and code base. As much as possible, the encoding should be simple and support the efficient computation of rule matches.

In defining the scope of the APPEL language, the working group generated a large list of possible requirements. The group then narrowed the scope to eliminate those requirements that were deemed less important or easier to implement if handled elsewhere. Thus, the working group limited the scope of APPEL as follows:

APPEL rules need not allow the expression of "sophisticated" rules based on the presence of multiple data elements within a P3P proposal (for example, a rule that would allow a zipcode to be collected unless a full name is also collected).
APPEL need not be capable of expressing rules for ranking multiple proposals. Rather it should express the rules necessary for determining whether a single proposal triggers a behavior. If more than one P3P proposal is available, they should be submitted to the rule evaluator individually. It is up to the calling program to determine what to do if multiple proposals are acceptable, or if a "prompt" behavior is returned while evaluating multiple proposals.
A compact or easy-to-read representation is not essential.
APPEL need not be capable of expressing negotiation strategies.
APPEL rules need not be able to express preferences based on state information (unless such information is encoded in RDF and submitted to an APPEL engine as any other metadata would be submitted).

In order to facilitate prototype implementations of APPEL the working group decided to split up the current draft into a Level 1 specification designed to express only basic privacy preferences and a more detailed Level 2 specification that implements the rest of the requirements outlined above. Specifically, APPEL Level 1 limits the requirements to:

only supporting three standard behaviors, accept, reject and prompt (i.e. no extensibility mechanism for custom behaviors).
only supporting preferences over P3P proposals (i.e. no non-P3P schemas).
only supporting restricted matching capabilities using a limited set of comparison operators and wildcards.

Note that the only known APPEL implementation to date -- IBM P3P Parser (based on a previous working draft) -- includes most features of both APPEL levels.

APPEL Overview

This section gives a general overview of the APPEL language. For a full specification, please refer to the latest APPEL working draft [APPEL].

APPEL is an ordered rules language, similar in design to PICSRules. Like PICSRules [PICSRules], APPEL rules are designed to be processed one at a time until a rule fires. Once a rule fires, no further rule processing is required. While PICSRules restrict pattern matching to the values of simple attribute-value pairs, APPEL allows pattern matching over structured data in the form of XML (or RDF) expressions. In fact the APPEL syntax looks very similar to the P3P proposal syntax. Any part of a P3P proposal that must be present for a rule to fire must appear in the body of that rule.

An APPEL rule evaluator is activated by a P3P application. The activating application provides the evaluator with various pieces of "evidence" and a rule set for processing them. Evidence includes the URI of the service and a single P3P proposal from the service if present. Rules are evaluated with respect to the evidence provided. A rule evaluates to true if an expression is satisfied. Basically, a rule is satisfied if any of the available evidence satisfies it.

The scope of the rule is determined by the opening and closing elements of an APPEL RULE element. The evaluator returns the behavior (as specified in its behavior attribute -- accept, prompt, reject, or another behavior specified in an extension) of the rule that fired on the basis of the evidence discussed above. In addition, the rule evaluator may optionally return other information such as an explanation string (suitable for user display).

A rule includes a behavior, an optional persona (if the user agent supports multiple user repositories, this string identifies the data repository that should be used), an optional explanation and a set of expressions. A rule with an empty set of expressions always evaluates to false. A rule containing only the degenerate expression always evaluates to true. Multiple expressions within a rule are implicitly ANDed together; thus, all must hold true for the rule to evaluate to true. Individual expressions are each composed of simple-expressions and data-reference-expressions. Simple-expressions and data-reference-expressions within an expression are implicitly ANDed together as well.

Simple-expressions are used to match generic elements and their attribute values within a proposal, for example '<STATEMENT purpose="3">'. They support only the = operator and may take string or numeric values. A data-reference expression on the other hand is a special kind of expression that is tailored to matching data references within a P3P proposal. Since the data reference elements of a P3P proposal determine which data fields are requested (and, upon acceptance of the proposal are ultimately supplied to the service), some special semantics were put into APPEL in order to prevent unintentional release of personal data.

APPEL supports the concept of quantifiers that govern the strictness of the required match between data reference expressions in the rule and the corresponding elements in the proposal. For example, rules that result in an accept behavior always use the semantics of the ONLY quantifier when matching data-reference-expressions: match if only the data reference elements that are listed in the rule are requested in the proposal. In contrast, rules that result in a non-"accept" behavior (e.g. reject and prompt) will default to using the semantics of the ANY quantifier: match if any of the data reference elements that are given in the rule are requested in the proposal.

APPEL supports a single wildcard metacharacter. Simple-expressions and data-reference-expressions can use this wildcard to match ranges of values such as <REF name="User.*"> (any element from the "User" data set). Wildcards can also be used to indicate that a particular attribute must be present, but that it may take any value.

The following is an example of a simple APPEL ruleset. Although the example is a well formed APPEL ruleset, it is used only to demonstrate a small set of example rules and is not necessarily a realistic example. This example ruleset represents the following natural language preferences:

The user does not mind revealing non-identifiable click-stream data and a pairwise user ID (PUID) to sites that collect no other information. However, the user insists that the service provide a human-readable privacy disclosure.
All other requests for data transfer should be denied.

The following listing illustrates one way to encode these preferences into an APPEL ruleset using two APPEL rules. An accept-rule (i.e., a rule with the string "accept" in its behavior attribute) first checks to see if only non-identifiable clickstream data and/or a PUID is collected, and accepts if disclosure information is available (lines 4-14). Otherwise, a "reject"-rule encapsulating the degenerate expression "OTHERWISE" will fire, rejecting proposals that contain requests for data transfer (lines 18-20).

Simple Ruleset in APPEL Level 1

000: <APPEL:APPEL xmlns="http://www.w3.org/APPEL" Order="RDF:Seq">
001: <APPEL:RULESET crtdby="APPEL WG" crtdon="Wed, 12-Aug-1998 09:12:32 GMT">
002:   <RDF:SEQ> 
003:     <RDF:LI>
004:       <APPEL:RULE behavior="accept"
005:                   description="Service only collects clickstream data">
006:           <P3P:USES>
007:           <P3P:STATEMENT action="r" id="0">
008:               <P3P:REF name="ID.PUID"/>
009:               <P3P:REF name="ClickStream.Client_"/>
010:           </P3P:STATEMENT>
011:           </P3P:USES>
012:           <P3P:DISCLOSURE discURI="*"/>
013:         </P3P:PROP>
014:       </APPEL:RULE>
015:     </RDF:LI>
016:     <RDF:LI>
017:       <APPEL:RULE behavior="reject"
018:                   description="I don't want to be identified!">
019:           <APPEL:OTHERWISE/>
020:       </APPEL:RULE>
021:     </RDF:LI>
022:   </RDF:SEQ>
023: </APPEL:RULESET>
024: </APPEL:APPEL>

Simple Ruleset in APPEL Level 1
`000: <APPEL:APPEL xmlns="http://www.w3.org/APPEL" Order="RDF:Seq">` `001: <APPEL:RULESET crtdby="APPEL WG" crtdon="Wed, 12-Aug-1998 09:12:32 GMT">` `002: <RDF:SEQ> <!-- This might be simplified by later versions of RDF -->` `003: <RDF:LI>` `004: <APPEL:RULE behavior="accept"` `005: description="Service only collects clickstream data">` `006: <P3P:USES>` `007: <P3P:STATEMENT action="r" id="0">` `008: <P3P:REF name="ID.PUID"/>` `009: <P3P:REF name="ClickStream.Client_"/>` `010: </P3P:STATEMENT>` `011: </P3P:USES>` `012: <P3P:DISCLOSURE discURI="*"/>` `013: </P3P:PROP>` `014: </APPEL:RULE>` `015: </RDF:LI>` `016: <RDF:LI>` `017: <APPEL:RULE behavior="reject"` `018: description="I don't want to be identified!">` `019: <APPEL:OTHERWISE/>` `020: </APPEL:RULE>` `021: </RDF:LI>` `022: </RDF:SEQ>` `023: </APPEL:RULESET>` `024: </APPEL:APPEL>`

Note that the line numbers are not part of the APPEL syntax. Lines 6-11 in the ruleset example contain a statement in the same syntax as it would appear in a P3P proposal. Line 12 contains a P3P disclosure element with a wild card value for the given attribute, indicating that the attribute must be present.

Discussion

The P3P Preferences working group has specified a language that meets our stated goals and requirements. There has been some debate as to the degree of difficulty involved in implementing APPEL. However we generally believe that APPEL Level 1 should not pose any major obstacles for implementors, and IBM's early APPEL implementation suggests that even APPEL Level 2 should be fairly straightforward to implement. On the other hand, since APPEL is a special-purpose language it requires the implementation of a parser and trust engine that otherwise would not be included in future browsers. If a suitable XML or RDF query language were available and widely implemented, our objectives might be met without the need for additional implementation work. Furthermore, a general-purpose language would likely provide more flexibility in the kinds of rules that could be created and provide better support for integrating P3P rules with other types of rules that users may wish to establish.

While several of the proposed general-purpose languages look promising as APPEL alternatives, it is important to note that P3P is quite different from many of the other applications the designers of these languages seem to be envisioning. A P3P query language requires some of the features of an advanced authorization language in addition to some of the features of a query language. In our application we are more concerned about the existence of data that matches a certain pattern than in retrieving, aggregating, or otherwise operating on that data. Although this makes our task simpler in some ways, it may require some awkward syntax to achieve using a general-purpose language. This might possibly be addressed by the use of macros designed specifically for P3P-related tasks.

Because P3P is intended to help users protect their privacy and retain control over their personal information, the working group tried to design APPEL with syntactic and semantic safeguards against unintentional release of personal data. For example, APPEL rules that accept proposals (and may agree to the release of data) must use the ONLY quantifier, which requires them to fire only if the data referenced in a proposal is restricted to the list of data elements or categories enumerated in the rule. Rules that reject proposals, on the other hand, default to using the ANY quantifier, which requires them to fire if any of the data referenced in the proposal is on the list of data elements or categories enumerated in the rule (although rules may created that override this default). While a general purpose language supported by a good user interface or editing tool might provide similar kinds of safeguards, building these features into the language offers a higher level of security than is likely to be achieved using a general purpose language.

While one of the primary goals of APPEL is to allow people to install pre-defined rulesets, it is also important that people be able to customize their rulesets to better meet their needs. Designing a user interface that allows users to customize an existing APPEL ruleset remains an open problem. Certainly a simple rule editor could allow people to edit individual rules or add or subtract rules from their ruleset. However, it would be desirable to have a user interface that could present users with an overall picture of the conditions under which proposals will be accepted or rejected, allowing them to make conceptual changes that might translate into multiple specific changes to their ruleset.

References

[APPEL]: Marc Langheinrich (editor). A P3P Preference Exchange Language (APPEL) Working Draft. W3C Working Draft 9 October 1998. http://www.w3.org/P3P/Group/Preferences/Drafts/WD-P3P-preferences-19981009.html
[PICSRules]: Martin Presler-Marshall (editor). PICSRules 1.1. W3C Recommendation 29 Dec 1997. http://www.w3.org/TR/REC-PICSRules
[P3P10]: Massimo Marchiori, Joseph Reagle, and Dan Jaye (editors). Platform for Privacy Preferences (P3P1.0) Specification. W3C Working Draft 9 November1998. http://www.w3.org/TR/WD-P3P/
[P3PNote]: Joseph Reagle and Lorrie Cranor. The Platform for Privacy Preferences. W3C Note 6 November 1998. http://www.w3.org/TR/NOTE-P3P-CACM/

Acknowledgements

Thanks to the other members of the P3P Preferences Working Group: Marc Langheinrich, Massimo Marchiori, Joseph Reagle, Drummond Reed, and Mary Ellen Zurko. The ideas expressed in this paper are the results of many discussions of this working group.