W3C

The Platform for Privacy Preferences 1.1 (P3P1.1) Specification

W3C Working Draft 10 February 2004

This Version:
http://www.w3.org/TR/2004/WD-P3P11-20040210/
Latest Version:
http://www.w3.org/TR/P3P11/
Editor:
Rigo Wenning, W3C / ERCIM (rigo@w3.org)
Authors:
Lorrie Cranor, CMU (P3P 1.0 & P3P 1.1)
Marc Langheinrich, ETH Zurich (P3P 1.0)
Massimo Marchiori, W3C / MIT / University of Venice (P3P 1.0)
Martin Presler-Marshall, IBM (P3P 1.0)
Joseph Reagle, W3C/MIT(P3P 1.0)
Matthias Schunter, IBM (P3P 1.1)

Abstract

This is the specification of the Platform for Privacy Preferences 1.1 (P3P 1.1). This document, along with its normative references, includes all the specification necessary for the implementation of interoperable P3P 1.1 applications. P3P 1.1 is based on the P3P 1.0 Recommendation and adds some features using the P3P 1.0 Extension mechanism. It also contains a new binding mechanism that can be used to bind policies for XML Applications beyond HTTP transactions.

Status of This Document

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.

This is a first Public Working Draft of the P3P 1.1 Specification for review by W3C Members and other interested parties. P3P 1.1 was developed from suggestions out of a Workshop in Dulles/Virginia and a Workshop in Kiel/Germany. The community at large gave feedback on limitations and shortcomings of P3P 1.0. As far as those suggestions have found sufficient support, they are now included in this new P3P 1.1 Working Draft. All new features are built using P3P's own Extension mechanism. Those extensions are contained in a new XML Schema in Appendix 5 and carry their own new namespace. All P3P 1.0 preserve their old namespace. Additionally, this Working Draft contains all the errata to P3P 1.0. Note that all changes from the P3P 1.0 Recommendation to this Working Draft are marked up with a different background color, even in the outline.

Publication as a Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.

This document has been produced by the P3P Specification Working Group as part of the Privacy Activity in the W3C Technology & Society Domain.

Patent disclosures relevant to this specification may be found on the P3P1.1 patent disclosure page, in conformance with W3C policy.

Please report errors in this document to www-p3p-dev@w3.org ( publicly archived).


Table of Contents

  1. Introduction
    1. The P3P1.1 Specification
      1. Goals and Capabilities of P3P1.1
      2. Example of P3P in Use
      3. P3P Policies
      4. P3P User Agents
      5. Implementing P3P1.1 on Servers
      6. Future Versions of P3P
    2. About this Specification
    3. Identity Definitions in the P3P Specification
      1. IdentifiedData
      2. Non-Identifiable and Linked Data
      3. Identifiers
    4. Terminology
  2. Referencing Policies
    1. Overview and Purpose of Policy References
    2. Locating Policy Reference Files
      1. Well-Known Location
      2. HTTP Headers
      3. The HTML link Tag
      4. The XHTML link Tag
      5. HTTP ports and other protocols
    3. Policy Reference File Syntax and Semantics
      1. Example Policy Reference File
      2. Policy Reference File Definition
        1. Policy reference file processing
          1. Significance of order
          2. Wildcards in policy reference files
        2. The META and POLICY-REFERENCES elements
        3. Policy reference file lifetimes and the EXPIRY element
          1. Motivation and mechanism
          2. The EXPIRY element
          3. Use of HTTP headers
          4. Error handling for policy reference file lifetimes
        4. The POLICY-REF element
        5. The INCLUDE and EXCLUDE elements
        6. The HINT element
        7. The COOKIE-INCLUDE and COOKIE-EXCLUDE elements
        8. The METHOD element
      3. Applying a Policy to a URI
      4. Forms and Related Mechanisms
    4. Additional Requirements
      1. Non-ambiguity
      2. Multiple Languages
      3. The Safe Zone
      4. Policy and Policy Reference File Processing by User Agents
      5. Security of Policy Transport
      6. Policy Updates
      7. Absence of Policy Reference File
      8. Asynchronous Evaluation
    5. The P3P Generic Attribute for XML Applications
    6. Example Scenarios
  3. Policy Syntax and Semantics
    1. Example policies
      1. English language policies
      2. XML encoding of policies
    2. Policies
      1. The POLICIES element
      2. The POLICY element
      3. The STATEMENT- GROUP-DEF (EXTENSION)
      4. The TEST element
      5. The ENTITY element
      6. The ACCESS element
      7. The DISPUTES element
      8. The REMEDIES element
    3. Statements
      1. The STATEMENT element
      2. The STATEMENT-GROUP element (EXTENSION)
      3. The CONSEQUENCE element
      4. The NON-IDENTIFIABLE element
      5. The PURPOSE element
      6. The RECIPIENT element
      7. The RETENTION element
      8. The DATA-GROUP and DATA elements
    4. Categories and the CATEGORIES element
    5. Extension Mechanism: the EXTENSION element
    6. User Preferences
  4. Compact Policies
    1. Referencing Compact Policies
    2. Compact Policies Vocabulary
      1. Compact ACCESS
      2. Compact DISPUTES
      3. Compact REMEDIES
      4. Compact NON-IDENTIFIABLE
      5. Compact PURPOSE
      6. Compact RECIPIENT
      7. Compact RETENTION
      8. Compact CATEGORIES
      9. Compact TEST
    3. Compact Policy Scope
    4. Compact Policy Lifetime
    5. Transforming a P3P Policy to a Compact Policy
    6. Transforming a Compact Policy to a P3P Policy
    7. Compact Policy Processing by User Agents
  5. Data schemas
    1. Natural Language Support for Data Schemas
    2. Data Structures
    3. The DATA-DEF and DATA-STRUCT elements
      1. Categories in P3P Data Schemas
      2. P3P Data Schema Example
      3. Use of data element names
    4. Persistence of data schemas
    5. Basic Data Structures
      1. Dates
      2. Names
      3. Logins
      4. Certificates
      5. Telephones
      6. Contact Information
        1. Postal
        2. Telecommunication
        3. Online
      7. Access Logs and Internet Addresses
        1. URI
        2. ipaddr
        3. Access Log Information
        4. Other HTTP Protocol Information
    6. The base data schema
      1. User Data
      2. Third Party Data
      3. Business Data
      4. Dynamic Data
    7. Categories and Data Elements/Structures
      1. Fixed-Category Data Elements/Structures
      2. Variable-Category Data Elements/Structures
    8. Using Data Elements
  6. User Agent Guidelines
    1. Completeness of Human-Readable Translations
    2. Plain Language Translations of P3P Vocabulary Elements
    3. Storage of P3P Policies and Translations
    4. Compact Policy Processing
    5. Sanity Checking P3P Policies
  7. Appendices
    Appendix 1: References (Normative)
    Appendix 2: References (Non-normative)
    Appendix 3: The P3P base data schema Definition (Normative)
    Appendix 4: XML Schema Definition (Normative)
    Appendix 5: XML DTD Definition (Non-normative)
    Appendix 5: The XML Schema for P3P 1.1 Extensions and the P3P generic attribute
    Appendix 6: ABNF Notation (Normative)
    Appendix 7: P3P Guiding Principles (Non-normative)
    Appendix 8: Working Group Contributors (Non-normative)
    Changelog

1. Introduction

The Platform for Privacy Preferences Project (P3P) enables Web sites to express their privacy practices in a standard format that can be retrieved automatically and interpreted easily by user agents. P3P user agents will allow users to be informed of site practices (in both machine- and human-readable formats) and to automate decision-making based on these practices when appropriate. Thus users need not read the privacy policies at every site they visit.

Although P3P provides a technical mechanism for ensuring that users can be informed about privacy policies before they release personal information, it does not provide a technical mechanism for making sure sites act according to their policies. Products implementing this specification MAY provide some assistance in that regard, but that is up to specific implementations and outside the scope of this specification. However, P3P is complementary to laws and self-regulatory programs that can provide enforcement mechanisms. In addition, P3P does not include mechanisms for transferring data or for securing personal data in transit or storage. P3P may be built into tools designed to facilitate data transfer. These tools should include appropriate security safeguards.

1.1 The P3P 1.1 Specification

The P3P1.1 specification defines the syntax and semantics of P3P privacy policies, and the mechanisms for associating policies with Web resources. P3P policies consist of statements made using the P3P vocabulary for expressing privacy practices. P3P policies also reference elements of the P3P base data schema -- a standard set of data elements that all P3P user agents should be aware of. The P3P specification includes a mechanism for defining new data elements and data sets, and a simple mechanism that allows for extensions to the P3P vocabulary.

1.1.1 Goals and Capabilities of P3P 1.1

P3P version 1.0 is a protocol designed to inform Web users of the data-collection practices of Web sites. It provides a way for a Web site to encode its data-collection and data-use practices in a machine-readable XML format known as a P3P policy. The P3P specification defines:

The goal of P3P version 1.1 is twofold. First, it allows Web sites to present their data-collection practices in a standardized, machine-readable, easy-to-locate manner. Second, it enables Web users to understand what data will be collected by sites they visit, how that data will be used, and what data/uses they may "opt-out" of or "opt-in" to.

P3P version 1.1 departs from version 1.0 and adds some enhancements and some new constraints:

1.1.2 Example of P3P in Use

As an introduction to P3P, let us consider one common scenario that makes use of P3P. Claudia has decided to check out a store called CatalogExample, located at http://www.catalog.example.com/. Let us assume that CatalogExample has placed P3P policies on all their pages, and that Claudia is using a Web browser with P3P built in.

Claudia types the address for CatalogExample into her Web browser. Her browser is able to automatically fetch the P3P policy for that page. The policy states that the only data the site collects on its home page is the data found in standard HTTP access logs. Now Claudia's Web browser checks this policy against the preferences Claudia has given it. Is this policy acceptable to her, or should she be notified? Let's assume that Claudia has told her browser that this is acceptable. In this case, the homepage is displayed normally, with no pop-up messages appearing. Perhaps her browser displays a small icon somewhere along the edge of its window to tell her that a privacy policy was given by the site, and that it matched her preferences.

Next, Claudia clicks on a link to the site's online catalog. The catalog section of the site has some more complex software behind it. This software uses cookies to implement a "shopping cart" feature. Since more information is being gathered in this section of the Web site, the Web server provides a separate P3P policy to cover this section of the site. Again, let's assume that this policy matches Claudia's preferences, so she gets no pop-up messages. Claudia continues and selects a few items she wishes to purchase. Then she proceeds to the checkout page.

The checkout page of CatalogExample requires some additional information: Claudia's name, address, credit card number, and telephone number. Another P3P policy is available that describes the data that is collected here and states that her data will be used only for completing the current transaction, her order.

Claudia's browser examines this P3P policy. Imagine that Claudia has told her browser that she wants to be warned whenever a site asks for her telephone number. In this case, the browser will pop up a message saying that this Web site is asking for her telephone number, and explaining the contents of the P3P statement. Claudia can then decide if this is acceptable to her. If it is acceptable, she can continue with her order; otherwise she can cancel the transaction.

Alternatively, Claudia could have told her browser that she wanted to be warned only if a site is asking for her telephone number and was going to give it to third parties and/or use it for uses other than completing the current transaction. In that case, she would have received no prompts from her browser at all, and she could proceed with completing her order.

Note that this scenario describes one hypothetical implementation of P3P. Other types of user interfaces are also possible.

1.1.3 P3P Policies

P3P policies use an XML with namespaces (cf. [XML] and [XML-Name]) encoding of the P3P vocabulary to provide contact information for the legal entity making the representation of privacy practices in a policy, enumerate the types of data or data elements collected, and explain how the data will be used. In addition, policies identify the data recipients, and make a variety of other disclosures including information about dispute resolution, and the address of a site's human-readable privacy policy. P3P policies must cover all relevant data elements and practices. However, legal issues regarding law enforcement demands for information are not addressed by this specification. It is possible that a site that otherwise abides by its policy of not redistributing data to others may be required to do so by force of law. P3P declarations are positive, meaning that sites state what they do, rather than what they do not do. The P3P vocabulary is designed to be descriptive of a site's practices rather than simply an indicator of compliance with a particular law or code of conduct. However, user agents may be developed that can test whether a site's practices are compliant with a law or code.

P3P policies represent the practices of the site. Intermediaries such as telecommunication providers, Internet service providers, proxies and others may be privy to the exchange of data between a site and a user, but their practices may not be governed by the site's policies. In addition, note that each P3P policy is applied to specific Web resources (Web pages, images, cookies, etc.) listed in a policy reference file. By placing one or more P3P policies on a Web site, a company or organization does not make any statements about the privacy practices associated with other Web resources not mentioned in their policy reference file, with other online activities that do not involve data collected on Web sites covered by their P3P policy, or with offline activities that do not involve data collected on Web sites covered by their P3P policy.

In cases where the P3P vocabulary is not precise enough to describe a Web site's practices, sites should use the vocabulary terms that most closely match their practices and provide further explanations (as stated in Section 3.2). However, policies MUST NOT make false or misleading statements.

1.1.4 P3P User Agents

P3P 1.1 user agents can be built into Web browsers, browser plug-ins, or proxy servers. They can also be implemented as Java applets or JavaScript; or built into electronic wallets, automatic form-fillers, or other user data management tools. P3P user agents look for references to a P3P policy at a well-known location, in P3P headers in HTTP responses, and in P3P link tags embedded in HTML content. These references indicate the location of a relevant P3P policy. User agents can fetch the policy from the indicated location, parse it, and display symbols, play sounds, or generate user prompts that reflect a site's P3P privacy practices. They can also compare P3P policies with privacy preferences set by the user and take appropriate actions. P3P can perform a sort of "gate keeper" function for data transfer mechanisms such as electronic wallets and automatic form fillers. A P3P user agent integrated into one of these mechanisms would retrieve P3P policies, compare them with user's preferences, and authorize the release of data only if a) the policy is consistent with the user's preferences and b) the requested data transfer is consistent with the policy. If one of these conditions is not met, the user might be informed of the discrepancy and given an opportunity to authorize the data release themselves.

The P3P 1.1 Specification gives implementers a lot of flexibility to determine the design and functionality of P3P user agents. However, the specification does include some requirements and guidelines for user agent implementers. Most of these can be found in section 6 and Appendix 7.

1.1.5 Implementing P3P 1.1 on Servers

Web sites can implement P3P 1.1 on their servers by translating their human-readable privacy policies into P3P syntax and then publishing the resulting files along with a policy reference file that indicates the parts of the site to which the policy applies. Automated tools can assist site operators in performing this translation. P3P 1.1 can be implemented on existing HTTP/1.1-compliant Web servers without requiring additional or upgraded software. Servers may publish their policy reference files at a well-known location, or they may reference their P3P policy reference files in HTML/XHTML content using a link tag. Alternatively, compatible servers may be configured to insert a P3P extension header into all HTTP responses that indicates the location of a site's P3P policy reference file.

Web sites have some flexibility in how they use P3P: they can opt for one P3P policy for their entire site or they can designate different policies for different parts of their sites. A P3P policy MUST cover all data generated or exchanged as part of a site's HTTP interactions with visitors. In addition, some sites may wish to write policies that cover all data an entity collects, regardless of how the data is collected.

1.1.6 Future Versions of P3P

Significant sections were removed from earlier drafts of the P3P 1.0 specification in order to facilitate rapid implementation and deployment of a P3P first step. A future version of the P3P specification might incorporate those features after P3P 1.0 is deployed. Such specification would likely include improvements based on feedback from implementation and deployment experience as well as four major components that were part of the original P3P vision but not included in P3P 1.0 or 1.1:

The P3P 1.1 Specification contains the most urgent improvements suggested by the P3P Workshop of December 2002 in Dulles/Virginia. Some of the Work suggested by this Workshop and by the P3P Workshop in Kiel are delayed to later versions.

1.2 About this Specification

This document, along with its normative references, includes all the specification necessary for the implementation of interoperable P3P applications.

The following key words are used throughout the document and have to be read as interoperability requirements. This specification uses words as defined in RFC2119 [KEY] for defining the significance of each particular requirement. These words are:

MUST or MUST NOT
This word or the adjective "required" means that the item is an absolute requirement of the specification.
SHOULD or SHOULD NOT
This word or the adjective "rcommended" means that there may exist valid reasons in particular circumstances to ignore this item, but the full implications should be understood and the case carefully weighed before choosing a different course.
MAY
This word or the adjective "optional" means that this item is truly optional. One vendor may choose to include the item because a particular marketplace requires it or because it enhances the product, for example; another vendor may omit the same item.

The P3P specification defines, with the exception of section 2.2.2, section 2.2.3 and section 4, an XML with namespaces syntax (cf. [XML] and [XML-Name]). In the following, for the sake of brevity we will liberally talk about "XML", meaning the more accurate "XML with namespaces".

A BNF-like notation is also used thorough the specification: the [ABNF] notation used in this specification is specified in RFC2234 and summarized in Appendix 6. However, note that in the case of XML syntax, such ABNF syntax is only a grammar representative used to enhance readability (lacking, for example, all the syntactic flexibilities that are implicitly included in XML, e.g. whitespace rules, quoting using either single quote (') or double quote ("), character escaping, comments, case sensitivity, order of attributes, namespace handling), and as such it has no normative value. All the XML syntax defined in this specification MUST conform to the XML Schema for P3P (see Appendix 4), which, together with the other constraints expressed in this specification using natural language, constitutes the normative definition.

The (non-normative) DTD provided in Appendix 5 MAY be used to verify that P3P files are valid. However, there are some valid files that may be rejected if checked against the DTD due to their use of namespaces.

As far as the non-XML syntax defined in this specification is concerned (section 2.2.2 defining P3P's HTTP header, section 2.2.3 defining usage of P3P in HTML, and section 4 defining compact policies), instead, the ABNF notation (together with the other constraints expressed in this specification using natural language) constitutes the normative definition.

1.3 Identity Definitions in the P3P Specification

In privacy regulations, guidelines and papers about privacy a variety of terms are used to describe data that identifies an individual to varying degrees.

The European Union Directive defines an identifiable person as one who can be identified, directly or indirectly, in particular by reference to an identification number or to one or more factors specific to his physical, physiological, mental, economic, cultural or social identity. The Directive also states that in determining whether a person is identifiable account should be taken of all the means likely reasonably to be used either by the controller or by any other person to identify the said person; whereas the principles of protection shall not apply to data rendered anonymous in such a way that the data subject is no longer identifiable.

In Australia, personal information is information about an individual who can be identified, or whose identity could be reasonably ascertained. In Canada personal information means information about an identifiable individual. In the United States, different sectors have different standards for identifiability of data. Similarly, in many other policy documents, terms such as personally identifiable information (PII) are often not defined or the cause for heated debate.

The P3P Specification Working Group has taken the view point that most information referring to an individual is identifiable in some way. As with other important areas of the specification, the goal of the working group was to allow for a wide variety of understandings of identity in order to allow data collectors to best express their policy and users to make choices based on a definition of identity information that is important to them. (More information on the debate and the definitions can be found in Lorrie Cranor's Book on P3P.

1.3.1 Identified Data

The most common term in the specification is identified data and focuses on whether a service knows the data subject's identity.

Identified data is information in a record or profile that can reasonably be tied to an individual. Admittedly, this is a somewhat subjective standard. For example, a data collector storing Internet Protocol (IP) addresses (which can be created dynamically or could be static and therefore tied to a particular computer used by a single individual) should consider the IP address identified data only when this data is added to the record or profile of a specific individual. In the more common case, where data collectors use IP addressing information in the aggregate or make no attempt to tie the IP address to a specified individual or computer over a long period of time, IP addresses are not considered identified even though it is possible for someone (eg, law enforcement agents with proper subpoena powers) to identify the individual based on the stored data.

As mentioned above, in the P3P context, any data that can be used reasonably by a data controller or any other person to identify an individual is considered to be identifiable data. The P3P specification uses the term identified to describe a subset of this data that can be reasonably be used by a data collector without assistance from other parties to identify an individual.

1.3.2 Non-Identifiable and Linked Data

The working group also felt that data collectors should be able acknowledge when they make specific attempts to anonymize information.

The term non-identifiable data refers to efforts made specifically to de-identify data. For example, a data collector collecting and storing IP addresses but not using them should NOT call this data non-identifiable even in the common case where they have no plans to identify an actual individual or computer. However, if a Web site collects IP addresses, but actively deletes all but the last four digits of this information in order to determine short term use, but insure that a particular individual or computer cannot be consistently identified, then the data collector can and should call this information non-identifiable. Also, non-identifiable can be used in cases where no information is being collected at all. Since most Web servers are designed to keep Web logs for maintenance, this would most likely mean that the data collector has taken specific efforts to ensure the anonymity of users.

Under the above definitions, a lot of information could be identifiable (not specifically made anonymous), but not identified (reasonably able to be tied to an individual or computer).

Similarly, the term linked refers to how information is being used in connection with a cookie. All data in a cookie or linked to a particular user must be disclosed in the cookie's policy. Using the terminology above, if the data collector collects identifiable information about the user it is generally linked data. For example, if the data collector stores a login name in a file associated with a persistent cookie and the login name is linked to personal data, the cookie is clearly linked.

In less clear cut example, if the data collector ties the cookie to a specific order id in a flat file and that order id is tied to personal information in a related file, the cookie would be linked to all of the relational data unless specific precautions have been taken to ensure that a data operator with access to the relational data cannot access the flat cookie data and vice versa.

In other words, a data collector that uses cookies must:

1.3.3 Identifiers

The Working Group decided against an identified or identifiable label for particular types of data. However, user agent implementers have the option of assigning these or other labels themselves and building user interfaces that allow users to make decisions about web sites on the basis of how they collect and use certain types of data.

The Working Group felt that different user agent implementations could be created to focus on different concerns around data type. Therefore, the working group enabled the creation of a robust data schema including broad categories of information that may be considered sensitive by certain user groups. The Working Group hopes that a diverse set of user agents will be created to allow users the ability to make identity decisions based on specific collections and types of collects if they desire to do so. For example, a user agent could allow users to opt to be prompted when medical or financial identifier is being collected, independent of how that information is being used.

1.4 Terminology

Character
Strings consist of a sequence of zero or more characters, where a character is defined as in the XML Recommendation [XML]. A single character in P3P thus corresponds to a single Unicode abstract character with a single corresponding Unicode scalar value (see [UNICODE]).
Data Element
An individual data entity, such as last name or telephone number. For interoperability, P3P 1.1 specifies a base set of data elements.
Data Category
A significant attribute of a data element or data set that may be used by a trust engine to determine what type of element is under discussion, such as physical contact information. P3P 1.1 specifies a set of data categories.
Data Set
A known grouping of data elements, such as "user.home-info.postal". The P3P 1.1 base data schema specifies a number of data sets.
Data Schema
A collection of data elements and sets defined using the P3P 1.1 DATASCHEMA element. P3P 1.1 defines a standard data schema called the P3P base data schema.
Data Structure
A hierarchical description of a set of data elements. A data set can be described according to its data structure. P3P 1.1 defines a set of basic datastructures that are used to describe the data sets in the P3P base data schema.
Equable Practice
A practice that is very similar to another in that the purpose and recipients are the same or more constrained than the original, and the other disclosures are not substantially different. For example, two sites with otherwise similar practices that follow different -- but similar -- sets of industry guidelines.
Identified Data
Identified data is information in a record or profile that can reasonably be tied to an individual, as defined in Section 1.3
Policy
A collection of one or more privacy statements together with information asserting the identity, URI, assurances, and dispute resolution procedures of the service covered by the policy.
Practice
The set of disclosures regarding data usage, including purpose, recipients, and other disclosures.
Preference
A rule, or set of rules, that determines what action(s) a user agent will take. A preference might be expressed as a formally defined computable statement (e.g., the [APPEL] preference exchange language).
Purpose
The reason(s) for data collection and use.
Repository
A mechanism for storing user information under the control of the user agent.
Resource
A network data object or service that can be identified by a URI. Resources may be available in multiple representations (e.g. multiple languages, data formats, size, and resolutions) or vary in other ways.
Safe Zone
Part of a Web site where the service provider performs only minimal data collection, and any data that is collected is used only in ways that would not reasonably identify an individual.
Service
A program that issues policies and (possibly) data requests. By this definition, a service may be a server (site), a local application, a piece of locally active code, such as an ActiveX control or Java applet, or even another user agent. Typically, however, a service is usually a Web site. In this specification the terms "service" and "Web site" are often used interchangeably.
Service Provider (Data Controller, Legal Entity)
The person or legal entity which offers information, products or services from a Web site, collects information, and is responsible for the representations made in a practice statement.
Statement
A P3P statement is a set of privacy practice disclosures relevant to a collection of data elements.
URI
A Uniform Resource Identifier used to locate Web resources. For definitive information on URI syntax and semantics, see [URI]. URIs that appear within XML or HTML have to be treated as specified in [CHARMODEL], section Character Encoding in URI References. This does not apply to URIs appearing in HTTP header fields; the URIs there should always be fully escaped.
User
An individual (or group of individuals acting as a single entity) on whose behalf a service is accessed and for which personal data exists. P3P policies describe the collection and use of personal data about this individual or group.
User Agent
A program whose purpose is to mediate interactions with services on behalf of the user under the user's preferences. A user may have more than one user agent, and agents need not reside on the user's desktop, but any agent must be controlled by and act on behalf of only the user. The trust relationship between a user and his or her agent may be governed by constraints outside of P3P. For instance, an agent may be trusted as a part of the user's operating system or Web client, or as a part of the terms and conditions of an ISP or privacy proxy.

2. Referencing Policies

2.1 Overview and Purpose of Policy References

Locating a P3P policy is one of the first steps in the operation of the P3P protocol. Services use policy references to state what policy applies to a specific URI or set of URIs. User agents use policy references to locate the privacy policy that applies to a Web resource, so that they can process that policy for the benefit of their user.

Policy references are used extensively as a performance optimization. P3P policies are typically several kilobytes of data, while a URI that references a privacy policy is typically less than 100 bytes. In addition to the bandwidth savings, policy references also reduce the need for computation: policies can be uniquely associated with URIs, so that a user agent need only parse and process a policy once rather than process it with every document to which the policy applies. Furthermore, by placing the information about relevant policies in a centralized location, Web site administration is simplified.

A policy reference file is used to associate P3P policies with certain regions of URI-space. The policy reference file is an XML with namespaces (see [XML] and [XML-Name]) file that can specify the policy for a single Web document, portions of a Web site, or for an entire site. The policy reference file may refer to one or more P3P policies; this allows for a single reference file to cover an entire site, even if different P3P policies apply to different portions of the site.The policy reference file is used to make any or all of the following statements:

All of these statements are made in the body of the policy reference file.

2.2 Locating Policy Reference Files

This section describes the mechanisms used to indicate the location of a policy reference file. Detailed syntax is also given for the supported mechanisms.

The location of the policy reference file can be indicated using one of four mechanisms. The policy reference file

  1. may be located in a predefined "well-known" location, or
  2. a document may indicate a policy reference file through an HTML link tag, or
  3. a document may indicate a policy reference file through an XHTML link tag, or
  4. through an HTTP header.

Note that if user agents support retrieving HTML (resp. XHTML) content over HTTP, they MUST handle mechanisms 1, 2 and 3 (resp. 4) listed above interchangeably. See also the requirements for non-ambiguity.

Policies are applied at the level of resources. A "page" from the user's perspective may be composed of multiple HTTP resources; each may have its own P3P policy associated with it. As a practical note, however, placing many different P3P policies on different resources on a single page may make rendering the page and informing the user of the relevant policies difficult for user agents. Additionally, services are recommended to attempt to craft their policy reference files such that a single policy reference file covers any given "page"; this will speed up the user's browsing experience.

For a user agent to process the policy that applies to a given resource, it must locate the policy reference file for that resource, fetch the policy reference file, parse the policy reference file, fetch any required P3P policies, and then parse the P3P policy or policies.

This document does not specify how P3P policies may be associated with Web resources retrieved by means other than HTTP. However, it does not preclude future development of mechanisms for associating P3P policies with resources retrieved using other protocols. Furthermore, additional methods of associating P3P policies with HTTP resources may be developed in the future.

2.2.1 Well-Known Location

Web sites using P3P MAY (and, are strongly encouraged to) place a policy reference file in a "well-known" location. To do this, a policy reference file would be made available on the site at the path /w3c/p3p.xml.

Note that sites are not required to use this mechanism; however, by using this mechanism, sites can ensure that their P3P policy will be accessible to user agents before any other resources are requested from the site. This will reduce the need for user agents to access the site using safe zone practices. Additionally, if a site chooses to use this mechanism, the policy reference file located in the well-known location is not required to cover the entire site. For example, sites where not all of the content is under the control of a single organization MAY choose not to use this mechanism, or MAY choose to post a policy reference file which covers only a limited portion of the site.

Use of the well-known location for a policy reference file does not preclude use of other mechanisms for specifying a policy reference file. Portions of the site MAY use any of the other supported mechanisms to specify a policy reference file, so long as the non-ambiguity requirements are met.

For example, imagine a shopping-mall Web site run by the MallExample company. On their Web site (mall.example.com), companies offering goods or services at the mall would get a company-specific subtree of the site, perhaps in the path /companies/company-name. The MallExample company may choose to put a policy reference file in the well-known location which covers all of their site except the /companies subtree. Then if the ShoeStoreExample company has some content in /companies/shoestoreexample, they could use one of the other mechanisms to indicate the location of a policy reference file covering their portion of the mall.example.com site.

One case where using the well-known location for policy reference files is expected to be particularly useful is in the case of a site which has divided its content across several hosts. For example, consider a site which uses a different logical host for all of its Web-based applications than for its static HTML content. The other mechanisms allowed for specifying the location of a policy reference file require that some URI on the host being accessed must be fetched to locate the policy reference file. However, the well-known location mechanism has no such requirement. Consider the example of an HTML form located on www.example.com. Imagine that the action URI on that form points to server cgi.example.com. The policy reference file that covers the form is unable to make any statements about the action URI that processes the form. However, the site administrator publishes a policy reference file at http://cgi.example.com/w3c/p3p.xml that covers the action URI, thus enabling a user agent to easily locate the P3P policy that applies to the action URI before submitting the form contents.

2.2.2 HTTP Headers

Any document retrieved by HTTP MAY point to a policy reference file through the use of a new response header, the P3P header ([P3P-HEADER]). If a site is using P3P headers, it SHOULD include this on responses for all appropriate request methods, including HEAD and OPTIONS requests.

The P3P header gives one or more comma-separated directives. The syntax follows:

[1]
p3p-header
=
`P3P: ` p3p-header-field *(`,` p3p-header-field)
[2]
p3p-header-field
=
policy-ref-field | compact-policy-field | extension-field
[3]
policy-ref-field
=
`policyref="` URI-reference `"`
[4]
extension-field
=
token
[`=` (token | quoted-string) ]
Here, URI-reference is defined as per RFC 2396 [URI], token and quoted-string are defined by [HTTP1.1].

In keeping with the rules for other HTTP headers, the name of the P3P header may be written with any casing. The contents should be specified using the casing precisely as specified in this document.

The policyref directive gives a URI which specifies the location of a policy reference file which may reference the P3P policy covering the document that pointed to the reference file, and possibly others as well. When the policyref attribute is a relative URI, that URI is interpreted relative to the request URI. Note that fetching the URI given in the policyref directive MAY result in a 300-class HTTP return code (redirection); user agents MUST interpret those redirects with normal HTTP semantics. Services should note, of course, that use of redirects will increase the time required for user agents to find and interpret their policies. The policyref URI MUST NOT be used for any other purpose beyond locating and referencing P3P policies.

The compact-policy-field is used to specify "compact policies". This is described in Section 4.

User agents which find unrecognized directives (in the extension-fields) MUST ignore the unrecognized directives. This is to allow easier deployment of future versions of P3P.

Example 2.1:

1. Client makes a GET request.

GET /index.html HTTP/1.1
Host: catalog.example.com
Accept: */*
Accept-Language: de, en
User-Agent: WonderBrowser/5.2 (RT-11)

2. Server returns content and the P3P header pointing to the policy of the resource.

HTTP/1.1 200 OK
P3P: policyref="http://catalog.example.com/P3P/PolicyReferences.xml"
Content-Type: text/html
Content-Length: 7413
Server: CC-Galaxy/1.3.18

2.2.3 The HTML link Tag

Servers MAY serve HTML content with embedded link tags (cf. [HTML]) that indicate the location of the relevant P3P policy reference file. This use of P3P does not require any change in the server behavior.

The link tag encodes the policy reference information that could be expressed using the P3P header. The link tag takes the following form (here, we just produce one possible ABNF format for the link tag, and suppose the [HTML] syntax rules can be used when using such a tag into an HTML file):

[5]
p3p-link-tag
=
`<link rel="P3Pv1" href="` URI `">`
Here, URI is defined as per RFC 2396 [URI].

When the href attribute is a relative URI, that URI is interpreted relative to the request URI.

In order to illustrate with an example the use of the link tag, we consider the policy reference expressed in Example 2.1 using HTTP headers. That example can be equivalently expressed using the link tag with the following piece of HTML:

<link rel="P3Pv1"
    href="http://catalog.example.com/P3P/PolicyReferences.xml">

Finally, note that since the p3p-link-tag is embedded in an HTML document, its character encoding will be the same as that of the HTML document. In contrast to P3P policy and policy reference documents (see section 2.3 and section 3 below), the p3p-link-tag need not be encoded using [UTF-8]. Note also that the link tag is not case sensitive.

2.2.4 The XHTML link tag

Correspondingly to the HTML link tag, P3P also supports XHTML (cf. [XHTML-MOD]). Servers MAY serve XHTML content that, using the XHTML Link Module (cf. Section 5.19 of [XHTML-MOD]), indicates the location of the relevant P3P policy reference file with an embedded XHTML link tag. Like in the HTML case, an XHTML link tag can be used to encode the policy reference information that could be expressed using the P3P header, by:

2.2.5 HTTP ports and other protocols

The mechanisms described here MAY be used for HTTP transactions over any underlying protocol. This includes plain-text HTTP over TCP/IP connections or encrypted HTTP over SSL connections, as well as HTTP over any other communications protocol designers wish to implement.

URIs MAY contain network port numbers, as specified in RFC 2396 [URI]. For the purposes of P3P, different ports on a single host MUST be considered to be separate "sites". Thus, for example, the policy reference file at the well-known location for www.example.com on port 80 (http://www.example.com/w3c/p3p.xml) would not give any information about the policies which apply to www.example.com when accessed over SSL (as the SSL communication would take place on a different port, 443 by default).

This document does not specify how P3P policies may be associated with resources retrieved by means other than HTTP. However, it does not preclude future development of mechanisms for associating P3P policies with resources retrieved over other protocols. Furthermore, additional methods of associating P3P policies with resources retrieved using HTTP may be developed in the future.

2.3 Policy Reference File Syntax and Semantics

This section explains the contents of policy reference files in detail.

2.3.1 Example Policy Reference File

Consider the case of a Web site wishing to make the following statements:

  1. P3P policy /P3P/Policies.xml#first applies to the entire site, except resources whose paths begin with /catalog, /cgi-bin, or /servlet.
  2. P3P policy /P3P/Policies.xml#second applies to all resources whose paths begin with /catalog.
  3. P3P policy /P3P/Policies.xml#third applies to all resources whose paths begin with /cgi-bin or /servlet, except for /servlet/unknown.
  4. No statement is made about what P3P policy applies to /servlet/unknown.
  5. These statements are valid for 2 days.

These statements can be represented by the following XML:

Example 2.2:

<META xmlns="http://www.w3.org/2002/01/P3Pv1">
 <POLICY-REFERENCES>
  <EXPIRY max-age="172800"/>

    <POLICY-REF about="/P3P/Policies.xml#first">
      <INCLUDE>/*</INCLUDE>
      <EXCLUDE>/catalog/*</EXCLUDE>
      <EXCLUDE>/cgi-bin/*</EXCLUDE>
      <EXCLUDE>/servlet/*</EXCLUDE>
    </POLICY-REF>

    <POLICY-REF about="/P3P/Policies.xml#second">
      <INCLUDE>/catalog/*</INCLUDE>
    </POLICY-REF>

    <POLICY-REF about="/P3P/Policies.xml#third">
      <INCLUDE>/cgi-bin/*</INCLUDE>
      <INCLUDE>/servlet/*</INCLUDE>
      <EXCLUDE>/servlet/unknown</EXCLUDE>
    </POLICY-REF>

 </POLICY-REFERENCES>
</META>

Note this example also includes via EXPIRY a relative expiry time in the document (cf. Section 2.3.2.3.2).

2.3.2 Policy Reference File Definition

This section defines the syntax and semantics of P3P policy reference files. All policy reference files MUST be encoded using [UTF-8]. P3P servers MUST encode their policy reference files using this syntax.

2.3.2.1 Policy reference file processing

2.3.2.1.1 Significance of order

A policy reference file has the META element as root. It may contain multiple POLICY-REF elements. If it does contain more than one element, they MUST be processed by user agents in the order given in the file. When a user agent is attempting to determine what policy applies to a given URI, it MUST use the first POLICY-REF element in the policy reference file which applies to that URI.

Note that each POLICY-REF may contain multiple INCLUDE, EXCLUDE, METHOD, COOKIE-INCLUDE, and COOKIE-EXCLUDE elements and that all of these elements within a given POLICY-REF MUST be considered together to determine whether the POLICY-REF applies to a given URI. Thus, it is not sufficient to find an INCLUDE element that matches a given URI, as EXCLUDE or METHOD elements may serve as modifiers that cause the POLICY-REF not to match.

2.3.2.1.2 Wildcards in policy reference files

Policy reference files make statements about what policy applies to a given URI. Policy reference files support a simple wildcard character to allow making statements about regions of URI-space. The character asterisk ('*') is used to represent a sequence of 0 or more of any character. No other special characters (such as those found in regular expressions) are supported.

Note that since the asterisk is also a legal character in URIs ([URI]), some special conventions have to be followed when encoding such "extended URIs" in a policy reference file:

URI escaping and unescaping is very much dependant on the actual scheme used, and might even differ between individual components within a single scheme, so no simple rule for which characters need to be escaped can be given here. Please refer directly to [URI] for details on the standard escaping process. Note that P3P user agents MAY ignore any URI pattern that does not conform to [URI].

The wildcard character MAY be used in the INCLUDE and EXCLUDE elements, in the COOKIE-INCLUDE and COOKIE-EXCLUDE elements, and in the HINT element.

2.3.2.2 The META and POLICY-REFERENCES elements

<META>
The META element contains a complete policy reference file. Optionally, one POLICIES element can follow. META can also contain one or more one or more EXTENSION elements (cf. section 3.5), as well as an xml:lang attribute (see section 2.4.2), to indicate the language in which its content is expressed.
<POLICY-REFERENCES>
This element MAY contain one or more POLICY-REF (policy reference) elements. It MAY also contain one EXPIRY element (indicating their expiration time), one or more HINT element, and one or more EXTENSION element (cf. section 3.5).
[6]
prf
=
`<META xmlns="http://www.w3.org/2002/01/P3Pv1"` [xml-lang] `>`
*extension
policyrefs
[policies]
*extension
"</META>"
[7]
policyrefs
=
"<POLICY-REFERENCES>"
[expiry]
*policyref
*hint
*extension
"</POLICY-REFERENCES>"
Here PCDATA is defined in [XML].

2.3.2.3 Policy reference file lifetimes and the EXPIRY element

2.3.2.3.1 Motivation and mechanism

It is desirable for servers to inform user agents about how long they can use the claims made in a policy reference file. By enabling clients to cache the contents of a policy reference file, it reduces the time required to process the privacy policy associated with a Web resource. This also reduces load on the network. In addition, clients that don't have a valid policy reference file for a URI will need to use "safe zone" practices for their requests. If clients have policy reference files that they know are still valid, then they can make more informed decisions on how to proceed.

In order to achieve these benefits, policy reference files SHOULD contain an EXPIRY element, which indicates the lifetime of the policy reference file. If the policy reference file does not contain an EXPIRY element, then it defaults to 24-hour lifetime.

The lifetime of a policy reference file tells user agents how long they can rely on the claims made in the policy reference file. By setting the lifetime of a policy reference file, the publishing site agrees that the policies mentioned in the policy reference file are appropriate for the lifetime of the policy reference file. For example, if a policy reference file has a lifetime of 3 days, then a user agent need not reload that file for 3 days, and can assume that the references made in that policy reference file are good for 3 days. All of the policy references made in a single policy reference file will receive the same lifetime. The only way to specify different lifetimes for different policy references is to use separate policy reference files.

The same mechanism used to indicate the lifetime of a policy reference file is also used to indicate the lifetime of a P3P policy. Thus P3P POLICIES elements SHOULD have an EXPIRY element associated with them as well. This lifetime applies to all P3P policies contained within that POLICIES element. If there is no EXPIRY element associated with a P3P policy, then it defaults to 24-hour lifetime.

When picking a lifetime for policies and policy reference files, sites need to pick a lifetime which balances two competing concerns. One concern is that the lifetime ought to be long enough to allow user agents to receive significant benefits from caching. The other concern is that the site would like to be able to change their policy for new data collection without waiting for an extremely long lifetime to expire. It is expected that lifetimes in the range of 1-7 days would be a reasonable balance between these two competing desires. Sites also need to remember the policy update requirements when updating their policies.

When a policy reference file has expired, the information in the policy reference file MUST NOT be used by a user agent until that user agent has successfully revalidated the policy reference file, or has fetched a new copy of the policy reference file.

Note that while user agents are not obligated to revalidate policy reference files or policy files that have not expired, they MAY choose to revalidate those files before their expiry period has passed in order to reduce the need for using "safe zone" practices. A valid P3P user agent implementation does not need to contain a cache for policies and policy reference files, though the implementation will have better performance if it does.

2.3.2.3.2 The EXPIRY element

The EXPIRY element can be used in a policy reference file and/or in a POLICIES element to state how long the policy reference file (or policies) remains valid. The expiry is given as either an absolute expiry time, or a relative expiry time. An absolute expiry time is a time, given in GMT, until which the policy reference file (or policies) is valid. A relative expiry time gives a number of seconds for which the policy reference file (or policies) is valid. This expiry time is relative to the time the policy reference file (or policies) was requested or last revalidated by the client. This computation MUST be done using the time of the original request or revalidation, and the current time, with both times generated from the client's clock. Revalidation is defined in section 13.3 of [HTTP1.1].

The minimum amount of time for any relative expiry time is 24 hours, or 86400 seconds. Any relative expiration time shorter than 86400 seconds MUST be treated as being equal to 86400 seconds in a client implementation. If a client encounters an absolute expiration time that is in the past, it MUST act as if NO policy reference file (or policy) is available. See section 2.4.7 "Absence of Policy Reference File" for the required procedure in such cases.

[8]
expiry
=
"<EXPIRY" (absdate|reldate) "/>"
[9]
absdate
=
`date="` HTTP-date `"`
[10]
reldate
=
`max-age="` delta-seconds `"`
Here, HTTP-date is defined in section 3.3.1 of [HTTP1.1], and delta-seconds is defined in section 3.3.2 of [HTTP1.1].
2.3.2.3.3 Requesting Policies and Policy Reference Files

In a real-world network, there may be caches which will cache the contents of policies and policy reference files. This is good for increasing the overall network performance, but may have deleterious effects on the operation of P3P if not used correctly. There are two specific concerns:

  1. When a user agent receives a policy reference file (or policy), if it was served from a caching proxy (see e.g. [CACHING]) the user agent needs to know how long the policy reference file or policy resided in the caching proxy. This time MUST be subtracted from the lifetime of the policy or policy reference file which uses relative expiry.
  2. When a user agent needs to revalidate a policy reference file (or policy), it needs to make sure that the revalidation fetches a current version of the policy reference file (or policy). For example, consider the case where a user agent holds a policy reference file with a 1 day relative expiry. If the user agent refetches it from a caching proxy, and the file has been residing in the caching proxy for 3 days, then the resulting file is useless.

HTTP 1.1 [HTTP1.1] contains powerful cache-control mechanisms to allow clients to place requirements on the operations of network caches; these mechanisms can resolve the problems mentioned above. The specific method will be discussed below.

HTTP 1.0, however, does not provide those more sophisticated cache control mechanisms. An HTTP 1.0 caching proxy will, in all likelihood, compute a cache lifetime for the policy reference file (or policies) based on the file's last-modified date; the resulting cache lifetime could be significantly longer than the lifetime specified by the EXPIRY element. The caching proxy could then serve the policy reference file (or policies) to clients beyond the lifetime in the EXPIRY; the result would be that user-agents would receive a useless policy reference file (or policies).

The second problem with an HTTP 1.0 caching proxy is that a user agent has no way to know how long the reference file may have been stored by the caching proxy. If the policy reference file (or policies) relies on relative expiry, it would then be impossible for the user agent to determine if the reference file's lifetime has already expired, or when it will expire.

Thus, if a user agent is requesting a policy reference file or a policy, and does not know for certain that there are no HTTP 1.0 caches in the path to the origin server, then the request MUST force an end-to-end revalidation. This can be done with the Pragma: no-cache HTTP request-header. Note that neither HTTP nor P3P define a way to determine if there is a HTTP 1.0-compliant cache in any given network path, so unless the user agent has this information derived from an outside source, it MUST force the end-to-end revalidation.

If the user agent has some way to know that all caches in the network path to the origin server are compliant with HTTP 1.1 (or that there are no caches in the network path to the origin server), then the client MAY do the following instead of forcing an end-to-end revalidation:

  1. Use cache-control request-headers to ensure that the received response is not older than its lifetime. This is done with the max-age cache-control setting, with a maximum age significantly less than the lifetime of the policy reference file (or policies). For example, a user agent could send Cache-Control: max-age=43200, thus ensuring that the response is no more than 12 hours old.
  2. Subtract the age of the response from the lifetime of the policy reference file (or policies), if it uses a relative expiry time. The age of the response is given by the Age: HTTP response-header.

Note that it is impossible for a client to accurately predict the amount of latency that may affect an HTTP request. Thus, if the policy reference file covering a request is going to expire soon, clients MAY wish to consider warning their users and/or revalidating the policy reference file before continuing with the request.

2.3.2.3.4 Error handling for policy reference file and policy lifetimes

The following situations have their semantics specifically defined:

  1. An absolute expiry date in the past renders the policy reference file (or policies) useless, as does an invalid or malformed expiry date, whether relative or absolute. In this case, user agents MUST act as if NO policy reference file (or policies) is available. See section 2.4.7 "Absence of Policy Reference File" for the required procedure in such cases.
  2. A relative expiration time shorter than 86400 seconds (1 day) is considered to be equal to 86400 seconds.
  3. When a policy reference file contains more than one EXPIRY element, the first one takes precedence for determining the lifetime of the policy reference file.

2.3.2.4 The POLICY-REF element

A policy reference file may refer to multiple P3P policies, specifying information about each. The POLICY-REF element describes attributes of a single P3P policy. Elements within the POLICY-REF element give the location of the policy and specify the areas of URI-space (and cookies) that each policy covers.

POLICY-REF
contains information about a single P3P policy.
about (mandatory attribute)
URI reference ([URI]), where the fragment identifier part denotes the name of the policy (given in its name attribute), and the URI part denotes the URI where the policy resides (a policy file, or a policy reference file, see Section 3.2). If this is a relative URI reference, it is interpreted relative to the URI of the policy reference file it resides in.
[11]
policy-ref
=
`<POLICY-REF about="` URI-reference `">`
*include
*exclude
*cookie-include
*cookie-exclude
*method-element
*extension
`</POLICY-REF>`
Here, URI-reference is defined as per RFC 2396 [URI].

2.3.2.5 The INCLUDE and EXCLUDE elements

Each INCLUDE or EXCLUDE element specifies one local URI or set of local URIs. A set of URIs is specified if the wildcard character '*' is used in the URI-pattern. These elements are used to specify the portion of the Web site that is covered by the policy referenced by the enclosing POLICY-REF element.

When INCLUDE (and optionally, EXCLUDE) elements are present in a POLICY-REF element, it means that the policy specified in the about attribute of the POLICY-REF element applies to all the URIs at the requested host corresponding to the local-URI(s) matched by any of the INCLUDEs, but not matched by an EXCLUDE element.

A policy referenced in a policy reference file can be applied only to URIs on the DNS (Domain Name System) host that references it. Thus, for example, a policy reference file at the well-known location of host www.example.com can apply policies only to resources on www.example.com. However, if foo.example.com includes a P3P HTTP header in its responses that references a policy reference file on bar.example.com, that policy reference file would be applied to resources on foo.example.com (not bar.example.com or www.example.com). The same policy reference file might be referenced in P3P HTTP headers sent by multiple hosts, in which case it may be applied to each host that references it. The INCLUDE and EXCLUDE elements MUST specify URI patterns relative to the root of the DNS host to which they are applied. This requirement does NOT apply to the location of the P3P policy file (the about attribute on the POLICY-REF element).

If a METHOD element (section 2.3.2.8) specifies one or more methods for an enclosing policy reference, it follows that all methods not mentioned are consequently not covered by this policy. In the case that this is the only policy reference for a given URI prefix, user agents MUST assume that NO policy is in effect for all methods NOT mentioned in the policy reference file. It is legal but pointless to supply a METHOD element without any INCLUDE or COOKIE-INCLUDE elements.

It is legal, but pointless, to supply an EXCLUDE element without any INCLUDE elements; in that case, the EXCLUDE element MUST be ignored by user agents.

Note that the set of URIs specified with INCLUDE and EXCLUDE does not include cookies that might be set or replayed when requesting one of such URIs: in order to associate policies with cookies, the COOKIE-INCLUDE and COOKIE-EXCLUDE elements are needed.

[12]
include
=
"<INCLUDE>" relativeURI "</INCLUDE>"
[13]
exclude
=
"<EXCLUDE>" relativeURI "</EXCLUDE>"
Here, relativeURI is defined as per RFC 2396 [URI], with the addition that the '*' character is to be treated as a wildcard, as defined in section 2.3.2.1.2.

2.3.2.6 The HINT element

Policy reference hints are a performance optimization that can be used under certain conditions. A site may declare a policy reference for itself using the well-known location, the P3P response header, or the HTML/XHTML link tag. It MAY further provide a hint to additional policy references, such as those declared by other sites.

For example, an HTML page might hint at policy references for its hyperlinks, embedded content, and action URIs. User agents MAY use the hint mechanism to discover policy reference files before requesting the affected URIs when the policy references are not available from the well-known location.

User agents which use hints to retrieve policies MUST NOT apply them to any site other than the one which contains the hinted policy reference file.

Any policy reference file MAY contain zero or more policy reference hints. Each hint is contained in a HINT element with two attributes, scope and path.

The scope attribute is used to specify a URI scheme and authority to which the hinted policy reference can be applied. If the authority component (cf. [URI]) is a server component (e.g., a hostname or IP address) the host part of the authority MAY begin with a wildcard, as defined in Section 2.3.2.1.2. The scope attribute MUST NOT contain a wildcard in any other position, MUST be encoded according to the conventions in Section 2.3.2.1.2, and MUST NOT contain a path, query or fragment URI component. Additionally, if the authority is a server, it SHOULD NOT contain a userinfo part.

For example, legal values for scope include:

The following are illegal values for the scope attribute:

The path attribute is used to locate the policy reference file on the hinted site. It is a relative URI whose base is the URI scheme and authority matched in the scope attribute. The path attribute MUST NOT be an absolute URI, so that the policy reference file is always retrieved from the same site that it is applied to.

Example 2.3:

<HINT scope="http://www.example.org" path="/mypolicy/p3.xml" />
<HINT scope="http://www.example.net:81" path="/w3c/prf.xml" />
<HINT scope="http://*.shop.example.com" path="/w3c/prf.xml" />
[14]
hint
=
`<HINT scope="` scheme ( `://` | `:/` ) authority `" path="` relativeURI `/>`
Here, scheme, authority and relativeURI are taken from RFC 2965 [STATE].

2.3.2.7 The COOKIE-INCLUDE and COOKIE-EXCLUDE elements

The COOKIE-INCLUDE and COOKIE-EXCLUDE elements are used to associate policies to cookies (cf. [COOKIES] and [STATE]).

A cookie policy MUST cover any data (within the scope of P3P) that is stored in that cookie or linked via that cookie. It MUST also reference all purposes associated with data stored in that cookie or enabled by that cookie. In addition, any data/purpose stored or linked via a cookie MUST also be put in the cookie policy. In addition, if that linked data is collected by HTTP, then the policy that covers that GET/POST/whatever request must cover that data collection. For example, when CatalogExample asks customers to fill out a form with their name, billing, and shipping information, the P3P policy that covers the form submittal will disclose that CatalogExample collects this data and explain how it is used. If CatalogExample sets a cookie so that it can recognize its customers and observe their behavior on its Web site, it would have a separate policy for this cookie. However, if this cookie is also linked to the user's name, billing, and shipping information -- perhaps so CatalogExample can generate custom catalog pages based on where the customer lives -- then that data must also be disclosed in the cookie policy.

For the purpose of this specification, state management mechanisms use either SET-COOKIE or SET-COOKIE2 headers, and cookie-namespace is defined as the value of the NAME, VALUE, Domain and Path attributes, specified in [COOKIES] and [STATE].

Each COOKIE-INCLUDE or COOKIE-EXCLUDE element can be used to match (similarly to INCLUDE and EXCLUDE) the NAME, VALUE, Domain and Path components of a cookie, expressing the cookies which are covered by the policy specified by the about attribute when the cookies are set from the resources on the Web site where the policy reference file resides:

COOKIE-INCLUDE (resp. COOKIE-EXCLUDE)
include (resp. exclude) cookies that match the name, value, domain and path attributes
name
match the NAME portion of the cookie
value
match the VALUE portion of the cookie
domain
match the Domain portion of the cookie
path
match the Path portion of the cookie

If the value of the domain attribute is set to the dot character ("."), the domain will match only cookies that omit the domain attribute (and thus have domain equivalent to the request host as per RFC 2965 ([STATE]).

Cookies that omit the path attribute have the default path of the request URI that generated the set-cookie response as per RFC 2965 [STATE]. The path attribute of a COOKIE-INCLUDE should be matched against this default value if a cookie omits the path attribute.

All four attributes are optional. If an attribute is absent, the COOKIE-INCLUDE (resp. COOKIE-EXCLUDE) will match cookies that have that attribute set to any value.

When COOKIE-INCLUDE (and optionally, COOKIE-EXCLUDE) elements are present in a POLICY-REF element, the policy specified in the about attribute of the POLICY-REF element applies to every cookie that is matched by any COOKIE-INCLUDE's, and not matched by a COOKIE-EXCLUDE element.

User agents MUST interpret COOKIE-INCLUDE and COOKIE-EXCLUDE elements in a policy reference file to determine the policy that applies to cookies set by or replayed to the host to which the policy reference file applies. While the domain attribute of a COOKIE-INCLUDE may match more broadly (for example, if the domain attribute is omitted it defaults to matching any domain value), user agents MUST limit their application of the policy to domains that could be legally used in a cookie set by the host to which the policy reference file applies. For example, if abc.xyz.example.com declares a policyref with <COOKIE-INCLUDE domain="*.xyz.*ple.com"/>, this would be matched to cookies with domains such as .abc.xyz.example.com and .xyz.example.com, but not .example.com or .xyz.sample.com.

A P3P policy can be associated with a cookie by the host that set that cookie as well as by any or all of the hosts to which it might be replayed. A user agent MAY fetch a cookie policy at the time a cookie is set and apply it later when the cookie is replayed, perhaps to other hosts in the domain. A user agent MAY request a policy reference file from a host before replaying a cookie to that host, and if the policy reference file contains an appropriate COOKIE-INCLUDE, a policy will be applied to that cookie even if the cookie was not set by that host. Any host to which the cookie may be replayed MUST be able to honor all the policies associated with the cookie, regardless of whether that host declares a policy for that cookie. Thus sites that set cookies that may be replayed to multiple hosts within a domain need to coordinate to make sure all the hosts can follow the declared policy. In addition, sites should be cautious with their use of wildcards to make sure that they do not inadvertently apply a policy to cookies to which it should not be applied (including previously set cookies that are still in use and cookies set by other hosts in the domain).

The policy that applies to a cookie applies until the policy expires, even if the associated policy reference file expires prior to policy expiry (but after the cookie was set). If the policy associated with a cookie has expired, then the user agent SHOULD reevaluate the cookie policy before sending the cookie. In addition, user agents MUST use only non-expired policies and policy reference files when evaluating new set-cookie events.

Example 2.4 states that /P3P/Policies.xml#first applies to all cookies.

Example 2.4:

<META xmlns="http://www.w3.org/2002/01/P3Pv1">
 <POLICY-REFERENCES>
    <POLICY-REF about="/P3P/Policies.xml#first">
       <COOKIE-INCLUDE name="*" value="*" domain="*" path="*"/>
    </POLICY-REF>
 </POLICY-REFERENCES>
</META>

Example 2.5 states that /P3P/Policies.xml#first applies to all cookies, except cookies with the cookie name value of "obnoxious-cookie", a domain value of ".example.com", and a path value of "/", and that /P3P/Policies.xml#second applies to all cookies with the cookie name of "obnoxious-cookie", a domain value of ".example.com", and a path value of "/".

Example 2.5:

<META xmlns="http://www.w3.org/2002/01/P3Pv1">
 <POLICY-REFERENCES>
    <POLICY-REF about="/P3P/Policies.xml#first">
       <COOKIE-INCLUDE name="*" value="*" domain="*" path="*"/>
       <COOKIE-EXCLUDE name="obnoxious-cookie" value="*" domain=".example.com" path="/"/>
    </POLICY-REF>
    <POLICY-REF about="/P3P/Policies.xml#second">
       <COOKIE-INCLUDE name="obnoxious-cookie" value="*" domain=".example.com" path="/"/>
    </POLICY-REF>
 </POLICY-REFERENCES>
</META>
[15]
cookie-include
=
"<COOKIE-INCLUDE"
   [` name="` token `"`]   ; matches the cookie's NAME
   [` value="` token `"`]  ; matches the cookie's VALUE
   [` domain="` token `"`] ; matches the cookie's Domain
   [` path="` token `"`]   ; matches the cookie's Path
"/>"
[16]
cookie-exclude
=
"<COOKIE-EXCLUDE"
   [` name="` token `"`]   ; matches the cookie's NAME
   [` value="` token `"`]  ; matches the cookie's VALUE
   [` domain="` token `"`] ; matches the cookie's Domain
   [` path="` token `"`]   ; matches the cookie's Path
"/>"
Here, token, NAME, VALUE, Domain and Path are defined as per RFC 2965 [STATE], with the addition that the '*' character is to be treated as a wildcard, as defined in section 2.3.2.1.2.

Note that [STATE] states default values for the domain and path attributes of cookies: these should be used in the comparison if those attributes are not found in a specific cookie. Also, conforming to [STATE], if an explicitly specified Domain value does not start with a full stop ("."), the user agent MUST prepend a full stop for it; and, note that every Path begins with the "/" character.

2.3.2.8 The METHOD element

By default, a policy reference applies to the stated URIs regardless of the method used to access the resource. However, a Web site may wish to define different P3P policies depending on the method to be applied to a resource. For example, a site may wish to collect more data from users when they are performing PUT or DELETE methods than when performing GET methods.

The METHOD element in a policy reference file is used to state that the enclosing policy reference only applies when the specified methods are used to access the referenced resources. The METHOD element may be repeated to indicate multiple applicable methods. If the METHOD element is not present in a POLICY-REF element, then that POLICY-REF element covers the resources indicated regardless of the method used to access them.

So, to state that /P3P/Policies.xml#first applies to all resources whose paths begin with /docs/ for GET and HEAD methods, while /P3P/Policies.xml#second applies for PUT and DELETE methods, the following policy reference would be written:

Example 2.6:

<META xmlns="http://www.w3.org/2002/01/P3Pv1">
 <POLICY-REFERENCES>
    <POLICY-REF about="/P3P/Policies.xml#first">
      <INCLUDE>/docs/*</INCLUDE>
      <METHOD>GET</METHOD>
      <METHOD>HEAD</METHOD>
    </POLICY-REF>
    <POLICY-REF about="/P3P/Policies.xml#second">
      <INCLUDE>/docs/*</INCLUDE>
      <METHOD>PUT</METHOD>
      <METHOD>DELETE</METHOD>
    </POLICY-REF>
 </POLICY-REFERENCES>
</META>

Note that HTTP requires the same behavior for GET and HEAD requests, thus it is inappropriate to specify different P3P policies for these methods. The syntax for the METHOD element is:

[17]
method-element
=
`<METHOD>` Method `</METHOD>`
Here, Method is defined in the section 5.1.1 of [HTTP1.1].

Finally, note that the METHOD element is designed to be used in conjunction with INCLUDE or COOKIE-INCLUDE elements. A METHOD element by itself will never apply a POLICY-REF to a URI.

2.3.3 Applying a Policy to a URI

A policy reference file specifies the policy which applies to a given URI. In other words, the indicated policy describes all effects of dereferencing the given URI (in some cases, with the appropriately specified METHOD).

There is a general rule which describes what it means for a P3P policy to cover a URI: the referenced policy MUST cover actions that the user's client software is expected to perform as a result of requesting that URI. Obviously, the policy must describe all data collection performed by site as a result of processing the request for the URI. Thus, if a given URI is covered for terms of GET requests, then the policy given by the policy reference file MUST describe all data collection performed by the site when that URI is dereferenced. Likewise, if a URI is covered for POST requests, then any data collection that occurs as a result of POSTing a form or other content to that URI MUST be described by the policy.

The concept of "actions that the client software is expected to perform" includes the setting of client-side cookies or other state-management mechanisms invoked by the response. If executable code is returned when a URI is requested, then the P3P policy covering that URI MUST cover certain actions which will occur when that code is executed. The covered actions are any actions which could take place without the user explicitly invoking them. If explicit user action causes data to be collected, then the P3P policy covering the URI for that action would disclose that data collection.

Some specific examples:

  1. Fetching a URI returns an HTML page which contains a form, and the form contents are sent to a second URI when the user clicks a "Submit" button. The P3P policy covering the second URI MUST disclose all data collected by the form. The P3P policy covering the first URI (the URI the form was loaded from) MAY or MAY NOT disclose any of the data that will be collected on the form.
  2. An HTML page includes JavaScript code which tracks how long the page is displayed and whether the user moved the mouse over a certain object on the page; when the page is unloaded, the JavaScript code sends that information to the server where the HTML page originated. The activity of the JavaScript code MUST be covered by the P3P policy of the HTML page. The reasoning is that this activity takes place without the user's knowledge or consent, and it occurs automatically as a result of loading the page.
  3. A resource returns an executable for an electronic mail program. In order to use the email program, the user must run an installation program, start the email program, and use its facilities. The P3P policy covering URI from where the email program was downloaded is not required to make a statement about the data which could be collected by using the email program. Installing and running the email program is clearly outside the Web browsing experience, so it is not covered by this specification. A separate protocol could be designed to allow downloaded applications to present a P3P policy, but this is outside the scope of this specification.
  4. An HTML page containing a form includes a reference to an executable which provides a custom client-side control. The data in the control is submitted to a site when the form is submitted. In this case, the URI for the HTML page and the URI for the custom control is not required to make a statement about the data the custom control represents. However, the URI to which the form contents are posted MUST cover the data from the custom control, just as it would cover any other data collected by processing the form. This behavior is similar to the way HTML forms are handled when they use only standard HTML controls: the control itself collects no data, and the data is collected when the form is posted. Note that this example assumes that the form is only posted when the user actively presses a "submit" or similar button. If the form were posted automatically (for example, by some JavaScript code in the page), then this example would be similar to example #2, and the data collected by the form MUST be described in the P3P policy which covers the HTML form.
  5. Requests to a URI are redirected to a third party. If the first party embeds previously collected personal data in the query string or other part of the redirect URI, the privacy policy for the first party's URI MUST describe the types of data transmitted and include the third party as a recipient.

2.3.4 Forms and Related Mechanisms

Forms deserve special consideration, as they often link to CGI scripts or other server-side applications in their action URIs (the action URI is the URI given in the action attribute of the HTML <FORM> element, as defined in section 17.3 of [HTML]). It is often the case that those action URIs are covered by a different policy than the form itself.

If a user agent is unable to find a matching include-rule for a given action URI in the policy reference file that was referenced from the page, it SHOULD assume that no policy is in effect. Under these circumstances, user agents SHOULD check the well-known location on the host of the action URI to attempt to find a policy reference file that covers the action URI. If this does not provide a P3P policy to cover the action URI, then a user agent MAY try to retrieve the policy reference file by using the HINT mechanism on the action URI, and/or by issuing a HEAD request to the action URI before actually submitting any data in order to find the policy in effect. Services SHOULD ensure that server-side applications can properly respond to such HEAD requests and return the corresponding policy reference link in the headers. In case the underlying application does not understand the HEAD request and no policy has been predeclared for the action URI in question, user agents MUST assume that no policy is in effect and SHOULD inform the user about this or take the corresponding actions according to the user's preferences.

Note that services might want to make use of the <METHOD> element in order to declare policies for server-side applications that only cover a subset of supported methods, e.g., POST or GET. Under such circumstances, it is acceptable that the application in question only supports the methods given in the policy reference file (e.g., PUT requests need not be supported). User agents SHOULD NOT attempt to issue a HEAD request to an action URI if the relevant methods specified in the form's method attribute have been properly predeclared in the page's policy reference file.

In some cases, different data is collected at the same action URI depending on some selection in the form. For example, a search service might offer to both search for people (by name and/or email) and (arbitrary) images. Using a set of radio buttons on the form, a single server-side application located at one and the same action URI handles both cases and collects the required information necessary for the search. If a service wants to predeclare the data collection practices of the server-side application it MAY declare all of the data collection practices in a single policy file (using a <INCLUDE> declaration matching the action URI). In this case, user agents MUST assume that all data elements are collected under every circumstance. This solution offers the convenience of a single policy but might not properly reflect the fact that only parts of the listed data elements are collected at a time. Services SHOULD make sure that a simple HEAD request to the action URI (i.e., without any arguments, especially without the value of the selected radio button) will return a policy that covers all cases.

Note that if a form is handled through use of the GET method, then the action URI reflects the choice of form elements selected by the user. In some cases, it will be possible to make use of the wildcard syntax allowed in policy reference files to specify different policies for different uses of the same form action-handler URI. Therefore, user agents MUST include the query-string portion of URIs when making comparisons with INCLUDE and EXCLUDE elements in policy reference files.

2.4 Additional Requirements

2.4.1 Non-ambiguity

User agents need to be able to determine unambiguously what policy applies to a given URI. Therefore, sites SHOULD avoid declaring more than one non-expired policy for a given URI. In some rare case sites MAY declare more than one non-expired policy for a given URI, for example, during a transition period when the site is changing its policy. In those cases, the site will probably not be able to determine reliably which policy any given user has seen, and thus it MUST honor all policies (this is also the case for compact policies, cf. Section 4.1 and Section 4.6). Sites MUST be cautious in their practices when they declare multiple policies for a given URI, and ensure that they can actually honor all policies simultaneously.

If a policy reference file at the well-known location declares a non-expired policy for a given URI, this policy applies, regardless of any conflicting policy reference files referenced through HTTP headers or HTML/XHTML link tags.

If an HTTP response header includes references to more than one policy reference file, P3P user agents MUST ignore all references after the first one.

If an HTML (resp. XHTML) file includes HTML (resp. XHTML) link tag references to more than one policy reference file, P3P user agents MUST ignore all references after the first one.

If a user agent discovers more than one non-expired P3P policy for a given URI (for example because a page has both a P3P header and a link tag that reference different policy reference files, or because P3P headers for two pages on the site reference different policy reference files that declare different policies for the same URI), the user agent MAY assume any (or all) of these policies apply as the site MUST honor all of them.

2.4.2 Multiple Languages

Multiple language versions (translations) of the same policy can be offered by the server using the HTTP "Content-Language" header to properly indicate that a particular language has been used for the policy. This is useful so that human-readable fields such as entity and consequence can be presented in multiple languages. The same mechanism can also be used to offer multiple language versions for data schemas. Servers SHOULD return a localized policy in response to an HTTP request with an HTTP "Accept-Language" header when a policy matching the given language preferences is available.

Whenever Content-Language is used to distinguish policies at the same URI that are offered in multiple languages, the policies MUST have the same meaning in each language. Two policies (or two data schemas) are taken to be identical if

Due to the use of the Accept-Language mechanism, implementers should take note that user agents may see different language versions of a policy or policy reference file despite sending the same Accept-Language request header if a new language version of a policy or data schema has been added.

Finally, language declarations can be also included directly within P3P XML files: the POLICY, POLICIES, META, and DATASCHEMA elements MAY take an xml:lang attribute to indicate the language of any human-readable fields they contain (xml:lang is normatively defined in section 2.12 of [XML]).

[18]
xml-lang
=
` xml:lang="` language `"`
Here, language is a language identifier as defined in [LANG].

2.4.3 The Safe Zone

P3P defines a special set of "safe zone" practices, which SHOULD be used by all P3P-enabled user agents and services for the communications which take place as part of fetching a P3P policy or policy reference file. In particular, requests to the well-known location for policy reference files SHOULD be covered by these "safe zone" practices. Communications covered by the safe zone practices SHOULD have only minimal data collection, and any data that is collected is used only in non-identifiable ways.

To support this safe zone, P3P user agents SHOULD suppress the transmission of data unnecessary for the purpose of finding a site's policy until the policy has been fetched. Therefore safe-zone practices for user agents include the following requirements:

Safe-zone practices for servers include the following requirements:

Note that the safe zone requirements do not say that sites cannot keep identifiable information -- only that they SHOULD NOT use in an identifiable way any information collected while serving a policy file or policy reference file. Tracking down the source of a denial of service attack, for example, would be a legitimate reason to use this information.

2.4.4 Policy and Policy Reference File Processing by User Agents

P3P user agents MUST only render or act upon P3P policies and policy reference files that are well-formed XML.

P3P user agents SHOULD only render or act upon P3P policies and policy reference files that conform to the XML schema given in Appendix 4, and user agents SHOULD NOT rely upon any part of a policy or policy reference file that does not conform to this XML schema.

User agents MUST NOT locally modify a P3P policy or policy reference file in order to make it conform to the XML schema.

2.4.5 Security of Policy Transport

P3P policies and references to P3P policies SHOULD NOT contain any sensitive information. This means that there are no additional security requirements for transporting a reference to a P3P policy beyond the requirements of the document it is associated with; so, if an HTML document would normally be served over a non-encrypted session, then P3P does not require nor recommend that the document be served over an encrypted session when a reference to a P3P policy is included with that document.

2.4.6 Policy Updates

Note that when a Web site changes its P3P policy, the old policy applies to data collected when it was in effect. It is the responsibility of the site to keep records of past P3P policies and policy reference files along with the dates when they were in effect, and to apply these policies appropriately.

If a site wishes to apply a new P3P policy to previously collected data, it MUST provide appropriate notice and opportunities for users to accept the new policy that are consistent with applicable laws, industry guidelines, or other privacy-related agreements the site has made.

2.4.7 Absence of Policy Reference File

If no policy reference file is available for a given site, user agents MUST assume (an empty) policy reference file exists at the well-known location with a 24 hour expiry, and therefore if the user returns to the site after 24 hours, the user agent MUST attempt to fetch a policy reference file from the well-known location again. User agents MAY check the well-known location more frequently, or upon a certain event such as the user clicking a browser refresh button. Sites MAY place a policy reference file at the well-