Rigo Wenning <rigo at w3.org>
Copyright 2003 W3C (MIT, ERCIM, Keio), All Rights Reserved. W3C liability, trademark, document use rules apply.
This document specifies the expression of relationsships between different hosts by extending the syntax and semantics of the Policy Reference File.
This is an editors' draft with no standing. We are proposing that this document be folded into the P3P 1.1 specification in the places as indicated by the section numbers here.
As part of the P3P 1.1 effort, this document describes an extension to P3P 1.0 to allow user agents to recognize when hosts in different domains are owned by the same entity or entities acting as agents for one another. These modifications would allow user agents to more intelligently apply privacy preferences, addressing implementation issues that have plagued many P3P deployments.1
Consider the web sites example.com and forinstance.com, which are owned by the same company and share some web site content, hosted on example.com. Some of that content includes “internal” banner ads that are generated by a servlet on example.com and promote certain features on example.com. Both sites are owned by the same company and wish to deploy a single P3P policy that describes their single corporate policy on data collection and usage.
A simple policy reference file is deployed in the well-known location (/w3c/p3p.xml) on example.com:
<META xmlns="http://www.w3.org/2002/01/P3Pv1"> <POLICY-REFERENCES> <POLICY-REF about="/p3p/policy.xml#corporate"> <INCLUDE>/*</INCLUDE> </POLICY-REF> </POLICY-REFERENCES> </META>
forinstance.com is configured to return the HTTP header
P3P: policyref="http://www.example.com/w3c/p3p.xml"
When a web browser visits a page on forinstance.com, the user agent applies the policy at the URI http://www.example.com/p3p/policy.xml#corporate. If an image on that page is served by example.com, the same policy would be applied to that image request, after looking up the policy reference file at the well-known location.
Since both the page request and the image request are covered by the same privacy policy, the user agent should be able to understand that any data collected (via cookies or otherwise) is being collected and used by the same entity, and therefore may decide that no extra privacy restrictions should be applied for the image request to the different domain.
If the hosts were flipped, and example.com was serving the page and forinstance.com the image, the same logic should hold with the exact same P3P deployment. This possibility brings up an interesting point. Third-party sites should not be able to “spoof” being covered by the first party privacy policy if they are not. If, in our example, forinstance.com was not part of the same company, and had different data collection and usage policies, this mechanism could allow it to claim to be covered by the example.com policy when in fact it is not. Without an extension to P3P 1.0, the user agent would not be capable of verifying the relationship and the burden would fall on example.com to be sure that other sites referencing its privacy policies were doing so legitimately.
The same principle would apply if forinstance.com, instead of being owned by the same entity as example.com, were instead an agent of example.com. Agent is defined by the P3P 1.0 specification as “a third party that processes data only on behalf of the service provider for the completion of the stated purposes.”2 Note that the <ours> element of P3P does not distinguish between “ourselves” and “entities acting as our agents,”.
The KNOWN-HOST element allows sites to declare hosts that are allowed to refer to a policy or policies in a policy reference file. A user agent may use this extension element to determine that an "ours" relationship exists between two sites.
The attribute name is a host name qualifier that can be a full individual host/domain name (e.g. www.example.com) or a wildcard qualifier describing a set of hosts/domains.
known-hosts-extension = `<EXTENSION optional="yes">`
*[known-host]
`</EXTENSION>` known-host = `<KNOWN-HOST`
[`name="` authority `"`]
`/>`
Here, authority is defined as per RFC 2396 [URI], with the addition that the '*' character is to be treated as a wildcard, as defined in section 2.3.2.1.2.
User agents should consider two hosts (A and B) to have an "ours" relationship if A refers to a policy reference file on B, and that policy reference file contains a matching KNOWN-HOST entry for B for the applicable policy.
In this example, forinstance.com and example.com are owned by the same company, and forinstance.com has referenced example.com's policy reference file. The example.com file would therefore have a KNOWN-HOST declaration for hosts in the forinstance.com domain. All such hosts would be allowed to reference the single policy declared in this file, which applies to all URIs and all cookies.
<META xmlns="http://www.w3.org/2002/01/P3Pv1"> <POLICY-REFERENCES> <POLICY-REF about="/p3p/policy.xml#corporate"> <INCLUDE>/*</INCLUDE> <COOKIE-INCLUDE name="*" value="*"/>
<EXTENSION> <KNOWN-HOST name="*.forinstance.com" /> </EXTENSION> </POLICY-REF> </POLICY-REFERENCES> </META>
Any number of KNOWN-HOST elements can be declared inside a POLICY-REF element or inside the POLICY-REFERENCES element. Known host declarations at the POLICY-REFERENCES level are considered to apply to all policies in the file, excluding those that have specific declarations at the POLICY-REF level.
This example.com policy reference file shows several different policies and two KNOWN-HOST declarations. The first declaration is for hosts in the forinstance.com domain and applies to the first two policy references("#corporate" and "#surveys), which do not have specific known host declarations. The third policy ("#dataProcessingAgent") covers a particular cookie set by an agent site that example.com has contracted with to provide data collection and processing services. This policy reference has its own known-host declaration, for hosts in the myagent.com domain. Since the forinstance.com domain is not redeclared as a known host for this policy, user agents can not verify a relationship between example.com and forinstance.com hosts as it applies to this policy.
<META xmlns="http://www.w3.org/2002/01/P3Pv1"> <POLICY-REFERENCES> <EXTENSION> <KNOWN-HOST name="*.forinstance.com" /> </EXTENSION> <POLICY-REF about="/p3p/policy.xml#corporate"> <INCLUDE>/*</INCLUDE>
<EXCLUDE>/surveys/*</EXCLUDE>
<COOKIE-INCLUDE name="*" value="*"/>
<COOKIE-EXCLUDE name="Survey*" value="*"/>
<COOKIE-EXCLUDE name="AgentCookie" value="*" domain=".myagent.com path="/"/>
</POLICY-REF> <POLICY-REF about="/p3p/policy.xml#surveys"> <INCLUDE>/surveys/*</INCLUDE> <COOKIE-INCLUDE name="Survey*" value="*"/>
</POLICY-REF>
<POLICY-REF about="/p3p/policy.xml#dataProcessingAgent"> <COOKIE-INCLUDE name="AgentCookie" value="*" domain=".myagent.com" path="/"/>
<EXTENSION> <KNOWN-HOST name="*.myagent.com" /> </EXTENSION> </POLICY-REF>
</POLICY-REFERENCES> </META>
Browsers may cache the policy reference file based on an EXPIRY element in the policy reference file. The expiration information associated with that element should also be considered to apply to the known-hosts declarations; i.e. known host information may be cached along with the policy reference information.
Since the proposed mechanism relies on modifications to the policy reference file, user agents that rely solely on compact policies can not verify these domain relationships.
The KNOWN-HOST extension relies on the use of the "P3P: policyref" HTTP header for one site to refer to a policy reference file on another site. Since policy reference files cannot include full URIs in the POLICY-REF INCLUDE elements, sites that rely on placing their policy reference file in the well-known location have no way of referencing policies hosted on other sites.
User agents should be aware that if they allow a cookie to be set based on a relationship established by known host declarations, they should verify that such a relationship exists at cookie playback time, and not send the cookie if it does not. Such verification implies re-fetching the policy reference file and evaluating its known host declarations only if the policy reference file has expired.
1 back http://www.w3.org/2002/p3p-ws/pp/coremetrics.pdf
“Agents and P3P”position paper presented to the W3C
Workshop on the Future of P3P
2 back P3P 1.0, section 3.3.5: <RECIPIENTS> element