My comments on the current last call draft from louis.theran@nokia.com on 2000-03-31 (www-p3p-public-comments@w3.org from March 2000)

From: <louis.theran@nokia.com>
Date: Fri, 31 Mar 2000 16:46:25 -0600
To: www-p3p-public-comments@w3.org
Cc: hubbard@w3.org
Message-ID: <B9CFA6CE8FFDD211A1FB0008C7894E4601078805@bseis01nok>
[ These comments address http://www.w3.org/TR/2000/WD-P3P-20000211
<http://www.w3.org/TR/2000/WD-P3P-20000211>  ]
 

1. General comments

The last call draft of the P3P specification is an ambitious undertaking.
It describes a fairly sophisticated model for policy specification, a series
of HTTP protocol extensions and a policy transmission and definition syntax.
Unfortunately, there are a number of problems with both the mechanisms
described and the specification itself.
 
Protocol extensions in particular are very difficult to deploy and really
need to be the result of extensive profiling.  There is a large amount of
text devoted to the use of the non-standard HTTP-EXT mechanism and policy
reference caching without any concrete statements of the potential effects.
I'm concerned that this kind of a priori postulation of performance impact
is both out of scope for P3P and ill advised from an engineering standpoint.
 
There are also a number of serious editorial issues, including normative
references to transient artifacts and incorrect or unnecessary use of ABNF.
Also, a number of sections are defined in an overly informal and vague
style.  This presents a major obstacle to outsiders (including myself) who
want to evaluate P3P.
 
I'm including a number of specific comments, questions and suggestions
below.

2. Specific comments

Example of P3P in use:
 
The example session between `Shelia' and thecoolcatalog.com should be
rewritten as a functional description of servers and clients conducting a
transaction involving a policy.  A state diagram would also be useful, and
thecoolcatalog.invalid would be a more correct domain for the example
server.
 
Use of ABNF:
 
The liberal use of ABNF in the P3P specification isn't really necessary and
only adds complexity.  XML in particular has a well defined syntax that
allows for higher level descriptions (i.e., DTDs or schemas) to be used.
This is much less error prone and easier to read.  The description of the
use of ABNF in section 1.2 is particularly confusing: are the HTTP header
rules intended to be processed as XML; is the intent that ABNF's ordering
rules be ignored all the time, or only in rules that look like XML
attributes?  Since there are already better notations for describing XML
structures, why not just use one?  
 
Even where ABNF is required, there are a number of minor errors.  For
example, the first use of ABNF uses an undefined nonterminal `prefix' ---
RFC 2774 calls this `header-prefix', which is defined as `2*DIGIT'; this
means that the English description of HTTP-EXT prefixes as two digit numbers
is misleading.
 
Finally, it is noteworthy that HTTP-EXT uses the RFC 822 syntax, not the RFC
2234 syntax.  P3P doesn't really use the 2234 syntax either, seeing as '|'
is used as the alternation operator instead of '/'.
 
Use of HTTP-EXT:
 
Use of an internet draft as a normative reference is in direct violation of
RFC 2026 (section 2.2).  While the W3C isn't necessarily bound by the IETF
process documents, I would call attention to the following excerpt from RFC
2026:

   An Internet-Draft that is published as an RFC, or that has remained

   unchanged in the Internet-Drafts directory for more than six months

   without being recommended by the IESG for publication as an RFC, is

   simply removed from the Internet-Drafts directory.

Are W3C Recommendations really allowed to use transient artifacts as
normative references?  As of this writing, HTTP-EXT is available as an
experimental RFC ( http://www.isi.edu/in-notes/rfc2774.txt
<http://www.isi.edu/in-notes/rfc2774.txt> ), so at the very least the
reference should be updated.  Note that RFCs with the Experimental
designation are not considered to be standards.
 
This specific use of HTTP-EXT:
 
Since ',' is a legitimate URI character, it can't be used as a list
separator unless there is a requirement that all other occurrences be
escaped.  I'm also concerned about the use of the (as far as I can tell)
undefined nonterminal `local-URI'.
 
The <meta> tag syntax:
 
As described, the <meta> tag syntax is in violation of HTML's content model.
Why not just use the kludge that is already in HTML---<meta
type="http-equiv" ...>?  Also, the names of the tags should be changed to
lower case, so that they conform to XHTML 1.0.
 
Indirect references:
 
This section is far too long.  Service replication and localization is a
system administration issue and should be left to site administrators.  A
simple paragraph requiring that a P3P user agent resolve all 3xx status
codes is adequate.
 
Reference caching:
 
The justifications of reference caching aren't explained very well.  If I
understand section 2.4.1.2 correctly, there is assumed to be an external
service that handles policy negotiation of behalf of multiple users.  Given
the current web infrastructure, such a service would either be implemented
as an intermediate proxy or as an extension to the user agent.  In the
former case, the service would have to fetch the content anyway; the latter
doesn't make a lot of sense to me, since an extended user agent would most
likely just implement P3P.
 
Unless there is a specific, common configuration that benefits from
reference caching it should simply be removed.  If reference caching is to
remain, there should at least be some mention of the fact that a policy
reference may live longer than the associated document.
 
The safe zone:
 
Taken literally, the safe zone would be an ideal location for a malicious
client to launch an anonymous denial of service attack against a system.  A
simpler solution to the problem the safe zone attempts to address would be
to require that user agents use a strictly limited subset if HTTP headers
(Host is the only really essential header for most requests) when requesting
a policy document.
 
Also, it is unclear why HEAD is recommended.  The purpose of making a
request to a site is to retrieve data; since the location of the policy
document will be returned with the initial response, using HEAD only adds a
round trip and the programs that generate dynamic content often don't
implement HEAD.
 
Syntactic issues:
 
The XML syntax is not very consistent.  <PURPOSE>, <RECIPIENT> and
<RETENTION> use empty tags extensively while <DATA> uses attributes.  I
consider the former option to be superior, as illustrated by the example in
section 3.4.1.  Also, the reuse of <DATA> in the data schema specification
is extremely confusing.
 
Also, the skeleton data schema matches neither the ABNF description nor the
complete example.
 
Extensibility issues:
 
P3P data schemas have less expressive power and are harder to extend than
either RDF or XML Schemas.  Is there any justification for not using one of
those?
 
Data types:
 
The descriptions of the primitive and base data types should proceed in
either a top-down or bottom-up direction.  The current ordering (primitive,
base data schema, basic types) is confusing.
 
 
^L

 
-- 
Louis Theran
Nokia Research Center/Boston
Received on Friday, 31 March 2000 17:48:50 UTC