Comments on arch doc draft from Tim Bray on 2002-06-27 (www-tag@w3.org from June 2002)

From: Tim Bray <tbray@textuality.com>
Date: Thu, 27 Jun 2002 14:36:15 -0700
To: www-tag@w3.org
Message-ID: <3D1B854F.7050301@textuality.com>
Numbered to see if this helps improve productivity of debate.

TB1. Document title

I Suggest "Architectural Principles for the World Wide Web"

TB2. Document's intended usage

I suggest that the primary usage of this document will be in reference 
mode, so that arguments can take the form of "As it says in section 
2.3.1 of the Web Arch, you have to use GET".   I think the document 
should be designed to support this usage even if it is thereby made less 
useful as a tutorial or rhetorical resource.

TB3. Minimalism

I think this document should contain *nothing* but succinct statements 
of architectural principles supported wherever possible by examples, and 
the minimum necessary text to make sure each principle is correctly 
understood.  I think that the bulk of text to motivate, evangelize, and 
provide context for the principles should go elsewhere (for example, 
maybe our findings, see below).

TB4. Prune Abstract

Remove all but the first sentence of the abstract, and reword it 
slightly to read "This document is designed to serve as a reference to 
the architectural principles which underly the World Wide Web."

TB5. Lose '@@' section in 2nd para of Status of this Document

TB6. Define terms

I think there is scope for defining terms for use in this document and 
for external reference.  Obvious examples are "agent", "client", and 
"server".  The xmlspec vocabulary provides all the machinery necessary 
to do this.

TB7. Clients/Servers/Agents

The introduction talks about clients, servers, and agents.  It would be 
simpler if we could write all our architectural principles purely in 
terms of "agents" and have them apply to any software using the Web.  So 
I recommend "agent" as a defined term.  It may be the case that there 
are architectural principles that apply only to "servers" or "clients" 
in which case we'll need to define those terms.  Right here in the 
introuction seems like a good place to do it.  If we do manage to get 
away with just using "agent" then a sentence here explaining that we're 
doing this and thus "agent" comprises both "client" and "server" would 
be useful here.

TB8. Results of having an architecture

Current doc says "that result in the large-scale effect of a shared 
information space."  I recommend "that result in the system working 
correctly and exhibiting good performance characteristics."

TB9. Is "Ruby" really a format?

in the same way the others are?  I think its inclusion doesn't help 
clarity here.

TB10. Is "DOM" really a format?

This was raised before and Chris explained why it's sensible to include 
DOM in this list but it's raised questions, most recently from Stuart, 
so I say lose it.

TB11. Lose last paragraph of introduction

The one beginning "The rules are kept to a minimum..."

TB12. Lose first sentence of chapter 1

The one beginning "Identification is..."

TB13. Introduce "resource" where?

I think "resource" should be a defined term in this document, where is 
it introduced?  Start of chapter 1 is as good a place as any.

TB14. Number principles?

Obviously it's a good idea to address principles, and people will want 
to use them for reference.  Does numbering them introduce fragility?  In 
XML 1, well-formedness and valiidity constraints are given short names, 
not numbers, which are subsequently used as hyperlink anchors.  So for 
example principle 1 could have the ID "Use URIs"

TB15. Principle #1

Could this be stronger?  Along the lines of "Anything which has a URI is 
by definition a Resource and thus part of the Web.  Anything which 
doesn't is not.  Everything important, including units of information 
and service as well as widely-shared abstract notions, SHOULD be a 
resource."  I'd love to find a way to use MUST in here.

TB16. URI & reference

In point of fact since a URI reference optionally is relative and 
optionally has a fragment identifier, there are naturally four classes 
of things here.  I think we should name them, define them, and make it 
handy for other people to use them.   Strawman language:

"A URI reference contains a URI, which may be relative or absolute, and 
optionally has a trailing #-delimited fragment identifier.  Thus there 
are four classes of identifier, all of which are URI References:

1. URI - not relative, no fragment.  This is what is sent from an agent 
to another in the dereferencing process.
2. Fragment-free URI Reference - relative allowed, no fragment.  As an 
example, XML 1.0 requires SYSTEM identifiers to be of this class.
3. Absolute URI Reference - relative disallowed, fragment allowed.  In 
practice, almost all XML namespace names are of this class.
4. Unrestricted URI Reference

W3C Recommendations MUST be clear as to which class of identifiers they 
support."

TB17. URI Schemes, lead-in

The section needs an introductory section explaining that the URI scheme 
  is expressed using a :-delimited prefix.

TB18. Defined term for "http:" URIs

We use these a lot and talk about them a lot, so let's bless some 
generic term.  Why not just HTTP URI, or possibly HURI or hURI?

TB19. URI Schemes ordered list

This needs serious revision, and I'm not at all sure that an ordered 
list is the right way to go.  It conflates specific remarks about hURIs 
with discussions of temporal relations and social governance.  Perhaps 
1.2 could have some brief subsections?

TB20. Fielding vs. existing language on HTTP URIs

I like Roy's version a lot better, and it sounds more or less exactly 
like what I've heard TimBL intone any number of times.  I suggest 
replacing existing language with Roy's.

TB21. Equivalence classes

The set-theoretic term "equivalence class" is used in a gramatically 
jarring way here.  I think the same effect could be achieved with less 
strain by saying that a URI scheme defines comparison and equivalence 
rules for URIs in that scheme.

TB22. Cool URLs

I think the title is inappropriate in a reference work.  The 
architectural principle can be stated succinctly in one sentence - 
supporting it with a reference to TimBL's "Cool URLs" paper is entirely 
appropriate.  Also it says "Persistence is usually a matter of policy" - 
I disagree, it's *always* a matter of policy.

TB23. URNs

A subtext of the current section 1.2.1 is that you don't really need 
URNs (as I read it anyhow).  At the moment the only technical 
characteristic that meaningfully distinguishes URNs from other URIs is 
that you typically can't dereference URNs.  Do we want to say anything 
about URNs?

I note that there's a later section which discusses this and (see issue 
below) should be unified with this section.

BTW, the example of how W3C uses policy to stablize URI-space design is 
good.

TB24. The economics of names

Recommend discarding entire section except for last paragraph. 
Recommending inserting a principle saying "You SHOULD avoid centralized 
registries, but you MAY rely on the continued existence and utility of 
the DNS."  Recommend renaming the section "Centralized Registries".

TB25. Dereferencing"

First para in sect 1.3 says dereference mechanism "should be" defined by 
each scheme... isn't that "must", for every scheme in which deferencing 
is a goal?

TB26. Principle 2

"Agents SHOULD be able to..." What does this mean?  Are we really saying 
"Use hURIs rather than some other URI scheme because dereferencing is 
defined and widely available?"

TB27. Role of findings

I think the practice in the current draft of pointing to findings is 
fine, but I think that the pointers should be explanatory rather than 
normative.  I.e. the architectural principles asserted in findings 
should be copied into this document.  Thus the finding goes on living to 
provide the tutorial, explanatory, and contextual material we don't want 
cluttering up the Arch doc.

Thus also, findings should take care to have a short punchy bottom-line 
assertion of the architectural principle in question, suitable for 
copying into the arch doc.  Norm's recent formatting-properties finding 
is an excellent example.

TB28. Normative/Non-normative refs

I think that normative references should be at a minimum, and should 
ideally be limited to non-W3C material (RFCs, Unicode, etc).  The 
reference function of this document should be complete and 
self-contained to the extent possible.

TB29. URIs, URLs, etc.

This should go into the "URI scheme" section above, right?

TB30. Protocols

Please copy in Orchard's text with what copy-editing is appropriate. 
We've had it for some time, we should use it.

TB31. Protocols, first para

I think the "teems with activity" prose could go without loss of 
content, in fact the whole paragraph.  Let's write down what the 
architectural principles are and then figure out the minimum 
introductory prose to make them comprehensible.

TB32. Remove Chapter 4

Until we have existence proof that there's something to go in it.  I'm 
beginning to think that it could be unified with the "Formats" chapter 
and the document would work better.  After all, the introduction says 
the Web arch comprises addressing, protocols, and formats.  Why not have 
one chapter for each?  Seems pleasingly symmetrical.
Received on Thursday, 27 June 2002 17:36:18 UTC