W3C | TAG | Previous: 7 Apr teleconference | Next: 28 April 2003 teleconf

Minutes of 14 Apr 2003 TAG teleconference

Nearby: (No IRC log) Teleconference details issues list www-tag archive

1. Administrative

  1. Roll call: SW (Chair), TBL, PC, DO, TB, CL, NW, RF, IJ (Scribe), Martin Duerst (for IRI discussion). Regrets: DC
  2. Accepted 7 Apr teleconference minutes
  3. Accepted this agenda
  4. Next meeting: 28 April. Regrets IJ.

1.1 Meeting planning

1.2 W3C Track Presentation


PC suggestion:
PC: I spoke with Janet about this; she thought it was reasonable to assume that there would be a number of people in this audience who would not be familiar with the TAG. Intro talk touching on some technical details appropriate.
SW: Do we have the material?
PC: I think so.
SW: One presenter or more?
PC: One or two; I volunteer to be one.
Who will attend: PC, CL, IJ, TBL, DC (don't know).
Who will not: SW, TB, RF, DO, NW
PC: I'll start the presentation and we can decide later who will give it.
Action PC: Prepare W3C track presentation.
SW: We should discuss 28 April.
PC: Slides may not be available until slightly after...

1.3 TAG report at AC meeting

  1. Completed action IJ 2003/04/07: Report to mtg organizer TBL constraint on slot for TAG report, then report back to TAG on revised slot (11:30-12:30).
  2. Suggestions for TAG presentation at May AC Meeting from DO and PC


SW: I heard a proposal to state our plans for last call of arch doc. Not sure we've closed that....
DO: Two questions to ask of AC:
  1. Faster arch doc or more complete?
  2. Shape of arch doc
PC: Idea is that we wanted to be sure that we had some questions for the AC. Need to decide who will give presentation.
SW: Slot is 11:30-12:30
Resolved: DO and CL will present TAG report at AC meeting.

1.4 Completed action items

  1. Action IJ 2003/04/07: Send out summary of TAG activity.

2. Technical

  1. URIEquivalence-15
  2. IRIEverywhere-27
  3. xmlIDSemantics-32
  4. abstractComponentRefs-37
  5. namespaceDocument-8


IJ: I am revising get7 finding; hope to have draft for TAG to review in the next day or so.

2.1 URIEquivalence-15

SW: RF has incorporated into RFC2396bis. My proposal is to let that text evolve through IETF process.
TB: SW posted some notes to uri@w3.org in this spirit.
RF: Seems reasonable to proceed thusly.
Resolved: Move URIEquivalence-15 to "Pending"

2.2 IRIEverywhere-27

Martin Duerst joins the call

SW: Where are we on this?
Tim, which document?
TBL: Rules in TB'suri-comp-4 ("6. Good Practice When Generating URIs"). Is assumption for IRIs that someone will do mapping from 8 to 7 bit?
CL: Yes.
TBL: If so, there will be an increasing gulf between namespace approach and others.
q+ to say that this is the compare vs dereference thing al over again
TBL: Are we prepared to live with this anomaly?
CL: the spec says that if you are using 7-bit transport, you do the mapping, and you do it as late as possible.
TBL: But all the time, because you can do the mapping, they will identify the same thing. At any time you can do the mapping, so anyone who does the mapping is valid.
"generating" vs "processing" ?
MD: TB's text is good practice for those generating URIs.
Chris, you wanted to say that this is the compare vs dereference thing al over again
TB: I'm kind of struggling here... Can we agree on the issues here:
  1. What help can we give the community (of spec writers) on the usage of IRIs? (especially since docs are in motion)
  2. There are some problems with IRIs themselves; TAG can provide useful input to IRI spec process.
TBL: I think there is a concisely statable problem: Our finding draws attention to the way of canonicalizing URIs, thereby indicating that canonicalization is a reasonable thing to do. We say "beware; don't expect people to do it".
I propose that we formally say: "The TAG feels that IRIs are a good thing, and people writing specs with places for identifiers bear this in mind, but
the IRI spec is still in motion and that's a fact of life, so we're not going to write interim text for anyone, sorry."
TBL: But is the TAG licensing canonicalization?
CL: I have moved away from my position due to practical considerations; I think that if we ask people to canonicalize, they won't do it
TBL: I heard DC saying we should eliminate hex-encoding and just move to UTF-8 strings.
CL: This breaks implementations of XSLT and namespace processors.
TBL: But they don't notice it today since they don't use IRIs today.
CL: That turns out not to be true. One case is uppercase/lowercase hex encoding. You have to tell them that uppercase and lowercase hex have to compare the same in all cases, and currently they don't. The canonicalization algo will thus bite them even if they are not doing IRIs.
TBL: With URIs we don't expect people to mix case when using URis; with IRIs we do expect people to use Kanji characters.
I think there is an unresolved issue inherent in Chris' respone - the fact that RFC2396-Sensitive Comparison is licensed by the TAG finding, so we should really move everyone up to using canon'n algorithm.
  1. I think we have consensus on the following: "In the future, good to use IRIs (no hex-encoding)"
  2. The IRI spec is not done (That's part of life).
  3. The TAG shouldn't write an interim spec.
  4. The way Schema 1.0 handled this was reasonable.
q+ to ask whether we are eally clear that the last part - is canonicalization allowed in *proceessing* URIs?
The way XML 1.0 not xml schema handled it
TBL: I agree with good practice points (canonicalize URIs). But "IRIs are a good thing" is not enough. To say that "IRIs are separate from URIs" I think that that would break a lot of Web software. My assumption is that we've been licensing the canonicalization.
q+ to ask tim what in particular these 7-bit fields are
I propose we close this issue by issuing a one-word statement: "Yes"
TBL: IRIs are also impacted by the URI equivalence comparison issue since UTF-8 is involved. In the case of IRIs, there's a serious need to use both 7 and 8 bit.
current IRI spec says use upper case if you have to use escaping
TBL: The equivalence issue shines a brighter light on the IRI issues.
tim-mit, you wanted to ask whether we are eally cleat that the last part - is canonicalization allowed in proceessing URIs? and to ask whether we are eally clear that the last part
... - is canonicalization allowed in *proceessing* URIs?
TBL: I want to use TB's document for IRIs as well. If we adopt this document for URIs, seems like we should do so for IRIs as well.
MJDuerst, you wanted to ask tim what in particular these 7-bit fields are
TB: I think we should close 27 with a one-word statement: Yes.
q+ to conclude: When everyone uses URIs, Simple String Comparson works pretty well, but our answers to issues 15 & a7 are incompatible.
TB: Should we take up another IRI issue, or address that in the I18N forum?
CL: I think that "Yes" is insufficient as an answer to 27; they asked "how"
RF: Until IRI is not a moving target, it should not be recommended by any spec. Recommend CDATA if you want; don't point to specs in development.
tim-mit, you wanted to conclude: When everyone uses URIs, Simple String Comparson works pretty well, but our answers to issues 15 & a7 are incompatible.
TBL: While I agree with RF formally, people are looking for a little bit of vision to the other side of the desert. I think we cannot treat these two issues independently; the answers to issues 15 and 27 are incompatible. In issue 15 we solve the problems by suggesting canonicalization (into a more standard 7-bit notation). In 27, we suggest moving away from a std 7-bit notation to either (1) a complete 8-bit world or a (2) 8-bit world with a direct mapping to a 7-bit world but you are not persuaded to move in one direction or the other.
16 = transmitter makes right. 27 - receiver makes right
MD: TBL said a lot of 7-bit fields might suffer from moving to 8 bit. Which ones? For namespaces and for RDF I don't see any problem (since both occur in 8-bit specs).
TBL: HTTP, SMTP will break.
MD: That's retrieval.
TBL: HTTP headers send around URIs (related to email, news, etc.) quoting the URI spec.
MD: We are explicitly not proposing that HTTP suddenly use 8 bit.
TBL: What happens when someone types in an 8 bit IRI and it has to be sent in 7 bit HTTP?
MD: IRI spec licenses canonicalization.
TBL: Voila.
MD: For retrieval.
TBL: We may not be retrieving! We may just be talking about it in an email message.
MD: My general view is that SMTP is broken in that respect anyway... You can send xml in email (base 64 encoding)
CL: You can do that or use numeric char references in your URI.
xml allows you to use ncr - what s the issue here?
SW: How do we help move IRI spec along? In TB finding, he talks about codepoint-by-codepoint comparision; this also seems relevant to the 8-bit/7-bit discussion. [It's fine to do codepoint-by-codepoint comparison]
MD: The IRI spec doesn't define canonicalization for the moment.
TBL: But it defines a mapping from IRIs to URIs. That's a form of canonicalization....
CL: We are saying that it's not; we are saying it's a conversion. We are not implying that the URI is in a canonical form.
TBL: But it is a canonicalization algo in the sense that it maps a string to a subset of IRIs (namely URIs).
q+ to say that there is software, APIs, net protocols, and so on which take URIs as a 7-bit value. You can't, and the IRI draft doesn't suggest you just squeeze 8 bits into the 7bit space.
MD: IRI spec says explicitly "You don't do this unless you have to."
I'll be in Japan next week, and traveling the week after, sorry.
TBL: I think that, for this to hold together, we'll need a finding ( iri-comp-4...) which extends the URI Equivalence text to talk about IRIs. We need to explicitly answer the question: "If I have an IRI, can I turn into a 7-bit form if I'm an intermediary?" I think that we'll have to modify our response to 15 when we look at 27.
q+ to say that if you want IRI to move forward, this discussion better get somewhere soon.
SW: Should this be going on in TAG or in I18N + URI Activities?
tim-mit, you wanted to say that there is software, APIs, net protocols, and so on which take URIs as a 7-bit value. You can't, and the IRI draft doesn't suggest you just squeeze 8
... bits into the 7bit space.
MJDuerst, you wanted to say that if you want IRI to move forward, this discussion better get somewhere soon.
MD: If this discussion continues the way it has over the past few weeks, the IRI spec will not move forward. I can, of course, just pick a way forward...but I don't want to do that.
RF: This discussion does not have an impact on the content of the IRI spec.
MD: My understanding is that what TBL wants to do is not independent of the IRI spec.
RF: If you were to publish the IRI spec tomorrow, then the issue we are talking about is answered. The only thing that's missing today is a reference for the IRI spec.
TBL: I think we are being asked to look at the bigger picture (usage of IRIs).
CL: Yes, I agree with TBL.
TB: I think that we can tell the community that in specs, in slots where ids can go, do not constraint to just what's blessed by RFC2396. It's ok to have non-ascii chars in formats and protocols.
Is a processor allowed to canonicalize an IRI using the IRI spec?
TB: Not sure what else to say. Perhaps (1) keep an eye on IRI work; we support it (2) we recommend canonicalizatoin and it applies to URIs and IRIs. Most importantly, canonicalization on generation is best.
TBL: So you should display a Kanji char and immediately convert to 7 bit?
MD: The question is whether TB's good practice for URIs applies literally to IRIs or "appropriately" to IRIs.
Martin would like to NOT canonicalize early (on generation) but to have such a well-defined equivalence that one can delay the canonicalization to any time. This is the opposite of the style of uri-comp-4.
hence the connection between 15 and 27
MD: There are thus two ways to do this, and possibly other ways in between. You can apply section 6 of uri-comp-4 in a literal sense (# Only perform %-escaping where required by RFC2396.) or in an appropriate spec (# Only perform %-escaping where required by IRI spec)
TB, MD: I'd prefer to do the latter.
What do we think about the way Namespace 1.1 handles this? http://www.w3.org/TR/xml-names11/#IRIs
TBL: We can't get away with string compare for IRIs. We want people for people to be able to exchange 7-bit and 8-bit forms.
CL: If you want to make something 7-bit, don't use hex conversion; use numeric char references.: Since we are talking about XML Namespaces here.
TBL: I'm also imagining other contexts (e.g., email headers)
SW: How to move forward?
TB: MD's comments after last week's meeting were helpful (even if I disagreed with some of them).
MD: I think I'll take Roy's advice and just do something...
TB: I think we can say - (1) virtue of early canonicalization (2) IRIs good; keep an eye on them (3) support 8-bit identifiers, even if IRI spec not done. Canonicalization: "Don't use "/../". Avoid other crap, independent of Kanji or %-escaping.
Action TB: Draft a proposed step forward on IRI 27.
CL: My action may be moot depending on TB's results: "Action CL 2003/04/07: Revised position statement on use of IRIs. CL says to expect this by 21 April."

2.3 xmlIDSemantics-32

* See Chris Lilley draft finding
good bye everybody. thanks for having me here.
Proposed: Send "How should the problem of identifying ID semantics in XML languages be addressed in the absence of a DTD?" to www-tag for their consideration?
CL: This is mostly discussion; no conclusion. Almost all of the material is on the email thread.
Resolved: Make "Chris Lilley draft finding" public.
Abstain: TBL
Needs more discussion. No conclusion yet. This document is a summary made available by the TAG; it is hoped that it lists all the solutions that have been proposed, and their advantages and disadvantages, which should help any group chartered to deal with this problem. The TAG is unhappy with the current situation and would like to see further discussion and convergence on a solution.
Action SW, NW: Read Client handling of MIME headers. If +2, then IJ will send to www-tag.
Resolved: That approach for deciding to make public finding is fine.

2.4 abstractComponentRefs-37

DO: 11 different options in latest summary. I didn't get much comment on tag@w3.org; got more feedback on www-tag.
PC: Please either post to www-tag or give us 7 days.
DO: Lurking here is an issue about URIs/URNs....

2.5 namespaceDocument-8

PC: In TB's 16 theses, he includes "don't use URNs"; I'm not sure that the TAG has taken a position on that point.
SW: I think we said it was important that namespace doc be both human- and machine-readable.
TB: Jonathan is working on the RDDL part.
PC: There's a US Govt namespace recommendatoin that proposes to use URNs instead of URLs. {Draft policy}

2.7 Architecture document

See also: findings.

  1. 26 Mar 2003 Working Draft of Arch Doc:
    1. Action DC 2003/02/06: Attempt a redrafting of 1st para under 2.2.4 of 6 Feb 2003 draft
    2. Action DC 2003/01/27: write two pages on correct and incorrect application of REST to an actual web page design
    3. Action DO 2003/01/27: Please send writings regarding Web services to tag@w3.org. DO grants DC license to cut and paste and put into DC writing.
    4. Action DC 2003/03/17: : Write some text for interactions chapter of arch doc related to message passing, a dual of shared state.

2.8 Issues that have associated action items

3. Other actions

Ian Jacobs for Stuart Williams and TimBL
Last modified: $Date: 2003/05/07 12:02:03 $