W3C | TAG | Previous: 24 Feb teleconference | Next: 7 April 2003
teleconf
Minutes of 31 Mar 2003 TAG teleconference
Nearby:  IRC log | Teleconference details · issues list · www-tag archive
1. Administrative
  - Roll call: All present. SW (Chair), TBL, TB, DC, DO, PC, RF, CL, NW, IJ
    (Scribe)
- Accepted 24 Feb telecon minutes
- Accepted this agenda
- Next meeting: 7 April. Regrets: SW. NW to Chair.
    Action IJ: Arrange for call to NW 
1.1 Meeting planning
1.2 Mailing list management
  - Completed action IJ: Announce creation of public-tag-announce to chairs
    and tech plenary participants (and to AC).
2. Technical
  - Architecture document
- IRIEverywhere-27 and URIEquivalence-15
See also: findings.
  - 26 Mar 2003
    Working Draft of Arch Doc: 
    
      - Action DC 2003/02/06: Attempt a redrafting of 1st para under 2.2.4
        of 6 Feb 2003 draft.
        DC: No progress. 
- Action DC 2003/01/27: write two pages on correct and incorrect
        application of REST to an actual web page design
        DC: Progress two weeks ago (found co-author). 
- Action DO 2003/01/27: Please send writings regarding Web services
        to tag@w3.org. DO grants DC license to cut and paste and put into DC
        writing.
        DO: I am in meetings now on this topic. No indication of when this
        will be done. 
- Action CL 2003/0127: Draft language for arch doc that takes
        language from internet media type registration, propose for arch doc,
        include sentiment of TB's second sentence from CP10.
        CL: This is about charset headers, and not generating them when
        they might be wrong.  I hope to have something by next week. 
- Action DC 2003/03/17: : Write some text for interactions chapter of
        arch doc related to message
        passing, a dual of shared state.
        DC: I think I have some progress on latest one. This is my scribbling on issue
        8 and nearby. This is scribbling on message
        passing as a dual of shared state 
 
  - [Chris]
- So i don't have to chase it up again:
- CP10. Agents which receive a resource representation accompanied by
    an
- Internet Media Type MUST interpret the representation according to
    the
- semantics of that Media Type and other header information.
    Servers
- which generate representations MUST not generate Media Types and
    other
- header information (for example charsets) unless there is certainty
      that
- the headers are correct.
- [Chris]
- so its basically, *disagree* with the media type registration ...
Discussion of how to move forward on Architecture Document
  - [Ian]
- TB: Usual progress not happening here (incorporation of draft text
      after issue resolution). Although lots of people are queued up to write
      pieces, that's not happening. IJ could write an end-to-end draft and we
      could work from that.
- DC: That one doesn't appeal to me.  I'd be happy to say TBL "go".
      DO: But one reason we are gathered around this table is lack of TBL
      time. 
- DC: I observe that TBL does the writing and that people don't react
      to it.
- CL: I agree with TB on how docs progress: someone just writes stuff.:
      Someone puts a stake in the ground and then you have something to push
      against.
- DC: That happens when the editor is really the author.
- TBL: The way the Process Doc works is that IJ puts it in his head.
      Not the same with the arch doc (for IJ). What are the reasons for the
      TAG document?
- IJ: Time is no longer the primary issue.: Primary issue is that I am
      not getting the sense of agreement that would allow me to run with
      small pieces and create text. One way to advance - small groups of
      editors to come up with more concrete TOCs for various sections.
- CL: I think real text is important; people need something to complain
      about.
- DC: To me the bottleneck is not writing up the decisions; it's making
      them.
- [Chris]
- would quite like the paired-off 'buddy' system
- [Roy]
- Section 5 should be just "Interaction" -- including UI to server
      actions
- [Zakim]
- DanC, you wanted to suggest that the arch doc actually does reflect
      the consensus of the group; there just isn't that much of it
- [Ian]
- IJ: I disagree slightly on section on protocols; we've not had many
      discussions. Hard to say we don't agree.
- TB: On issues where we can hammer out consensus, writing them down
      will produce a meaty and useful arch doc. I think there are problems,
      but we have enough areas of consensus to publish something.  E.g., look
      at our findings. On "5.
      Machine-to-machine interaction"  I think we have consensus on
      device-independence, lessons of REST, but that material not really
      there yet.
- TBL: Where is Web Services supposed to be?
- IJ: I think section 5.
- TBL: There's a fundamental difference between Web Services like
      things and ordinary Web things.
- [DanC]
- hm... perhaps, TBray; but it's not clear to me where Ian should go to
      get the material
- [Ian]
- DO: I got the impression that TBL thought about Web Services
      interactions as different from Web interactions...
- TBL: Some Web Service interactions are very Web-like. But the
      architecture of Web Services is fundamentally messages that are part of
      larger protocols.  They are not part of transferring the state of
      global information space.  Some Web Services are retrieval operations,
      and some are not.
- DO: Roy asked a question a few months ago - are we writing the arch
      of the current or future Web.
- DO: I thought the answer at the time was more about "what has been".
      I'd love it if we worked on a more encompassing arch doc, but I wasn't
      aware that that was our mandate.
- TBL: I agree that the arch doc should encompass what we are doing
      within W3C.
- TB: The list under 4.5 has lots of useful consensus material that
      needs writing up. 
- RF: It would be easier for me if we talked about the future Web
      within this arch doc. It's easier for me to lay out the sections in my
      mind. E.g., I consider everything that TBL described to be under
      "Interactions"
- SW: How do we relate to WSA WG?
- CL to TBL: How is what you are saying about Web Services different
      from Semantic Web interactions
- TBL: Whether you send in a msg or a Web service, the Sem Web is for
      talking about real things in a logical way. You can send messages or
      create documents. The distinctions are orthogonal.
- [Chris]
- not finished yet btw
- [Ian]
- CL: So, e.g., the Sem Web is that "this nut is compatible with this
      bolt if they have the same thread size; as defined over here at this
      URI"
- [timbl]
- Chris how is SW different from WS?
- [Ian]
- CL: Whereas Web Services is "I want to buy 10k nuts if they fit on
      this bolt over here."
- TBL: When you say "I want to buy them", that's part of the Web
      Services architecture.
- CL: But that's the real world -- it costs you money.  I'm not seeing
      a real difference here; this is still about machine-to-machine
      communication.
- [Chris]
- so, sw and ws seem like very closely related things, in summary
- [Ian]
- DO: I would like to talk about diff between Sem Web and Web Services
      in the arch doc. My personal opinion is that the focus of Web Services
      is "how do you send messages back and forth; and how do you describe
      the messages in the exchange."  I see the Sem Web as how you make
      assertions about things and then make queries (in a reduced world) with
      constraints.  Yes you can put queries in messages...these two worlds
      intersect.
- [Zakim]
- TBray, you wanted to note that the only real industrial SW tech
      applications I've seen have been in WS apps
- [Ian]
- TB: To me it's obvious that the overlap is very substantial. I've
      seen people doing Web Services deployment realize that they need
      vocabularies.  I think that they are inextricably in bed with one
      another.
- TBL: I think that you find that when you use one you often use the
      other. But conceptually, they are very different parts of the
      architecture. You often use TCP and HTTP together. But they provide
      different services . Sem Web moves information space from bits to
      statements about real-world things. Combining Web Services and Sem Web
      is powerful. But there will be apps that use one without the other.  We
      are not writing an article for Business Week; we are writing up
      architecture principles. So, e.g., model real-world things in your app,
      not documents.
- [Zakim]
- timbl, you wanted to say that the overlap in application area is very
      large, but the overlap in concept (which befits the arch doc) is
    smaller
- [Ian]
- TBL: I can demonstrate cases where you use one without the other.
- [TBray]
- I don't think Web Services can really get to first base without
      shared semantics a la SW. And all the SW golden-future stories I read
      have observable effects that sound a lot like WS.
- [Ian]
- DO: I agree we are writing an arch doc. The WSA WG is also writing an
      architecture document. In the 14 Nov 2002 draft
       of the WSA Arch Doc (3 Basic and Extended
      Architecture),  I proposed some text about Web Services using
      language of Roy's thesis.
- [timbl]
- Inextricably intermingled? test cases: Library ctalog = sem web
      without WS; document validation sWS = WS without SW maybe? Ordering
      machine parts could be a SWS case.
- [Ian]
- DO: Maybe we can document differences and similarities.
- [DanC]
- I think the library folks would disagree with the "without WS",
      timbl. They're webizing Z39.50 using SOAP all over the place.
- [Ian]
- DO: We need to decide whether we're documenting one architecture or
    N.
- DC: It is?
- DO: I think the understanding we've had so far is that we are writing
      the Web architecture w.r.t. REST, excluding things like Web Services
      and Semantic Web.
- [DanC]
- (I consider our issues list as the authoritative list of decisions we
      owe)
- [timbl]
- WS without WS - foaf?
- [DanC]
- (and perhaps a decision that says 'this document is consistent with
      our decisions on all these issues')
- [Ian]
- TB: I think the arch doc will exhibit progress when one or more
      people take ownership of one or more sections and write them.  I don't
      think we can divorce content from TOC. I am convinced we have consensus
      around enough points that we can produce a useful document.  I'm not
      arguing that the current TOC is wrong; I'm arguing that we need text
      from start to finish, and we need to roll up sleeves and write it.
  - [Ian]
- CL: Text of this IJ/MD/CL discussion
      needs to change ($Date: 2003/04/01 16:32:57 $)
- [DanC]
- to me, what chris just said is the heart of the issue.
- [Ian]
- CL: No longer true: " 1. RFC2396 should be modified so that hex
      digits (HEXDIG) are case-insensitive. "
- DC: Please include the Kanji example in this document.
- CL: TB's how to compare distinguishes case where you dereference and
      where you do not. I am proposing that the cases where you don't
      dereference are treated by simple Unicode string matching. However the
      next level up works on all schemas and you are dereferencing; upper
      lower case hex should be same; same with Kanji example; I think that
      this should be scheme-independent.
- RF: Why do IRIs affect RFC2396 bis?
- CL: Would be weird if Kanji lowercase hex isn't the same as Kanji
      uppercase hex. Does not depend on URI scheme.
- RF: Need to convert from IRIs to URIs for comparison, though.
- CL: Aha! No need to do this comparison any more if you are not
      dereferencing the IRIs.
- [timbl]
- (RF said they compared the same because you convert to URI before
      comparison)
- [Ian]
- RF: If you are going to use an IRI as a namespace id, you're going to
      want to convert to URI first, otherwise, you'll have differences in the
      use of that identifier. You create a lot of unnecessary duplicates if
      you don't convert first.
- CL: Encoding has nothing to do with this.  These are Unicode
      character codes. Assume it's in IRI form. Why would you convert to
    URI?
- [DanC]
- "most common is URIs"... not in XML implementations
- [Ian]
- RF: Most common use of identifiers is as URIs; you'll have IRIs
      sometimes and URIs sometimes. If you don't convert to URIs first, you
      won't have uniformity.
- CL: When you are not dereferencing, case matters. The available
      evidence from people who do namespace stuff is that they do string
      comparison. It's pushing too much water uphill for namespace case.
- RF: So you are relying on "If you want uniformity, use things
      uniformly."
- CL: I can't get my preferred position, so my new position is closer
      to what the world will accept.
- RF: There are situations where IRIs are not as useful due to lack of
      deployment.
- [timbl]
- w+ to say that we were asked to fix a problem; if we documented the
      current situation, we would document a mess. We could suggest an
      alternative. Eg. We could recommend that string comparison be limited
      to URIs only, and people not use IRIs until then can do IRI-URI
    mapping.
- [Ian]
- SW: Larry, I think, is advising us to be patient and wait until
      there's an RFC.
- [Chris]
- http://www.w3.org/International/iri-edit/draft-duerst-iri.html
- [Zakim]
- DanC, you wanted to say why it maybe should depend on the scheme and
      to observe that the genie is out of the bottle here
- [Ian]
- DC: There's text in XML namespace spec that says you can't have two
      attribs with the same name after they are namespace qualified.
- [Chris]
- good test case there
- [Ian]
- DC: You can tell (for Kanji case) whether your namespace
      implementation was happy. When I coded this up, I converted IRI to URI
      before comparison. It worked; that's a coherent position .I looked up
      in the world and I saw that there's an enormous tide of people doing it
      the other way: All XPath, XML Query, XML Schema implementations.
- [Chris]
- I agree its a coherent position. i t was my first choice. But its not
      the majority position
- [Ian]
- DC: HTML implementations use heuristics to maximize @@missed@@
- [Chris]
- everyone does unicode character by character comparison
- [Ian]
- DC: The way you get shared understanding of a name is to repeat
      verbatim in lots of contexts. The more mangling we have, the higher the
      risk of confusion.
- [Chris]
- mixing up the hexified and non-hexified forms is a problem
- they do no collide
- the kanji form is te correct canonical form, written correctly
- [Ian]
- DC: I think that the right answer is: (1) In the specific case of
      namespace names - no they do not collide unless they match unicode-char
      by unicode-char (2) this is rationalized in the real world since a
      person wanted to write their name the way that person usually does.
- [Zakim]
- timbl, you wanted to agree with Roy and re explain it if chris hasn't
      got it and to say that while the URI spec licences the conversion
      between 8bit and hexified froms, then they
- ... will be equivalent. and to say that we were asked to fix a
      problem; if we documented the current situation, we would document a
      mess. We could suggest an alternative. Eg. We
- ... could recommend that string comparison be limited to URIs only,
      and people not use IRIs until then can do IRI-URI mapping.
- [Ian]
- TBL: If we documented the current situation, we would document a
    mess.
- [timbl]
- possible solutions -- insist on hexifying for comparison, or forbid
      hexifying by clients. (iri proposal suggest the former; danc proposal
      leads to the latter)
- [Chris]
- we *could* insisst on hexifying always but this would not work
- [Ian]
- TBL: String comparison is fine BUT, you're not allowed to use IRIs
      until you've got your IRI-to-URI conversion working.
      
        - Plan A: URIs rule and that IRIs are one way of talking about
          URIs. I would argue that this is not pushing too much water
        uphill.
- Plan B: Hexifying does not work. %xx are just characters. Client
          can't change. When you send across medium, these are Unicode
          strings - each protocol has to say how to handle them.
 
- [Chris]
- hexifying isa last-ditch, late conversion to get an IRI through a non
      8 bit clean protocol
- which is basically what I understand to be the IRI position
- [Ian]
- TBL: Plan B says that namespace can do char-by-char Unicode and
      everyone else has to deal with it.
- RF: Does this mean IRI comparison should convert URI to IRI
    first?
- CL: No. All URIs are already IRIs.
- DC: There is a plan C in RF's question - the two spaces are
      isomorphic. You peek inside hex encoding.
- CL: It's better to say - once you've done hexing you've got a
      different thing. It's still an IRI, but doesn't use other chars than
      ASCII
- [DanC]
- yes, plan C conflicts with the "if you mean the same thing, say it
      the same way" principle
- [Ian]
- RF: An IRI that uses chars outside ASCII char set will be less
      "interoperable" than one that contains only ASCII chars.
- [timbl]
- The isomophism creates some equivalences which break plan C
- [Ian]
- DC: This is like a health warning: don't use "l" and "1" to avoid
      confusion.
- RF: Problems include email, RFCs munging text, ...  I anticipate
      times when IRI will be used as namespace and people will want to use it
      in its URI state.
- CL: For the data entry problem, you can already type things in in
      ASCII. That aspect is already dealt with for XML.
- [Zakim]
- TBray, you wanted to agree with TimBL
- [Ian]
- TB: I think that DC's experience with pushing water uphill is
      correct. If you are staying in the URI universe, then the approach
      taken so far (strcmp) is important - people should use the same name
      for the same thing. In a URI world, that's ok. (and bots can be more
      aggressive) For IRIs, I think we have to push some water uphill.
- [Chris]
- because bots do dereferencing type things
- timbray asks if 'all uris are iris' are cast in stone
- believe so, yes
- [Ian]
- TB on the whether a URI is already an IRI: I think that the rules
      about hexification non-deterministic.
- [Chris]
- hexification always when needed, never when not needed
- [Ian]
- TB: If we could require that hexification always when needed and
      never when not needed, then we'd have more leverage to attack the IRI
      problem.
- [Zakim]
- timbl, you wanted to propose that there is lots more water to be
      pushed uphill in plan A
- [Ian]
- TBL: There is lots more water to be pushed uphill in plan A.  I don't
      see Plan C working.
- [Chris]
- 2.3 IRI Equivalence and Normalization
- [TBray]
- I propose we say a URI is *not* an IRI if it's got unnecessary
      hexification
- [Chris]
- "It follows from the above that IRIs SHOULD NOT be modified when
      being transported. "
- "In some scenarios a definite answer to the question of IRI
      equivalence is needed that is independent of the scheme used and always
      can be calculated quickly and without accessing a network. An example
      of such a case might be XML Namespaces ([XMLNamespace]). "
- [Ian]
- TBL: I feel that when you compare pushing water uphill to number of
      places where URIs are communicated over 7 bits....that world is much
      more diverse. That is MUCH MORE water to push uphill. You'd have to
      divide the Web in two.
- [Chris]
- disagree that it slits the web in two
- *not* doing IRI would certainly split the Web into n parts,
    though
- [Ian]
- TBL: If we try to move to the world where URIs are replaced by IRIs,
      I think it's easier to get namespace-aware software to change (convert
      then compare). That's a more numerable set of applications.
- [Zakim]
- DanC, you wanted to discuss world-wide charset and to agree that the
      rules around hexifying are frightening; cf XML query spec and to
      observe that we are in web phase 2
- [Ian]
- DC: Regarding interoperability of IRIs/URIs, my experience is that if
      you want to use a URI that will really work everywhere, you should use
      [2-9] and that's it. I agree that the hex rules are frightening; the
      XML Query folks ran across problems recently. To me, hexifying is
      something the URI owner gets to do and nobody else gets to look
    into.
- [Chris]
- 'no peeking inside' - IRI spec agreres about that
- 3.2 Converting URIs to IRIs
- In some situations, it may be desirable to try to convert a URI into
      an equivalent IRI. This section gives a procedure to do such a
      conversion. The conversion described in this section will always result
      in an IRI which maps back to the URI that was used as an input for the
      conversion (except for potential case differences in escape sequences).
      However, the IRI resulting from this conversion may not be exactly the
      same as the original IRI (if there ever was one).
- [Ian]
- DC: As to TBL's comment on water in 7-bit URI world: Maybe. But water
      not all flowing in the same direction. I remain convinced that using
      strings of Unicode chars for resource names is the correct option.
- [DanC]
- yes, or no, paulc: do we discuss issue 8 today?
- [Chris]
- would prefer to try and get this issue with some agreement so i can
      edit this document
- [Ian]
- [TAG agrees to address issue 8 next week]
- [TBray]
- I think we have agreement that
      
        - IRIs are a good thing, and
- URIs are not going away so the URI->IRI conversion is a
          necessity,
- we need real clarity on URI<->IRI interconversion rules
          & when to do
 
- [Chris]
- yes, clearly on 1)
- [Ian]
- NW: I agree with RF's comment - I think it will remain the case for
      some time that URIs work where IRIs don't.
- [Chris]
- for months now
- [Zakim]
- Norm, you wanted to say that there is simply existing software that
      insists on ASCIII
- [DanC]
- yeah, well, solving world hunger is a good thing. but I'm not going
      to 2nd a proposal to solve it, setting expectations that we're not
      going to meet.
- [Ian]
- TBL: I suggest that CL write down Plans A and B.
- [Chris]
- document both options and discuss them
- [DanC]
- roy, you're wrong. IRI does exist. It's the norm in practice.
- ;-)
- [Ian]
- CL: Yes, I agree that it's a good idea to write down both
    options.
- RF: If the issue is that IRIs should be everywhere, then we need to
      be clear on what IRIs are.  IRI docs are still undergoing big changes.
      I am happy to reconsider this issue when IRI and URI specs
    stabilize.
- [Zakim]
- Roy, you wanted to say that I agree with Larry's comment: the only
      finding is that IRI does not yet exist
- [Chris]
- the implementations have consensus and have done for a long time
- [Ian]
- DC: I observe more consensus in the implementations than in the spec.
      The fact that the users are happy to invest in this technology early
      suggests to me that we owe the community a spec for what they're
    doing.
- [Chris]
- because the rest of the world wants to use its own characters
- [Ian]
- DC: I'm talking about XML* specs, not just IRI specs.
- [Roy]
- Solution: the attribute is CDATA, not an IRI or URI
- [Chris]
- TAG should request that draft-duerst04 moves to an RFC asap
- [Ian]
- DC: The status quo is that people cut/paste IRI text directly into
      their specs.
- [Zakim]
- timbl, you wanted to say we need to define equivalence more widely
      than one WG
- [Ian]
- TBL: We've been asked to say when URIs/IRIs are equivalent. I think
      it would be a shame to leave it as "Pick your favorite of 7 levels"
- [Chris]
- it has been pointed out before that equivalence does indeed depend -
      it depends on the equivalence operator that you are using
- and there are only two levels, in fact
- [Ian]
- TBL: Plan A says IRI_1 and IRI_2 equiv when same string of Unicode
      chars. Plan B says IRI_1 and IRI_2 are equiv when hexified and string
      compared.
- [Chris]
- well, not strcmp but wide-strcmp
- [Ian]
- DC: The only guy who gets to hexify is the guy who makes up the URI
      in the first place.
- [Discussion of Internet Explorer behavior - been doing Plan
      B]
- [DanC]
- (except that I didn't really mean that. yes, I did say that.)
- [Chris]
- doing plan b when they dereference
- doing plan a when they compare without dereference
- [Ian]
- RF: Like LM, I think we need to wait for the next IRI draft.
- [Chris]
- roy says that international domain names now ratified so martin will
      update the spec to reflect this
- [TBray]
- sorry bye
- [Ian]
- DC: So Martin writes some text. But we need to get agreement with
      lots of people (including W3C groups and outside W3C).  So TAG should
      actively coordinate among these groups. E.g., provide tests.
- SW: Another thing in LM's message - creation of an IRI mailing list. 
      Is the IETF the venue for this meeting?
- DC: An IETF WG could be.
- RF: Discussion of draft is taking place on a W3C mailing list for the
      purpose of advancing spec within the IETF.
- [Stuart]
- http://www.w3.org/International/iri-edit/
- [Chris]
- The mailing list for public discussion of IRIs is public-iri@w3.org,
      with a public archive. To subscribe, send mail to
      public-iri-request@w3.org with subscribe as the subject.
- [Ian]
- RF: It would be ideal to endorse this as the home and get people to
      talk there.
- [Some discussion that CL and DC would be uncomfortable with
      providing text to groups with a guarantee that it would pass muster at
      doc transition.]
- Action CL: Revise the IRI position draft for next week.
2.3 Issues that have associated action items
3. Other actions
  - Action IJ 2003/02/06: Modify issues list to show that actions/pending
    are orthogonal to decisions. IJ is working with PLH on this. Revisit this
    end of April.
  Ian Jacobs for Stuart Williams and TimBL
  Last modified: $Date: 2003/04/01 16:32:57 $