W3C | TAG | tag issues list
Proposal regarding TAG issue IRIEverywhere-27
Status of this document
This document was prepared in response to an action item assigned at the
18
November 2002 TAG ftf meeting regarding issue IRIEverywhere-27.
The TAG has not reviewed this document. This document represents some of the
TAG's 18 November 2002 discussion, but does not represent the consensus of
the TAG or of any other parties. Please send comments on this document to the
public mailing list www-tag (archive).
Many thanks to Martin Dürst for discussing these topics.
Introduction
IRIs are defined in an Internet Draft entitled "Internationalized Resource
Identifiers"; see the IRI
home page for the latest version.
This document proposes:
- An answer the question "To what extent should IRIs be
used on the Web today?"
- Changes to various specifications.
1. To what extent should IRIs be
used on the Web today?
IRIs are based on several years of experience with internationalized
identifiers. Some W3C Recommendations include language copied from IRI
drafts; other groups are likely to do similarly, possibly introducing
interoperability problems. The TAG threrefore supports the work on an IRI
specification (per 18 Nov 2002 teleconf).
The IRI specification is still a draft, however. Therefore, any party
wishing to use IRIs -- software developers, content authors, or specification
editors -- should use them with caution.
- Software developers should begin preparations for manipulating
IRIs.
- Specification writers SHOULD NOT include normative references to the
IRI specification, but MAY quote portions of the specification. If they
do, they SHOULD include a warning that the IRI specification is likely to
change, and the referring specification may be reissued to account for
those changes.
- Authors SHOULD NOT rely on interoperable support of IRIs.
2. Changes to various specifications in light of this model
The TAG considers the IRI space and URI space to be different (per 18 Nov
2002 teleconf): IRIs are a way to talk about URIs (as relative URIs map to
absolute ones). In this light, the TAG recommends that IRIs not be compared
for equivalence directly, but that they be converted to URIs (possibly after
normalization) and the URIs compared. The TAG is working on a separate
finding entitled "How
to Compare Uniform Resource Identifiers."
In light of this (and other concerns regarding hex-escaping), the authors
recommend some changes to the following specifications.
2.1 IRI specification
- Section 2.3 ("IRI Equivalence and Normalization) should:
- Talk about normalization
- Explain that IRI comparision is done by normalizing and converting
to URIs (per section 3.1), then comparing URIs per How to Compare
Uniform Resource Identifiers."
- Question: Should the TAG ask the IRI spec editors to convey a model
whereby IRIs are a means of "talking about a URI?" Currently, the IRI
specification conveys the message that IRIs should replace
URIs. Martin observes that if you say that
~/%7e/%7E are equivalent, then you should say that IRIs are a way
of writing URIs. Otherwise, IRIs and URIs are different.
2.2 RFC2396
- RFC2396 should be modified so that hex digits (HEXDIG) are
case-insensitive. We note that the editor's
draft of RFC2396 at the time of writing this does not talk about this
case-insensitivity, nor does it define HEXDIG.
2.3 How to Compare URIs
- Include a reference to the IRI specification, where IRI comparison is
discussed.
- Per TAG discussion at the 20 Jan 2003
teleconference, ask that "lower case" in section 3.1 be changed to
"upper case," even if the desired change to RFC2396 is made.
- In section 3.1, step 2, part three, change "character sequence" to
"triplet sequence."
- (For ASCII characters), treat, for example, ~/%7e/%7E as
equivalent. If "~" is treated differently, then some IRIs may be
considered equivalent in this regard, while URIs would be
considered different. This breaks the model of seeing IRIs as
a way of writing URIs.
Then ask that How to Compare URIs be reflected in RFC2396.
2.4 XML Namespaces 1.1
- Have the authors of XML Namespaces 1.1 (in section
2.3) so that ~/%7e/%7E as considered equivalent.
Chris Lilley, Ian Jacobs
Last modified: $Date: 2003/01/27 22:58:44 $ by $Author: ijacobs $