This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 4632 - Use of IRIs
Summary: Use of IRIs
Status: RESOLVED FIXED
Alias: None
Product: SML
Classification: Unclassified
Component: Core+Interchange Format (show other bugs)
Version: unspecified
Hardware: PC Linux
: P2 normal
Target Milestone: Second draft
Assignee: Kumar Pandit
QA Contact: SML Working Group discussion list
URL: http://dev.w3.org/cvsweb/~checkout~/2...
Whiteboard:
Keywords:
Depends on:
Blocks: 4635
  Show dependency treegraph
 
Reported: 2007-06-12 18:10 UTC by Philippe Le Hegaret
Modified: 2007-09-21 04:54 UTC (History)
0 users

See Also:


Attachments

Description Philippe Le Hegaret 2007-06-12 18:10:07 UTC
The current specification only talks about using URIs and doesn't say anything about IRIs.
The WSDL 2.0 specification went through that debate already, so I suggest to follow their path:
[[
 This specification uses the XML Schema type xs:anyURI (see [XML Schema: Datatypes]). It is defined so that xs:anyURI values are essentially IRIs (see [IETF RFC 3987]). The conversion from xs:anyURI values to an actual URI is via an escaping procedure defined by (see [XLink 1.0]), which is identical in most respects to IRI Section 3.1 (see [IETF RFC 3987]).

For interoperability, WSDL authors are advised to avoid the US-ASCII characters: "<", ">", '"', space, "{", "}", "|", "\", "^", and "`", which are allowed by the xs:anyURI type, but disallowed in IRIs.
]]
http://www.w3.org/TR/2007/PR-wsdl20-20070523/#xmlSchemaAnyURI

I would also proposed to change the name of the element from sml:uri to sml:address or something equivalent.
Comment 1 Philippe Le Hegaret 2007-06-12 21:05:13 UTC
The change of the element name sml:uri is now tracked by http://www.w3.org/Bugs/Public/show_bug.cgi?id=4635
Comment 2 Pratul Dublish 2007-09-07 02:51:24 UTC
Here's a suggestion to resolve this bug.

Given the decision at Toronto F2F to require XML Schema 1.0 for SML 1.1 (i.e.,  XML Schema 1.0 must be supported, but implementations are not prevented from using XML Schema 1.1), it may be better to continue using URI as the interoperable reference scheme in SML IF since it aligns with the definition of xs:anyURI in XML Schema 1.0

3.2.17 anyURI

[Definition:]   anyURI represents a Uniform Resource Identifier Reference (URI). An anyURI value can be absolute or relative, and may have an optional fragment identifier (i.e., it may be a URI Reference). This type should be used to specify the intention that the value fulfills the role of a URI as defined by [RFC 2396]<http://www.w3.org/TR/xmlschema-2/#RFC2396>, as amended by [RFC 2732]<http://www.w3.org/TR/xmlschema-2/#RFC2732>.

The mapping from anyURI values to URIs is as defined by the URI reference escaping procedure defined in Section 5.4 Locator Attribute<http://www.w3.org/TR/2001/REC-xlink-20010627/#link-locators> of [XML Linking Language]<http://www.w3.org/TR/xmlschema-2/#XLink> (see also Section 8 Character Encoding in URI References<http://www.w3.org/TR/2001/WD-charmod-20010126/#sec-URIs> of [Character Model]<http://www.w3.org/TR/xmlschema-2/#CharMod>). This means that a wide range of internationalized resource identifiers can be specified when an anyURI is called for, and still be understood as URIs per [RFC 2396]<http://www.w3.org/TR/xmlschema-2/#RFC2396>, as amended by [RFC 2732]<http://www.w3.org/TR/xmlschema-2/#RFC2732>, where appropriate to identify resources.


The above approach allows SML and SML IF implementations to leverage the xs:anyURI support built into XML Schema 1.0 processors. Note that URI reference escaping mechanism allows internationalized resource identifiers to be specified as anyURI although the specification may not be as elegant as that for IRIs.

If the WG still wants to pursue IRIs, then we'll need to define a new datatype - say smlif:anyIRI - to capture the definition of xs:anyURI in XML Schema 1.1.

3.3.18 anyURI

[Definition:]   anyURI represents an Internationalized Resource Identifier Reference (IRI).  An anyURI value can be absolute or relative, and may have an optional fragment identifier (i.e., it may be an IRI Reference).  This type should be used when the value fulfills the role of an IRI, as defined in [RFC 3987]<http://www.w3.org/TR/xmlschema11-2/#RFC3987> or its successor(s) in the IETF Standards Track.
Note: IRIs may be used to locate resources or simply to identify them. In the case where they are used to locate resources using a URI, applications should use the mapping from anyURI values to URIs given by the URI reference escaping procedure defined in Section 3.1 Mapping of IRIs to URIs<http://www.ietf.org/rfc/rfc3987.txt> of [RFC 3987]<http://www.w3.org/TR/xmlschema11-2/#RFC3987> or its successor(s) in the IETF Standards Track.  This means that a wide range of internationalized resource identifiers can be specified when an anyURI is called for, and still be understood as URIs per [RFC 3986]<http://www.w3.org/TR/xmlschema11-2/#RFC3986> and its successor(s).


Most SML validators will need to add support for the above since the above definition is not supported by XML Schema 1.0 processors.
Comment 3 C. M. Sperberg-McQueen 2007-09-13 19:09:54 UTC
Following on from the proposal in comment #2, see also the proposal from
Kumar Pandit at http://lists.w3.org/Archives/Public/public-sml/2007Sep/0082.html

which reads in part:

Proposal:
I propose that we should not refer to any specific URI/IRI RFC in the SML and SML-IF specs. Instead, we should let the definition of sml:uri be governed by the definition of xs:anyURI type defined in the schema version we align to.

Reasons:

1.    sml:uri is of type xs:anyURI as currently defined in the SML schema. This type supports encoding international characters using a well defined escaping scheme (http://www.w3.org/TR/2004/REC-xmlschema-2-20041028/datatypes.html#anyURI). SML instance documents can of course have internationalized content. Thus internationalization is possible without any change to the spec as it is defined today.

2.    We have decided to align SML 1.1 spec with XML schema 1.0 which is aligned with URIs (RFC2396/2732).

3.    Most XML schema 1.0 processor implementations do not support IRIs. This means that SML implementations that use an off-the-shelf schema 1.0 processor will need to provide additional implementation to support IRIs. Implementing RFC 3987 (46 pages long) is non-trivial amount of work. This is an unnecessary implementation burden given that internationalization is already possible as described in #1.

4.    If we decide to align SML spec with XML schema 1.1 in a future release, we will automatically get IRI functionality as schema 1.1 is aligned with the IRI RFCs.

5.    The SML spec does not put any restrictions on additional reference schemes that can be defined. If an application does not wish to use the internationalization support already provided by sml:uri (xs:anyURI) for reasons specific to its domain, it is free to define additional scheme(s) to meet its needs without violating the SML 1.1 specification.

[end quotation]

As a supporter of IRIs, I am happy with this.  But I want to clarify
one important technical issue.

It needs to be noted that while XSDL 1.0 points to the URI spec, not
the IRI spec, it's nevertheless true that the value space of xs:anyURI
includes (as far as I know) all legal IRIs.  So any library that
"supports" xs:anyURI, for the relevant meaning of "support" will also
"support" IRIs.  (The anyURI type was designed to support IRIs; the
XSDL 1.0 spec points to URIs not IRIs only for historical reasons.)

The only difference I know of (there MAY be others, but I rather doubt
it) between xs:anyURI and IRI is that after XSDL 1.0 was completed, a
later revision of the IRI spec forbad blanks in IRIs.  So
"http://example.com/This is an example" is legal (though very odd) as
an anyURI, but not as an IRI, where the nearest equivalent is
"http://example.com/This%20is%20an%20example".

That means I'm happy with Kumar's suggestion, but I am concerned lest
he may support it out of the view that supporting xs:anyURI involves
less than supporting IRIs.  If he still supports his proposal after
this clarification, then good.
Comment 4 Pratul Dublish 2007-09-20 20:28:50 UTC
Pls fix in 2nd draft as per consensus in 9/20 call
Comment 5 Kumar Pandit 2007-09-21 04:54:06 UTC
Replaced references to "URIs (rfc 3986)" by references to the "URI Scheme" section.

Updated the "URI Scheme" section to include the following sentence:

The URI Scheme is based on the anyURI type defined in the XML schema specification [XML Schema Datatypes].