This document is also available in these non-normative formats: PostScript version , PDF version .
The English version of this specification is the only normative version. Non-normative translations may also be available.
Copyright © 2007 W3C ® ( MIT , ERCIM , Keio ), All Rights Reserved. W3C liability , trademark and document use rules apply.
The aim of this document is to outline a syntax for expressing URIs in a generic, abbreviated syntax. While it has been produced in conjunction with the XHTML 2 Working Group, it is not specifically targeted at use by XHTML Family Markup Languages. Note that the target audience for this document is Language designers, not the users of those Languages.
This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.
This document is a Last Call Working Draft of the CURIE Syntax specification. Originally this document was based upon work done in the definition of [ XHTML2 ], and work done by the RDF-in-HTML Task Force , a joint task force of the Semantic Web Best Practices and Deployment Working Group and XHTML 2 Working Group . It is being released in a separate, stand-alone specification in order to speed its adoption and facilitiate its use in various specifications. We believe the syntax rules defined in this document to be stable, and invite feedback on this draft through 20 April 2008. Comments should be addressed to www-html-editor@w3.org . All comments sent to that address are available in a public archive .
This document has been produced by the W3C XHTML 2 Working Group as part of the HTML Activity . The goals of the XHTML 2 Working Group are discussed in the XHTML 2 Working Group charter .
This document was produced by a group operating under the 5 February 2004 W3C Patent Policy . W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy .
Publication as a Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.
Please report errors in this specification to www-html-editor@w3.org ( archive ). It is inappropriate to send discussion email to this address. Public discussion may take place on www-html@w3.org ( archive ).
This section is informative.
More and more languages need a mechanism to permit the use of extensible value collections. These are primarily found in XML attribute values, but also found in other, similar spaces in non-XML languages (e.g., [ SPARQL ]). Typically such extension mechanisms utilize the concept of scoping , where values are created within a unique scope , and that value space is managed by whomever defines it. Using such a mechanism allows independent organizations to define values without the risk of collision.
At the same time, language designers are trying to ensure that their languages mesh smoothly into the semantic web . Since the basis of the semantic web is the notion that meaning can be derived through the relationship among resources, these extension mechanisms need a ready way of mapping their scoped values to resources (via URIs).
In many cases, language designers are attempting to use QNames for this extension mechanism [ XML-SCHEMA-QNAME ]. QNames do permit independent management of the value space, and can map the values to a resource. Unfortunately, QNames are unsuitable in most cases because 1) they are NOT intended for use in attribute values, and 2) the syntax of QNames is overly-restrictive and does not allow all possible URIs to be expressed.
A
specific
example
of
the
problem
this
causes
comes
from
attempting
to
define
the
value
space
for
books.
In
a
QName,
the
part
after
the
colon
must
be
a
valid
element
name,
making
an
example
such
as
the
following
invalid
:
isbn:0321154991
This is not a valid QName simply because '0321154991' is not a valid element name. Yet, in the example given, we don't really want to define a valid element name anyway. The whole reason for using a QName was to reference an item in a private value space - that of ISBNs. Moreover, in this example, we want that value to map to a URI that will reveal the meaning of that ISBN. As you can see, the definition of QNames and this (relatively common) use case are in conflict with one another.
This specification addresses the problem by creating a new data type whose purpose is specifically to allow for the definition of scoped values that map to URIs in exactly this way. This type is called a "CURIE" or a "Compact URI", and values that are syntactically valid QNames are a subset of this.
Note that this specification is targeted at markup language designers, not document authors. Any language designer considering the use of QNames in attribute values should consider instead using CURIEs, since CURIEs are designed for this purpose, while QNames are not.
Although they are not currently called CURIEs, the technique described here is in widespread usage. However, taken literally, QNames would not support many of the examples that we would find 'in the wild' — the fact that they do is mainly because systems and authors take a very lax approach to QNames.
In other words, the principle used in QNames — that of combining a namespace name with a local part to generate a URI — is widely used, but little checking is done on the local part to ensure that the string is a valid element name. However, this does mean that CURIEs can be easily used in a number of places, since there is already a large amount of 'mind-share'.
This section is normative .
The keywords "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [ RFC2119 ].
A conforming CURIE processor must support all of the features required in this specification.
This section is normative.
A
CURIE
is
by
definition
a
syntactic
superset
of
a
QName.
It
is
comprised
of
two
components,
a
prefix
and
a
reference
.
The
prefix
is
separated
from
the
reference
by
a
colon
(
:
).
It
is
possible
to
omit
both
the
prefix
and
the
colon,
or
to
omit
just
the
prefix
and
leave
the
colon.
To
disambiguate
a
CURIE
when
it
appears
in
a
context
where
a
normal
[
URI
]
may
also
be
used,
the
entire
CURIE
is
permitted
to
be
enclosed
in
brackets
(
[
,
]
).
safe_curie := '[' curie ']' curie := [ [ prefix ] ':' ] reference prefix := NCName reference := irelative-ref (as defined in IRI)
When CURIES are used in an XML-based host language, prefix values MUST be able to be defined using the 'xmlns:' syntax specified in [ XMLNAMES ]. Such host languages MAY also provide additional prefix mapping definition mechanisms.
When
CURIES
are
used
in
a
non-XML
host
language,
the
host
language
MUST
provide
a
mechanism
for
defining
the
mapping
from
the
prefix
to
an
IRI.
A
host
language
MAY
declare
a
default
prefix
value,
or
MAY
provide
a
mechanism
for
defining
a
default
prefix
value.
In
such
a
host
language,
when
the
prefix
is
omitted
from
a
CURIE,
the
default
prefix
value
MUST
be
used.
Conversely,
if
such
a
language
does
not
define
a
default
prefix
value
mechanism,
CURIEs
MUST
NOT
have
their
prefix
omitted.
The
concatenation
of
the
prefix
value
associated
with
a
CURIE
and
its
reference
MUST
be
an
IRI
[
IRI
]
.
Note
that
while
the
set
of
IRIs
represents
the
lexical
space
of
a
CURIE,
the
value
space
is
the
set
of
URIs
(IRIs
after
canonicalization
-
see
[
IRI
]).
The CURIE prefix '_' is reserved. For this reason, the prefix '_' SHOULD be avoided by authors.
Host languages MAY define additional constraints on these syntax rules when CURIES are used in the context of those host languages. Host languages MUST NOT relax the constraints defined this specification.
The
safe_curie
production
is
for
use
in
attribute
values
where
it
would
be
otherwise
impossible
to
disambiguate
between
a
CURIE
and
a
URI.
Language designers SHOULD only use CURIEs (or safe_curies) as the datatype of new attributes in their markup language, since using them in values where historically an attribute has taken a URI as its datatype could break backward compatibility.
This section is informative.
Each
host
language
that
incorporates
CURIEs
supplies
a
mechanism
for
defining
prefix
mappings.
In
the
case
of
XML-based
host
languages,
one
such
mechanism
is
required
to
be
xmlns
.
This
section
illustrates
some
possible
alternative
mapping
mechanisms
available
in
various
existing
languages.
The
[
SPARQL
]
language
provides
a
PREFIX
keyword
for
defining
the
prefix
used
in
their
CURIE-like
identifiers.
PREFIX foaf: <http://xmlns.com/foaf/0.1/> SELECT ?x ?name WHERE { ?x foaf:name ?name }
HTML 4.01 does not currently employ CURIEs. An extension to HTML 4.01 to support RDFa, however, has been discussed. Such an extension would need to define a prefix mapping mechanism in order to support the use of CURIEs in the RDFa attributes. For example:
<html> <head> <title>An HTML document using RDFa</title> <meta scheme="prefix" name="myPrefix" content="http://www.example.com/myPrefix/" > </head> <body> <p about="http://www.example.com/something" rel="myPrefix:reference"> some content </p> </body> </html>
XHTML 2 incorporates RDFa. Since XHTML 2 is an XML-based markup language, documents annotated with RDFa use the xmlns mechanism to define prefixes.
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:dc="http://purl.org/dc/elements/1.1/"> <head profile="http://www.w3.org/1999/xhtml/vocab"> <title>An XHTML document using RDFa</title> </head> <body> <p about="http://www.example.com/something" rel="cite"> some content was written by <span property="dc:creator">some author</span> </p> </body> </html>
This section is informative.
CURIEs can be used in exactly the same syntactic way QNames have been used in attribute values, with the modification that the format of the strings after the colon is looser. In all cases a parsed CURIE will produce an IRI. However, the process of parsing involves substituting the value represented by the prefix for the prefix itself, and then simply appending the part after the colon (the reference ).
Note that if CURIEs are to be used in the context of scripting, accessing a CURIE via standard mechanisms such as the XML DOM will return the raw CURIE, not its lexical value. In order to develop portable applications that evaluate CURIEs, a script author must transform CURIEs into their lexical value before evaluating them (e.g., dereferencing the resulting URI or comparing two CURIEs).
All of the following are valid CURIEs — even though they are not valid QNames — and they take advantage of the fact that the part after the colon no longer needs to conform to the rules for element names:
home:#start joseki: google:xforms+or+'xml+forms'
In
some
cases
language
designers
will
want
to
use
both
URIs
and
CURIEs
as
the
value
of
an
attribute.
For
example,
in
XHTML+RDFa
[
XHTMLRDFa
]
the
about
attribute
allows
a
URI
to
be
specified
that
some
metadata
is
"about",
but
it
is
also
be
useful
to
abbreviate
this
URI,
using
the
compact
syntax.
However,
the
problem
is
that
it
is
not
possible
for
the
language
parser
to
be
completely
sure
whether
it
has
located
a
CURIE
or
a
URI.
For
example,
a
resource
could
be
specified
as
follows:
<p rel="foaf:homePage" about="http://www.example.org/home.html">home</p>
There is no way to be sure that this is a normal URI, or a CURIE. Therefore the syntax for carrying a CURIE when there is any possibility of ambiguity is to enclose the CURIE in square brackets, as in the following example:
<html xmlns:ex="http://www.example.org/"> <head>...</head> <body> <p rel="foaf:homePage" about="[ex:home.html]">home</p> </body> </html>
This section is normative.
In
order
to
facilitate
the
use
of
CURIEs
in
markup
languages,
this
specification
defines
some
additional
datatypes
in
the
XHTML
datatype
space
(
http://www.w3.org/1999/xhtml/datatypes/
).
Markup
languages
that
use
XHTML
Modularization
can
find
these
definitions
in
the
Modularization
support
file
"datatypes"
for
their
schema
grammar.
This section is normative.
This section is informative.