<?xml version='1.0'?>
<?xml-stylesheet href="rfc2629.css" type="text/css"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd">

<!--<?rfc symrefs='yes'?>-->
<!--<?rfc sortrefs='yes'?>-->
<?rfc toc='yes'?>
<?rfc compact='yes'?>
<!--<?rfc strict='yes'?>-->
<!--<?rfc subcompact='no'?>-->

<!-- updates="2244,3028" -->
<rfc ipr="full3667" docName="draft-newman-i18n-comparator-04.txt" category="std" xml:lang="en">
<front>
<title abbrev="Collation Registry">Internet Application Protocol
Collation Registry (DRAFT!!!)</title>
<author initials="C." surname="Newman" fullname="Chris Newman">
<organization>Sun Microsystems</organization>
<address>
<postal>
<street>1050 Lakes Drive</street>
<city>West Covina</city>
<region>CA</region>
<code>91790</code>
<country>US</country>
</postal>
<email>chris.newman@sun.com</email>
</address>
</author>

<author initials="M.J." surname="Duerst" fullname='Martin Duerst (Note: Please write "Duerst" with u-umlaut wherever possible, for example as "D&amp;#252;rst" in XML and HTML.)'>
  <organization abbrev="W3C/Keio University">W3C/Keio University</organization>
  <address>
  <postal>
  <street>5322 Endo</street>
  <city>Fujisawa</city>
  <region>Kanagawa</region>
  <code>252-8520</code>
  <country>Japan</country>
  </postal>
  <phone>+81 466 49 1170</phone>
  <facsimile>+81 466 49 1171</facsimile>
  <email>mailto:duerst@w3.org</email>
  <uri>http://www.w3.org/People/D%C3%BCrst/</uri>
  </address>
</author>

<date month="October" year="2004"/>
<area>Applications</area>
<keyword>Collation</keyword>
<keyword>Sorting</keyword>
<abstract><t>
Many Internet application protocols include string-based lookup,
searching, or sorting operations.  However the problem space for
searching and sorting international strings is large, not fully
explored, and is outside the area of expertise for the Internet
Engineering Task Force (IETF).  Rather than attempt to solve such a
large problem, this specification creates an abstraction framework so
that application protocols can precisely identify a comparison function
and the repertoire of comparison functions can be extended in the
future.
</t></abstract>
</front>
<middle>

<section title="Introduction">
<t>The <xref target="RFC2244">ACAP</xref> specification introduced the
concept of a comparator (which we call collation in this document), but
failed to create an IANA registry.  With the introduction of <xref target="RFC3454">stringprep</xref> and the <xref target="unicode-tr10v9">Unicode Collation Algorithm</xref>, it is now
time to create that registry and populate it with some initial values
appropriate for an international community.  This specification replaces
and generalizes the definition of a comparator in ACAP and creates a
collation registry.</t>

<section title="Structure of this Document"></section><t>@@@@ to be completed</t><section title="Conventions Used in this Document">
<t>
The key words "MUST", "MUST NOT", "SHOULD", "SHOULD NOT", and "MAY" in
this document are to be interpreted as defined in
<xref target="RFC2119">"Key words for use
in RFCs to Indicate Requirement Levels"</xref>.</t>

<t>The attribute syntax specifications use the <xref target="RFC2234">Augmented Backus-Naur Form (ABNF)</xref> notation
including the core rules defined in Appendix A.  This also inherits ABNF
rules from <xref target="RFC3066">Language Tags</xref>.</t>
<t>The term 'protocol' is used in this memo in a very generic sense, and includes things such as query languages.</t></section>
</section>

<section title="Collation Definition and Purpose">

<section title="Definition"><t>A collation is a named function which takes two arbitrary length
character strings (with the exception of the <xref target="octet">i;octet</xref> collation) as input and can be used to
perform one or more of three basic comparison operations: equality test,
substring match, and ordering test.</t>

</section><section title="Purpose"><t>Collations provide a multi-protocol abstraction layer for comparison
functions so the details of a particular comparison operation can be
specified by someone with appropriate expertise independent of the
application protocol that consumes that collation.  This is similar to
the way a <xref target="RFC2978">charset</xref> separates the details of
octet to character mapping from a protocol specification such as <xref target="RFC2045">MIME</xref> or the way <xref target="RFC2222">SASL</xref> separates the details of an authentication
mechanism from a protocol specification such as <xref target="RFC2244">ACAP</xref>.</t>

<?rfc needLines="15"?>
<figure><preamble>Here a small diagram to help illustrate the value of this abstraction layer:</preamble>
<artwork type="diagram">
+-------------------+                         +-----------------+
| IMAP i18n SEARCH  |--+                      | Basic           |
+-------------------+  |                   +--| Collation Spec  |
                       |                   |  +-----------------+
+-------------------+  |  +-------------+  |  +-----------------+
| ACAP i18n SEARCH  |--+--| Collation   |--+--| A stringprep    |
+-------------------+  |  | Registry    |  |  | Collation Spec  |
                       |  +-------------+  |  +-----------------+
+-------------------+  |                   |  +-----------------+
| ...other protocol |--+                   |  | locale-specific |
+-------------------+                      +--| Collation Spec  |
                                              +-----------------+
</artwork>
<postamble>Thus IMAP, ACAP and future application protocols with international search capability simply specify how to interface to the collation registry instead of each protocol specification having to specify all the collations it supports.
</postamble></figure></section>

<section title="Sort Keys"><t>One component of a collation is a canonicalization function which
can be pre-applied to single strings and may enhance the performance of
subsequent comparison operations.  Normally, this is an implementation
detail of collations, but at times it may be useful for an
application protocol to expose collation canonicalization over
protocol.  Collation canonicalization can range from an identity mapping (e.g., the i;octet collation <xref target="octet"></xref>) to a mapping which makes the string unreadable to a human (e.g., the basic collation).</t>
</section></section>

<section title="Collation Name Syntax" anchor="syntax">
<section title="Basic Syntax"><t>The collation name itself is a single US-ASCII string
beginning with a letter and made up of letters, digits, or one of the
following 4 symbols: "-", ";", "=" or ".".  The name MUST NOT be longer than
254 characters.</t><figure>
<artwork type="abnf">
  collation-char  =  ALPHA / DIGIT / "-" / ";" / "=" / "."

  collation-name  =  ALPHA *253collation-char
</artwork></figure>

</section><section title="Wildcards"><t>The string a client uses to select a collation MAY contain a
wildcard ("*") character which matches zero or more collation-chars.
Wildcard characters MUST NOT be adjacent.  Clients which support
disconnected operation SHOULD NOT use wildcards to select a collation,
but clients which provide collation operations only when connected to
the server MAY use wildcards.  If the wildcard string matches multiple
collations, the server SHOULD select the collation with the broadest
scope (preferably international scope), the most recent table versions
and the greatest number of supported operations.  A single wildcard
character ("*") refers to the application protocol collation behavior
that would occur if no explicit negotiation were used.</t>
<figure>
<artwork type="abnf">
  collation-wild  =  ("*" / (ALPHA ["*"])) *(collation-char ["*"])
                      ; MUST NOT exceed 255 characters total
</artwork></figure></section>

<section title="Ordering Direction"><t>When used as a protocol element for ordering, the
collation name MAY be prefixed by either "+" or "-" to explicitly
specify an ordering direction.  As mentioned previously, "+" has no
effect on the ordering function, while "-" negates the result of the
ordering function.  In general, collation-order is used when a
client requests a collation, and collation-sel is used with the
server informs the client of the selected collation.</t>
<figure>
<artwork type="abnf">
  collation-sel   =  ["+" / "-"] collation-name
  
  collation-order =  ["+" / "-"] collation-wild
</artwork></figure></section>

<section title="URIs"><t>Some protocols are designed to use URIs to refer
to collations rather than simple tokens.  A special section of the
IANA web page is reserved for such usage.  The "collation-uri" form is
used to refer to a specific IANA registry entry for a specific named
collation (the collation registration may not actually be present if it is
experimental).  The "collation-auri" form is an abstract name for an ordering,
a comparator pattern or a vendor private comparator.</t><figure>
<artwork type="abnf">
  collation-uri   =  "http://www.iana.org/assignments/collation/"
                     collation-name ".xml"
  
  collation-auri  =  ( "http://www.iana.org/assignments/collation/"
                     collation-order [".xml"]) / other-uri
  
  other-uri       =  absoluteURI
                  ;  excluding the IANA collation namespace.
</artwork></figure></section>

<section title="Naming Guidelines"><t>While this specification makes no absolute requirements on the structure of collation names, naming consistency is important, so the following initial guidelines are provided.</t>

<t>Collation names with an international audience typically begin with
"i;".  Collation names intended for a particular language or locale
typically begin with a <xref target="RFC3066">language tag</xref>
followed by a ";".  After the first ";" is normally the name of the
general collation algorithm followed by a series of algorithm
modifications separated by the ";" delimiter.  Parameterized
modifications will use "=" to delimit the parameter from the value.  The
version numbers of any lookup tables used by the algorithm SHOULD be
present as parameterized modifications.</t>

<t>Collation names of the form *;vnd-domain.com;* are reserved for vendor-specific collations created by the owner of the domain name following the "vnd-" prefix.  Registration of such collations (or the name space as a whole) with intended use of "Vendor" is encouraged when a public specification or open-source implementation is available, but is not required.</t></section>
</section>

<section title="Collation Specification Requirements" anchor="compreq">
<section title="Operations Supported"><t>A collation specification MUST state which of the three basic functions are supported (equality, substring, ordering) and how to perform each of the supported functions on any two input character strings including empty strings (with the exception of the <xref target="octet">i;octet</xref> collation).  Collations must be deterministic, i.e.given a collation with a specific name, and any two fixed input strings, the result MUST be the same for the same operation.  Collations MUST be transitive.</t>

<section title="Equality"><t>The equality function always returns "match" or "no-match" when supplied valid input and MAY return "error" if the input strings are not valid character strings or violate other collation constraints.</t></section></section><section title="Substring"><t>The substring matching function determines if the first string is a substring of the second string.  A collation which supports substring matching will automatically support the two special cases of substring matching: prefix and suffix matching if those special cases are supported by the application protocol.  It returns "match" or "no-match" when supplied valid input and returns "error" when supplied invalid input.</t><t>Application protocols MAY return position information for substring matches.  If this is done, the position information MUST include both the starting offset and the ending offset in the string.  This is important because more sophisticated collations can match strings of unequal length (for example, a pre-composed accented character will match a decomposed accented character).</t></section><section title="Ordering"><t>The ordering function determines how two character strings are ordered.  It returns "-1" if the first string is listed before the second string according to the collation, "+1" if the second string is listed before the first string, and "0" if the two strings are equal.  If the order of the two strings is reversed, the result of the ordering function of the collation MUST be reversed, i.e. results which would be "+1" are instead "-1" and results which would be "-1" are instead "+1", while results which would be "0" stay "0". In general, collations SHOULD NOT return "0" unless the two character sequences are identical.</t><t>Since ordering is normally used to sort a list of items, "error" is not a useful return value from the ordering function.  Strings with errors that prevent the sorting algorithm from functioning correctly should sort to the end of the list.  Thus if the first string is invalid while the second string is valid, the result will be "+1".  If the second string is invalid while the first string is valid, the result will be "-1".  If  both strings are invalid, the result SHOULD match the result from the "i;octet" collation.</t><t>When the collation is used with a "+" prefix, the behavior is the same as when used with no prefix.  When the collation is used with a "-" prefix, the result of the ordering function of the collation MUST be reversed.</t></section><section title="Internal Canonicalization Algorithm"><t>A collation specification MUST describe the internal
canonicalization algorithm.  This algorithm can be applied to individual
strings and the result strings can be stored to potentially optimize
future comparison operations.  A collation MAY specify that the canonicalization algorithm is the identity function.  The output of the canonicalization algorithm MAY have no meaning to a human.</t>

</section><section title="Use of Lookup Tables"><t>Collations which use more than one customizable lookup table in a
documented format MUST assign numbers to the tables they use.  This
permits an application protocol command to access the tables used by a
server collation.</t></section>

<section title="Treatement of NULL Strings"><t>Unless otherwise specified by the collation or application protocol, a NULL
string (as opposed to an empty string) is equal only to another NULL string, a NULL string is not a substring of any other string, and a NULL string sorts to
a position after all non-NULL strings, but before strings which generate errors.</t></section>

<section title="Multi-Value Attributes"><t>Some application protocols will permit the use of multi-value attributes with a collation.  This paragraph describes the rules that apply unless otherwise specified by the collation or application protocol.  In the case of the equality and substring operation, the operations are applied over each pair of single values from the two inputs.  If any combination produces an error, the result is an error.  Otherwise, if any combination produces a "match", the result is a match.  Otherwise the result is "no-match".  For the ordering function, the smallest ordinal character string from the first set of values is compared to the smallest ordinal character string from the second set of values.</t>

</section>


</section>

<section title="Application Protocol Requirements" anchor="appreq">
<t>This section describes the requirements and issues that an  application protocol which offers searching, substring matching and/or sorting and permits the use of characters outside the US-ASCII charset needs to consider.</t>

<section title="Character Encoding"><t>The protocol specification has to make sure that it is clear on which characters (rather than just octets) the collations are used. This can be done by specifying the protocol itself in terms of characters (e.g. in the case of a query language), by specifying a single character encoding for the protocol (e.g. UTF-8 <xref target="RFC3629"></xref>), or by carefully describing the relevant issues of character encoding labeling and conversion. 
In the later case, details to consider include how to handle unknown charsets, any charsets
which are mandatory-to-implement, any issues with byte-order that might
apply, and any transfer encodings which need to be supported.</t></section><section title="Operations"><t>The protocol must specify which of the operations defined in this specification (equality matching, substring matching and
ordering) can be invoked in the protocol, and how they are invoked. There may be more than one way to invoke an operation.</t><t>The protocol MUST provide a mechanism for the client to select the
collation to use with equality matching, substring matching and
ordering.</t><t>If the protocol provides positional information for the results of a
substring match, that positional information MUST fully specify the
substring in the result that matches independent of the length of the
search string.  For example, returning both the starting and ending
offset of the match would suffice, as would the starting offset and a
length.  Returning just the starting offset is not acceptable.  This rule is necessary because advanced collations can treat strings of different lengths as equal (for example, pre-composed and decomposed accented characters).</t></section>

<section title="Wildcards"><t>The protocol MUST specify whether it allows the use of wildcards in collation identifiers or not. If the protocol allows wildcards, then:<list><t>The protocol MUST specify how comparisons behave in the absence of explicit collation negotiation or when a collation of "*" is requested.  The protocol MAY specify that the default collation used in such
circumstances is sensitive to server configuration.</t><t>The protocol SHOULD provide a way to list available collations
matching a given wildcard pattern or patterns.</t></list></t>

</section>





<section title="Canonicalization Function"><t>If the protocol provides a canonicalization function for strings, then
use of collations MAY be appropriate for that function. [Need to describe how that would be done.]</t></section>

<section title="Disconnected Clients"><t>If the protocol supports disconnected clients, then a mechanism for
the client to precisely replicate the server's collation algorithm is
likely desirable.  Thus the protocol MAY wish to provide a command to
fetch lookup tables used by charset conversions and collations.</t></section>

<section title="Error Codes"><t>The protocol specification should consider assigning protocol error
codes for the following circumstances:

<list style="symbols">
<t>The client requests the use of a collation by name or pattern, but no implemented collation matches that pattern.</t>
<t>The client attempts to use a collation for a function that is not supported by that collation.  For example, attempting to use the "i;ascii-numeric" collation for a substring matching function.</t>
<t>The client uses an equality or substring matching collation and the result is an error.  It may be appropriate to distinguish between the two input strings, particularly when one is supplied by the client and one is stored by the server.  It might also be appropriate to distinguish the specific case of an invalid UTF-8 string.</t>
</list></t></section>

<section title="Octet Collation"><t>If the protocol permits the use of the <xref target="octet">i;octet</xref> collation, it has to say so. The octet collation SHOULD NOT be used unless the protocol uses UTF-8 as its single character encoding.</t><t>If the protocol permits the use of collations with data structures
beyond those described in this specification ([is the following a list of  described data structures, or of undescribed data structures???] octet strings, NULL
string, array of octet strings), the protocol MUST describe the default
behavior for a collation with that data structure.</t></section>
</section>



<section title="Use by ACAP and Sieve">
<t>Both <xref target="RFC2244">ACAP</xref> and <xref target="RFC3028">Sieve</xref> are standards track specifications which used collations prior to the creation of this specification and registry.  Those standards do not meet all the application protocol requirements described in <xref target="appreq"/>.  For backwards compatibility, those protocols use the "i;ascii-casemap" instead of "en;ascii-casemap". [have to check whether the following is true:] These protocols allow the use of the <xref target="octet">i;octet</xref> collation working directly on UTF-8 data as used in these protocols.</t>
</section>

<section title="Collation Registration" anchor="registry">

<section title="Collation Registration Procedure">
<t>IANA will create a mailing list collation@iana.org which can be used for public discussion of collation proposals prior to registration.  Use of the mailing list is encouraged but not required.  The actual registration procedure will not begin until the completed registration template is sent to iana@iana.org.  The IESG will appoint a designated expert who will monitor the collation@iana.org mailing list and review registrations forwarded from IANA.  The designated expert is expected to tell IANA and the submitter of the registration within two weeks whether the registration is approved, approved with minor changes, or rejected with cause.  When a registration is rejected with cause, it can be re-submitted if the concerns listed in the cause are addressed.  Decisions made by the designated expert can be appealed to the IESG and subsequently follow the normal appeals procedure for IESG decisions.</t>

<t>Collation registrations in a standards track, BCP or IESG-approved experimental RFC are owned by the IETF, and changes to the registration follow normal procedures for updating such documents.  Collation registrations in other RFCs are owned by the RFC author(s).  Other collation registrations are owned by the individual(s) listed in the contact field of the registration and IANA will preserve this information.  Changes to a registration MUST be approved by the owner.  In the event the owner cannot be contacted for a period of one month and a change is deemed necessary, the IESG MAY re-assign ownership to an appropriate party.</t>

</section>

<section title="Collation Registration Format">
<t>Registration of a collation is done by sending a well-formed XML document that validates with <xref target="DTD">collationreg.dtd</xref>.</t><section title="Registration Template"><figure><preamble>Here is a template for the registration:</preamble>
<artwork type="xml"><![CDATA[
<?xml version='1.0'?>
<!DOCTYPE rfc SYSTEM 'collationreg.dtd'>
<collation rfc="YYYY" scope="i18n" intendedUse="common">
  <name>collation name</name>
  <title>technical title for collation</title>
  <functions>equality order substring</functions>
  <specification>specification reference</specification>
  <owner>email address of owner or IETF</owner>
  <submitter>email address of submitter<submitter>
  <version>1</version>
  <UnicodeVersion>3.2</UnicodeVersion>
  <UCAVersion>3.1.1</UCAVersion>
</collation>
]]></artwork></figure></section><section title="The collation Element"><t>The root of the  registration document MUST be a &lt;collation&gt; element. The  collation element contains the other elements in the
registration, which are described in the following sub-subsections, in the order given here.</t><t>The &lt;collation&gt; element MAY include an "rfc=" attribute if the specification is in an RFC. The "rfc=" attribute  gives only the number of the RFC, without any prefix, such as "RFC", or suffix, such as ".txt".</t><t>The &lt;collation&gt; element  MUST include a "scope=" attribute, which MUST have one of the values "i18n", "local" or "other".</t><t> The &lt;collation&gt; element  MUST include an "intendedUse=" attribute, which must have one fo the values "common", "limited", "vendor", or "deprecated". Collation specifications intended for "common" use are expected to reference standards from standards bodies with significant experience dealing with the details of international character sets.</t><t>Be aware that future revisions of this specification may add additional function types, as well as additional XML attributes and values.  Any system which automatically parses these XML documents MUST take this into account
to preserve future compatibility. A DTD for the current definition of the collation registration template is given in <xref target="DTD"></xref></t></section>

<section title="The name Element"><t>The &lt;name&gt; element gives the precise
name of the
comparator. The &lt;name&gt; element is mandatory.</t></section>
<section title="The title Element"><t>The &lt;title&gt; element give the title of the
comparator. The &lt;title&gt; element is mandatory.</t></section>
<section title="The functions Element"><t>The &lt;functions&gt; element lists which of the three
functions the comparator provides. The &lt;functions&gt; element is mandatory.</t></section>
<section title="The specification Element"><t>The &lt;specification&gt; element
describes where to find the specification. The &lt;specification&gt; element is mandatory. It MAY have a URI attribute. [check that the following is really true; it reflects what is currently in the DTD; also, say what it means/in what cases it should be used] There may be more than one &lt;specification&gt; elements.</t></section>
<section title="The submitter Element"><t>The &lt;submitter&gt; element provides an RFC 2822 email address for the person
who submitted the registration.  It is optional if the &lt;owner&gt; element
contains an email address. [check that the following is really true; it reflects what is currently in the DTD; also, say what it means/in what cases it should be used] There may be more than one &lt;submitter&gt; elements.</t></section><section title="The owner Element"><t>The &lt;owner&gt; element contains either
the four letters "IETF" or an email address of the owner of the
registration. The &lt;owner&gt; element is mandatory. [check that the following is really true; it reflects what is currently in the DTD; also, say what it means/in what cases it should be used] There may be more than one &lt;owner&gt; elements.</t></section>
<section title="The version Element"><t>The &lt;version&gt; element is included when the
registration is likely to be revised or has been revised in such a way
that the results change for certain input strings. The &lt;version&gt; element is optional.</t></section>
<section title="The UnicodeVersion Element"><t>The &lt;UnicodeVersion&gt; element indicates the version number of the UnicodeData
file on which the collation is based. The &lt;UnicodeVersion&gt; element is optional.</t></section>
<section title="The UCAVersion Element"><t>The &lt;UCAVersion&gt; element
specifics the version of the Unicode Collation Algorithm on which the
collation is based. The &lt;UCAVersion&gt; element is optional.</t></section>
<section title="The UCAMatchLevel Element"><t>The &lt;UCAMatchLevel&gt; element specifies the
number of Unicode Collation Algorithm sort key levels used for the
equality and substring operations. The &lt;UCAMatchLevel&gt; element is optional.</t></section>




</section>

<section title="DTD for Collation Registration" anchor="DTD">
<?rfc needLines='30'?><figure><artwork><![CDATA[
<!-
  DTD for Collation Registration Document
  
  Data types:
  
  entity      description
  ======      ===========
  NUMBER      [0-9]+
  URI         As defined in RFC YYYY
  CTEXT       printable ASCII text (no line-terminators)
  TEXT        character data
  ->
<!ENTITY % NUMBER        "CDATA">
<!ENTITY % URI           "CDATA">
<!ENTITY % CTEXT         "#PCDATA">
<!ENTITY % TEXT          "#PCDATA">
<!ELEMENT collation      (name,title,functions,specification+,owner+,
                          submitter*,version?,UnicodeVersion?,
                          UCAVersion?,UCAMatchLevel?)>
<!ATTLIST collation
          rfc            %NUMBER;                           "0"
          scope          (i18n|local|other)                 #IMPLIED
          intendedUse    (common|limited|vendor|deprecated) #IMPLIED>
<!ELEMENT name           (%CTEXT;)>
<!ELEMENT title          (%CTEXT;)>
<!ELEMENT functions      (%CTEXT;)>
<!ELEMENT specification  (%TEXT;)>
<!ATTLIST specification
          uri            %URI;                              "">
<!ELEMENT owner          (%CTEXT;)>
<!ELEMENT submitter      (%CTEXT;)>
<!ELEMENT version        (%CTEXT;)>
<!ELEMENT UnicodeVersion (%CTEXT;)>
<!ELEMENT UCAVersion     (%CTEXT;)>
<!ELEMENT UCAMatchLevel  (%CTEXT;)>
]]></artwork></figure>
</section><section title="Structure of Collation Registry">
<t>Once the registration is approved, IANA will store each XML
registration document in a URL of the form
http://www.iana.org/assignments/collation/collation-name.xml where
collation-name is the contents of the name element in the registration.
Both the submitter and the designated expert is responsible for
verifying that the XML is well-formed and complies with the DTD.  In the
future, it is hoped IANA will take over XML verification responsibility
from the designated expert.</t>

<t>IANA will also maintain a text summary of the registry under the name
http://www.iana.org/assignments/collation/summary.txt.  This
summary is divided into four sections.  The first section is for
collations intended for common use.  This section is intended for
collation registrations published in IESG approved RFCs or for locally
scoped collations from the primary standards body for that locale.  The
designated expert is encouraged to reject collation registrations with
an intended use of "common" if the expert believes it should be
"limited", as it is desirable to keep the number of "common"
registrations small and high quality.  The second section is reserved
for limited use collations.  The third section is reserved for
registered vendor specific collations.  The final section is reserved
for deprecated collations.</t>
</section>
<section title="Example Initial Registry Summary">
<figure><preamble>The following is an example of how IANA might structure the
initial registry summary.txt file:</preamble>
<artwork type="table">
  Collation                              Functions Scope Reference
  ---------                              --------- ----- ---------
Common Use Collations:
  i;nameprep;v=1;uv=3.2                  e, o, s   i18n  [RFC XXXX]
  i;basic;uca=3.1.1;uv=3.2               e, o, s   i18n  [RFC XXXX]
  i;basic;uca=3.1.1;uv=3.2;match=accent  e, o, s   i18n  [RFC XXXX]
  i;basic;uca=3.1.1;uv=3.2;match=case    e, o, s   i18n  [RFC XXXX]
  en;ascii-casemap                       e, o, s   Local [RFC XXXX]

Limited Use Collations:
  i;octet                                e, o, s   Other [RFC XXXX]
  i;ascii-numeric                        e, o      Other [RFC XXXX]

Vendor Collations:

Deprecated Collations:
  i;ascii-casemap                        e, o, s   Local [RFC XXXX]


References
----------
[RFC XXXX]  Newman, C., "Internet Application Protocol Collation
            Registry", RFC XXXX, Sun Microsystems, October 2003.
</artwork></figure>
</section>

</section>
<section title="Guidelines for Expert Reviewer">
<t>The expert reviewer appointed by the IESG has fairly broad latitude for this registry.  While a number of collations are expected (particularly customizations of the basic collation for localized use), an explosion of collations (particularly common use collations) is not desirable for widespread interoperability.  However, it is important for the expert reviewer to provide cause when rejecting a registration, and when possible to describe corrective action to permit the registration to proceed.  The following table includes some example reasons to reject a registration with cause:
<list style="symbols">
<t>The registration is not a well-formed XML document that follows the DTD.</t>
<t>The registration has intended use of "common", but there is no evidence the collation will be widely deployed so it should be listed as "limited".</t>
<t>The registration has intended use of "common", but is redundant with the functionality of a previously registered "common" collation.</t>
<t>The collation name fails to precisely identify the version numbers of relevant tables to use.</t>
<t>The registration fails to meet one of the "MUST" requirements in <xref target="compreq"/>.</t>
<t>The collation name fails to meet the syntax in <xref target="syntax"/>.</t>
<t>The collation specification referenced in the registration is vague or has optional features without a clear behavior specified.</t>
<t>The referenced specification does not adequately address security considerations specific to that collation.</t>
</list></t>
</section><section title="Initial Collations" anchor="initial">
<t>This section describes an initial set of collations for the collation registry.</t>



<?rfc needLines="3"?>
<section title="ASCII Numeric Collation" anchor="numeric">
<section title="ASCII Numeric Collation Description"><t>The "i;ascii-numeric" collation is a simple collation intended for use with arbitrary sized decimal numbers stored as octet strings of US-ASCII digits (0x30 to 0x39).  It supports equality and ordering, but does not support the substring function.  The algorithm is as follows:
<list style="numbers">
<t>If neither string begins with a digit, return "error" if matching, or the result of the "i;octet" collation for ordering.</t>
<t>If the first string begins with a digit and the second string does not, return "error" if matching and "-1" for ordering.</t>
<t>If the second string begins with a digit and the first string does not,
return "error" if matching and "+1" for ordering.</t>
<t>Let "n" be the number of digits at the beginning of the first string, and "m" be the number of digits at the beginning of the second string.</t>
<t>If n is equal to m, return the result of the "i;octet" collation.</t>
<t>If n is greater than m, prepend a string of "n - m" zeros to the second string and return the result of the "i;octet" collation.</t>
<t>If m is greater than n, prepend a string of "m - n" zeros to the first string and return the result of the "i;octet" collation.</t>
</list></t>
<t>The associated canonicalization algorithm is to truncate the input string at the first non-digit character.</t>
</section><section title="ASCII Numeric Collation Registration">
<figure><artwork type="xml"><![CDATA[
<?xml version='1.0'?>
<!DOCTYPE rfc SYSTEM 'collationreg.dtd'>
<collation rfc="]]>XXXX<![CDATA[" scope="other" intendedUse="limited">
  <name>i;ascii-numeric</name>
  <title>ASCII Numeric</title>
  <functions>equality order</functions>
  <specification>RFC ]]>XXXX<![CDATA[</specification>
  <owner>IETF</owner>
  <submitter>chris.newman@sun.com<submitter>
</collation>
]]></artwork></figure>
</section></section>

<section title="ASCII Casemap Collation" anchor="casemap">
<section title="ASCII Casemap Collation Description"><t>The "en;ascii-casemap" collation is a simple collation intended for use with English language text in pure US-ASCII.  It provides equality, substring and ordering functions.  The algorithm first applies a canonicalization algorithm to both input strings which subtracts 32 (0x20) from all octet values between 97 (0x61) and 122 (0x7A) inclusive.  The result of the collation is then the same as the result of the "i;octet" collation for the canonicalized strings.  Care should be taken when using OS-supplied functions to implement
this collation as this is not locale sensitive, but functions such as strcasecmp and toupper can be locale sensitive.</t>

<t>For historical reasons, in the context of ACAP and Sieve, the name "i;ascii-casemap" is a synonym for this collation.</t>
</section>




<section title="Legacy English Casemap Collation Registration">
<figure><artwork type="xml"><![CDATA[
<?xml version='1.0'?>
<!DOCTYPE rfc SYSTEM 'collationreg.dtd'>
<collation rfc="]]>XXXX<![CDATA[" scope="local" intendedUse="deprecated">
  <name>i;ascii-casemap</name>
  <title>Legacy English Casemap</title>
  <functions>equality order substring</functions>
  <specification>RFC ]]>XXXX<![CDATA[</specification>
  <owner>IETF</owner>
  <submitter>chris.newman@sun.com<submitter>
</collation>
]]></artwork></figure>
</section>

<section title="English Casemap Collation Registration">
<figure><artwork type="xml"><![CDATA[
<?xml version='1.0'?>
<!DOCTYPE rfc SYSTEM 'collationreg.dtd'>
<collation rfc="]]>XXXX<![CDATA[" scope="local" intendedUse="common">
  <name>en;ascii-casemap</name>
  <title>English Casemap</title>
  <functions>equality order substring</functions>
  <specification>RFC ]]>XXXX<![CDATA[</specification>
  <owner>IETF</owner>
  <submitter>chris.newman@sun.com<submitter>
</collation>
]]></artwork></figure>
</section></section>

<?rfc needLines="3"?>
<section title="Nameprep Collation" anchor="nameprep">
<section title="Nameprep Collation Description"><t>The "i;nameprep;v=1;uv=3.2" collation is an implementation of the
<xref target="RFC3491">nameprep</xref> specification based on normalization tables from
Unicode version 3.2.  This collation applies the nameprep canoncialization function to
both input strings and then returns the result of the i;octet collation on the
canonicalized strings.  While this collation offers all three functions, the ordering
function it provides is inadequate for use by the majority of the world.</t>

<t>Version number 1 is applied to nameprep as specified in RFC 3491. If the nameprep
specification is revised without any changes that would produce different results when
given the same pair of input octet strings, then the version number will remain
unchanged.</t>

<texttable><preamble>The table numbers for tables used by nameprep are as follows:</preamble>
<ttcol align="right">Table Number</ttcol><ttcol>Table Name</ttcol>
<c> 1</c><c>UnicodeData-3.2.0.txt</c>
<c> 2</c><c>Table B.1</c>
<c> 3</c><c>Table B.2</c>
<c> 4</c><c>Table C.1.2</c>
<c> 5</c><c>Table C.2.2</c>
<c> 6</c><c>Table C.3</c>
<c> 7</c><c>Table C.4</c>
<c> 8</c><c>Table C.5</c>
<c> 9</c><c>Table C.6</c>
<c>10</c><c>Table C.7</c>
<c>11</c><c>Table C.8</c>
<c>12</c><c>Table C.9</c>
</texttable></section>
<section title="Nameprep Collation Registration">
<figure><artwork type="xml"><![CDATA[
<?xml version='1.0'?>
<!DOCTYPE rfc SYSTEM 'collationreg.dtd'>
<collation rfc="]]>XXXX<![CDATA[" scope="i18n" intendedUse="common">
  <name>i;nameprep;v=1;uv=3.2</name>
  <title>Nameprep</title>
  <functions>equality order substring</functions>
  <specification>RFC ]]>XXXX<![CDATA[</specification>
  <owner>IETF</owner>
  <submitter>chris.newman@sun.com<submitter>
  <version>1</version>
  <UnicodeVersion>3.2</UnicodeVersion>
</collation>
]]></artwork></figure>
</section></section>


<section title="Basic Collation" anchor="basic">
<section title="Basic Collation Description"><t>The basic collation is intended to provide tolerable results for a number of languages for all three functions (equality, substring and ordering) so it is suitable as a mandatory-to-implement collation for protocols which include ordering support.  The ordering function of the basic collation is the <xref target="unicode-tr10v9">Unicode Collation Algorithm</xref> version 9 (UCAv9).</t>

<t>The equality and substring functions are created as described in UCAv9 section 8.  While that section is informative to UCAv9, it is normative to this collation specification.</t>

<t>This collation is based on Unicode version 3.2, with the following tables relevant:

<list style="numbers">

<t>For the normalization step,
&lt;http://www.unicode.org/Public/3.2-Update/UnicodeData-3.2.0.txt&gt;
is used.  Column 5 is used to determine the canonical decomposition, while column 3
contains the canonical combining classes necessary to attain canonical order.</t>

<t>The table of characters which require a logical order exception is a subset of the
table in &lt;http://www.unicode.org/Public/3.2-Update/PropList-3.2.0.txt&gt; and is
included here:
<figure><artwork type="utable">
0E40..0E44    ; Logical_Order_Exception
# Lo   [5] THAI CHARACTER SARA E..THAI CHARACTER SARA AI MAIMALAI
0EC0..0EC4    ; Logical_Order_Exception
# Lo   [5] LAO VOWEL SIGN E..LAO VOWEL SIGN AI

# Total code points: 10
</artwork></figure></t>

<t>The table used to translate normalized code points to a sort key is
&lt;http://www.unicode.org/reports/tr10/allkeys-3.1.1.txt&gt;.</t>

</list></t>

<t>UCAv9 includes a number of configurable parameters and steps labelled as potentially optional.  The following list summarizes the defaults used by this collation:

<list style="symbols">

<t>The logical order exception step is mandatory by default to support the largest number of languages.</t>

<t>Steps 2.1.1 to 2.1.3 are mandatory as the repertoire of the basic collation is intended to be large.</t>

<t>The second level in the sort key is evaluated forwards by default.</t>

<t>The variable weighting uses the "non-ignorable" option by default.</t>

<t>The semi-stable option is not used by default.</t>

<t>Support for exactly three levels of collation is the default behavior.</t>

<t>No preprocessing step is used by the basic collation prior to applying the UCAv9 algorithm.  Note that an application protocol specification MAY require pre-processing prior to the use of any collations.</t>

<t>The equality and substring algorithms exclude differences at level 2 and 3 by default (thus it is case-insensitive and ignores accentual distinctions.</t>

<t>The equality and substring algorithms use the "Whole Characters Only" feature described in UCAv9 section 8 by default.</t>

</list></t>

<t>The exact collation name with these defaults is "i;basic;uca=3.1.1;uv=3.2".  When a specification states that the basic collation is mandatory-to-implement, only this specific name is mandatory-to-implement.</t>

<figure><preamble>In order to allow modification of the optional behaviors, the following ABNF is used for variations of the basic collation:</preamble>
<artwork type="abnf">
  basic-collation  =  ("i" / Language-Tag) ";basic;uca=3.1.1;uv=3.2"
                      [";match=accent" / ";match=case"]
                      [";tailor=" 1*collation-char ]
</artwork></figure>

<t>If multiple modifiers appear, they MUST appear in the order described above.  The modifiers have the following meanings:

<list style="hanging" hangIndent="15">

<t hangText="match=accent">Both the first and second levels of the sort keys are considered relevant to the equality and substring operations (rather than the default of first level only).  This makes the matching functions sensitive to accentual distinctions.</t>

<t hangText="match=case">The first three levels of sort keys are considered relevant to the equality and substring operations.  This makes the matching functions sensitive to both case and accentual distinctions.</t>

</list></t>

<t>The default weighting option is "non-ignorable".  The "semi-stable"
sort key option is not used by default.</t>

<t>The canonicalization algorithm associated with this collation is the
output of step 3 of the UCAv9 algorithm (described in section 4.3 of the
UCA specification).  This canonicalization is not suitable for human
consumption.</t>

<t>Finally, the UCAv9 algorithm permits the "allkeys" table to be
tailored to a language.  People who make quality tailorings are
encouraged to register those tailorings using the collation registry. 
Tailoring names beginning with "x" are reserved for experimental use,
are treated as "Limited use" and MUST NOT match wildcards if any
registered collation is available that does match.</t>
</section><section title="Basic Collation Registration">
<figure><artwork type="xml"><![CDATA[
<?xml version='1.0'?>
<!DOCTYPE rfc SYSTEM 'collationreg.dtd'>
<collation rfc="]]>XXXX<![CDATA[" scope="i18n" intendedUse="common">
  <name>i;basic;uca=3.1.1;uv=3.2</name>
  <title>Basic</title>
  <functions>equality order substring</functions>
  <specification>RFC ]]>XXXX<![CDATA[</specification>
  <owner>IETF</owner>
  <submitter>chris.newman@sun.com<submitter>
  <UnicodeVersion>3.2</UnicodeVersion>
  <UCAVersion>3.1.1</UCAVersion>
  <UCAMatchLevel>1</UCAMatchLevel>
</collation>
]]></artwork></figure>
</section>

<section title="Basic Accent Sensitive Match Collation Registration">
<figure><artwork type="xml"><![CDATA[
<?xml version='1.0'?>
<!DOCTYPE rfc SYSTEM 'collationreg.dtd'>
<collation rfc="XXXX" scope="i18n" intendedUse="common">
  <name>i;basic;uca=3.1.1;uv=3.2;match=accent</name>
  <title>Basic Accent Sensitive Match</title>
  <functions>equality order substring</functions>
  <specification>RFC XXXX</specification>
  <owner>IETF</owner>
  <submitter>chris.newman@sun.com<submitter>
  <UnicodeVersion>3.2</UnicodeVersion>
  <UCAVersion>3.1.1</UCAVersion>
  <UCAMatchLevel>2</UCAMatchLevel>
</collation>
]]></artwork></figure>
</section>

<section title="Basic Case Sensitive Match Collation Registration">
<figure><artwork type="xml"><![CDATA[
<?xml version='1.0'?>
<!DOCTYPE rfc SYSTEM 'collationreg.dtd'>
<collation rfc="XXXX" scope="i18n" intendedUse="common">
  <name>i;basic;uca=3.1.1;uv=3.2;match=case</name>
  <title>Basic Case Sensitive Match</title>
  <functions>equality order substring</functions>
  <specification>RFC XXXX</specification>
  <owner>IETF</owner>
  <submitter>chris.newman@sun.com<submitter>
  <UnicodeVersion>3.2</UnicodeVersion>
  <UCAVersion>3.1.1</UCAVersion>
  <UCAMatchLevel>3</UCAMatchLevel>
</collation>
]]></artwork></figure>
</section></section>
<section title="Octet Collation" anchor="octet">
<section title="Octet Collation Description">
<t>The "i;octet" collation is a simple and fast collation intended for use on
binary octet strings rather than on character data.
It is the only such collation; it is not possible to register additional collations with this property. Protocols that want to make this collation available have to do so by explicitly allowing it. If not explicitly allowed, it MUST NOT be used. It never returns an "error" result.  It provides equality, substring and ordering
functions.</t>
<t>  The ordering algorithm is as follows:
<list style="numbers">
<t>If both strings are the empty string, return the result "0".</t>
<t>If the first string is empty and the second is not,
   return the result "-1".</t>
<t>If the second string is empty and the first is not,
   return the result "+1".</t>
<t>If both strings begin with the same octet value, remove the first
   octet from both strings and repeat this algorithm from step 1.</t>
<t>If the unsigned value (0 to 255) of the first octet of the first string is less than the unsigned value of the first octet of the second string, then return "-1".</t>
<t>If this step is reached, return "+1".</t>
</list></t>
<t>This algorithm is roughly equivalent to the C library function memcmp with appropriate length checks added.</t>
<t>The matching function returns "match" if the sorting algorithm would return "0".  Otherwise the matching function returns "no-match".</t>
<t>The substring function returns "match" if the first string is the empty string, or if there exists a substring of the second string of length equal to the length of the first string which would result in a "match" result from the equality function.  Otherwise the substring function returns "no-match".</t>
<t>The associated canonicalization algorithm is the identity function.</t>
</section><section title="Octet Collation Registration">
<figure><preamble>This collation is defined with intendedUse="limited" because it can only be used by protocols that explicitly allow it.</preamble><artwork type="xml"><![CDATA[
<?xml version='1.0'?>
<!DOCTYPE rfc SYSTEM 'collationreg.dtd'>
<collation rfc="]]>XXXX<![CDATA[" scope="i18n" intendedUse="limited">
  <name>i;octet</name>
  <title>Octet</title>
  <functions>equality order substring</functions>
  <specification>RFC ]]>XXXX<![CDATA[</specification>
  <owner>IETF</owner>
  <submitter>chris.newman@sun.com<submitter>
</collation>
]]></artwork></figure>
</section></section></section>






<section title="IANA Considerations"><t><xref target="registry"></xref> defines how to register collations with IANA. This section should be carefully studied, and commented upon if necessary, by IANA before approval of this document for publication as an RFC.<xref target="initial"></xref> defines a list of predefined collations, which should be registered when this document is approved and published as an RFC.</t></section><section title="Security Considerations">
<t>Collations will normally be used with UTF-8 strings.  Thus the security considerations for <xref target="RFC3629">UTF-8</xref> and <xref target="RFC3454">stringprep</xref> also apply and are normative to this specification.</t>
</section>

<section title="Open Issues"><t>See http://www.w3.org/2004/08/ietf-collation.</t>

</section>

<section title="Change Log"><section title="Changes From -02">
<t><list style="numbers">
   <t>Changed from data being octet sequences (in UTF-8) to data being character sequences (with octet collation as an exception).</t><t>Made XML format description much more structured.</t>
	<t>Changed &lt;submittor&gt; to &lt;submitter&gt;,
	because this spelling is much more common.</t>
	<t>Defined 'protocol' to include query languages.</t>
	<t>Reorganized document, in particular IANA considerations section
	(which newly is just a list of pointers).</t>
	<t>Added subsections, and a 'Structure of this Document' section.</t>
	<t>Updated references.</t>
	<t>Created a 'Change Log' chapter, with sections for each draft.</t>
	<t>Reduced 'Open issues' section, open issues are now maintained at
	http://www.w3.org/2004/08/ietf-collation.</t>
</list>
</t>
</section>

<section title="Changes From -01">
<t>Add IANA comment to open issues. Otherwise this is just a re-publish
   to keep the document alive.</t>
</section>

<section title="Changes From -00">
<t><list style="numbers">
<t>Replaced the term comparator with collation.  While comparator is
somewhat more precise because these abstract functions are used for
matching as well as ordering, collation is the term used by other parts
of the industry.  Thus I have changed the name to collation for
consistency.</t>
<t>Remove all modifiers to the basic collation except for the
customization and the match rules.  The other behavior modifications
can be specified in a customization of the collation.</t>
<t>Use ";" instead of "-" as delimiter between parameters to make names
more URL-ish.</t>
<t>Add URL form for comparator reference.</t>
<t>Switched registration template to use XML document.</t>
<t>Added a number of useful registration template elements related to the Unicode Collation Algorithm.</t>
<t>Switched language from "custom" to "tailor" to match UCA language for tailoring of the collation algorithm.</t>
</list></t>
</section></section>

</middle>
<back>

<references title="Normative References">
<reference anchor="RFC2119">
<front>
<title abbrev="RFC Key Words">Key words for use in RFCs to Indicate Requirement Levels</title>
<author initials="S." surname="Bradner" fullname="Scott Bradner"><organization/></author>
<date month="March" year="1997"/>
<area>General</area>
<keyword>keyword</keyword>
</front>
<seriesInfo name="BCP" value="14"/>
<seriesInfo name="RFC" value="2119"/>
</reference>

<reference anchor="RFC2234">
<front>
<title abbrev="ABNF for Syntax Specifications">Augmented BNF for Syntax Specifications: ABNF</title>
<author initials="D.H." surname="Crocker" fullname="David H. Crocker"><organization/></author>
<author initials="P." surname="Overell" fullname="Paul Overell"><organization/></author>
<date month="November" year="1997"/></front>
<seriesInfo name="RFC" value="2234"/>
</reference>

<reference anchor="RFC3629">
<front>
<title abbrev="UTF-8">UTF-8, a transformation format of ISO 10646</title>
<author initials="F." surname="Yergeau" fullname="Francois Yergeau"><organization/></author>
<date month="November" year="2003"/>
</front>
<seriesInfo name="STD" value="63"/>
<seriesInfo name="RFC" value="3629"/>
</reference>

<reference anchor="RFCYYYY">
<front>
<title abbrev="URI Generic Syntax">Uniform Resource Identifier (URI): Generic Syntax</title>
<author initials="T." surname="Berners-Lee" fullname="Tim Berners-Lee"><organization/></author>
<author initials="R.T." surname="Fielding" fullname="Roy T. Fielding"><organization/></author>
<author initials="L." surname="Masinter" fullname="Larry Masinter"><organization/></author>
<date month="April" year="2004"/>
<note title="Note to the RFC Editor">
  <t>Please update this reference with the RFC resulting from
draft-fielding-uri-rfc2396bis-xx.txt, and remove this Note</t></note>
</front>
<seriesInfo name="Internet-Draft" value="draft-fielding-uri-rfc2396bis-07.txt"/>
</reference>


<reference anchor="RFC3066">
<front>
<title>Tags for the Identification of Languages</title>
<author initials="H." surname="Alvestrand" fullname="H. Alvestrand">
<organization/></author>
<date year="2001" month="January"/></front>
<seriesInfo name="BCP" value="47"/>
<seriesInfo name="RFC" value="3066"/>
</reference>

<reference anchor="RFC3454">
<front>
<title>Preparation of Internationalized Strings ("stringprep")</title>
<author initials="P." surname="Hoffman" fullname="Paul Hoffman"><organization/></author>
<author initials="M." surname="Blanchet" fullname="Marc Blanchet"><organization/></author>
<date year="2002" month="December"/>
</front>
<seriesInfo name="RFC" value="3454"/>
</reference>

<reference anchor="RFC3491">
<front>
<title>Nameprep: A Stringprep Profile for Internationalized Domain Names (IDN)</title>
<author initials="P." surname="Hoffman" fullname="Paul Hoffman"><organization/></author>
<author initials="M." surname="Blanchet" fullname="Marc Blanchet"><organization/></author>
<date year="2003" month="March"/>
</front>
<seriesInfo name="RFC" value="3491"/>
</reference>

<reference anchor="unicode-tr10v9" target="http://www.unicode.org/reports/tr10/tr10-9.html">
 <front>
   <title>Unicode Collation Algorithm version 9</title>
   <author initials="M." surname="Davis" fullname="Mark Davis">
     <organization>International Business Machines</organization>
     <address>
       <email>mark.davis@us.ibm.com</email>
     </address>
   </author>
   <author initials="K." surname="Whistler" fullname="Ken Whistler">
     <organization>Unicode Consortium</organization>
     <address>
       <email>ken@unicode.org</email>
     </address>
   </author>
   <date month="July" year="2002"/>
 </front>
</reference>
</references>

<references title="Informative References">

<reference anchor="RFC2045">
<front>
<title abbrev="Internet Message Bodies">Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies</title>
<author initials="N." surname="Freed" fullname="Ned Freed">
<organization>Innosoft International, Inc.</organization></author>
<author initials="N.S." surname="Borenstein" fullname="Nathaniel S. Borenstein">
<organization>First Virtual Holdings</organization>
</author>
<date year="1996" month="November"/></front>
<seriesInfo name="RFC" value="2045"/>
</reference>

<reference anchor="RFC2222">
<front>
<title abbrev="SASL">Simple Authentication and Security Layer (SASL)</title>
<author initials="J.G." surname="Myers" fullname="John G. Myers">
<organization>Netscape Communications</organization>
</author>
<date year="1997" month="October"/>
<area>Security</area>
<keyword>authentication</keyword>
<keyword>security</keyword></front>
<seriesInfo name="RFC" value="2222"/>
</reference>

<reference anchor="RFC2244">
<front>
<title abbrev="ACAP">ACAP -- Application Configuration Access Protocol</title>
<author initials="C." surname="Newman" fullname="Chris Newman">
<organization>Innosoft International, Inc.</organization>
</author>
<author initials="J.G." surname="Myers" fullname="John Gardiner Myers">
<organization>Netscape Communications</organization>
</author>
<date year="1997" month="November"/>
</front>
<seriesInfo name="RFC" value="2244"/>
</reference>

<reference anchor="RFC2434">
<front>
<title abbrev="Guidelines for IANA Considerations">Guidelines for Writing an IANA Considerations Section in RFCs</title>
<author initials="T." surname="Narten" fullname="Thomas Narten">
<organization>IBM Corporation</organization>
</author>
<author initials="H.T." surname="Alvestrand" fullname="Harald Tveit Alvestrand">
<organization>Maxware</organization>
</author>
<date year="1998" month="October"/>
</front>
<seriesInfo name="BCP" value="26"/>
<seriesInfo name="RFC" value="2434"/>
</reference>

<reference anchor="RFC2822">
<front>
<title>Internet Message Format</title>
<author initials="P." surname="Resnick" fullname="P. Resnick">
<organization/></author>
<date year="2001" month="April"/></front>
<seriesInfo name="RFC" value="2822"/>
<format type="TXT" octets="110695" target="ftp://ftp.isi.edu/in-notes/rfc2822.txt"/>
</reference>

<reference anchor="RFC2978">
<front>
<title>IANA Charset Registration Procedures</title>
<author initials="N." surname="Freed" fullname="N. Freed">
<organization/></author>
<author initials="J." surname="Postel" fullname="J. Postel">
<organization/></author>
<date year="2000" month="October"/></front>
<seriesInfo name="BCP" value="19"/>
<seriesInfo name="RFC" value="2978"/>
</reference>

<reference anchor="RFC3028">
<front>
<title>Sieve: A Mail Filtering Language</title>
<author initials="T." surname="Showalter" fullname="T. Showalter">
<organization/></author>
<date year="2001" month="January"/></front>
<seriesInfo name="RFC" value="3028"/>
</reference>

</references>

</back>
</rfc>
