﻿<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE spec PUBLIC "-//W3C//DTD Specification V2.2//EN"
       "http://www.w3.org/2002/xmlspec/dtd/2.2/xmlspec.dtd"  [

<!--for doing work locally without a network connection -->
<!-- <!DOCTYPE spec PUBLIC "-//W3C//DTD Specification V2.2//EN"
       "xmlspec.dtd"  [ -->

  <!ENTITY exins "http://www.w3.org/2007/07/exi">
  <!ENTITY times "&#215;">
  <!ENTITY ne "&#8800;">
  <!ENTITY le "&#8804;">
  <!ENTITY oplus "&#8853;">
  <!ENTITY hellip "&#8230;">
  <!ENTITY vellip "&#8942;">
  <!ENTITY lceil "&#8968;">
  <!ENTITY rceil "&#8969;">
  <!ENTITY sqcup "&#x2294;">
  <!-- cup looked so close to alphabet "U", use sqcup instead -->
  <!-- !ENTITY cup "&#8746;" -->
  <!ENTITY nbsp "&#160;">
  <!ENTITY mdash "&#8212;">
]>
<!--

/*
 * Copyright (c) 2007 World Wide Web Consortium,
 *
 * (Massachusetts Institute of Technology, European Research Consortium for
 * Informatics and Mathematics, Keio University). All Rights Reserved. This
 * work is distributed under the W3C(r) Document License [1] in the hope that
 * it will be useful, but WITHOUT ANY WARRANTY; without even the implied
 * warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
 *
 * [1] http://www.w3.org/Consortium/Legal/2002/copyright-documents-20021231
 */

    -->
<!-- $Id: exi.xml,v 1.1 2008/09/16 11:51:04 cbournez Exp $ -->
<?xml-stylesheet type="text/xsl" href="exi.xsl"?>
<spec w3c-doctype="wd">
<header>
<title>Efficient XML Interchange (EXI) Format</title>
<version>1.0</version>
<w3c-designation>WD-exi-20080919</w3c-designation>
<!-- w3c-doctype>W3C Editors' Draft</w3c-doctype -->
<w3c-doctype>W3C Working Draft</w3c-doctype>
<pubdate>
<day>19</day>
<month>September</month>
<year>2008</year></pubdate>
<notice>
This is the Last Call working draft.
</notice>
<publoc>
<loc href="http://www.w3.org/TR/2008/WD-exi-20080919/">http://www.w3.org/TR/2008/WD-exi-20080919/</loc>
</publoc>
<altlocs>
<loc role="xml" href="exi.xml">XML</loc></altlocs>
<prevlocs>
<loc href="http://www.w3.org/TR/2008/WD-exi-20080728/">http://www.w3.org/TR/2008/WD-exi-20080728/</loc>
</prevlocs>
<latestloc>
<loc href="http://www.w3.org/TR/exi/">http://www.w3.org/TR/exi/</loc></latestloc>
<authlist>
<author>
<name>John Schneider</name>
<affiliation>AgileDelta, Inc.</affiliation>
<!-- email></email --></author>
<author>
<name>Takuki Kamiya</name>
<affiliation>Fujitsu Laboratories of America, Inc.</affiliation>
<!-- email></email --></author></authlist>
<abstract>
<p>This document is the specification of the Efficient XML Interchange (EXI)
format. EXI is a very compact representation for the Extensible Markup
Language (XML) Information Set that is intended to simultaneously optimize
performance and the utilization of computational resources. The EXI
format uses a hybrid approach drawn from the information and formal language
theories, plus practical techniques verified by measurements,
for entropy encoding XML information. Using a relatively simple algorithm,
which is amenable to fast and compact implementation, and a small set of
data types, it reliably produces efficient encodings of XML event streams.
The event production system and format definition of EXI are presented.</p>
</abstract>
<status id="Status">
<p>
<emph>This section describes the status of this document at the time
of its publication. Other documents may supersede this document. A
list of current W3C publications and the latest revision of this
technical report can be found in the <loc
href="http://www.w3.org/TR/">W3C technical reports index</loc> at
http://www.w3.org/TR/.</emph></p>
<p>
This is the Last Call Public Working Draft of the Efficient XML Interchange (EXI) Format 1.0. It is made available for review by W3C members and other interested parties. It has been produced by the <loc href="http://www.w3.org/XML/EXI/">Efficient XML Interchange (EXI) Working Group</loc>, which is part of the <loc href="http://www.w3.org/XML/Activity">Extensible Markup Language (XML) Activity</loc>. A summary <xspecref href='#changes'>list of changes</xspecref> made to this document since the last publication is available.
</p>
<p>
The Working Group intends to advance this specification to W3C Recommendation status. In addition, the group has produced two draft notes, publications of which are part of the criteria for this specification to enter Last Call status. Those notes each analyze the impacts of the new format on existing XML technologies <bibref ref="exiimpacts"/>, and the evaluation of performance gains of the format based on the criteria defined by the XBC Working Group <bibref ref="exieval"/>.
</p>
<p>
The features and algorithms described in this document are considered stable at the time of this writing.

However, the mechanism described in section <specref ref="encodingOptimizedForMisses"/> may be subject to change. This mechanism caps the amount of memory used for value partitions in string tables.

It should be considered a feature at risk and may later be altered or replaced if (and only if) 
the Working Group identifies another mechanism that provides even better efficiency.
</p>
<p>
Any feedback on this specification is welcome. Please send comments about this
document to <loc href="mailto:public-exi-comments@w3.org">public-exi-comments@w3.org</loc>
(<loc href="http://lists.w3.org/Archives/Public/public-exi-comments/">public archive</loc>).
When preparing comments to send in, please provide a separate email message for each distinct issue to the extent possible. The Last Call review period for this document extends until 07 November 2008.
</p>
<p>Publication as a Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.</p>
<p> This document was produced by a group operating under the <a href="http://www.w3.org/Consortium/Patent-Policy-20040205/">5 February 2004 W3C Patent Policy</a>. W3C maintains a <a rel="disclosure" href="http://www.w3.org/2004/01/pp-impl/38502/status#specs">public list of any patent disclosures</a> made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains <a href="http://www.w3.org/Consortium/Patent-Policy-20040205/#def-essential">Essential Claim(s)</a> must disclose the information in accordance with <a href="http://www.w3.org/Consortium/Patent-Policy-20040205/#sec-Disclosure">section 6 of the W3C Patent Policy</a>. </p>
</status>
<langusage>
<language id="en-US">English</language></langusage>
<revisiondesc>
<p>Last Modified: $Date: 2008/09/16 11:51:04 $</p></revisiondesc></header>
<body>
<div1 id="introduction">
<head>Introduction</head>
<p>The Efficient XML Interchange (EXI) format is a very compact, high
performance XML representation that was designed to work well for a
broad range of applications.  It simultaneously improves performance
and significantly reduces bandwidth requirements without compromising
efficient use of other resources such as battery life, code size,
processing power, and memory.
</p>
<p>EXI uses a grammar-driven approach that achieves very efficient
encodings using a straightforward encoding algorithm and a small set
of data types. Consequently, <termref def="key-exiprocessor">EXI processors</termref> are relatively simple and
can be implemented on devices with limited capacity. </p> <p>EXI is schema
&quot;informed&quot;, meaning that it can utilize available schema
information to improve compactness and performance, but does not
depend on accurate, complete or current schemas to work. It supports
arbitrary schema extensions and deviations and also works very
effectively with partial schemas or in the absence of any schema.  The
format itself also does not depend on any particular schema language,
or format, for schema information. </p>
<p><termdef id="key-exiprocessor" term="EXI processor">A program module
called an <term>EXI processor</term>, whether it is part of a software or
a hardware, is used by application programs to encode their structured data
into <termref def="key-existream">EXI streams</termref> and/or to decode
<termref def="key-existream">EXI streams</termref> to make the structured
data accessible to them.</termdef>

The former and the latter of the aforementioned roles of EXI processors are each called <termdef id="key-exiencoder"><term>EXI stream encoder</term></termdef> and <termdef id="key-exidecoder"><term>EXI stream decoder</term></termdef>.


This document not only specifies the
EXI format, but also defines errors that EXI processors are required to
detect and behave upon.</p>

<!-- div2 id="processorRoles">
<head>EXI Processor Roles</head>
<p>
There are two distinct roles, EXI stream encoder and EXI stream decoder, that EXI Processors can serve. The role of an EXI stream encoder is to provide the capability for applications to encode their structured data into EXI streams, whereas the EXI stream decoder is the role that serves to decode EXI streams to make the structured data accessible to applications. EXI Processors MUST provide the function of at least one of these two roles. The roles are not mutually exclusive. 
While those EXI Processors that serve only either one of the roles can be still useful, other EXI Processors MAY choose to provide both roles at once to enable applications with the ability to generate and consume EXI Streams.
</p>
</div2 -->

<p>The primary goal of this document is to define the EXI format completely without leaving ambiguity so as to make it feasible for implementations to interoperate. As such, the document lends itself to describing the design and features of the format in a systematic manner, often declaratively with relatively few prosaic annotations and examples. Those readers who prefer a step-by-step introduction to the EXI format design and features are suggested to start with the non-normative <bibref ref="exiprimer"/>.
</p>
<div2 id="history">
<head>History and Design</head>
<p>EXI is the result of extensive work carried out by the W3C's XML
Binary Characterization (XBC) and Efficient XML Interchange (EXI)
Working Groups. XBC was chartered to investigate the costs and
benefits of an alternative form of XML, and formulate a way to objectively
evaluate the potential of a substitute format for XML.  Based on XBC's
recommendations, EXI was chartered, first to measure, evaluate, and
compare the performance of various XML technologies (using metrics
developed by XBC <bibref ref="xbcmeas"/>), and then, if it appeared
suitable, to formulate a recommendation for a W3C format
specification. The measurements results and analyses, are presented
elsewhere <bibref ref="eximeas"/>. The format described in this
document is the specification so recommended. 
</p>
<p>The functional requirements of the EXI format are those that were
prepared by the XBC WG in their analysis of the desirable properties
of a high performance encoding for XML <bibref ref="xbcproperties"/>.
Those properties were derived from a very broad set of use cases also
identified by the XBC working group <bibref ref="xbcusecases"/>.
</p>
<p>The design of the format presented here, is largely based on the
results of the measurements carried out by the group to evaluate the
performance characteristics (mainly of processing efficiency and
compactness) of various existing formats. The EXI format is based on
Efficient XML <bibref ref="efx"/>, including for example the basis heuristic grammar approach,
compression algorithm, and resulting entropy encoding. 
</p>
<p>EXI is compatible with XML at the XML Information Set <bibref
ref="XMLInfoset"/> level, rather than at the XML syntax level. This
permits it to encapsulate an efficient alternative syntax and grammar
for XML, while facilitating at least the potential for minimizing the
impact on XML application interoperability.
</p>    
</div2>
<div2 id="conventions">
<head>Notational Conventions and Terminology</head>
<p>The key words MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD,
SHOULD NOT, RECOMMENDED, MAY, and OPTIONAL, when they appear
EMPHASIZED in this document, are to be interpreted as described in RFC
2119 <bibref ref="RFC2119"/>. Other terminology used to describe the EXI
format is defined in the body of this specification.
</p>
<p>The term <term>event</term> and <term>stream</term> is used throughout this document to denote <term><termref def="key-exievent">EXI event</termref></term> and <term><termref def="key-existream">EXI stream</termref></term> respectively unless the words are qualified differently to mean otherwise.</p>
<p>This document specifies an abstract grammar for EXI. In grammar notation, all terminal
symbols are represented in plain text and all non-terminal symbols are
represented in <emph>italics</emph>. Grammar productions are
represented as follows: </p>
<table width="100%">
<tbody>
<tr>
<td width="5%"></td>
<td>
<emph>LeftHandSide</emph> :&nbsp;&nbsp;
Event&nbsp;&nbsp;<emph>NonTerminal</emph></td></tr></tbody></table>
<p>A set of one or more grammar productions that share the same
left-hand-side non-terminal symbol are often presented together along
with <termref def="key-eventcode">event codes</termref> that uniquely
identify events among the collocated productions as follows: 
</p>
<table width="100%">
<tbody>
<tr>
<td width="5%"></td>
<td colspan="3">
<emph>LeftHandSide</emph> :</td></tr>

<tr>
<td></td>
<td width="5%"></td>
<td width="75%">
Event <sub>1</sub>&nbsp;&nbsp;<emph>NonTerminal 
<sub>1</sub></emph></td>
<td>EventCode<sub>1</sub></td></tr>
<tr>
<td></td>
<td></td>
<td>
Event <sub>2</sub>&nbsp;&nbsp;<emph>NonTerminal 
<sub>2</sub></emph></td>
<td>EventCode<sub>2</sub></td></tr>
<tr>
<td></td>
<td></td>
<td>
Event <sub>3</sub>&nbsp;&nbsp;<emph>NonTerminal 
<sub>3</sub></emph></td>
<td>EventCode<sub>3</sub></td></tr>
<tr>
<td></td>
<td></td>
<td>...</td>
<td></td></tr>
<tr>
<td></td>
<td></td>
<td>
Event <sub>n</sub>&nbsp;&nbsp;<emph>NonTerminal 
<sub>n</sub></emph></td>
<td>EventCode<sub>n</sub></td></tr>
</tbody></table>
<p>Section <specref ref="grammarNotation"/> introduces additional notations for describing productions and event codes in grammars. Those additional notations facilitates concise representation of the EXI grammar system.
</p>
<p>
<termdef id="key-qname">
In this document, the term <term>qname</term> is used to denote a 
<xspecref spec="XS2" ref="QName">QName</xspecref>.
</termdef>
When used to qualify terminal symbols in grammars (see <specref ref="eventTypes"/> for notation), to identify built-in element grammars (see <specref ref="builtinElemGrammars"/>) and global type grammars (see <specref ref="typeGrammars"/>), or to distinguish value channels in EXI compression (see <specref ref="ValueChannels"/>), such uses of qname represent QName values, which are tuples of { uri, local-name }. Otherwise, a qname represents a QName value affixed with a prefix part to make a triplet of { prefix, uri, local-name }, where the absence of prefix is indicated by "" (an empty string). Two qnames are considered equal when they have the same uri and the same local-name to each other regardless of prefix values.
</p>
<p>Terminal symbols that are qualified with a qname permit the use of a wildcard symbol (*) in place of or as part of a qname. The forms of terminal symbols involving qname wildcards used in grammars and their definitions are described in the table below.
</p>
<table width="80%" border="1">
<colgroup align="left" width="25%"></colgroup>
<colgroup/>
<thead>
<tr>
<th align="center">Wildcard</th>
<th>Definition</th>
</tr>
</thead>
<tbody>
<tr>
<td>&nbsp;&nbsp;&nbsp;&nbsp;SE (*)</td>
<td>The terminal symbol that matches a start element (SE) event with any qname.</td>
</tr>
<tr>
<td>&nbsp;&nbsp;&nbsp;&nbsp;SE (<emph>uri</emph>&nbsp;:&nbsp;*)</td>
<td>The terminal symbol that matches a start element (SE) event with any local-name in namespace <emph>uri</emph>.</td>
</tr>
<tr>
<td>&nbsp;&nbsp;&nbsp;&nbsp;AT (*)</td>
<td>The terminal symbol that matches an attribute (AT) event with any qname.</td>
</tr>
</tbody>
</table>
<p>Several prefixes are used throughout this document to designate certain namespaces. The bindings shown below are assumed, however, any prefixes can be used in practice if they are properly bound to the namespaces.</p>
<table width="80%" border="1">
<colgroup align="left" width="25%"></colgroup>
<colgroup/>
<thead>
<tr>
<th align="center">Prefix</th>
<th>Namespace Name</th>
</tr>
</thead>
<tbody>
<tr>
<td>&nbsp;&nbsp;&nbsp;&nbsp;exi</td>
<td>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
&exins;</td>
</tr>
<tr>
<td>&nbsp;&nbsp;&nbsp;&nbsp;xml</td>
<td>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
http://www.w3.org/XML/1998/namespace</td>
</tr>
<tr>
<td>&nbsp;&nbsp;&nbsp;&nbsp;xsd</td>
<td>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
http://www.w3.org/2001/XMLSchema</td>
</tr>
<tr>
<td>&nbsp;&nbsp;&nbsp;&nbsp;xsi</td>
<td>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
http://www.w3.org/2001/XMLSchema-instance</td>
</tr>
</tbody>
</table>
<p>In describing the layout of an EXI format construct, a pair of square brackets [ ] are used to surround the name of a field to denote that the occurrence of the field is optional in the structure of the part or component that contains the field.
</p>
<p>In arithmetic expressions, the notation &lceil;<emph>x</emph>&rceil; where <emph>x</emph> represents a real number denotes the ceiling of <emph>x</emph>, that is, the smallest integer greater than or equal to <emph>x</emph>.
</p>
</div2></div1>
<div1 id="principles">
<head>Design Principles</head>
<p>The following design principles were used to guide the development of EXI and encourage consistent design decisions. They are listed here to provide insight into the EXI design rationale and to anchor discussions on desirable EXI traits.</p>
<glist>
<gitem>
<label>General:</label>
<def>
<p>One of primary objectives of EXI is to maximize the number of systems, devices and applications that can communicate using XML data. Specialized approaches optimized for specific use cases should be avoided.</p></def></gitem>
<gitem>
<label>Minimal:</label>
<def>
<p>To reach the broadest set of small, mobile and embedded applications, simple, elegant approaches are preferred to large, analytical or complex ones. </p></def></gitem>
<gitem>
<label>Efficient:</label>
<def>
<p>EXI must be competitive with hand-optimized binary formats so it can be used by applications that require this level of efficiency. </p></def></gitem>
<gitem>
<label>Flexible:</label>
<def>
<p>EXI must deal flexibly and efficiently with documents that contain arbitrary schema extensions or deviate from their schema. Documents that contain schema deviations should not cause encoding to fail. </p></def></gitem>
<gitem>
<label>Interoperable:</label>
<def>
<p>EXI must integrate well with existing XML technologies, minimizing the changes required to those technologies. It must be compatible with the XML Information Set <bibref ref="XMLInfoset"/>, without significant subsetting or supersetting, in order to maintain interoperability with existing and prospective XML specifications.</p></def></gitem></glist></div1>
<div1 id="concepts">
<head>Basic Concepts</head>
<p>EXI achieves broad generality, flexibility, and performance, by unifying concepts from formal language theory and information theory into a single, relatively simple algorithm. The algorithm uses a grammar to determine what is likely to occur at any given point in an XML document and encodes the most likely alternatives in fewer bits. The fully generalized algorithm works for any language that can be described by a grammar (e.g., XML, Java, HTTP, etc.); however, EXI is optimized specifically for XML languages. </p>
<p>The built-in EXI grammar accepts any XML document or fragment and may be augmented with productions derived from XML Schemas <bibref ref="schema1"/><bibref ref="schema2"/>, RELAX NG schemas <bibref ref="relaxng"/>, DTDs <bibref ref="XML10"/> or other sources of information about what is likely to occur in a set of XML documents. The EXI encoder uses the grammar to map a stream of XML information items onto a smaller, lower entropy, stream of events. </p>
<p>The encoder then represents the stream of events using a set of simple variable length codes called <termref def="key-eventcode">event codes</termref>. <termref def="key-eventcode">Event codes</termref> are similar to Huffman codes <bibref ref="huffman"/>, but are much simpler to compute and maintain. They are encoded directly as a sequence of values, or if additional compression is desired, they are passed to the <termref def="compression">EXI compression</termref> algorithm, which replaces frequently occurring event patterns to further reduce size. </p>
<p>When schemas are used, EXI also supports a user-customizable set of typed encodings for efficiently encoding typed values. </p></div1>

<div1 id="streams">
<head>EXI Streams</head>
<p><termdef id="key-existream" term="EXI Stream">An <term>EXI stream</term> is an 
<termref def="key-exiheader">EXI header</termref>
followed by an EXI body.</termdef> <termdef id="key-exibody" term="EXI Body">It is the <term>EXI body</term> that carries the content of the document, while the EXI header amongst its roles communicates the options that were used for encoding the EXI body.</termdef> Section
<specref ref="header"/> describes the <termref def="key-exiheader">EXI header</termref>. Values in an EXI stream are packed into bytes most significant bit first.</p>
<!-- This is still permitted, but is not normative and should go in best practices -->
<!--
<p>Applications that use EXI streams embedded in a container data format that discerns it is an EXI stream, dictates the EXI format version and the EXI Options used for its encoding, may omit the EXI header. Although an EXI Body is not a valid EXI stream, EXI processors MAY provide a capability to process an EXI body independent of an EXI stream.
</p>
-->
<p><termdef id="key-exievent" term="EXI Event">The building block of an EXI body is an <term>EXI event</term>.</termdef> An EXI body consists of a sequence of EXI events representing an <termref def="key-exidocument">EXI document</termref> or an <termref def="key-exifragment">EXI fragment</termref>.
</p>
<p>The EXI events permitted at any given position in an EXI stream are determined by the EXI grammar. 
As is the case with XML, 
the events occur with nesting pairs of matching start element and end element events where any pair does not intersect with another except when it is fully contained in the other.

The EXI grammar incorporates knowledge of the XML grammar and may be augmented and refined using schema information and fidelity options. The EXI grammar is formally specified in section <specref ref="grammars"/>.</p>
<p>
The EXI grammars either permits only a single root element or multiple root elements in an EXI body, depending on the top-level grammar used for processing the body.
<termdef id="key-exidocument" term="EXI Document">
<term>EXI documents</term> are EXI bodies encoded using either Built-in Document Grammar (See <specref ref="builtinDocGrammars"/>) or Schema-informed Document Grammar (See <specref ref="informedDocGrammars"/>), and are inherently restricted to each contain only a single root element as per the grammars.
</termdef>
<termdef id="key-exifragment" term="EXI Fragment">
<term>EXI fragments</term> are EXI bodies encoded using either Built-in Fragment Grammar (See <specref ref="builtinFragGrammars"/>) or Schema-informed Fragment Grammar (See <specref ref="informedFragGrammars"/>), and are permitted to each contain multiple root elements.
</termdef>
</p>
<p>
<termdef id="key-schemainformed-existream" term="Schema-informed EXI Stream">When schema information is available to describe the contents of an EXI body, such an EXI stream is a <term>schema-informed EXI stream</term>, and either Schema-informed Document Grammar (See <specref ref="informedDocGrammars"/>) or Schema-informed Fragment Grammar (See <specref ref="informedFragGrammars"/>) is used to process the EXI body.</termdef> 
<termdef id="key-schemaless-existream" term="Schema-less EXI Stream">Otherwise, an EXI stream is a <term>schema-less EXI stream</term>, and either Built-in Document Grammar (See <specref ref="builtinDocGrammars"/>) or Built-in Fragment Grammar (See <specref ref="builtinFragGrammars"/>) is used to process the EXI body.</termdef>
</p>
<p>The following table summarizes the EXI events and associated content that occur in an EXI stream.  The content items appear in an EXI stream in the order they are shown in the table.  In addition, the table includes the grammar notation used to represent each event in this specification. Each event in an EXI stream participates in a mapping system that relates events to XML Information Items so that an EXI document 
or an EXI fragment 

as a whole serves to represent an XML Information Set. The table shows XML Information Items relevant to each EXI event type. Appendix <specref ref="InfosetMapping"/> describes the mapping system in detail.</p>
<table id="eventTypes" border="1">
<caption>EXI event types</caption>
<thead>
<tr>
<th>EXI Event Type</th>
<th>Content</th>
<th>Grammar Notation</th>
<th>Information Item</th>
</tr>
</thead>
<tbody>
<tr>
<td>Start Document</td>
<td>&nbsp;</td>
<td>SD</td>
<td rowspan="2"><specref ref="DocumentInformationItem"/></td></tr>
<tr>
<td>End Document</td>
<td>&nbsp;</td>
<td>ED</td></tr>
<tr>
<td rowspan="3">Start Element</td>
<td rowspan="3"><emph>qname</emph>
</td>
<td>SE ( 
<emph>qname</emph> )</td>
<td rowspan="4"><specref ref="ElementInformationItem"/></td></tr>
<tr>
<td>SE ( 
<emph>*</emph> )</td></tr>
<tr>
<td>
SE ( <emph>uri&nbsp;:&nbsp;*</emph> )

</td></tr>
<tr>
<td>End Element</td>
<td>&nbsp;</td>
<td>EE</td></tr>
<tr>
<td rowspan="2">Attribute</td>
<td rowspan="2"><emph>qname, value</emph></td>
<td>AT ( 
<emph>qname</emph> )</td>
<td rowspan="2"><specref ref="AttributeInformationItem"/></td></tr>
<tr>
<td>AT ( 
<emph>*</emph> )</td></tr>
<tr>
<td>Characters</td>
<td><emph>value</emph></td>
<td>CH</td>
<td><specref ref="CharacterInformationItem"/></td></tr>
<tr>
<td>Namespace Declaration</td>
<td>
<emph>
uri
</emph>, <emph>
prefix
</emph>, 
<emph>local-element-ns</emph>
</td>
<td>NS</td>
<td><specref ref="NamespaceInformationItem"/></td></tr>
<tr>
<td>Comment</td>
<td>
<emph>text</emph></td>
<td>CM</td>
<td><specref ref="CommentInformationItem"/></td></tr>
<tr>
<td>Processing Instruction</td>
<td>
<emph>name, text</emph></td>
<td>PI</td>
<td><specref ref="ProcessingInstructionInformationItem"/></td></tr>
<tr>
<td>DOCTYPE</td>
<td>
<emph>name, public, system, text</emph></td>
<td>DT</td>
<td><specref ref="DocumentTypeDeclaractionInformationItem"/></td></tr>
<tr>
<td>Entity Reference</td>
<td>
<emph>name</emph></td>
<td>ER</td>
<td><specref ref="UnexpandedEntityInformationItem"/></td></tr>
<tr>
<td>Self Contained</td>
<td>&nbsp;</td>
<td>SC</td>
<td>&nbsp;</td></tr>
</tbody></table>
<p>Section 
<specref ref="encodingEvents"/> describes the algorithm used to encode events in the EXI stream. 
As indicated in the table above, there are some event types that carry content with their event instances while other event types function as markers without content. 
</p>
<p>SE events may be followed by a series of NS events. Each NS event either associates a prefix with an URI, assigns a default namespace, or in the case of a namespace declaration with an empty URI, rescinds one of such associations in effect at the point of its occurrence. The effect of the association or disassociation caused by a NS event stays in effect until the corresponding EE event occurs.
</p>
<p>Like XML, the namespace of a particular element may be specified by a namespace declaration preceeding the element or a local namespace declaration following the element name. When the namespace is specified by a local namespace declaration, the <emph>local-element-ns</emph> flag of the associated NS event is set to true and the prefix of the element is set to the prefix of that NS event. When the namespace is specified by a previous namespace declaration, the <emph>local-element-ns</emph> flag of all local NS events is false and the prefix of the element is set according to the prefix component of the element <emph>qname</emph>. The series of NS events associated with a particular element may include at most one NS event with its 
<emph>local-element-ns</emph> flag
 set to true. The <emph>uri</emph> of a NS event with its 
<emph>local-element-ns</emph> flag
 set to true MUST match the <emph>uri</emph> of the associated SE event.
</p>
<p>An SE event may be followed by a SC event, indicating the element is self-contained and can be read independently from the rest of the EXI body. Applications may use self-contained elements to index portions of the EXI body for random access.
</p>
<p>Each item in the event content has a data type associated with it as shown in the following table. The content of each event, if any, is encoded as a sequence of items each of which being encoded according to its data type in order starting with the first item followed by subsequent items.</p>
<table border="1" width="95%" id='table2'>
<caption>Data types of event content items</caption>
<colgroup width="20%"/>
<colgroup width="30%"/>
<colgroup width="50%"/>
<thead>
<tr>
<th>Content item</th>
<th>Used in</th>
<th>Type</th></tr>
</thead>
<tbody>
<tr>
<td id="key-nameContentItem">
<emph>name</emph></td>
<td>PI, DT, ER</td>
<td>
<specref ref="encodingString"/></td></tr>
<tr>
<td id="key-prefixContentItem">
<emph>prefix</emph></td>
<td>NS</td>
<td>
<specref ref="encodingString"/></td></tr>
<tr>
<td id="key-indicatorContentItem">
<emph>local-element-ns</emph>
</td>
<td>NS</td>
<td>
<specref ref="encodingBoolean"/></td></tr>
<tr>
<td id="key-publicContentItem">
<emph>public</emph></td>
<td>DT</td>
<td>
<specref ref="encodingString"/></td></tr>
<tr>
<td id="key-qnameContentItem">
<emph>qname</emph></td>
<td>SE, AT</td>
<td>
<specref ref="encodingQName"/></td></tr>
<tr>
<td id="key-systemContentItem">
<emph>system</emph></td>
<td>DT</td>
<td>
<specref ref="encodingString"/></td></tr>
<tr>
<td id="key-textContentItem">
<emph>text</emph></td>
<td>CM, PI</td>
<td>
<specref ref="encodingString"/></td></tr>
<tr>
<td id="key-uriContentItem">
<emph>uri</emph></td>
<td>NS</td>
<td>
<specref ref="encodingString"/></td></tr>
<tr>
<td id="key-valueContentItem">
<emph>value</emph></td>
<td>CH, AT</td>
<td>According to the schema type (see 
<specref ref="encodingValues"/>) if any is in effect, otherwise <specref ref="encodingString"/></td></tr></tbody></table>

<p>Content items other than <emph>value</emph> have their inherent, fixed data types independent of their uses. The data type that governs each occurrence of the <emph>value</emph> item depends on the schema type if any that is in effect for the value in question. The type xsd:anySimpleType is used for <emph>value</emph>s that do not have an associated schema-type, are schema-invalid, or occur in mixed content. Section 
<specref ref="encodingValues"/> describes how each of the types listed above are encoded in an EXI stream. </p>
<ednote>
<edtext>
The syntax and semantics of NS event is so formulated in favor of simplicity in order not to incur processing cost that would have otherwise be involved by such operations as sorting and conditional branching, yet that it keeps the number of additional bits required to achieve the functionality in overall EXI streams to the minimal with the observation that the number of namespace declarations in an EXI stream is generally small.
</edtext>
</ednote>
</div1>
<div1 id="header">
<head>EXI Header</head>
<p>
Each EXI stream begins with an EXI header.
<termdef id="key-exiheader" term="EXI header">
The <term>EXI header</term> 
can identify EXI streams, 

distinguish EXI 
streams 

from text XML documents, 
identify the version of the EXI format being used, and specify the options used to process the body of the EXI stream.
</termdef>
The EXI header has the following structure:
</p>

<table border="1" rules="cols">
<tbody>
<tr>
<td align="center" rowspan="2">
<termref def="key-exiCookie">
&nbsp;[&nbsp;EXI&nbsp;Cookie&nbsp;]&nbsp;
</termref>

</td>
<td align="center" rowspan="2">
<termref def="key-distinguishingbits">
&nbsp;Distinguishing&nbsp;Bits&nbsp;
</termref></td>
<td align="center">
&nbsp;Presence&nbsp;Bit&nbsp;
</td>
<td align="center">
<termref def="key-version">
&nbsp;EXI&nbsp;Format&nbsp;
</termref>
</td>
<td align="center" rowspan="2">
&nbsp;[<termref def="key-options">EXI&nbsp;Options</termref>]&nbsp;
</td>
<td align="center" rowspan="2">
&nbsp;[Padding&nbsp;Bits]&nbsp;
</td>
</tr>
<tr>
<td align="center">
&nbsp;for&nbsp;EXI&nbsp;Options&nbsp;
</td>
<td align="center">
<termref def="key-version">
&nbsp;Version&nbsp;
</termref>
</td>
</tr>
</tbody>
</table>
<p>The EXI Options field within an EXI header is optional.  Its presence is indicated by
the value of the presence bit that follows <termref def="key-distinguishingbits">Distinguishing Bits</termref>. The presence and absence is indicated by the value 1 and 0, respectively.  
</p>

<p>When either <termref def="key-compressionOption">compression</termref> is used, or the <termref def="key-alignmentOption">alignment</termref> used is one of <termref def="key-bytealignment">byte-alignment</termref> or <termref def="key-precompression">pre-compression</termref> as dictated by <termref def="key-options">EXI Options</termref>, 
padding bits of minumum length required to make the whole length of 
the header byte-aligned are added at the end of the header. 
The padding bits field can contain any values of bits as its contents.

</p>

<p>
The details of 
<termref def="key-exiCookie">EXI Cookie</termref>, 

<termref def="key-distinguishingbits">Distinguishing Bits</termref>, <termref def="key-version">EXI Format Version</termref> and <termref def="key-options">EXI Options</termref> are described in the following sections.
</p>

<div2 id="EXICookie">
<head>EXI Cookie</head>
<p>
<termdef id="key-exiCookie" term="EXI Cookie">
An <termref def="key-exiheader">EXI header</termref> MAY start with an <term>EXI Cookie</term>,
which is a four byte field that serves to indicate that the stream of which it is a part is an EXI stream.</termdef> The four byte field consists of four characters 
"&nbsp;$&nbsp;" , "&nbsp;E&nbsp;", "&nbsp;X&nbsp;" and "&nbsp;I&nbsp;" 

in that order, each represented as an ASCII octet, as follows.
</p>

<table border="1">
<tbody>
<tr class="bitcell">
<td align="center" class="bitcell">'<sup>&nbsp;</sup>$<sub>&nbsp;</sub>'</td>
<td align="center" class="bitcell">'<sup>&nbsp;</sup>E<sub>&nbsp;</sub>'</td>
<td align="center" class="bitcell">'<sup>&nbsp;</sup>X<sub>&nbsp;</sub>'</td>
<td align="center" class="bitcell">'<sup>&nbsp;</sup>I<sub>&nbsp;</sub>'</td>
</tr>
</tbody>
</table>

<p>
This four byte sequence is particular to EXI and specific enough to distinguish EXI streams from a broad range of data types currently used on the Web. While the EXI cookie is optional, its use is RECOMMENDED in the EXI header when the EXI stream is exchanged in a context where longer, more solid content-based datatype identification is desired than what is provided by <termref def="key-distinguishingbits">Distinguishing Bits</termref> whose role is rather narrowly focused on distinguishing EXI streams from XML documents.
</p>

</div2>

<div2 id="DistinguishingBits">
<head>Distinguishing Bits</head>

<p>
<termdef id="key-distinguishingbits" term="Distinguishing Bits">
The second part in the EXI header is the <term>Distinguishing Bits</term>,
which is a two bit field of which the first bit contains the value 1 and the second bit contains the value 0, as follows.</termdef>
</p>

<table border="1">
<tbody>
<tr class="bitcell">
<td align="center" class="bitcell">1</td>
<td align="center" class="bitcell">0</td></tr>
</tbody>
</table>

<p>
</p>

<p>
Unlike the optional EXI cookie that MAY occur to precede this field, the presence of Distinguishing Bits is REQUIRED in the EXI header. It is used to distinguish EXI  streams from text XML documents in the absence of an <termref def="key-exiCookie">EXI cookie</termref>.

This 
two 

bit sequence is the minimum that suffices to distinguish EXI
streams 

from XML documents since it is the minimum length bit
pattern that cannot occur as the first two bits of a well-formed XML
document represented in any one of the conventional character
encodings, such as UTF-8, UTF-16, UCS-2, UCS-4, EBCDIC, ISO 8859,
Shift-JIS and EUC, according to XML 1.0 <bibref ref="XML10"/>. Therefore, XML
Processors are expected to reject an EXI stream as early as they read
and process the first byte from the stream.</p>

<p>
Systems that use EXI 
streams 

as well as XML documents can 
reliably 

look at
the Distinguishing Bits to determine whether to interpret a particular
stream as XML or EXI.
</p>
</div2>

<div2 id="version">
<head>EXI Format Version</head>
<p><termdef id="key-version" term="EXI Format Version">
The fourth part in the EXI header is the <term>EXI Format Version</term>, which identifies the version of the EXI format being used.</termdef>
EXI format version numbers are integers. Each version of the EXI Format Specification specifies the corresponding EXI format version number to be used by conforming implementations. The EXI format version number that corresponds with this version of the EXI format specification is 0 (zero).</p>

<p>The first bit of the version field indicates whether the version is a preview or final version of the EXI format.
A value of 0 indicates this is a final version and a value of 1 indicates this is a preview
version. Final versions correspond to final, approved versions of the EXI format specification.
An <termref def="key-exiprocessor">EXI processor</termref> that implements a final version of the EXI format specification is REQUIRED to process EXI streams that have a version field with its first bit set to 0 followed by a version number that corresponds to the version of the EXI specification the processor implements.
<!-- <termref def="key-exiprocessor">EXI processors</termref> are REQUIRED to process EXI streams that have a version field with its first bit set to 0 and thereof MUST conform to the version of EXI specification that corresponds to the EXI format version that is in use in the stream. -->
Preview versions of the EXI format are useful for
gaining implementation and deployment experience prior to finalizing a
particular version of the EXI format. While preview versions may match drafts of this specification, they are not governed by this specification and the behaviour of EXI processors encountering preview versions of the EXI format is implementation dependent. Implementers are free to coordinate to achieve interoperability between different preview versions of the EXI format.
</p>

<p>Following the first bit of the version is a sequence of one or more
4-bit unsigned integers representing the version number. The version
number is determined by summing this sequence of 4-bit unsigned
values. The sequence is terminated by any 4-bit unsigned integer with
a value in the range 0-14. As such, the first 15 version numbers are
represented by 4 bits, the next 15 are represented by 8 bits, etc.</p>

<p>Given an EXI stream with its stream cursor positioned just past the first bit of the EXI format version field, the EXI format version number can be computed by going through the following steps with version number initially set to 1.</p>
<olist>
<item>Read next 4 bits as an unsigned integer value.</item>
<item>Add the value that was just read to the version number.</item>
<item>If the value is 15, go to step 1, otherwise (i.e. the value being in the range of 0-14), use the current value of the version number as the EXI version number.</item>
</olist>

<p>The following are example EXI format version numbers.</p>

<example>
<head>EXI Format Version Examples</head>
<table border="1">
<!-- caption>EXI Version Examples</caption -->
<thead>
<tr>
<th width="200">EXI Format Version Field</th>
<th width="200">Description</th></tr>
</thead>
<tbody>
<tr>
<td>&nbsp;&nbsp;1 0000</td>
<td>&nbsp;&nbsp;Preview version 1</td>
</tr>
<tr>
<td>&nbsp;&nbsp;0 0000</td>
<td>&nbsp;&nbsp;Final version 1</td>
</tr>
<tr>
<td>&nbsp;&nbsp;0 1110</td>
<td>&nbsp;&nbsp;Final version 15</td>
</tr>
<tr>
<td>&nbsp;&nbsp;0 1111 0000</td>
<td>&nbsp;&nbsp;Final version 16</td>
</tr>
<tr>
<td>&nbsp;&nbsp;0 1111 0001</td>
<td>&nbsp;&nbsp;Final version 17</td>
</tr>
</tbody>
</table>
</example>

<p><termref def="key-exiprocessor">EXI processors</termref> conforming with the final version of this
specification MUST use the 5-bit value 0 0000 as the version
number.</p>

</div2>
<div2 id="options">
<head>EXI Options</head>
<p><termdef id="key-options" term="EXI Options">The 
fifth 

part of the EXI
header is the <term>EXI Options</term>, which provides a way to specify the
options used to encode the body of the EXI stream</termdef>.
<termdef id="key-optionsDoc" term="EXI Options document">

The EXI Options are represented as an <term>EXI Options document</term>, which is an XML document encoded using the EXI format described in this specification.

</termdef>
This results in a very compact header
format that can be read and written with very little additional software.
</p>
<p>The presence of EXI Options in its entirety is optional in EXI header,
and it is predicated on the value of the presence bit that follows the
<termref def="key-distinguishingbits">Distinguishing Bits</termref>.
When EXI Options are present in the header, an EXI Processor MUST observe the
specified options to process the EXI stream that follows. Otherwise,
an EXI Procesor may obtain the EXI options using another mechanism. 
There are no fallback option values provided by this specification for use
in the absence of the whole EXI Options part.

</p>
<p>
<termref def="key-exiprocessor">EXI processors</termref> MAY provide external means for applications or users to
specify EXI Options when the EXI header is absent.
Such <termref def="key-exiprocessor">EXI processors</termref> are typically used in controlled systems
where the knowledge about the effective EXI Options is shared prior to
the exchange of EXI 
streams 

. The mechanism to communicate out-of-bound
EXI Options and their representation used in such systems are implementation dependent.</p>
<p>The following table describes the EXI options specified in the
options field.</p>

<table border="1">
<caption>EXI Options in Options Field</caption>
<thead>
<tr>
<th>EXI Option</th>
<th>Description</th>

<th>Default Value</th>

</tr>
</thead>
<tbody>
<tr>
<td>
<termref def="key-alignmentOption">alignment</termref>
</td>
<td>
Alignment of event codes and content items
</td>
<td>
<termref def="key-unaligned">bit-packed</termref>
</td>
</tr>
<tr>
<td><termref def="key-compressionOption">compression</termref></td>
<td>EXI compression is used to achieve better compactness</td>
<td>false</td>
</tr>
<tr>
<td><termref def="key-strictOption">strict</termref></td>
<td>Strict interpretation of schemas is used to achieve better compactness</td>
<td>false</td>
</tr>
<tr>
<td><termref def="key-fragmentOption">fragment</termref></td>
<td>Body is encoded as an <termref def="key-exifragment">EXI fragment</termref> instead of an <termref def="key-exidocument">EXI document</termref>
</td>
<td>false</td>
</tr>
<tr>
<td><termref def="key-preserveOption">preserve</termref></td>
<td>Specifies whether comments, pis, etc. are preserved</td>
<td>all false</td>
</tr>
<tr>
<td><termref def="key-selfContained">selfContained</termref></td>
<td>Enables self-contained elements</td>
<td>false</td>
</tr>
<tr>
<td><termref def="key-schemaIDOption">schemaID</termref></td>
<td>Identify the schema information, if any, used to encode the body</td>
<td>none</td>
</tr>
<tr>
<td><termref def="key-datatypeRepresentationOption">
datatypeRepresentationMap

</termref></td>
<td>
Identify datatype representations used to encode <termref def="key-valueContentItem"><emph>values</emph></termref> in <termref def="key-exibody">EXI body</termref>

</td>
<td>none</td>
</tr>
<tr>
<td><termref def="key-blockSizeOption">blockSize</termref></td>
<td>Specifies the block size used for EXI compression</td>
<td>1,000,000</td>
</tr>
<tr>
<td>
<termref def="key-valueMaxLengthOption">valueMaxLength</termref>

</td>
<td>
Specifies the maximum string length of 
<termref def="key-valueContentItem"><emph>value</emph></termref> 
content items to be considered for addition to the string table.

</td>
<td>
unbounded

</td>
</tr>
<tr>
<td>
<termref def="key-valuePartitionCapacityOption">valuePartitionCapacity</termref>

</td>
<td>
Specifies the total capacity of value partitions in a string table

</td>
<td>
unbounded

</td>
</tr>
<tr>
<td>[user defined]</td>
<td>User defined options may be added</td>
<td>none</td>
</tr>
</tbody>
</table>

<p>Appendix <specref ref="optionsSchema"/> provides an XML Schema
describing 

<termref def="key-optionsDoc">the EXI Options document</termref>.


This schema is

designed to produce smaller headers
for option combinations used when compactness is critical.</p>

<p>
The <termref def="key-optionsDoc">EXI Options document</termref> is
encoded as an <termref def="key-exibody">EXI body</termref> 
 informed by the above mentioned schema 

using the default options specified by the following XML document.
An EXI Options document consists only of EXI body, and MUST 
NOT 

start with an EXI header.
</p>

<reprdef>
<head>Header options used for encoding the <termref def="key-optionsDoc">EXI Options document</termref></head>
<repr xml:space="preserve">
  &lt;header xmlns="&exins;"&gt;
    &lt;strict/&gt;
  &lt;/header&gt;
</repr>
</reprdef>

<p><termdef id="key-alignmentOption">The <term>alignment option</term> is used to control the alignment of event codes and content items.</termdef> The value is one of <termref def="key-unaligned">bit-packed</termref>, <termref def="key-bytealignment">byte-alignment</termref> or <termref def="key-precompression">pre-compression</termref>, of which <termref def="key-unaligned">bit-packed</termref> is the default value assumed when the "alignment" element is absent in the <termref def="key-optionsDoc">EXI Options document</termref>.
When the value of <termref def="key-compressionOption">compression option</termref> is set to true, 
the way event codes and associated contents are represented is governed by the rule specified in <specref ref="compression"/> instead of the alignment option value, thus the compression option value "true" effectively rescinds the effect of an alignment option value.

</p>

<p><termdef id="key-unaligned">Alignment option value <term>bit-packed</term> indicates that the the event codes and associated content are packed in bits without any paddings in-between.</termdef>
</p>

<p><termdef id="key-bytealignment">Alignment option value <term>byte-alignment</term> indicates that the event codes and associated content are aligned on byte boundaries.</termdef> While byte-alignment generally results in EXI streams of larger sizes compared with their bit-packed equivalents, byte-alignment may provide a help in some use cases that involve frequent copying of large arrays of scalar data directly out of the stream. It can also make it possible to work with data in-place and can make it easier to debug encoded data by allowing items on aligned boundaries to be easily located in the stream.</p>

<p>
<termdef id="key-precompression">Alignment option value <term>pre-compression</term> alignment indicates that all steps involved in compression (see section <specref ref="compression"/>) are to be done with the exception of the final step of applying the DEFLATE algorithm.</termdef> The primary use case of pre-compression is to avoid a duplicate compression step when compression capability is built into the transport protocol. In this case, pre-compression just prepares the stream for later compression.
</p>

<p>
<termdef id="key-compressionOption">The <term>compression option</term> is a Boolean used to increase compactness using additional computational resources.</termdef> The default value "false" is assumed when the "compression" element is absent in the <termref def="key-optionsDoc">EXI Options document</termref>.
When set to true, the event codes and associated content are compressed according to <specref ref="compression"/> regardless of the <termref def="key-alignmentOption">alignment</termref> option value.
</p>

<!-- p>If <termref def="key-compressionOption">compression</termref> or  
<termref def="key-alignmentOption">alignment</termref> are off, the event codes and 
associated content are represented as a sequence of bit-encoded values. 
</p -->

<p>
<termdef id="key-strictOption">The <term>strict option</term> is a Boolean used to increase compactness by using a strict interpretation of the schemas and omitting preservation of certain items, such as comments, processing instructions and namespace prefixes.</termdef> The default value "false" is assumed when the "strict" element is absent in the <termref def="key-optionsDoc">EXI Options document</termref>.
When set to true, 
NS, CM, PI, ER and SC events are pruned from EXI grammars, and schema-informed element and type grammars are restricted to only permit items declared in the schemas.

The "strict" element MUST NOT appear in an <termref def="key-optionsDoc">EXI options document</termref> when the "preserve" element is present in the same options document.

</p>

<p>
<termdef id="key-fragmentOption">The <term>fragment option</term> is a Boolean that indicates whether the <termref def="key-exibody">EXI body</termref> is an <termref def="key-exidocument">EXI document</termref> or an <termref def="key-exifragment">EXI fragment</termref>.</termdef>  When set to true, the <termref def="key-exibody">EXI body</termref> is an <termref def="key-exifragment">EXI fragment</termref>. Otherwise, the <termref def="key-exibody">EXI body</termref> is an <termref def="key-exidocument">EXI document</termref>. 
Unlike EXI documents, EXI fragments are capable of representing multiple elements at the root level. They are analogous in concept to <xspecref spec="XML" ref='wf-entities'>external general parsed entities</xspecref> in XML in that they consist of a sequence of elements, processing instructions and comments in containers of their own that are physically separate from the documents in which they are to be used.

An EXI fragment is formally defined in terms of its grammar in Sections <specref ref="builtinFragGrammars"/> and <specref ref="informedFragGrammars"/>. 
The XML Information Set an EXI stream is mapped onto contains a document information item if the stream represents an EXI document, otherwise, the XML Information Set does not have a document information item if the stream represents an EXI fragment. The order among elements, processing instructions and comments that appear at the root in an EXI fragment is deemed significant and MUST be preserved by <termref def="key-exiprocessor">EXI processors</termref>.</p>

<p><termdef id="key-preserveOption">The <term>preserve option</term> is a set of Booleans that can be set independently to control whether certain information items are preserved in the EXI stream.</termdef> <specref ref="fidelityOptions"/> describes the set of information items effected by the preserve option.
The "preserve" element MUST NOT appear in an <termref def="key-optionsDoc">EXI options document</termref> when the "strict" element is present in the same options document.

</p>

<p><termdef id="key-selfContained">The <term>selfContained option</term> is a Boolean used to enable the use of self contained elements in the EXI stream.</termdef> Self contained elements may be read independently from the rest of the EXI body, allowing them to be indexed for random access. The "selfContained" element MUST NOT appear in an <termref def="key-optionsDoc">EXI options document</termref> when the "compression" or "pre-compression" elements are present in the same options document.</p>

<p><termdef id="key-schemaIDOption">The <term>schemaID option</term> may be used to identify the schema information used when encoding the EXI body.</termdef> When the 
"schemaID" element in the <termref def="key-optionsDoc">EXI options document</termref> contains the  
xsi:nil 

 attribute, no schema information was used when encoding the EXI body. When the value of the "schemaID" element is empty, no user defined schema information was used when encoding the EXI body; however, the built-in XML Schema types may have been used with the xsi:type attribute to specify element types. When the schemaID option is absent (i.e., undefined), no statement is made about the schema information used to encode the EXI body and this information
MUST be 

communicated out of band.
This specification does not dictate the syntax or semantics of other values specified in this field. An example schemaID scheme is the use of URI that is apt for globally identifying schema resources on the Web.
The parties involved in the exchange are free to agree on the scheme of schemaID field that is appropriate for their use to uniquely identify the schema information. 

</p>

<p><termdef id="key-datatypeRepresentationOption">The <term>
datatypeRepresentationMap 

option</term>, 
represented by a 
"datatypeRepresentationMap" 

element, 

identifies 
datatype representations 

used to encode 
<termref def="key-valueContentItem"><emph>values</emph></termref> in 

the <termref def="key-exibody">EXI body</termref> 
as described in <specref ref="datatypeRepresentationMap"/>.</termdef></p>

<p><termdef id="key-blockSizeOption">The <term>blockSize option</term> specifies the block size used for EXI compression.</termdef> When the blockSize option is absent, the default blocksize of 1,000,000 is used. The default blockSize is intentionally large but can be reduced for processing large documents on devices with limited memory.</p> 

<p>
<termdef id="key-valueMaxLengthOption">
The <term>valueMaxLength option</term> specifies the maximum length of string values representing value content items to be considered for addition to the string table.
</termdef> 
When the valueMaxLength option is absent, the maximum length is unbounded.
String values representing <termref def="key-valueContentItem"><emph>value</emph></termref> content items that have length larger than the valueMaxLength option value are excluded from further consideration on account of <termref def="key-valuePartitionCapacityOption">valuePartitionCapacity</termref> for addition to the string table.
</p> 

<p>
<termdef id="key-valuePartitionCapacityOption">
The <term>valuePartitionCapacity option</term> specifies the total capacity of the global and all local value partitions of a string table, where the measurement unit of the capacity is the number of unique enitiries.</termdef>
When the valuePartitionCapacity option is absent, an unbounded capacity is assumed. A string representing a <termref def="key-valueContentItem"><emph>value</emph></termref> content item that has length smaller than or equal to the <termref def="key-valueMaxLengthOption">valueMaxLength</termref> option value and is not found in the value partitions at the time of the value occurrence is to be added into the string table only when doing so would not cause the number of unique values in value partitions to exceed the capacity.
The use of valuePartitionCapacity option value and the way the number of unique values are counted for value partitions are described in <specref ref="stringTablePartitions"/>.
</p> 
<!-- p>
<termdef id="key-valuePartitionCapacityOption">
The <term>valuePartitionCapacity option</term> specifies the total capacity of the global and all local value partitions of a string table, where the measurement unit of the capacity is the number of characters.</termdef>
When the valuePartitionCapacity option is absent, an unbounded capacity is assumed. A string representing a <termref def="key-valueContentItem"><emph>value</emph></termref> content item that has length smaller than or equal to the <termref def="key-valueMaxLengthOption">valueMaxLength</termref> option value and is not found in the value partitions at the time of the value occurrence is to be added into the string table only when doing so would not cause the total number of characters in value partitions to exceed the capacity.
The use of valuePartitionCapacity option value and the way the number of characters are metered for value partitions are described in <specref ref="stringTablePartitions"/>.
</p --> 

<!-- It is encoded using the schema in Appendix <specref ref="optionsSchema"/>
with the options specified by the following XML document: -->

</div2>
</div1>

<div1 id="encodingEvents">
<head>Encoding EXI Streams</head>
<p>The rules for encoding a series of events as an EXI stream are very
simple and are driven by a declarative set of grammars that describes
the structure of an EXI stream. Every event in the stream is
encoded using the same set of encoding rules, which are summarized as
follows: </p>
<olist>
<item>Get the next event to be encoded</item>
<item>If fidelity options indicate this event type is not processed,
go to step 1</item>
<item>Use the grammars to determine the <termref def="key-eventcode">event code</termref> of the event</item>
<item>Encode the event code followed by the event content</item>
<item>Evaluate the grammar production matched by the event</item>
<item>Repeat until the End Document (ED) event is encoded</item></olist>

<p>Self-contained (SC), namespace (NS) and attribute (AT) events associated with a given element occur directly after the start element (SE) event in the following order:</p>

<table border="1" rules="cols">
<tbody>
<tr>
<td align="center" width="50" height="30">SC</td>
<td align="center" width="50">NS</td>
<td align="center" width="50">NS</td>
<td align="center" width="50">...</td>
<td align="center" width="50">NS</td>
<td align="center" width="100">AT (xsi:type)</td>
<td align="center" width="100">AT (xsi:nil)</td>
<td align="center" width="50">AT</td>
<td align="center" width="50">AT</td>
<td align="center" width="50">...</td>
<td align="center" width="50">AT</td>
</tr>
</tbody>
</table>

<p>
Namespace (NS) events occur in document order. AT(xsi:type) and AT(xsi:nil) occur before all other AT events. In a <termref def="key-schemaless-existream">schema-less EXI stream</termref>, the remaining attribute (AT) events can occur in any order. 
In a <termref def="key-schemainformed-existream">schema-informed EXI stream</termref>, the remaining attribute (AT) events occur in lexical order sorted first by <emph>qname</emph>'s local-name then by <emph>qname</emph>'s URI.

</p>

<p>EXI uses the same simple procedure described above, to encode well-formed documents, document fragments, schema-valid information items, schema-invalid information items, information items partially described by schemas and information items with no schema at all. Only the grammars that describe these items differ. For example, an element with no schema information is encoded according to the XML grammar defined by the XML specification, while an element with schema information is encoded according to the more specific grammar defined by that schema. </p>

<p><termdef id="key-eventcode" term="Event Code">An <term>event code</term> is a sequence of 1 to 3 non-negative integers called parts. Each production in a grammar has an event code that distinguishes its event from that of other productions that share the same left-hand-side non-terminal symbol. </termdef></p>

<p>Section 
<specref ref="eventCodes"/> describes in detail how the grammar is used to determine the event code of an event. Section 
<specref ref="encodingEventCodes"/> describes in detail how event codes are represented as bits. Section 
<specref ref="fidelityOptions"/> describes available fidelity options and how they effect the EXI stream. Section 
<specref ref="encodingValues"/> describes how the typed event contents are represented as bits. </p>
<div2 id="eventCodes">
<head>Determining Event Codes</head>
<p>The structure of an EXI stream is described by the EXI grammars, which are formally specified in section 
<specref ref="grammars"/>. Each grammar defines which events are permitted to occur at any given point in the EXI stream and provides a pre-assigned event code for each event.</p>

<p>For example, the grammar productions below describe the events that can occur in a schema-informed EXI stream after the Start-Document (SD) event provided there are four global elements defined in the schema and provide an event code for each event:
</p>
<example>
<head>Example productions with event codes</head>

<table width="95%">
<thead>
<tr>
<th align="left" colspan="3">Syntax</th>
<th align="left">Event Code</th></tr>
</thead>
<tbody>
<tr>
<td width="5%"></td>
<td colspan="3"><emph>DocContent</emph></td></tr>
<tr>
<td></td>
<td width="5%"></td>
<td width="75%">SE ("A") 
<emph>DocEnd</emph></td>
<td>0</td></tr>
<tr>
<td></td>
<td width="5%"></td>
<td width="75%">SE ("B") 
<emph>DocEnd</emph></td>
<td>1</td></tr>
<tr>
<td></td>
<td width="5%"></td>
<td width="75%">SE ("C") 
<emph>DocEnd</emph></td>
<td>2</td></tr>
<tr>
<td></td>
<td width="5%"></td>
<td width="75%">SE ("D") 
<emph>DocEnd</emph></td>
<td>3</td></tr>
<tr>
<td></td>
<td width="5%"></td>
<td width="75%">SE (*) 
<emph>DocEnd</emph></td>
<td>4.0</td></tr>
<tr>
<td></td>
<td></td>
<td>DT 
<emph>DocContent</emph></td>
<td>4.1</td></tr>
<tr>
<td></td>
<td></td>
<td>CH 
<emph>DocContent</emph></td>
<td>4.2</td></tr>
<tr>
<td></td>
<td></td>
<td>CM 
<emph>DocContent</emph></td>
<td>4.3.0</td></tr>
<tr>
<td></td>
<td></td>
<td>PI 
<emph>DocContent</emph></td>
<td>4.3.1</td></tr></tbody></table></example>
<p>At the point in an EXI stream where the above grammar productions are in effect, the event code of Start Element "A" (i.e. SE("A")) is 0. The event code of a DOCTYPE (DT) event at this point in the stream is 4.1, and so on. 
</p>
</div2>

<div2 id="encodingEventCodes">
<head>Representing Event Codes</head>

<p>Each event code is represented by a sequence of 1 to 3 parts that uniquely identify an event. 
Event code parts are encoded in order starting with the first part followed by subsequent parts.</p>

<p>
When the value of <termref def="key-compressionOption">compression option</termref> is false, and

<termref def="key-unaligned">bit-packed</termref> alignment option is used for the current processing of the stream,
the <emph>i</emph>th part of an event code is encoded using the minimum number of bits required to distinguish it from the <emph>i</emph>th part of the other sibling event codes in the current grammar. Specifically, the 
<emph>i</emph>th part of an event code is encoded as an <emph>n</emph>-bit unsigned integer (<specref ref="encodingBoundedUnsigned" />), of which 
<emph>n</emph> is &lceil; log <sub>2</sub> <emph>m</emph> &rceil; where <emph>m</emph> is the number of distinct values used as the 
<emph>i</emph>th part of its own and all its sibling event codes in the current grammar.
<!-- In cases, where there is only one distinct value for a given part, the part is omitted (i.e., encoded in log 
<sub>2</sub> 1 = 0 bits). -->
Two event codes are siblings at the <emph>i</emph>th part if and only if they share the same values in all preceding parts. All event codes are siblings at the first part.
</p>
<p>
On the other hand, when the value of <termref def="key-compressionOption">compression option</termref> is true, or either <termref def="key-bytealignment">byte-alignment</termref> or <termref def="key-precompression">pre-compression</termref> alignment option is used, 
the <emph>i</emph>th part of an event code is encoded using the minimum number of bytes instead of 
bits required to distinguish it from the <emph>i</emph>th part of the other sibling event codes in 
the current grammar.  Each part is encoded as an <emph>n</emph>-bit unsigned integer 
(<specref ref="encodingBoundedUnsigned" />), of which 
<emph>n</emph> is &lceil; log <sub>2</sub> <emph>m</emph> &rceil; where <emph>m</emph> is the 
number of distinct values used as the 
<emph>i</emph>th part of its own and all its sibling event codes in the current grammar.
The number of bytes used for the <emph>n</emph>-bit unsigned integer representation in this case 
is equal to &lceil; <emph>n</emph> / 8 &rceil;.</p>

<p>Regardless of the 
values of <termref def="key-compressionOption">compression option</termref> and <termref def="key-alignmentOption">alignment option</termref>, 
if there is only one distinct value for a given part, the part is omitted (i.e., encoded in log <sub>2</sub> 1 = 0 bits = 0 bytes).
</p>
<p>For example, the nine event codes shown in the 
<emph>DocContent</emph> grammar above have a value ranging from 0 to 4 for their first part. There are five distinct values needed to identify the first part of these event codes. Therefore, when EXI compression and alignment are not in effect, the first part can be encoded in &lceil; log <sub>2</sub> 5 &rceil; = 3 bits. In the same fashion, the number of bits used for encoding second and third part (if present) are calculated as &lceil; log <sub>2</sub> 4 &rceil; = 2 bits and &lceil; log <sub>2</sub> 2 &rceil; = 1 bits, respectively.
On the other hand, when EXI compression or alignment is in effect, the number of bytes used for each part is &lceil; 3 / 8 &rceil; = 1 bytes for the first part, &lceil; 2 / 8 &rceil; = 1 bytes for the second part and &lceil; 1 / 8 &rceil; = 1 bytes for the third part.</p>

<p>The table below illustrates how the event codes of each event in the 
<emph>DocContent</emph> grammar above is encoded. </p>
<example>
<head>Example event code encoding</head>
<p></p>
<table border="1" width="95%">
<caption>Example event code encoding 
when EXI compression is not in effect and <termref def="key-unaligned">bit-packed</termref> alignment option is used
</caption>
<colgroup></colgroup>
<colgroup span="3" align="center"></colgroup>
<colgroup></colgroup>
<colgroup align="center"></colgroup>
<thead>
<tr>
<th width="30%">Event</th>
<th colspan="3">Part values</th>
<th width="40%">Event Code Encoding</th>
<th width="10%"># bits</th></tr>
</thead>
<tbody>
<tr>
<td>SE ("A")</td>
<td>0</td>
<td>&nbsp;</td>
<td>&nbsp;</td>
<td>000</td><td>3</td></tr>
<tr>
<td>SE ("B")</td>
<td>1</td>
<td>&nbsp;</td>
<td>&nbsp;</td>
<td>001</td><td>3</td></tr>
<tr>
<td>SE ("C")</td>
<td>2</td>
<td>&nbsp;</td>
<td>&nbsp;</td>
<td>010</td><td>3</td></tr>
<tr>
<td>SE ("D")</td>
<td>3</td>
<td>&nbsp;</td>
<td>&nbsp;</td>
<td>011</td><td>3</td></tr>
<tr>
<td>SE (*)</td>
<td>4</td>
<td>0</td>
<td>&nbsp;</td>
<td>100&nbsp;&nbsp;00</td><td>5</td></tr>
<tr>
<td>DT</td>
<td>4</td>
<td>1</td>
<td>&nbsp;</td>
<td>100&nbsp;&nbsp;01</td><td>5</td></tr>
<tr>
<td>CH</td>
<td>4</td>
<td>2</td>
<td>&nbsp;</td>
<td>100&nbsp;&nbsp;10</td><td>5</td></tr>
<tr>
<td>CM</td>
<td>4</td>
<td>3</td>
<td>0</td>
<td>100&nbsp;&nbsp;11&nbsp;&nbsp;0</td><td>6</td></tr>
<tr>
<td>PI</td>
<td>4</td>
<td>3</td>
<td>1</td>
<td>100&nbsp;&nbsp;11&nbsp;&nbsp;1</td><td>6</td></tr></tbody></table>
<table border="1" width="95%">
<colgroup></colgroup>
<colgroup span="3" align="center"></colgroup>
<colgroup></colgroup>
<colgroup></colgroup>
<tbody>
<tr>
<td width="30%"># distinct values ( 
<emph>m</emph>)</td>
<td>5</td>
<td>4</td>
<td>2</td>
<td width="40%">&nbsp;</td><td width="10%">&nbsp;</td></tr>
<tr>
<td><table border="0">
<tr><td># bits per part</td></tr>
<tr><td>&nbsp;&nbsp;&lceil; log <sub>2</sub> <emph>m</emph> &rceil;</td></tr>
</table></td>
<td>3</td>
<td>2</td>
<td>1</td>
<td>&nbsp;</td><td>&nbsp;</td></tr></tbody></table>
<p></p>

<table border="1" width="95%">
<caption>Example event code encoding 



when EXI compression is in effect, or either
<termref def="key-bytealignment">byte-alignment</termref> or <termref def="key-precompression">pre-compression</termref> alignment option is used
</caption>
<colgroup></colgroup>
<colgroup span="3" align="center"></colgroup>
<colgroup></colgroup>
<colgroup align="center"></colgroup>
<thead>
<tr>
<th width="30%">Event</th>
<th colspan="3">Part values</th>
<th width="40%">Event Code Encoding</th>
<th width="10%"># bytes</th></tr>
</thead>
<tbody>
<tr>
<td>SE ("A")</td>
<td>0</td>
<td>&nbsp;</td>
<td>&nbsp;</td>
<td>00000000</td><td>1</td></tr>
<tr>
<td>SE ("B")</td>
<td>1</td>
<td>&nbsp;</td>
<td>&nbsp;</td>
<td>00000001</td><td>1</td></tr>
<tr>
<td>SE ("C")</td>
<td>2</td>
<td>&nbsp;</td>
<td>&nbsp;</td>
<td>00000010</td><td>1</td></tr>
<tr>
<td>SE ("D")</td>
<td>3</td>
<td>&nbsp;</td>
<td>&nbsp;</td>
<td>00000011</td><td>1</td></tr>
<tr>
<td>SE (*)</td>
<td>4</td>
<td>0</td>
<td>&nbsp;</td>
<td>00000100&nbsp;&nbsp;00000000</td><td>2</td></tr>
<tr>
<td>DT</td>
<td>4</td>
<td>1</td>
<td>&nbsp;</td>
<td>00000100&nbsp;&nbsp;00000001</td><td>2</td></tr>
<tr>
<td>CH</td>
<td>4</td>
<td>2</td>
<td>&nbsp;</td>
<td>00000100&nbsp;&nbsp;00000010</td><td>2</td></tr>
<tr>
<td>CM</td>
<td>4</td>
<td>3</td>
<td>0</td>
<td>00000100&nbsp;&nbsp;00000011&nbsp;&nbsp;00000000</td><td>3</td></tr>
<tr>
<td>PI</td>
<td>4</td>
<td>3</td>
<td>1</td>
<td>00000100&nbsp;&nbsp;00000011&nbsp;&nbsp;00000001</td><td>3</td></tr></tbody></table>
<table border="1" width="95%">
<colgroup></colgroup>
<colgroup span="3" align="center"></colgroup>
<colgroup></colgroup>
<colgroup></colgroup>
<tbody>
<tr>
<td width="30%"># distinct values (<emph>m</emph>)</td>
<td>5</td>
<td>4</td>
<td>2</td>
<td width="40%">&nbsp;</td><td width="10%">&nbsp;</td></tr>
<tr>
<td><table border="0">
<tr><td># bytes per part</td></tr>
<tr><td>&nbsp;&nbsp;&lceil; (log <sub>2</sub> <emph>m</emph>) / 8 &rceil;</td></tr>
</table></td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>&nbsp;</td><td>&nbsp;</td></tr>
</tbody></table>
</example>

</div2>
<div2 id="fidelityOptions">
<head>Fidelity Options</head>
<p>Some XML applications do not require the entire XML feature set and would prefer to eliminate the overhead associated with unused features. For example, the SOAP 1.2 specification 
<bibref ref="soap12" /> prohibits the use of XML processing-instructions. In addition, there are many data-exchange use cases that do not require XML comments or DTDs. </p>
<p>Applications can use a set of fidelity options to specify the XML features they require. As specified in section 
<specref ref="pruningProductions"/>, EXI processors MUST use these fidelity options to prune the events that are not required from the grammars, improving compactness and processing efficiency. </p>
<p>The table below lists the fidelity options supported by this version of the EXI specification and describes the effect setting these options has on the EXI stream. </p>
<table border="1">
<caption>Fidelity options</caption>
<thead>
<tr>
<th>Fidelity option</th>
<th>Effect</th></tr>
</thead>
<tbody>
<tr>
<td>Preserve.comments</td>
<td>CM events are preserved</td></tr>
<tr>
<td>Preserve.pis</td>
<td>PI events are preserved</td></tr>
<!-- tr>
<td>Preserve.whitespace</td>
<td>CH events containing only insignificant whitespace are preserved</td></tr -->
<tr>
<td>Preserve.dtd</td>
<td>DOCTYPE and ER events are preserved</td></tr>
<tr>
<td id="key-preservePrefixesOption">Preserve.prefixes</td>
<td>NS events and namespace prefixes are preserved</td></tr>
<tr>
<td id="key-preserveLexicalValuesOption">Preserve.lexicalValues</td>
<td>Lexical form of element and attribute values is preserved 
in <termref def="key-valueContentItem"><emph>value</emph></termref> content items

</td></tr></tbody></table>
<!-- p>Which whitespace is deemed to be insignificant, depends on the available schema information and the xml:space attribute. If xml:space=&quot;preserve&quot; for the current element context or a schema exists and specifies that the content model of the current element is mixed, then all whitespace inside the element is significant. Otherwise, only the whitespace that occurs between consecutive, corresponding start tags and end tags is significant. </p -->
</div2></div1>

<div1 id="encodingValues">
<head>Representing Event Content</head>
<p>The content of each event in an EXI body is represented according to its type (see <specref ref='table2'/>). In the absence of external type information, attribute and character <emph>values</emph> are typed as String. </p>

<p><termdef id="key-exidatatype" term="EXI Datatype">EXI defines a minimal set of 
datatype representations 
called 
<term>
Built-in EXI datatype representations 
</term> that define how values are represented in EXI streams.</termdef>  When the <termref def="key-preserveLexicalValuesOption">preserve.lexicalValues</termref> option is false, 
<termref def="key-valueContentItem">values</termref> are represented 
according to their schema datatypes per <specref ref="builtInEXITypes"/> below using built-in EXI datatype representations 
as described in <specref ref="encodingDatatypes"/>.
 Otherwise, 
<termref def="key-valueContentItem">values</termref>
are represented as Strings with restricted character sets (see <specref ref='builtInRestrictedStrings'/> below). The following table lists the 
built-in EXI datatype representations, associated type identifiers and the XML Schema Language <bibref ref="schema2" /> built-in datatypes each is used to represent by default.</p>

<table border="1" id="builtInEXITypes">
<caption>Built-in EXI Datatypes</caption>
<thead>
<tr>
<th>
Built-in EXI Datatype Representation

</th>
<th>EXI Datatype ID</th>
<th colspan="2">
<xspecref spec="XS2" ref="built-in-datatypes">XML Schema Datatypes</xspecref>
</th>
</tr>
</thead>
<tbody>
<tr>
<td rowspan="2">
<xspecref href="#encodingBinary">Binary</xspecref>
</td>
<td>xsd:base64Binary</td>
<td colspan="2"><emph>base64Binary</emph></td>
</tr>
<tr>
<!-- td/ -->
<td>xsd:hexBinary</td>
<td colspan="2"><emph>hexBinary</emph></td>
</tr>
<tr>
<td>
<xspecref href="#encodingBoolean">Boolean</xspecref>
</td>
<td>xsd:boolean</td>
<td colspan="2"><emph>boolean</emph></td>
</tr>
<tr>
<td>
<xspecref href="#encodingDateTime">Date-Time</xspecref>
</td>
<td>xsd:dateTime</td>
<td colspan="2"><emph>dateTime</emph>, <emph>time</emph>, <emph>date</emph>, <emph>gYearMonth</emph>, <emph>gYear</emph>, <emph>gMonthDay</emph>, <emph>gDay</emph>, <emph>gMonth</emph></td>
</tr>
<tr>
<td>
<xspecref href="#encodingDecimal">Decimal</xspecref>
</td>
<td>xsd:decimal</td>
<td colspan="2"><emph>decimal</emph></td>
</tr>
<tr>
<td>
<xspecref href="#encodingFloat">Float</xspecref>
</td>
<td>xsd:double</td>
<td colspan="2"><emph>float</emph>, <emph>double</emph></td>
</tr>
<tr>
<td>
<xspecref href="#encodingBoundedUnsigned">n-bit Unsigned Integer</xspecref>
</td>
<td rowspan="3">xsd:integer</td>
<td colspan="2" rowspan="3">
<p>
<emph>integer</emph>, the representation of which depends on the 
<xspecref spec="XS2" ref="dt-facet">facet</xspecref>
values as follows.
</p>
<p>
When the bounded range of <emph>integer</emph> is 4095 or smaller as determined by the values of 
<xspecref spec="XS2" ref="rf-minInclusive">minInclusive</xspecref>, 
<xspecref spec="XS2" ref="rf-minExclusive">minExclusive</xspecref>, 
<xspecref spec="XS2" ref="rf-maxInclusive">maxInclusive</xspecref>
and 
<xspecref spec="XS2" ref="rf-maxExclusive">maxExclusive</xspecref>
facets, use <xspecref href="#encodingBoundedUnsigned">n-bit Unsigned Integer</xspecref> representation.
</p>
<p>
Otherwise, when the <emph>integer</emph> satisfies one of the followings, use <xspecref href="#encodingUnsignedInteger">Unsigned Integer</xspecref> representation.
</p>
<ulist>
<item>It is <emph>nonNegativeInteger</emph>.</item>
<item>Either <xspecref spec="XS2" ref="rf-minInclusive">minInclusive</xspecref> facet is specified with a value equal to or greater than 0, or
<xspecref spec="XS2" ref="rf-minExclusive">minExclusive</xspecref>  facet is specified with a value equal to or greater than -1.
</item>
</ulist>
<p>
Otherwise, use <xspecref href="#encodingInteger">Integer</xspecref> representation.
</p>

</td>
</tr>
<tr>
<td>
<xspecref href="#encodingUnsignedInteger">Unsigned Integer</xspecref>
</td>
<!-- td>&nbsp;</td -->
<!-- td colspan="2">&nbsp;</td -->
</tr>
<tr>
<td>
<xspecref href="#encodingInteger">Integer</xspecref>
</td>
<!-- td>&nbsp;</td -->
<!-- td colspan="2">&nbsp;</td -->
</tr>
<tr>
<td>
<xspecref href="#encodingString">String</xspecref>
</td>
<td>xsd:string</td>
<td colspan="2"><emph>string</emph>, <emph>anySimpleType</emph>, <emph>anyURI</emph>, <emph>duration</emph>, All types derived by <emph>union</emph></td>
</tr>
<tr>
<td>
<xspecref href="#encodingList">List</xspecref>
</td>
<td>&nbsp;</td>
<td colspan="2">All types derived by <emph>list</emph>, including
<emph>IDREFS</emph> and <emph>ENTITIES</emph></td></tr>
<tr>
<td>
<xspecref href="#encodingQName">QName</xspecref>
</td>
<td>&nbsp;</td>
<td colspan="2">
<!-- All element and attribute <emph>qnames</emph>,--> <!-- NOTE: these are not schema types. -->
<!-- <xspecref href='http://www.w3.org/TR/2004/REC-xmlschema-2-20041028/#QName'>QName</xspecref>, <xspecref href='http://www.w3.org/TR/2004/REC-xmlschema-2-20041028/#Notation'>Notation</xspecref>--> 
<!-- note : the qname type is not used for element/attribute values - only for element/attribute names -->
&nbsp;</td></tr>
</tbody></table>
<p>By default, datatypes derived from the XML Schema datatypes above are also represented according to the associated <termref def="key-exidatatype">built-in EXI datatype representation</termref>. When there are more than one XML Schema datatypes above from which a datatype is derived directly or indirectly, the closest ancestor is used to determine the <termref def="key-exidatatype">built-in EXI datatype representation</termref>. For example, a value of XML Schema datatype xsd:int is represented according to the same <termref def="key-exidatatype">built-in EXI datatype representation</termref> as a value of XML Schema datatype xsd:integer. Although xsd:int is derived indirectly from xsd:integer and also further from xsd:decimal, a value of xsd:int is processed as an instance of xsd:integer because xsd:integer is closer to xsd:int than xsd:decimal is in the datatype inheritance hierarchy.</p>

<p>Each EXI datatype identifier above is a <termref def="key-qname">qname</termref>. Datatype identifiers uniquely identify one of the 
<termref def="key-exidatatype">built-in EXI datatype representations.</termref>
They are used by <termref def="key-datatypeRepresentationMaps">Datatype Representation Map</termref> to designate XML Schema datatypes to <termref def="key-exidatatype">built-in EXI datatype representations</termref> different from the ones that are associated by default. Not all 
<termref def="key-exidatatype">built-in EXI datatype representations</termref>
are assigned datatype identifiers. Only those that have identifiers are usable by <termref def="key-datatypeRepresentationMaps">Datatype Representation Map</termref> for designating alternative representations.
</p>
<p>When the <termref def="key-preserveLexicalValuesOption">preserve.lexicalValues</termref> option is true, all 
<termref def="key-valueContentItem"><emph>values</emph></termref> 
are represented as Strings. Some 
<termref def="key-valueContentItem"><emph>values</emph></termref> that would have otherwise been designated to certain 
<termref def="key-exidatatype">built-in EXI datatype representations</termref> 
are represented as Strings with restricted character sets as defined by the table below.</p>

<table border="1" width="95%" id='builtInRestrictedStrings'>
<caption>Restricted Character Sets for Built-in EXI 
Datatype Representations

</caption>
<colgroup width="20%"/>
<colgroup width="80%"/>
<thead>
<tr>
<th>EXI Datatype ID</th>
<th>Restricted Character Set</th>
</tr>
</thead>
<tbody>
<tr>
<td>xsd:base64Binary</td>
<td>{ #x9, #xA, #xD, #x20, +, /, [0-9], =, [A-Z], [a-z] } </td>
</tr>
<tr>
<td>xsd:hexBinary</td>
<td>{ #x9, #xA, #xD, #x20, [0-9], [A-F], [a-f] } </td>
</tr>
<tr>
<td>xsd:boolean</td>
<td>{ #x9, #xA, #xD, #x20, 0, 1, a, e, f, l, r, s, t, u } </td>
</tr>
<tr>
<td>xsd:dateTime</td>
<td>{ #x9, #xA, #xD, #x20, +, -, ., [0-9], :, T, Z } </td>
</tr>
<tr>
<td>xsd:decimal</td>
<td>{ #x9, #xA, #xD, #x20, +, -, ., [0-9] } </td>
</tr>
<tr>
<td>xsd:double</td>
<td>{ #x9, #xA, #xD, #x20, +, -, ., [0-9], E, F, I, N, a, e } </td>
</tr>
<tr>
<td>xsd:integer</td>
<td>{ #x9, #xA, #xD, #x20, +, -, [0-9] } </td>
</tr>
</tbody></table>

<p>The restricted character set for the EXI List datatype representation is determined by the EXI datatype representation of the values in the List.
</p>

<p>The rules used to represent values of String depend on the content items to which the values belong. There are certain content items whose value representation involve the use of string tables while other content items are represented using the encoding rule described in <specref ref="encodingString"/> without involvement of string tables. The content items that use string tables and how each of such content items uses string tables to represent their values are described in <specref ref="stringTable"/>.</p>
<p>Schemas can provide one or more enumerated values for types. EXI exploits those pre-defined values when they are available to represent values of such types in a more efficient manner than it would otherwise using built-in EXI datatypes. The encoding rule for representing a type of enumerated values is described in <specref ref="encodingEnumerations"/>. Types that are derived from other types by union and their subtypes are always represented as String regardless of the availability of enumerated values. Representation of values of which the schema type is one of QName, Notation or a type derived therefrom by restriction are also not affected by enumerated values if any.
</p>
<!-- p>The encoding rule to represent schema types that are derived by list and their subtypes, including <xspecref href='http://www.w3.org/TR/2004/REC-xmlschema-2-20041028/#IDREFS'>IDREFS</xspecref> and <xspecref href='http://www.w3.org/TR/2004/REC-xmlschema-2-20041028/#ENTITIES'>ENTITIES</xspecref> is described in <specref ref="encodingList"/>.
</p -->

<div2 id="encodingDatatypes">
<head>Built-in EXI Datatype Representations</head>
<p>The following sections describe the encoding rules of <termref def="key-exidatatype">built-in EXI datatype representations</termref> for representing <termref def="key-valueContentItem"><emph>values</emph></termref> in EXI streams.
</p>
<div3 id="encodingBinary">
<head>Binary</head>
<p>Values typed as Binary are represented as a length-prefixed sequence of octets representing the binary content. The length is represented as an Unsigned Integer (see 
<specref ref="encodingUnsignedInteger"/>). </p></div3>
<div3 id="encodingBoolean">
<head>Boolean</head>

<p>In the absence of pattern facets in the schema datatype, values typed as Boolean are represented as <emph>n</emph>-bit unsigned integer (<specref ref="encodingBoundedUnsigned" />), where <emph>n</emph> is one (1) and the value zero (0) represents false and the value one (1) represents true.
</p>
<p>Otherwise, when pattern facets are available in the schema datatype, Boolean datatype representation is able to distinguish values not only arithmetically (0 or 1) but also between lexical variances ("0", "1", "false" and "true"), and values typed as Boolean are represented as <emph>n</emph>-bit unsigned integer (<specref ref="encodingBoundedUnsigned" />), where <emph>n</emph> is two (2) and the value zero (0), one (1), two (2) and three (3) each represents value "false", "0", "true" and "1".
</p>
</div3>

<div3 id="encodingDecimal">
<head>Decimal</head>
<p>Values typed as Decimal are represented as a Boolean sign (see <specref ref="encodingBoolean"/>) followed by two  Unsigned Integers (see <specref
ref="encodingUnsignedInteger"/>). A sign value of zero (0) is used to represent positive Decimal values and a sign value of one (1) is used to represent negative Decimal values. The first Unsigned Integer represents the integral portion of the Decimal value. The second Unsigned Integer represents the fractional portion of the Decimal value with the digits in reverse order to preserve leading zeros.</p>
</div3>

<div3 id="encodingFloat">
<head>Float</head>
<p>Values typed as Float are represented as two consecutive Integers (see 
<specref ref="encodingInteger"/>). The first Integer represents the mantissa of the floating point number and the second Integer represents the base-10 exponent of the floating point number. The range of the mantissa is - (2<sup>63</sup>) to 2<sup>63</sup>-1 and the range of the exponent is - (2<sup>14</sup>-1) to 2<sup>14</sup>-1. Values typed as Float with a mantissa or exponent outside the accepted range are represented as schema-invalid values.</p>

<p>The exponent value -(2<sup>14</sup>) is used to indicate one of the special values: infinity, negative infinity and not-a-number (NaN). An exponent value -(2<sup>14</sup>) with mantissa values 1 and -1 represents 
positive infinity (INF) and negative infinity (-INF) respectively. An exponent value -(2<sup>14</sup>) with any other mantissa value represents NaN.
</p>

<p>A value represented as Float can be decoded by going through the following steps.</p>
<olist>
<item>Retrieve the mantissa value using the procedure described in <specref ref="encodingInteger"/>.</item>
<item>Retrieve the exponent value using the procedure described in <specref ref="encodingInteger"/>.</item>
<item>If the exponent value is -(2<sup>14</sup>), the mantissa value 1 represents INF, the mantissa value -1 represents -INF and any other mantissa value represents NaN. If the exponent value is not -(2<sup>14</sup>), the float value is <emph>m</emph> &times; 10<sup><emph>e</emph></sup> where <emph>m</emph> is the mantissa and <emph>e</emph> is the exponent obtained in the preceding steps.
</item>
</olist>
</div3>

<div3 id="encodingInteger">
<head>Integer</head>
<p>The Integer type supports signed integer numbers of arbitrary magnitude. Values typed as Integer are represented as a Boolean sign (see <specref ref="encodingBoolean" />) followed by an Unsigned Integer (see <specref ref="encodingUnsignedInteger" />). A sign value of zero (0) is used to represent positive integers and a sign value of one (1) is used to represent negative integers. For non-negative values, the Unsigned Integer holds the magnitude of the value. For negative values, the Unsigned Integer holds the magnitude of the value minus 1. </p>
</div3>

<div3 id="encodingUnsignedInteger">
<head>Unsigned Integer</head>
<p>The Unsigned Integer type supports unsigned integer numbers of arbitrary magnitude. Values typed as Unsigned Integer are represented using a sequence of octets. The sequence is terminated by an octet with its most significant bit set to 0. The value of the unsigned integer is stored in the least significant 7 bits of the octets as a sequence of 7-bit bytes, with the least significant byte first. </p>
<p>EXI processors SHOULD support arbitrarily large Unsigned Integer values. EXI processors MUST support Unsigned Integer values less than 4294967296.</p>
<!-- Unsigned Integer values SHOULD be stored in the minimum number of required octets. -->
<p>A value represented as Unsigned Integer can be decoded by going through the following steps.</p> <example>
<head>
Example algorithm for decoding an Unsigned Integer
</head>
<olist>
<item>Start with the initial value set to 0 and the initial multiplier set to 1.</item>
<item>Read the next octet.</item>
<item>Multiply the value of the unsigned number represented by the 7 least significant bits of the octet by the current multiplier and add the result to the current value.</item>
<item>Multiply the multiplier by 128.</item>
<item>If the most significant bit of the octet was 1, go back to step 2.</item>
</olist>
</example>

<p/>
</div3>

<div3 id="encodingQName">
<head>QName</head>
<p>Values of type QName are encoded as a sequence of values representing the URI, local-name and prefix components of the QName in that order, where the prefix component is present only when the <termref def="key-preservePrefixesOption">preserve.prefixes</termref> option is set to true.
</p>
<p>When the QName value is specified by a schema-informed grammar using the SE(<emph>qname</emph>) or AT(<emph>qname</emph>) terminal symbols, URI and local-name are implicit and are omitted.
Similarly, when the URI of the QName value is derived from a schema-informed grammar using SE(<emph>uri</emph>:&nbsp;*) terminal symbols, URI is implicit thus omitted in the representation, and only the local-name component is encoded as a String (see <specref ref="encodingString"/>).
Otherwise, URI and local-name components are encoded as Strings. 
If the QName is in no namespace, the URI is represented by a zero length String. 
</p>
<p>When present, prefixes are represented as <emph>n</emph>-bit unsigned integers (<specref ref="encodingBoundedUnsigned" />), where <emph>n</emph> is log<sub>2</sub>(<emph>N</emph>) and <emph>N</emph> is the number of unique <emph>prefix</emph>es specified for the URI of the QName by preceding NS events in the EXI stream. Each unique <emph>prefix</emph> is assigned a unique <emph>n</emph>-bit integer (0 ... <emph>N</emph>-1) according to the order in which the associated NS event occurs in the EXI stream. If there are no <emph>prefix</emph>es specified for the URI of the QName by preceding NS events in the EXI stream, the prefix is undefined. An undefined prefix is represented using zero bits (i.e., omitted).
</p>
<p>Given either a <emph>n</emph>-bit unsigned integer <emph>m</emph> that represents the prefix value or an undefined prefix, the effective prefix value is determined by following the rules described below in order. A QName is in error if it has an undefined prefix that cannot be resolved by the rules below.
</p>
<ol>
<li>If the prefix is defined, select the <emph>m</emph>-th <emph>prefix</emph> value associated with the URI of the QName as the candidate prefix value. Otherwise, there is no candidate prefix value.
</li>
<li>If the QName value is part of an SE event followed by an associated NS event with 
a <termref def="key-indicatorContentItem"><emph>local-element-ns</emph></termref> flag value 
being true, the prefix value is the <emph>prefix</emph> of such NS event. Otherwise, the prefix value is the candidate value, if any, selected in step 1 above.
</li>
</ol>
</div3>

<div3 id="encodingDateTime">
<head>Date-Time</head>
<p>Values typed as Date-Time are encoded as a sequence
of values representing the individual components of the Date-Time. The
following table specifies each of the possible date-time components
along with how they are encoded.</p>
<table border="1">
<caption>Date-Time components</caption>
<thead>
<tr>
<th>Component</th>
<th>Value</th>
<th>Type</th></tr>
</thead>
<tbody>
<!-- tr>
<td>Type</td>
<td>The type of date (see below)</td>
<td>3-bit Unsigned Integer (<specref ref="encodingBoundedUnsigned"/>)</td></tr -->
<tr>
<td>Year</td>
<td>Offset from 2000</td>
<td>Integer ( 
<specref ref="encodingInteger"/>)</td></tr>
<tr>
<td>MonthDay</td>
<td>
Month * 32 + Day
</td>
<td>
9-bit Unsigned Integer (<specref
ref="encodingBoundedUnsigned"/>) where day is a value in the range 1-31 and month is a value in the range 1-12.
</td></tr>
<tr>
<td>Time</td>
<td>((Hour * 60) + Minutes) * 60 + seconds</td>
<td>17-bit Unsigned Integer (<specref ref="encodingBoundedUnsigned"/>)</td></tr>
<!-- tr>
<td>FractionalSecs?</td>
<td>Boolean presence indicator</td>
<td>Boolean (<specref ref="encodingBoolean"/>)</td></tr -->
<tr>
<td>FractionalSecs</td>
<td>Fractional seconds</td>
<td>Unsigned Integer ( 
<specref ref="encodingUnsignedInteger"/>) representing the fractional part of the seconds with digits in reverse order to preserve leading zeros</td></tr>
<!-- tr>
<td>TimeZone?</td>
<td>Boolean presence indicator</td>
<td>Boolean (<specref ref="encodingBoolean"/>)</td></tr-->
<tr>
<td>TimeZone</td>
<td>TZHours * 60 + TZMinutes</td>
<td>11-bit Unsigned Integer (<specref ref="encodingBoundedUnsigned"/>) representing a signed integer offset by 840 ( = 14 * 60 )</td></tr>
<tr>
<td>presence</td>
<td>Boolean presence indicator</td>
<td>Boolean (<specref ref="encodingBoolean"/>)</td></tr>
</tbody></table>
<p>
The variety of components that constitute a value and their appearance order depend on the XML Schema type associated with the value. The following table shows which components are included in a value of each XML Schema type that is relevant to Date-Time datatype. Items listed in square brackets are included if and only if the value of its preceding presence indicator (specified above) is set to true.</p>
<table border="1">
<caption>Assortment of Date-Time components</caption>
<thead>
<tr>
<th>XML Schema Datatype</th>
<th>Included Components</th></tr>
</thead>
<tbody>
<tr>
<td><xspecref spec="XS2" ref='gYear'>gYear</xspecref></td>
<td>Year, presence, [TimeZone]</td></tr>
<tr>
<td><xspecref spec="XS2" ref='gYearMonth'>gYearMonth</xspecref></td>
<td rowspan="2">Year, MonthDay, presence, [TimeZone]</td></tr>
<tr>
<td><xspecref spec="XS2" ref='date'>date</xspecref></td>
<!-- td>Year, MonthDay, [TimeZone]</td --></tr>
<tr>
<td><xspecref spec="XS2" ref='dateTime'>dateTime</xspecref></td>
<td>Year, MonthDay, Time, presence, [FractionalSecs], presence, [TimeZone]</td></tr>
<tr>
<td><xspecref spec="XS2" ref='gMonth'>gMonth</xspecref></td>
<td rowspan="3">MonthDay, presence, [TimeZone]</td></tr>
<tr>
<td><xspecref spec="XS2" ref='gMonthDay'>gMonthDay</xspecref></td>
<!-- td>MonthDay, [TimeZone]</td --></tr>
<tr>
<td><xspecref spec="XS2" ref='gDay'>gDay</xspecref></td>
<!-- td>MonthDay, [TimeZone]</td --></tr>
<tr>
<td><xspecref spec="XS2" ref='time'>time</xspecref></td>
<td>Time, presence, [FractionalSecs], presence, [TimeZone]</td></tr></tbody></table></div3>

<div3 id="encodingBoundedUnsigned">
<head><emph>n</emph>-bit Unsigned Integer</head>
<p>
When the value of <termref def="key-compressionOption">compression option</termref> is false and
the value <termref def="key-unaligned">bit-packed</termref> is used for <termref def="key-alignmentOption">alignment options</termref>, 
values of type 
<emph>n</emph>-bit Unsigned Integer are represented as an unsigned binary integer using <emph>n</emph> bits. 
Otherwise, they are represented as an unsigned integer using the minimum number of bytes required to store 
<emph>n</emph> bits. Bytes are ordered with the least significant byte first.</p>

<!-- p>The n-bit unsigned integer encoding is also used to encode <emph>bounded integers</emph>. 
These are integer values that have been constrained explicitly through the use of schema facets 
(for example, XML schema minInclusive and maxInclusive facets) or implicitly through the use 
of a restricted data type (for example, the XML schema <emph>unsignedByte</emph> type).</p -->

<p>
The <emph>n</emph>-bit unsigned integer is used for encoding <termref def="key-eventcode">event codes</termref>, prefix component of QName (see <specref ref="encodingQName" />) as well as certain value content items, as described in respective relevant parts of this document. As shown in table <specref ref="builtInEXITypes" />, integers with bounded range size <emph>m</emph> equal to 4095 or smaller are encoded using <emph>n</emph>-bit unsigned integer with <emph>n</emph> being &lceil; log <sub>2</sub> <emph>m</emph> &rceil;, as an offset from the minimum value in the range.
</p>

<!-- p>A bounded integer value is encoded as an offset (or delta) from the minimum value in the range. 
It is encoded in the minimum number of bits that would be necessary to hold any value within the 
full range.  For example, if an integer is constrained to have a value between 3 and 10 
(inclusively) and the value to be encoded is 7, the number encoded would be 7 - 3 = 4 and the 
number of bits needed would be 3.</p>

<p>If the range defined by the bounds is large, the average number of bits needed to encode a 
set of values can be larger than the number of bits needed if those values are encoded as 
variable-length integers (see <specref ref="encodingInteger"/>). For this reason, a maximum 
range value is imposed such that if the value to be encoded is larger than this maximum, 
variable-length integer encoding is done.  The maximum range value is 4095 which equates to a 
bit field length of no more than 12 bits.</p -->
</div3>

<div3 id="encodingString">
<head>String</head>
<p>Values of type String are represented as a length prefixed sequence of
characters. The length indicates the number of characters in the
string and is represented as an Unsigned Integer (see <specref
ref="encodingUnsignedInteger"/>). If a restricted character set is defined for the string (see <specref ref="restrictedCharSet"/>), each character is represented as an <emph>n</emph>-bit Unsigned Integer (see <specref ref="encodingBoundedUnsigned"/>). Otherwise, each character is represented by its UCS 
<bibref ref="ISO10646"/>

code point encoded as an Unsigned Integer (see <specref ref="encodingUnsignedInteger"/>).
</p>
<p>EXI uses a string table to represent certain
content items more efficiently. Section <specref ref="stringTable"/>
describes the string table and how it is applied to different content
items.</p>
<div4 id="restrictedCharSet">
<head>Restricted Character Sets</head>
<p>If a string value is associated with a schema datatype and one or more of the datatypes in its datatype hierarchy has one or more pattern facets, there may be a restricted character set defined for the string value. The following steps are used to determine the restricted character set, if any, defined for a given string value associated with such a schema datatype.
</p>
<p>First, determine the character set for each datatype in the datatype hierarchy of the string value that has one or more pattern facets according to section <specref ref="regexToCharset"/>. For each datatype with more than one pattern facet, compute the restricted character set based on the union of the regular expressions specified by its pattern facets. If the restricted character set for a datatype contains at least 255 characters or contains non-BMP characters, the character set of the datatype is not restricted and can be omitted from further consideration.</p>

<p>Then, compute the restricted character set for the string value as the intersection of all the character sets computed above. If the resulting character set contains less than 255 characters, the string value has a restricted character set and each character is represented using an <emph>n</emph>-bit Unsigned Integer (see <specref ref="encodingBoundedUnsigned"/>), where <emph>n</emph> is log<sub>2</sub>(<emph>N</emph> + 1) and <emph>N</emph> is the number of characters in the restricted character set.</p>

<p>The characters in the restricted character set are sorted by UCS 
<bibref ref="ISO10646"/>

code point and represented by integer values in the range (0 ... <emph>N</emph>-1) according to their ordinal position in the set. Characters that are not in this set are represented by the integer <emph>N</emph> followed by the UCS code point of the character represented as an Unsigned Integer.</p>

<!-- reworded for clarity: jcs 12/13/07
<ol>
<li> </li>
<li>If the string value does not have have a datatype If the datatype is an ur-type, the character set of the datatype is the entire XML character set.</li>
<li>Otherwise, the character set of the datatype is determined as follows.
</li>
<ol>
<li>If the datatype does not have pattern facets specified within its own definition, the character set of the datatype equals to the character set of its base datatype.
</li>
<li>Otherwise, "local character set" of the datatype is obtained by making union of all the character sets derived from the patterns (i.e. regular expressions) specified within its own datatype definition. See <specref ref="regexToCharset"/> for how to derive a character set from a pattern. Then the character set of the datatype equals to the intersection of the local character set and base datatype's character set.
</li></ol></ol>
<p>Given the number of member characters <emph>N</emph> in the character set in effect, when <emph>N</emph> is greater than 255 or non-BMP characters are included in the set, each character in the string is encoded as Unsigned Integer (see <specref ref="encodingUnsignedInteger"/>) representing its UCS <bibref ref="ISO10646"/> code point. Otherwise (i.e. <emph>N</emph> is equal to or smaller than 255 and only BMP <bibref ref="ISO10646"/> characters are contained in the set), <emph>n</emph>-bit Unsigned Integer (see <specref ref="encodingBoundedUnsigned"/>) is used for character representation, where <emph>n</emph> equals to log<sub>2</sub>(<emph>N</emph> + 1). The characters in the character set are sorted by UCS code point and character serial numbers are assigned sequentially (0 ... <emph>N</emph>-1, inclusively). These serial numbers are used in the <emph>n</emph>-bit representation, and the serial number <emph>N</emph> is reserved to indicate a character that does not participate in the character set. Character serial number <emph>N</emph> is always followed by the UCS code point of the character represented as an Unsigned Integer.
</p>
-->
<p>The figure below illustrates an overview of the process for determining and using restricted character sets described in this section. </p>
<graphic source="restrictedCharset.png" alt="String Processing Model"/>
</div4>
</div3>
<div3 id="encodingList">
<head>List</head>
<p>Values of type List are encoded as a length
prefixed sequence of values. The length is encoded as an Unsigned Integer (see
<specref ref="encodingUnsignedInteger"/>) and each value is encoded according
to its type (see <specref ref="encodingValues"/>).</p>
</div3>

</div2>
<div2 id="encodingEnumerations">
<head>Enumerations</head>
<p>Values of enumerated types are encoded as
<emph>n</emph>-bit Unsigned Integers (<specref ref="encodingBoundedUnsigned"/>) where <emph>n</emph> = &lceil; log <sub>2</sub> <emph>m</emph> &rceil; and <emph>m</emph> is the number of items
in the enumerated type. The value assigned to each item corresponds to
its ordinal position in the enumeration in schema-order starting with
position zero (0).</p>
<p>Exceptions are for schema types derived from others by union and their subtypes, QName or Notation and types derived therefrom by restriction. The values of such types are processed by their respective built-in EXI datatype representations instead of being represented as enumerations.</p>
</div2>

<div2 id="stringTable">
<head>String Table</head>
<p>EXI uses a string table to assign "compact identifiers" to some
string values. Occurrences of string values found in the string table
are represented using the associated compact identifier rather than
encoding the entire "string literal". The string table is initially pre-populated with
string values that are likely to occur in certain contexts and is
dynamically expanded to include additional string values encountered
in the document. The following content items are encoded using a
string table: </p>

<ulist>
<item>
<termref def="key-uriContentItem"><emph>uris</emph></termref></item>
<item>
<termref def="key-prefixContentItem"><emph>prefixes</emph></termref></item>
<item>
<emph>uri</emph> and 
<emph>local-name</emph>
in <termref def="key-qnameContentItem"><emph>qnames</emph></termref>
</item>
<item>
<termref def="key-valueContentItem"><emph>values</emph></termref></item></ulist>

<p>When a string value is found in the string table, the value is encoded
using the compact identifier and no changes are made to the string table as a result. 
When a string value is not found in the string table, its string literal is encoded
as a String without using a compact identifier, only after which
the string table is augmented by including the string value with an assigned
compact identifier
unless the string value represents a value content item 
and fails to satisfy the criteria in effect by virtue of <termref def="key-valuePartitionCapacityOption">valuePartitionCapacity</termref> and <termref def="key-valueMaxLengthOption">valueMaxLength</termref> options

.</p>

<p>The string table is divided into partitions and each partition is
optimized for more frequent use of either compact identifiers or string literals
depending on the purpose of the partition. Section <specref
ref="stringTablePartitions"/> describes how EXI string table is
partitioned. Section <specref ref="encodingOptimizedForHits"/>
describes how string values are encoded when the associated partition
is optimized for more frequent use of compact identifiers. Section <specref
ref="encodingOptimizedForMisses"/> describes how string values are
encoded when the associated partition is optimized for more frequent use
of string literals.</p>
<p>The life cycle of a string table spans the processing of 
a single EXI stream. String tables are not represented in an EXI stream or exchanged
between EXI processors. A string table cannot be reused across multiple EXI streams;
therefore, EXI processors MUST use a string table that is equivalent to
the one that would have been newly created and pre-populated with initial
values for processing each EXI stream.
</p>


<div3 id="stringTablePartitions">
<head>String Table Partitions</head>
<p>The string table is organized into partitions
so that the indices assigned to compact identifiers can stay relatively small.
Smaller number of indices results in improved average compactness and the efficiency
of table operations. Each partition has a separate set of compact identifiers and
content items are assigned to specific partitions as described below. 
</p>
<p><termref def="key-uriContentItem"><emph>Uri</emph></termref> content items and the URI portion of <emph>qname</emph> content items are assigned to the uri
partition. The uri partition is optimized for frequent use of compact identifiers and is
pre-populated with initial entries as described in <specref ref="initialUriValues"/>.
When a schema is provided, the uri partition is also pre-populated with
the name of each
target
namespace URI declared in the schema,
plus some of the namespace URIs used in wildcard terms (see section <specref ref="wildcardTerms"/> for the condition),
appended in lexicographical order.</p>

<p><termref def="key-prefixContentItem"><emph>Prefix</emph></termref> content items are assigned to partitions based
on their associated namespace URI. Partitions containing
<emph>prefix</emph> content items are optimized for frequent use of compact identifiers and the
string table is pre-populated with entries as described in
<specref ref="initialPrefixValues"/>.</p>

<p>
The local-name portion of <termref def="key-qnameContentItem"><emph>qname</emph></termref>
content items are assigned to partitions based
on the namespace URI of 
the <emph>qname</emph> content item of which the local-name is a part. Partitions containing local-names are optimized for frequent use of string literals and the string table is pre-populated
with entries as described in <specref ref="initialLocalNames"/>.
When a schema is provided, the string table is also pre-populated with the
local name of each attribute, element and type declared in the
schema, partitioned by namespace URI and sorted lexicographically.</p>

<!-- p><termref def="key-nameContentItem"><emph>Name</emph></termref> content items are assigned to the
name partition. The name partition is
optimized for frequent use of string literals and is initially empty.</p -->

<p>
<termref def="key-valueContentItem"><emph>Value</emph></termref>
content items are assigned simultaneously to the global value partition
as well as to the "local" value partition that corresponds to the
<emph>qname</emph> of the attribute or element in context at the time
when the string table is looked up and the string value is not found in both global and local value partitions.
Partitions containing <termref def="key-valueContentItem"><emph>value</emph>
</termref> content items are optimized for frequent use of string literals and are initially empty.
<termdef id="key-valueAmount">
All value partitions in a string table share a single variable <term><emph>valueAmount</emph></term> the value of which is a non-negative integer that reflects the current number of unique values in value partitions.
</termdef>
<!-- termdef id="key-valueAmount">
All value partitions in a string table share a single variable <term><emph>valueAmount</emph></term> the value of which is a non-negative integer that reflects the current total number of characters in value partitions.
</termdef -->
Its value is initially set to 0 (zero) and changes while processing an EXI stream per the rule described in <specref ref="encodingOptimizedForMisses"/>.

</p>
</div3>

<div3 id="encodingOptimizedForHits">
<head>Partitions Optimized for Frequent use of Compact Identifiers</head>
<p>String table partitions that are expected to contain a relatively
small number of entries used repeatedly throughout the document are
optimized for the frequent use of compact identifiers. This includes the <termref def="key-uriContentItem"><emph>uri</emph></termref> partition and
all partitions containing <termref def="key-prefixContentItem"><emph>prefix</emph></termref> content items. </p>

<p>When a string value is found in a partition optimized for frequent use of compact identifiers,
the string value is represented as the value (<emph>i</emph>+1)
encoded as an <emph>n</emph>-bit Unsigned Integer (<specref ref="encodingBoundedUnsigned"/>), where
<emph>i</emph> is the value of the compact identifier, <emph>n</emph> is
&lceil; log<sub>2</sub> (<emph>m</emph>+1) &rceil; and <emph>m</emph> is the number of
entries in the string table partition at the time of the operation.
</p>

<p>When a string value is not found in a partition optimized for frequent use of compact identifiers,
the String value is represented as zero (0) encoded as an
<emph>n</emph>-bit Unsigned Integer, followed by the string literal
encoded as a String (<specref ref="encodingString"/>). After
encoding the String value, it is added to the string table partition
and assigned the next available compact identifier <emph>m</emph>.</p>
</div3>

<div3 id="encodingOptimizedForMisses">
<head>Partitions Optimized for Frequent use of String Literals</head>
<p>The remaining string table partitions are optimized for
the frequent use of string literals. This includes all string table partitions containing
local-names
and all string table partitions containing <termref def="key-valueContentItem"><emph>value</emph></termref> content
items.</p>

<p>When a string value is found in the partitions containing
local-names, the
string value is represented as zero (0) encoded as an Unsigned Integer (see
<specref ref="encodingUnsignedInteger"/>) followed by an the compact
identifier of the string value. The compact identifier of the string
value is encoded as an <emph>n</emph>-bit unsigned integer (<specref ref="encodingBoundedUnsigned"/>), where
<emph>n</emph> is &lceil; log<sub>2</sub> <emph>m</emph> &rceil; and <emph>m</emph> is
the number of entries in the string table partition at the time of the operation.</p>

<p>When a string value is not found in the partitions containing
local-names, its
string literal is encoded as a String (see <specref
ref="encodingString"/>) with the length of the string is incremented
by one. After encoding the string value, it is added to the string
table partition and assigned the next available compact
identifier <emph>m</emph>.</p>

<p>As described above, <termref def="key-valueContentItem"><emph>value</emph></termref> content items are assigned
to two partitions, a "local" value partition and the global
value partition. When a string value is found in the "local" value partition,
the string value is represented as zero (0) encoded as an Unsigned Integer (see
<specref ref="encodingUnsignedInteger"/>) followed by the compact identifier
of the string value in the "local" value partition. 
When a string value is found in the global value partition, but not in the "local" value
partition, the String value is represented as one (1) encoded as an
Unsigned Integer (see <specref ref="encodingUnsignedInteger"/>) followed by the compact
identifier of the String value in the global value
partition. The compact identifier is encoded as an <emph>n</emph>-bit
unsigned integer (<specref ref="encodingBoundedUnsigned"/>), where <emph>n</emph> is &lceil; log<sub>2</sub><emph>m</emph> &rceil; and <emph>m</emph> is the number of entries in the
associated partition at the time of the operation.</p>

<p>When a string value is not found in the global or "local" 
<emph>value</emph> partition, its string literal is encoded as a
String (see <specref ref="encodingString"/>) with the length
<emph>L</emph> + 2 (incremented by two) where <emph>L</emph> is the number of characters in the string value

.

After encoding the string value, it is added to
both the associated "local" value string table partition and the global value
string table partition
if <emph>L</emph> is equal to or smaller than the <termref def="key-valueMaxLengthOption">valueMaxLength option</termref> value, and the value of <termref def="key-valueAmount"><emph>valueAmount</emph></termref> is smaller than the capacity specified by 
<termref def="key-valuePartitionCapacityOption">valuePartitionCapacity option</termref>.
When the string value was added to the value partitions, the value of <termref def="key-valueAmount"><emph>valueAmount</emph></termref> is incremented by 1
.
</p>
</div3>
<ednote>
<edtext>
String values representing <termref def="key-valueContentItem"><emph>value</emph></termref> content items are never added to the string table once 
<termref def="key-valueAmount"><emph>valueAmount</emph></termref> reaches 
<termref def="key-valuePartitionCapacityOption">valuePartitionCapacity</termref>.
The working group is still looking at other alternatives to cap the amount of memory used for value partitions that can result in more compact representation of string values overall, including those that involve reassignment of compact identifiers using some sort of round-robin selection method, and the expected effect on processing efficiency of each alternative.

</edtext>
</ednote>

</div2>

<div2 id="datatypeRepresentationMap">
<head>
Datatype Representation Map

</head>
<p>By default, each typed value in an EXI stream is represented by the
associated built-in EXI datatype representation (e.g., see <specref
ref="builtInEXITypes"/>). However, <termdef id="key-datatypeRepresentationMaps"
term="Datatype Representation Map"><termref def="key-exiprocessor">EXI processors</termref> MAY provide the capability to specify different built-in EXI datatype representations or 
user-defined datatype representations for representing specific schema datatypes. 
This capability is called 
<term>
Datatype Representation Map

</term></termdef>.
</p>

<p>
EXI processors that support 
Datatype Representation Map 

MAY provide
external means to define and install 
user-defined datatype representations 

, of which EXI
processors are free to choose implementation dependent mechanisms. EXI
processors MAY also provide means for applications or users to specify
alternate 
built-in EXI datatype representations 

or 
user-defined datatype representations 

for representing
specific schema datatypes, the mechanisms of which are again
implementation dependent.
</p>
<p>When an EXI processor encodes an EXI stream using 
Datatype Representation Map 

, it MUST specify
in the EXI header each schema datatype that is not represented using the
default built-in EXI datatype representation and the alternate built-in EXI datatype representation or user-defined datatype representation used for each one unless the whole <termref def="key-options">EXI Options</termref> part of the header is omitted.
An EXI processor that attempts to decode an
EXI stream that specifies a user-defined datatype representation in the EXI header that
it does not recognize MAY report a warning, but this is not an
error. However, when an EXI processor encounters a typed value that
was encoded by a user-defined datatype representation that it does not support, it MUST
report an error.</p>
<p>The EXI options header, when it appears in an EXI stream, MUST include a 
"datatypeRepresentationMap"

element for each
schema datatype that is not represented using the default 
built-in EXI datatype representation. 

The 
"datatypeRepresentationMap"

element includes two child elements. The <termref def="key-qname">qname</termref> of
the first child element identifies the schema datatype that is not
represented using the default 
built-in EXI datatype representation 

and the <termref def="key-qname">qname</termref> of the
second child element identifies the alternate 
built-in EXI datatype representation 

or user-defined 
datatype representation 

used to represent that type. 
Built-in EXI datatype representations 

are identified by the type identifiers in <specref
ref="builtInEXITypes"/>. </p>

<p>For example, the following 
"datatypeRepresentationMap"

element indicates all values of
type xsd:decimal in the EXI stream are represented using the built-in
String datatype representation, which has the type ID xsd:string: </p>

<example>
<head>datatypeRepresentationMap indicating all Decimal values are represented using
built-in String datatype representation</head>
<eg xml:space="preserve">
    &lt;datatypeRepresentationMap xmlns:xsd="http://www.w3.org/2001/XMLSchema"&gt;
        &lt;xsd:decimal/&gt;
        &lt;xsd:string/&gt;
    &lt;/datatypeRepresentationMap&gt;
</eg>
</example>

<p>It is the responsibility of an EXI processor to interface with a particular implementation of 
built-in EXI datatype representations 

or user-defined 
datatype representations 

properly. In the example above, an EXI processor may need to provide a string value of the data being processed that is typed as xsd:decimal in order to interface with 
an implementation of built-in String datatype representation.

In such a case, some EXI processors may have started with a decimal value and such processors may well translate the value into a string before passing the data to 
the implementation of built-in String datatype representation 

while other EXI processors may already have a string value of the data so that it can pass the value directly to 
the implementation of built-in String datatype representation 

without any translation.
</p>

<p>As another example, the following 
"datatypeRepresentationMap"

element indicates all
values of the used-defined datatype geo:geometricSurface are represented
using the user-defined 
datatype representation 

geo:geometricInterpolator: </p>

<example>
<head>datatypeRepresentationMap illustrating a user-defined type represented by a user-defined datatype representation</head>
<eg xml:space="preserve">
    &lt;datatypeRepresentationMap xmlns:geo="http://www.example.com/Geometry"&gt;
        &lt;geo:geometricSurface/&gt;
	&lt;geo:geometricInterpolator/&gt;
    &lt;/datatypeRepresentationMap&gt;
</eg>
</example>

<note>
EXI only defines a way to indicate the use of user-defined 
datatype representations 

for representing values of specific datatypes. 
Datatype representations 

which are assigned to datatypes by <termref def="key-qname">qnames</termref>, are omnipresent only if the <termref def="key-qname">qname</termref> is one of those that represent built-in EXI datatype representations. For 
datatype representations 

of other <termref def="key-qname">qnames</termref>, EXI does not provide nor  suggest a method by which they are identified and shared between EXI Processors. Therefore, its use needs to be restrained by weighing alternatives and considering the consequences of each in pros and cons, in order to avoid unruly proliferation of documents that use custom 
datatype representations 

Those applications that ever find 
Datatype Representation Map 

useful should make sure that they exchange such documents only among the parties that are pre-known or discovered to be able to process the user-defined 
datatype representations 

that are in use. Otherwise, if it is not for certain if a receiver undestands the particular user-defined 
datatype representations, 

the sender should never attempt to send documents that use user-defined 
datatype representations 

to that recipient.
</note>

</div2>

</div1>

<div1 id="grammars">
<head>EXI Grammars</head>
<p>EXI is a knowledge based encoding that uses a set of grammars to
determine which events are most likely to occur at any given point in
an EXI stream and encodes the most likely alternatives in fewer
bits. It does this by mapping the stream of events to a lower entropy
set of representative values and encoding those values using a set of
simple variable length codes or an EXI compression algorithm. </p>
<p>The result is a very simple, small algorithm that uniformly handles
schema-less encoding, schema-informed encoding, schema deviations,
and any combination thereof in EXI streams. These variations do
not require different algorithms or different parsers, they are simply
informed by different combinations of grammars. </p>
<p>The following sections describe the grammars used to inform the EXI encoding. </p>
<!-- note>The grammars in this specification are intentionally permissive. They accept all valid documents, but also accept several invalid documents. </note -->
<note>The grammar semantics in this specification are written for clarity and generality. They do not prescribe a particular implementation approach. </note>
<div2 id="grammarNotation">
<head>Grammar Notation</head>
<!-- 
<p>In this specification, all terminal symbols are represented in plain text and all non- terminal symbols are represented in 
<emph>italics</emph>. Grammar productions are represented as follows: </p>
<table width="100%">
<tbody>
<tr>
<td width="5%"></td>
<td>
<emph>LeftHandSide</emph> : 
<emph>RightHandSide</emph></td></tr></tbody></table>
<p>A set of one or more grammar productions that share the same left-hand-side non- terminal symbol may be represented as follows: </p>
<table width="100%">
<tbody>
<tr>
<td width="5%"></td>
<td colspan="2">
<emph>LeftHandSide</emph> :</td></tr>
<tr>
<td></td>
<td width="5%"></td>
<td>
<emph>RightHandSide 
<sub>1</sub></emph> 
<emph>RightHandSize 
<sub>2</sub></emph></td></tr>
<tr>
<td></td>
<td></td>
<td>
<emph>RightHandSide 
<sub>3</sub></emph></td></tr>
<tr>
<td></td>
<td></td>
<td>...</td></tr>
<tr>
<td></td>
<td></td>
<td>
<emph>RightHandSide 
<sub>n</sub></emph></td></tr></tbody></table -->

<div3 id="fixedEventCodes">
<head>Fixed Event Codes</head>
<p>Each grammar production has an <termref def="key-eventcode">event code</termref>, which is represented by a sequence of one to three parts separated by periods (&quot;.&quot;). Each part is an unsigned integer. The following are examples of grammar productions with event codes as they appear in this specification. </p>
<example>
<head>Example productions with fixed event codes</head>

<table width="95%">
<thead>
<tr>
<th colspan="3" align="left">Productions</th>
<th align="left">Event Codes</th></tr>
</thead>
<tbody>
<tr>
<td>&nbsp;</td></tr>
<tr>
<td width="5%"></td>
<td colspan="4">
<emph>LeftHandSide <sub>1</sub></emph> :</td></tr>
<tr>
<td></td>
<td width="5%"></td>
<td width="75%">
Event <sub>1</sub>&nbsp;&nbsp;
<emph>NonTerminal <sub>1</sub></emph></td>
<td>0</td></tr>
<tr>
<td></td>
<td></td>
<td>
Event <sub>2</sub>&nbsp;&nbsp;
<emph>NonTerminal <sub>2</sub></emph></td>
<td>1</td></tr>
<tr>
<td></td>
<td></td>
<td>
Event <sub>3</sub>&nbsp;&nbsp;
<emph>NonTerminal <sub>3</sub></emph></td>
<td>2.0</td></tr>
<tr>
<td></td>
<td></td>
<td>
Event <sub>4</sub>&nbsp;&nbsp;
<emph>NonTerminal <sub>4</sub></emph></td>
<td>2.1</td></tr>
<tr>
<td></td>
<td></td>
<td>
Event <sub>5</sub>&nbsp;&nbsp;
<emph>NonTerminal <sub>5</sub></emph></td>
<td>2.2.0</td></tr>
<tr>
<td></td>
<td></td>
<td>
Event <sub>6</sub>&nbsp;&nbsp;
<emph>NonTerminal <sub>6</sub></emph></td>
<td>2.2.1</td></tr>
<tr>
<td colspan="5">&nbsp;</td></tr>
<tr>
<td></td>
<td colspan="4">
<emph>LeftHandSide <sub>2</sub></emph> :</td></tr>
<tr>
<td></td>
<td></td>
<td>
Event <sub>1</sub>&nbsp;&nbsp;
<emph>NonTerminal <sub>1</sub></emph></td>
<td>0</td></tr>
<tr>
<td></td>
<td></td>
<td>
Event <sub>2</sub>&nbsp;&nbsp;
<emph>NonTerminal <sub>2</sub></emph></td>
<td>1.0</td></tr>
<tr>
<td></td>
<td></td>
<td>
Event <sub>3</sub>&nbsp;&nbsp;
<emph>NonTerminal <sub>3</sub></emph></td>
<td>1.1</td></tr></tbody></table>
</example>
<p>The number of parts in a given event code is called the event code's length. No two productions with the same non-terminal symbol on the left-hand-side are permitted to have the same event code. </p></div3>
<div3 id="variableEventCodes">
<head>Variable Event Codes</head>
<p>Some non-terminal symbols are used on the right-hand-side in a production without an event prefixed to them. Such non-terminal symbols are macros and they are used to capture some recurring set of productions into symbols so that a symbol can be used in the grammar representation instead of including all the productions the macro represents in place every time it is used.
</p>

<example>
<head>Example productions that use macro non-terminal symbols</head>
<table width="95%">
<tbody>
<tr>
<td width="5%"></td>
<td colspan="3">
<emph>ABigProduction <sub>1</sub></emph> :</td></tr>
<tr>
<td></td>
<td width="5%"></td>
<td width="75%">
Event <sub>1</sub>&nbsp;&nbsp;
<emph>NonTerminal <sub>1</sub></emph></td>
<td>0</td></tr>
<tr>
<td></td>
<td></td>
<td>
Event <sub>2</sub>&nbsp;&nbsp;
<emph>NonTerminal <sub>2</sub></emph></td>
<td>1</td></tr>
<tr>
<td></td>
<td></td>
<td>
<emph>LEFTHANDSIDE <sub>1</sub></emph> (2.0)</td>
<td>2.0</td></tr>
<tr>
<td colspan="4">&nbsp;</td></tr>
<tr>
<td></td>
<td colspan="3">
<emph>ABigProduction <sub>2</sub></emph> :</td></tr>
<tr>
<td></td>
<td></td>
<td>
Event <sub>1</sub>&nbsp;&nbsp;
<emph>NonTerminal <sub>1</sub></emph></td>
<td>0</td></tr>
<tr>
<td></td>
<td></td>
<td>
<emph>LEFTHANDSIDE <sub>1</sub></emph> (1.1)</td>
<td>1.1</td></tr>
<tr>
<td></td>
<td></td>
<td>
Event <sub>2</sub>&nbsp;&nbsp;
<emph>NonTerminal <sub>2</sub></emph></td>
<td>1.2</td></tr>
</tbody></table>

</example>

<p>
Because non-terminal macros are injected into the right-hand-side of more than one production,
the event codes of productions with these macro non-terminals on the left-hand-side are not fixed, but will have different event code values depending on the context in which the macro non-terminal appears. This specification calls these variable event codes and uses variables in place of individual event code parts to indicate the event code parts are determined by the context. Below are some examples of variable event codes: </p>
<example>
<head>Example non-terminal macros and its productions with variable event codes</head>

<table width="95%">
<tbody>
<tr>
<td width="5%"></td>
<td colspan="4">
<emph>LEFTHANDSIDE <sub>1</sub> (n.m)</emph> :</td></tr>
<tr>
<td></td>
<td width="5%"></td>
<td width="75%">
EVENT <sub>1</sub>&nbsp;&nbsp;<emph>NONTERMINAL 
<sub>1</sub></emph></td>
<td>
<emph>n</emph>.0</td></tr>
<tr>
<td></td>
<td></td>
<td>
EVENT <sub>2</sub>&nbsp;&nbsp;<emph>NONTERMINAL 
<sub>2</sub></emph></td>
<td>
<emph>n</emph>.1</td></tr>
<tr>
<td></td>
<td></td>
<td>
EVENT <sub>3</sub>&nbsp;&nbsp;<emph>NONTERMINAL 
<sub>3</sub></emph></td>
<td>
<emph>n</emph>. 
<emph>m</emph>+2</td></tr>
<tr>
<td></td>
<td></td>
<td>
EVENT <sub>4</sub>&nbsp;&nbsp;<emph>NONTERMINAL 
<sub>4</sub></emph></td>
<td>
<emph>n</emph>. 
<emph>m</emph>+3</td></tr>
<tr>
<td></td>
<td></td>
<td>
EVENT <sub>5</sub>&nbsp;&nbsp;<emph>NONTERMINAL 
<sub>5</sub></emph></td>
<td>
<emph>n</emph>. 
<emph>m</emph>+4.0</td></tr>
<tr>
<td></td>
<td></td>
<td>
EVENT <sub>6</sub>&nbsp;&nbsp;<emph>NONTERMINAL 
<sub>6</sub></emph></td>
<td>
<emph>n</emph>. 
<emph>m</emph>+4.1</td></tr></tbody></table>
</example>
<p>Unless otherwise specified, the variable 
<emph>n</emph> evaluates to the event code of the production in which the macro non-terminal 
<emph>LEFTHANDSIDE 
<sub>1</sub></emph> appears on the right-hand-side. Similarly, the expression 
<emph>n</emph>. 
<emph>m</emph> represents the first two parts of the event code of the production in which the macro non-terminal 
<emph>LEFTHANDSIDE 
<sub>1</sub></emph> appears on the right-hand-side. </p>

<p>Non-terminal macros are used in this specification for notational convenience only.
They are not non-terminals, even though they are used in place of non-terminals.
Productions that use non-terminal macros on the right-hand-side need to be expanded by macro substitution before such productions are interpreted.
Therefore, <emph>ABigProduction <sub>1</sub></emph> and <emph>ABigProduction <sub>2</sub></emph> shown in the preceding example are equivalent to the following set of productions derived by expanding the non-terminal macro symbol <emph>LEFTHANDSIDE 
<sub>1</sub></emph> and evaluating the variable event codes.
</p>
<example>
<head>Expanded productions equivalent to the productions used above</head>

<table width="95%">
<tbody>
<tr>
<td width="5%"></td>
<td colspan="4">
<emph>ABigProduction <sub>1</sub></emph> :</td></tr>
<tr>
<td></td>
<td width="5%"></td>
<td>
Event <sub>1</sub>&nbsp;&nbsp;
<emph>NonTerminal <sub>1</sub></emph></td>
<td>0</td></tr>
<tr>
<td></td>
<td></td>
<td>
Event <sub>2</sub>&nbsp;&nbsp;
<emph>NonTerminal <sub>2</sub></emph></td>
<td>1</td></tr>
<tr>
<td></td>
<td></td>
<td>
EVENT <sub>1</sub>&nbsp;&nbsp;<emph>NONTERMINAL 
<sub>1</sub></emph></td>
<td>2.0</td></tr>
<tr>
<td></td>
<td></td>
<td>
EVENT <sub>2</sub>&nbsp;&nbsp;<emph>NONTERMINAL 
<sub>2</sub></emph></td>
<td>2.1</td></tr>
<tr>
<td></td>
<td></td>
<td>
EVENT <sub>3</sub>&nbsp;&nbsp;<emph>NONTERMINAL 
<sub>3</sub></emph></td>
<td>2.2</td></tr>
<tr>
<td></td>
<td></td>
<td>
EVENT <sub>4</sub>&nbsp;&nbsp;<emph>NONTERMINAL 
<sub>4</sub></emph></td>
<td>2.3</td></tr>
<tr>
<td></td>
<td></td>
<td>
EVENT <sub>5</sub>&nbsp;&nbsp;<emph>NONTERMINAL 
<sub>5</sub></emph></td>
<td>2.4.0</td></tr>
<tr>
<td></td>
<td></td>
<td>
EVENT <sub>6</sub>&nbsp;&nbsp;<emph>NONTERMINAL 
<sub>6</sub></emph></td>
<td>2.4.1</td></tr>
<tr>
<td colspan="5">&nbsp;</td></tr>


<tr>
<td width="5%"></td>
<td colspan="4">
<emph>ABigProduction <sub>2</sub></emph> :</td></tr>
<tr>
<td></td>
<td></td>
<td>
Event <sub>1</sub>&nbsp;&nbsp;
<emph>NonTerminal <sub>1</sub></emph></td>
<td>0</td></tr>
<tr>
<td></td>
<td></td>
<td width="75%">
EVENT <sub>1</sub>&nbsp;&nbsp;<emph>NONTERMINAL 
<sub>1</sub></emph></td>
<td>1.0</td></tr>
<tr>
<td></td>
<td></td>
<td>
EVENT <sub>2</sub>&nbsp;&nbsp;<emph>NONTERMINAL 
<sub>2</sub></emph></td>
<td>1.1</td></tr>
<tr>
<td></td>
<td></td>
<td>
Event <sub>2</sub>&nbsp;&nbsp;
<emph>NonTerminal <sub>2</sub></emph></td>
<td>1.2</td></tr>
<tr>
<td></td>
<td></td>
<td>
EVENT <sub>3</sub>&nbsp;&nbsp;<emph>NONTERMINAL 
<sub>3</sub></emph></td>
<td>1.3</td></tr>
<tr>
<td></td>
<td></td>
<td>
EVENT <sub>4</sub>&nbsp;&nbsp;<emph>NONTERMINAL 
<sub>4</sub></emph></td>
<td>1.4</td></tr>
<tr>
<td></td>
<td></td>
<td>
EVENT <sub>5</sub>&nbsp;&nbsp;<emph>NONTERMINAL 
<sub>5</sub></emph></td>
<td>1.5.0</td></tr>
<tr>
<td></td>
<td></td>
<td>
EVENT <sub>6</sub>&nbsp;&nbsp;<emph>NONTERMINAL 
<sub>6</sub></emph></td>
<td>1.5.1</td></tr>

</tbody></table>

</example></div3>
<!-- div3 id="productionBag">
<head>Production Bag</head>
<p>Some non-terminal symbols are used on the right-hand-side in a production with a pseudo event code of the form of a range (0 ... <emph>n</emph>). Such a non-terminal symbol represents a bag of productions where variable <emph>n</emph> used in the pseudo event code denotes the number of productions in the bag, and is used without an event prefixed to them.
</p>

<example>
<head>Example use of a production bag</head>
<table width="95%">
<thead>
<tr>
<th align="left" colspan="3">&nbsp;</th>
<th align="left">Event Code</th></tr>
</thead>
<tbody>
<tr>
<td width="5%"></td>
<td colspan="3">
<emph>ABigProduction</emph> :</td></tr>
<tr>
<td></td>
<td width="5%"></td>
<td width="75%"><emph>ProductionBag</emph></td>
<td>0 ... (<emph>n</emph>-1)</td></tr>
<tr>
<td></td>
<td></td>
<td>
<emph>LEFTHANDSIDE (n)</emph></td>
<td><emph>n</emph></td></tr>

<tr>
<td colspan="4">&nbsp;</td></tr>
<tr>
<td></td>
<td colspan="3">
<emph>LEFTHANDSIDE (n)</emph> :</td></tr>
<tr>
<td></td>
<td></td>
<td>EVENT <sub>1</sub>&nbsp;&nbsp;<emph>NONTERMINAL <sub>1</sub></emph></td>
<td><emph>n</emph></td></tr>
<tr>
<td></td>
<td></td>
<td>EVENT <sub>2</sub>&nbsp;&nbsp;<emph>NONTERMINAL <sub>2</sub></emph></td>
<td><emph>n</emph>+1</td></tr>
<tr>
<td></td>
<td></td>
<td>EVENT <sub>3</sub>&nbsp;&nbsp;<emph>NONTERMINAL <sub>3</sub></emph></td>
<td>(<emph>n</emph>+2).0</td></tr>
<tr>
<td></td>
<td></td>
<td>EVENT <sub>4</sub>&nbsp;&nbsp;<emph>NONTERMINAL <sub>4</sub></emph></td>
<td>(<emph>n</emph>+2).1</td></tr>
</tbody></table>
</example>

<p>The content of a production bag can be either static or dynamic. A static bag contains a fixed set of productions throughout its life cycle whereas a dynamic bag grows while processing an EXI stream. Production bags are used in this specification for notational convenience only. They are not non-terminals, even though they are used in place of non-terminals. Productions that use production bags on the right-hand-side need to be expanded by substituting the bags with their content before such productions are interpreted.
</p>

<p>The grammar <emph>ABigProduction</emph> shown in the preceding example is equivalent to the following set of productions when the production bag <emph>ProductionBag</emph> contains productions that have <emph>RightHandSide <sub>1</sub></emph>, <emph>RightHandSide <sub>2</sub></emph> and <emph>RightHandSide <sub>3</sub></emph> as the right-hand-side with respective event code 0, 1 and 2.
</p>

<example>
<head>Expanded productions equivalent to the productions used above</head>
<table width="95%">
<thead>
<tr>
<th align="left" colspan="3">&nbsp;</th>
<th align="left">Event Code</th></tr>
</thead>
<tbody>
<tr>
<td width="5%"></td>
<td colspan="3">
<emph>ABigProduction</emph> :</td></tr>
<tr>
<td></td>
<td width="5%"></td>
<td width="75%"><emph>RightHandSide <sub>1</sub></emph></td>
<td><emph>0</emph></td></tr>
<tr>
<td></td>
<td></td>
<td><emph>RightHandSide <sub>2</sub></emph></td>
<td><emph>1</emph></td></tr>
<tr>
<td></td>
<td></td>
<td><emph>RightHandSide <sub>3</sub></emph></td>
<td><emph>2</emph></td></tr>
<tr>
<td></td>
<td></td>
<td>EVENT <sub>1</sub>&nbsp;&nbsp;<emph>NONTERMINAL <sub>1</sub></emph></td>
<td><emph>3</emph></td></tr>
<tr>
<td></td>
<td></td>
<td>EVENT <sub>2</sub>&nbsp;&nbsp;<emph>NONTERMINAL <sub>2</sub></emph></td>
<td>4</td></tr>
<tr>
<td></td>
<td></td>
<td>EVENT <sub>3</sub>&nbsp;&nbsp;<emph>NONTERMINAL <sub>3</sub></emph></td>
<td>5.0</td></tr>
<tr>
<td></td>
<td></td>
<td>EVENT <sub>4</sub>&nbsp;&nbsp;<emph>NONTERMINAL <sub>4</sub></emph></td>
<td>5.1</td></tr>
</tbody></table>
</example>

</div3 -->
</div2>
<div2 id="grammarEventCodes">
<head>Grammar Event Codes</head>
<p>Each production rule in the EXI grammar includes an event code value that approximates the likelihood the associated production rule will be matched over the other productions with the same left-hand-side non-terminal symbol. Ultimately, the event codes determine the value(s) by which each non-terminal symbol will be represented in the EXI stream. </p>
<p>To understand how a given event code approximates the likelihood a given production will matched, it is useful to visualize the event codes for a set of production rules that have the same non-terminal symbol on the left-hand-side as a tree. For example, the following set of productions: </p>
<example>
<head>Example productions with event codes</head>

<table width="95%">
<tbody>
<tr>
<td width="5%"></td>
<td colspan="4">
<emph>ElementContent</emph> :</td></tr>
<tr>
<td></td>
<td width="5%"></td>
<td width="75%">EE</td>
<td>0</td></tr>
<tr>
<td></td>
<td></td>
<td>SE (*) 
<emph>ElementContent</emph></td>
<td>1.0</td></tr>
<tr>
<td></td>
<td></td>
<td>CH 
<emph>ElementContent</emph></td>
<td>1.1</td></tr>
<tr>
<td></td>
<td></td>
<td>ER 
<emph>ElementContent</emph></td>
<td>1.2</td></tr>
<tr>
<td></td>
<td></td>
<td>CM 
<emph>ElementContent</emph></td>
<td>1.3.0</td></tr>
<tr>
<td></td>
<td></td>
<td>PI 
<emph>ElementContent</emph></td>
<td>1.3.1</td></tr></tbody></table></example>
<p>represents a set of information items that might occur as element content after the start tag. Using the production event codes, we can visualize this set of productions as follows: </p>
<graphic source="eventCodeTree.png" alt="Event code tree for ElementContent grammar"/>
<p>where the non-terminal symbols are represented by the leaf nodes of the tree and the event code of each production rule that contains a non-terminal symbol defines a path from the root of the tree to the node associated with that symbol. We call this the event code tree for a given set of productions. </p>
<p>An event code tree is similar to a Huffman tree <bibref ref="huffman"/> in that shorter paths are generally used for symbols that are considered more likely. However, event code trees are far simpler and less costly to compute and maintain. Event code trees are shallow and contain at most three levels. In addition, the length of each event code in the event code tree is assigned statically without analyzing the data. This classification provides some of the benefits of a Huffman tree without the cost. </p></div2>
<div2 id="pruningProductions">
<head>Pruning Unneeded Productions</head>
<p>As discussed in section 
<specref ref="fidelityOptions"/>, applications MAY provide a set of fidelity options to specify the XML features they require. EXI processors MUST use these fidelity options to prune the events that are not required from the grammars, improving compactness and processing efficiency.</p>
<p>For example, the following set of productions represent the set of information items that might occur as element content after the start tag.</p>
<example>
<head>Example productions with full fidelity</head>

<table width="95%">
<tbody>
<tr>
<td width="5%"></td>
<td colspan="3">
<emph>ElementContent</emph> :</td></tr>
<tr>
<td></td>
<td width="5%"></td>
<td width="75%">EE</td>
<td>0</td></tr>
<tr>
<td></td>
<td></td>
<td>SE (*) 
<emph>ElementContent</emph></td>
<td>1.0</td></tr>
<tr>
<td></td>
<td></td>
<td>CH 
<emph>ElementContent</emph></td>
<td>1.1</td></tr>
<tr>
<td></td>
<td></td>
<td>ER 
<emph>ElementContent</emph></td>
<td>1.2</td></tr>
<tr>
<td></td>
<td></td>
<td>CM 
<emph>ElementContent</emph></td>
<td>1.3.0</td></tr>
<tr>
<td></td>
<td></td>
<td>PI 
<emph>ElementContent</emph></td>
<td>1.3.1</td></tr></tbody></table>
</example>
<p>If an application sets the fidelity options preserve.comments, preserve.pis and preserve.dtd to false, the productions matching comment (CM), processing instruction (PI) and entity reference (ER) events are pruned from the grammar, producing the following set of productions: </p>
<example>
<head>Example productions after pruning</head>

<table width="95%">
<tbody>
<tr>
<td width="5%"></td>
<td colspan="4">
<emph>ElementContent</emph> :</td></tr>
<tr>
<td></td>
<td width="5%"></td>
<td width="75%">EE</td>
<td>0</td></tr>
<tr>
<td></td>
<td></td>
<td>SE (*) 
<emph>ElementContent</emph></td>
<td>1.0</td></tr>
<tr>
<td></td>
<td></td>
<td>CH 
<emph>ElementContent</emph></td>
<td>1.1</td></tr></tbody></table>
</example>
<p>Removing these productions from the grammar tells EXI processors that comments and processing instructions will never occur in the EXI stream, which reduces the entropy of the stream allowing it to be encoded in fewer bits. </p>
<p>Each time a production is removed from a grammar, the event codes of the other productions with the same non-terminal symbol on the left-hand-side MUST be adjusted to keep them contiguous if its removal has left the remaining productions with non-contiguous event codes.</p></div2>
<div2 id="builtinGrammars">
<head>Built-in XML Grammars</head>
<p>This section describes the built-in XML grammar used by EXI when no additional information is available to describe the contents of the EXI stream. The built-in XML grammar is used when no schema exists, 
and for schema extensions and deviations that are not declared by the schema. </p>
<p>A built-in XML grammar is self-evolving. The built-in grammar continuously reflects the knowledge being learned while processing an EXI stream onto itself in order to keep refining itself for subsequent use of the grammar within the extent of processing a single stream.</p>
<div3 id="builtinDocGrammars">
<head>Built-in Document Grammar</head>
<p>In the absence of additional information about the content of the EXI stream, the following grammar describes the events that will occur in an <termref def="key-exidocument">EXI document</termref>. </p>
<table width="100%">
<tbody>
<tr>
<th align="left" colspan="3">Syntax</th>
<th align="left">Event Code</th></tr>
<tr>
<td colspan="4">&nbsp;</td></tr>
<tr>
<td width="5%"></td>
<td colspan="3">
<emph>Document</emph> :</td></tr>
<tr>
<td></td>
<td width="5%"></td>
<td width="60%">SD 
<emph>DocContent</emph></td>
<td width="30%">0</td></tr>
<tr>
<td colspan="4">&nbsp;</td></tr>
<tr>
<td></td>
<td colspan="3">
<emph>DocContent</emph> :</td></tr>
<tr>
<td></td>
<td></td>
<td>SE (*) 
<emph>DocEnd</emph></td>
<td>0</td></tr>
<tr>
<td></td>
<td></td>
<td>DT 
<emph>DocContent</emph></td>
<td>1.0</td></tr>
<tr>
<td></td>
<td></td>
<td>CM 
<emph>DocContent</emph></td>
<td>1.1.0</td></tr>
<tr>
<td></td>
<td></td>
<td>PI 
<emph>DocContent</emph></td>
<td>1.1.1</td></tr>
<tr>
<td colspan="4">&nbsp;</td></tr>
<tr>
<td></td>
<td colspan="3">
<emph>DocEnd</emph> :</td></tr>
<tr>
<td></td>
<td></td>
<td>ED</td>
<td>0</td></tr>
<tr>
<td></td>
<td></td>
<td>CM 
<emph>DocEnd</emph></td>
<td>1.0</td></tr>
<tr>
<td></td>
<td></td>
<td>PI 
<emph>DocEnd</emph></td>
<td>1.1</td></tr></tbody></table>
<p></p>
<table>
<tbody>
<tr>
<th align="left">Semantics:</th></tr></tbody></table>
<p>All productions in the built-in Document grammars of the form 
<emph>LeftHandSide</emph> : SE (*) <emph>RightHandSide</emph>
are evaluated as follows: </p>
<olist>
<item>Let <emph>qname</emph> be the <termref def="key-qname">qname</termref> of the element matched by SE (*) </item>
<item>If a grammar does not exist for element 
<emph>qname</emph>, create one based on the <termref def="key-builtinElementGrammar">Built-in Element Grammar</termref></item>
<item>Evaluate the element contents using a built-in grammar for element <emph>qname</emph></item>
<item>Evaluate the remainder of event sequence using <emph>RightHandSide</emph>.</item>
</olist>
</div3>
<div3 id="builtinFragGrammars">
<head>Built-in Fragment Grammar</head>
<p>In the absence of additional information about the contents of an EXI stream, the following grammar describes the events that will occur in an <termref def="key-exifragment">EXI fragment</termref>. The grammar shown below represents the initial set of productions that belong to a built-in fragment grammar at the start of a stream processing, which is supplemented by the semantic description that explains the rules used to evolve the built-in fragment grammar to continuously improve it and be better prepared for subsequent uses of the same grammar during the rest of the processing of the stream.</p>

<table width="100%">
<thead>
<tr>
<th align="left" colspan="3">Syntax</th>
<th align="left">Event Code</th></tr>
</thead>
<tbody>
<tr>
<td colspan="4">&nbsp;</td></tr>
<tr>
<td width="5%"></td>
<td colspan="3">
<emph>Fragment</emph> :</td></tr>
<tr>
<td></td>
<td width="5%"></td>
<td width="60%">SD 
<emph>FragmentContent</emph></td>
<td width="30%">0</td></tr>
<tr>
<td colspan="4">&nbsp;</td></tr>
<tr>
<td></td>
<td colspan="3">
<emph>FragmentContent</emph> :</td></tr>
<tr>
<td></td>
<td></td>
<td>SE (*) 
<emph>FragmentContent</emph></td>
<td>0</td></tr>
<tr>
<td></td>
<td></td>
<td>ED</td>
<td>1</td></tr>
<tr>
<td></td>
<td></td>
<td>CM 
<emph>FragmentContent</emph></td>
<td>2.0</td></tr>
<tr>
<td></td>
<td></td>
<td>PI 
<emph>FragmentContent</emph></td>
<td>2.1</td></tr>
</tbody></table>
<p></p>
<table>
<tbody>
<tr>
<th align="left">Semantics:</th></tr></tbody></table>
<p>All productions in the built-in Fragment grammars of the form 
<emph>LeftHandSide</emph> : SE (*) <emph>RightHandSide</emph>
are evaluated as follows: </p>
<olist>
<item>Let <emph>qname</emph> be the <termref def="key-qname">qname</termref> of the element matched by SE (*) </item>
<item>If a grammar does not exist for element 
<emph>qname</emph>, create one based on the <termref def="key-builtinElementGrammar">Built-in Element Grammar</termref></item>
<item>Evaluate the element contents using a built-in grammar for element <emph>qname</emph></item>
<item>Create a production of the form <emph>LeftHandSide</emph> : SE (<emph>qname</emph>) <emph>RightHandSide</emph> with an event code 0</item>
<item>Increment the first part of the event code of each production in the current grammar with the non-terminal <emph>LeftHandSide</emph> on the left hand side.</item>
<item>Add the production created in step 4 to the grammar</item>
<item>Evaluate the remainder of event sequence using <emph>RightHandSide</emph>.</item>
</olist>

<p>All productions of the form <emph>LeftHandSide</emph> : SE (<emph>qname</emph>) <emph>RightHandSide</emph> that were previously added to the grammar upon the first occurrence of the element that has the <termref def="key-qname">qname</termref> <emph>qname</emph> are evaluated as follows when they are matched: </p>
<olist>
<item>Evaluate the element contents using a built-in grammar for element <emph>qname</emph></item>
<item>Evaluate the remainder of event sequence using <emph>RightHandSide</emph>.</item>
</olist>

</div3>
<div3 id="builtinElemGrammars">
<head>Built-in Element Grammar</head>
<p><termdef id="key-builtinElementGrammar" term="Built-in Element Grammar">EXI defines a <term>built-in element grammar</term> that is used in the absence of additional information about the contents of an EXI element prior to its processing.</termdef> A built-in element grammar shown below is prescibed by EXI to reflect the events that will occur in an element and the order amongst them in general without any further constraint about what is likely or not likely to occur inside elements.</p>
<p>A single instance of built-in element grammar is shared by those elements in a stream that have the same <termref def="key-qname">qname</termref> and do not have additional a priori constraints as to their content. A separate instance of built-in element grammar is assigned to each <termref def="key-qname">qname</termref> upon the first occurrence of the elements of the same <termref def="key-qname">qname</termref>, thereafter the grammar continuously evolves by reflecting the knowledge learned while processing the content of those elements. The grammar shown below represents the initial set of productions that belong to a built-in element grammar at the time when a new instance is created, which is supplemented by the semantic description that explains the rules that are applied by the grammar onto itself to evolve and be better prepared for subsequent uses of the same grammar instance during the rest of the processing of the stream.</p>
<table width="100%">
<thead>
<tr>
<th align="left" colspan="3">Syntax</th>
<th align="left">Event Code</th></tr>
</thead>
<tbody>
<tr>
<td colspan="4">&nbsp;</td></tr>
<tr>
<td width="5%"></td>
<td colspan="3">
<emph>StartTagContent</emph> :</td></tr>
<tr>
<td></td>
<td width="5%"></td>
<td width="60%">EE</td>
<td width="30%">0.0</td></tr>
<tr>
<td></td>
<td></td>
<td>AT (*) 
<emph>StartTagContent</emph></td>
<td>0.1</td></tr>
<tr>
<td></td>
<td></td>
<td>NS 
<emph>StartTagContent</emph></td>
<td>0.2</td></tr>
<tr>
<td></td>
<td></td>
<td>SC 
<emph>Fragment</emph></td>
<td>0.3</td></tr>
<tr>
<td></td>
<td></td>
<td>
<emph>ChildContentItems</emph> (0.4)</td>
<td></td></tr>
<tr>
<td colspan="4">&nbsp;</td></tr>
<tr>
<td></td>
<td colspan="3">
<emph>ElementContent</emph> :</td></tr>
<tr>
<td></td>
<td></td>
<td>EE</td>
<td>0</td></tr>
<tr>
<td></td>
<td></td>
<td>
<emph>ChildContentItems</emph> (1.0)</td>
<td></td></tr>
<tr>
<td colspan="4">&nbsp;</td></tr>
<tr>
<td></td>
<td colspan="3">
<emph>ChildContentItems (n.m)</emph> :</td></tr>
<tr>
<td></td>
<td></td>
<td>SE (*) <emph>ElementContent</emph></td>
<td>
<emph>n</emph>. 
<emph>m</emph></td></tr>
<tr>
<td></td>
<td></td>
<td>CH <emph>ElementContent</emph></td>
<td>
<emph>n</emph>.(<emph>m</emph>+1)</td></tr>
<tr>
<td></td>
<td></td>
<td>ER <emph>ElementContent</emph></td>
<td>
<emph>n</emph>.(<emph>m</emph>+2)</td></tr>
<tr>
<td></td>
<td></td>
<td>CM <emph>ElementContent</emph></td>
<td>
<emph>n</emph>.(<emph>m</emph>+3).0</td></tr>
<tr>
<td></td>
<td></td>
<td>PI <emph>ElementContent</emph></td>
<td>
<emph>n</emph>.(<emph>m</emph>+3).1</td></tr>
</tbody></table>
<p></p>
<table width="100%">
<thead>
<tr>
<th align="left">Note:</th>
</tr>
</thead>
<tbody>
<tr>
<td>&nbsp;
</td>
</tr>
<tr>
<td>
<ulist>
<item>
The value of each AT (xsi:type) event matching the AT(*) terminal is represented as a QName (see 
<specref ref="encodingQName"/>). If there is no namespace in scope for the specified <termref def="key-qname">qname</termref> prefix, the QName <emph>uri</emph> is set to empty ("") and the QName <emph>localName</emph> is set to the full lexical value of the QName, including the prefix. 

</item>
</ulist>
</td>
</tr>
</tbody></table>

<table>
<tbody>
<tr>
<th align="left">Semantics:</th></tr>
<tr><td>&nbsp;</td></tr>
<tr><td>
<p>All productions in the built-in Element grammar of the form 
<emph>LeftHandSide</emph>: AT (*) 
<emph>RightHandSide</emph> are evaluated as follows: </p>
<olist>
<item>Let 
<emph>qname</emph> be the <termref def="key-qname">qname</termref> of the attribute matched by AT (*) </item>
<item>If <emph>qname</emph> is not xsi:type or xsi:nil, create a production of the form 
<emph>LeftHandSide</emph> : AT (<emph>qname</emph>) <emph>StartTagContent</emph>
with an event code 0 and increment the first part of the event code of each production in the current grammar with the non-terminal <emph>LeftHandSide</emph> on the left hand side. Add this production to the grammar.</item>
<item>
If <emph>qname</emph> is xsi:type, let <emph>type-qname</emph> be the value of the xsi:type attribute, and if a grammar exists for the <emph>type-qname</emph> type, evaluate the remainder of event sequence using the grammar for <emph>type-qname</emph> type instead of <emph>RightHandSide</emph>. 
Otherwise, evaluate the remainder of event sequence using <emph>RightHandSide</emph>.

</item>
</olist>

<p>All productions of the form <emph>LeftHandSide</emph> : SC <emph>Fragment</emph> are evaluated as follows: </p>
<olist>
<item>
Save the string table, grammars, namespace prefixes and any implementation-specific state learned while processing this EXI Body.
</item>
<item>Initialize the string table, grammars, namespace prefixes and any implementation-specific state learned while processing this EXI Body to the state they held just prior to processing this EXI Body.
</item>
<item>Skip to the next byte-aligned boundary in the stream.
</item>
<item>Let <emph>qname</emph> be the <termref def="key-qname">qname</termref> of the SE event immediately preceding this SC event.</item>
<item>Let <emph>content</emph> be the sequence of events following this SC event that match the grammar for element <emph>qname</emph>, up to and including the terminating EE event.</item>
<item>Evaluate the sequence of events (SD, SE(<emph>qname</emph>), <emph>content</emph>, ED) according to the <emph>Fragment</emph> grammar.
</item>
<item>Restore the string table, grammars, namespace prefixes and implementation-specific state learned while processing this EXI Body to that saved in step 1 above.
</item>
</olist>

<p>All productions in the built-in Element grammars of the form 
<emph>LeftHandSide</emph> : SE (*) <emph>RightHandSide</emph> are evaluated as follows: </p>
<olist>
<item>Let <emph>qname</emph> be the <termref def="key-qname">qname</termref> of the element matched by SE (*) </item>
<item>If a grammar does not exist for element 
<emph>qname</emph>, create one based on the <termref def="key-builtinElementGrammar">Built-in Element Grammar</termref></item>
<item>Evaluate the element contents using a built-in grammar for element <emph>qname</emph></item>
<item>Create a production of the form <emph>LeftHandSide</emph> : SE (<emph>qname</emph>) <emph>RightHandSide</emph> with an event code 0</item>
<item>Increment the first part of the event code of each production in the current grammar with the non-terminal <emph>LeftHandSide</emph> on the left hand side.</item>
<item>Add the production created in step 4 to the grammar</item>
<item>Evaluate the remainder of event sequence using <emph>RightHandSide</emph>.</item>
</olist>
<p>All productions of the form <emph>LeftHandSide</emph> : SE (<emph>qname</emph>) <emph>RightHandSide</emph> that were previously added to the grammar upon the first occurrence of the element that has the <termref def="key-qname">qname</termref> <emph>qname</emph> are evaluated as follows when they are matched: </p>
<olist>
<item>Evaluate the element contents using a built-in grammar for element <emph>qname</emph></item>
<item>Evaluate the remainder of event sequence using <emph>RightHandSide</emph>.</item>
</olist>
<p>All productions in the built-in Element grammar of the form 
<emph>LeftHandSide</emph> : CH 
<emph>RightHandSide</emph> are evaluated as follows: </p>
<olist>
<item>If a production of the form, 
<emph>LeftHandSide</emph> : CH 
<emph>RightHandSide</emph> with an event code of length 1 does not exist in the current element grammar, create one with event code 0 and increment the first part of the event code of each production in the current grammar with the non-terminal 
<emph>LeftHandSide</emph> on the left hand side. </item>
<item>Add the production created in step 1 to the grammar
</item>
<item>Evaluate the remainder of event sequence using <emph>RightHandSide</emph>.</item>
</olist>
</td></tr>
</tbody></table>

</div3></div2>
<div2 id="informedGrammars">
<head>Schema-informed Grammars</head>
<p>This section describes the schema-informed grammars used by EXI when schema information is available to describe the contents of the EXI stream. Schema-informed grammars are independent of any particular schema language and can be derived from W3C XML Schemas, RELAX NG schemas, DTDs or other 
schema languages 

for describing what is likely to occur in an EXI stream. </p>
<p>Schema-informed grammars accept all XML documents and fragments regardless of whether and how closely they match the schema. The encoder encodes individual events using schema-informed grammars where they are available and falls back to the built-in XML grammars where they are not. In general, events for which a schema-informed grammar exists will be encoded more efficiently. </p>
<p>Unlike built-in XML grammars, schema-informed grammars are static and do not evolve, which permits the reuse of schema-informed grammars across the processing of multiple EXI streams. This is a single outstanding difference between the two grammar systems.</p>
<!-- NOTE: merge into the paragraph above -->
<p> With such differences, however, their uses are not exclusive, but are connected together at individual grammar level. Of particular note is that built-in grammars that are called upon for schema-deviated parts 
are still subject to dynamic grammar learning during the rest of the EXI stream processing as is described in <specref ref="builtinFragGrammars"/>. 
</p>
<div3 id="informedDocGrammars">
<head>Schema-informed Document Grammar</head>
<p>When schema information is available to describe the contents of an EXI stream, the following grammar describes the events that will occur in an <termref def="key-exidocument">EXI document</termref>. </p>
<table width="100%">
<thead>
<tr>
<th align="left" colspan="3">Syntax</th>
<th align="left">Event Code</th></tr></thead>
<tbody><tr>
<td colspan="4">&nbsp;</td></tr>
<tr>
<td width="5%"></td>
<td colspan="3">
<emph>Document</emph> :</td></tr>
<tr>
<td></td>
<td width="5%"></td>
<td width="75%">SD 
<emph>DocContent</emph></td>
<td>0</td></tr>
<tr>
<td colspan="4">&nbsp;</td></tr>
<tr>
<td></td>
<td colspan="3">
<emph>DocContent</emph> :</td></tr>

<tr>
<td></td>
<td></td>
<td>SE (G <sub>0</sub>) <emph>DocEnd</emph></td>
<td>0</td></tr>
<tr>
<td></td>
<td></td>
<td>SE (G <sub>1</sub>) <emph>DocEnd</emph></td>
<td>1</td></tr>
<tr>
<td></td>
<td></td>
<td>&nbsp;&nbsp;&vellip;</td>
<td>&vellip;</td>
</tr>
<tr>
<td></td>
<td></td>
<td>SE (G <sub><emph>n</emph>-1</sub>) <emph>DocEnd</emph></td>
<td><emph>n</emph>-1</td></tr>
<!-- tr>
<td></td>
<td></td>
<td>SE (G <emph><sub>i</sub></emph>) <emph>DocEnd</emph></td>
<td><emph>i</emph></td></tr>
<tr>
<td></td>
<td></td>
<td colspan="2">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;∀G 
<sub>0 &le; i &lt; n</sub> ∈ {qnames of global elements declared in schema}</td></tr -->
<tr>
<td></td>
<td></td>
<td>SE (*) 
<emph>DocEnd</emph></td>
<td>
<emph>n</emph>.0</td></tr>
<tr>
<td></td>
<td></td>
<td>DT 
<emph>DocContent</emph></td>
<td>
<emph>n</emph>.1</td></tr>
<!-- tr>
<td></td>
<td></td>
<td>CH <emph>DocContent</emph></td>
<td>
<emph>n</emph>.2</td></tr -->
<tr>
<td></td>
<td></td>
<td>CM 
<emph>DocContent</emph></td>
<td>
<emph>n</emph>.2.0</td></tr>
<tr>
<td></td>
<td></td>
<td>PI 
<emph>DocContent</emph></td>
<td>
<emph>n</emph>.2.1</td></tr>
<tr>
<td colspan="4">&nbsp;</td></tr>
<tr>
<td></td>
<td colspan="3">
<table>
<tr><th colspan="2" align="left">N.B.</th></tr>
</table></td></tr>
<tr>
<td></td>
<td colspan="3"><ulist><item>The variable 
<emph>n</emph> in the grammar above is the number of global elements declared in the schema.
G <sub>0</sub>, G <sub>1</sub>, ... G <sub><emph>n</emph>-1</sub> represent all the qnames of global elements sorted lexicographically, first by <emph>localName</emph>, then by <emph>uri</emph>.</item></ulist>
<ulist><item>The terminal symbol SE (*) is only matched if a more specific match does not exist among SE(G <sub>0</sub>), SE(G <sub>1</sub>), ... SE(G <sub><emph>n</emph>-1</sub>).</item></ulist></td>
</tr>
<tr>
<td></td>
<td colspan="3">
<emph>DocEnd</emph> :</td></tr>
<tr>
<td></td>
<td></td>
<td>ED</td>
<td>0</td></tr>
<tr>
<td></td>
<td></td>
<td>CM 
<emph>DocEnd</emph></td>
<td>1.0</td></tr>
<tr>
<td></td>
<td></td>
<td>PI 
<emph>DocEnd</emph></td>
<td>1.1</td></tr></tbody></table>
<p></p>
<table>
<tbody>
<tr>
<th align="left">Semantics:</th></tr></tbody></table>
<p>All productions in the schema-informed document grammars of the form 
<emph>LeftHandSide</emph> : SE (*) 
<emph>RightHandSide</emph> are evaluated as follows: </p>
<olist>
<item>Let 
<emph>qname</emph> be the <termref def="key-qname">qname</termref> of the element matched by SE (*) </item>
<item>If a grammar does not exist for Element 
<emph>qname</emph>, create one based on the <termref def="key-builtinElementGrammar">Built-in Element Grammar</termref></item>
<item>Evaluate the element contents using the SE (<emph>qname</emph>) grammar </item></olist></div3>
<div3 id="informedFragGrammars">
<head>Schema-informed Fragment Grammar</head>
<p>When schema information is available to describe the contents of an EXI stream, the following grammar describes the events that will occur in an <termref def="key-exifragment">EXI fragment</termref>. </p>
<table width="100%">
<thead>
<tr>
<th align="left" colspan="3">Syntax</th>
<th align="left">Event Code</th></tr></thead>
<tbody><tr>
<td colspan="4">&nbsp;</td></tr>
<tr>
<td width="5%"></td>
<td colspan="3">
<emph>Fragment</emph> :</td></tr>
<tr>
<td></td>
<td width="5%"></td>
<td width="75%">SD 
<emph>FragmentContent</emph></td>
<td>0</td></tr>
<tr>
<td colspan="4">&nbsp;</td></tr>
<tr>
<td></td>
<td colspan="3">
<emph>FragmentContent</emph> :</td></tr>
<tr>
<td></td>
<td></td>
<td>SE (F <sub>0</sub>) <emph>FragmentContent</emph></td>
<td>0</td></tr>
<tr>
<td></td>
<td></td>
<td>SE (F <sub>1</sub>) <emph>FragmentContent</emph></td>
<td>1</td></tr>
<tr>
<td></td>
<td></td>
<td>&nbsp;&nbsp;&vellip;</td>
<td>&vellip;</td>
</tr>
<tr>
<td></td>
<td></td>
<td>SE (F <sub><emph>n</emph>-1</sub>) <emph>FragmentContent</emph></td>
<td><emph>n</emph>-1</td></tr>
<!-- tr>
<td></td>
<td></td>
<td>SE (Fi) 
<emph>FragmentContent</emph></td>
<td>
<emph>i</emph></td></tr>
<tr>
<td></td>
<td></td>
<td colspan="2">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;∀F 
<sub>0 &le; i &lt; n</sub> ∈ {unique qnames for all elements declared in schema}</td></tr -->
<tr>
<td></td>
<td></td>
<td>ED</td>
<td>
<emph>n</emph></td></tr>
<tr>
<td></td>
<td></td>
<td>SE (*) 
<emph>FragmentContent</emph></td>
<td>(<emph>n</emph>+1).0</td></tr>
<tr>
<td></td>
<td></td>
<td>CM 
<emph>FragmentContent</emph></td>
<td>(<emph>n</emph>+1).1.0</td></tr>
<tr>
<td></td>
<td></td>
<td>PI 
<emph>FragmentContent</emph></td>
<td>(<emph>n</emph>+1).1.1</td></tr>
<tr>
<td colspan="4">&nbsp;</td></tr>
<tr>
<td></td>
<td colspan="3">
<table>
<tr><th colspan="2" align="left">N.B.</th></tr>
</table></td></tr>
<tr>
<td></td>
<td colspan="3">
<ulist><item>The variable <emph>n</emph> in the grammar above represents the number of unique element qnames declared in the schema. The variables F <sub>0</sub>, F <sub>1</sub>, ... F <sub><emph>n</emph>-1</sub> represent these qnames sorted lexicographically, first by <emph>localName</emph>, then by <emph>uri</emph>. If there is more than one element declared with the same qname, the qname is included only once and its content is evaluated according to the relaxed Element Fragment grammar described in <specref ref="informedElementFragGrammar"/>.</item>
</ulist><ulist>
<item>The terminal symbol SE (*) is only matched if a more specific match does not exist among SE(F <sub>0</sub>), SE(F <sub>1</sub>), ... SE(F <sub><emph>n</emph>-1</sub>).</item>
</ulist></td>
</tr></tbody></table>
<table>
<tbody>
<tr>
<th align="left">Semantics:</th></tr></tbody></table>
<p>All productions in the schema-informed fragment grammars of the form 
<emph>LeftHandSide</emph> : SE (*) 
<emph>RightHandSide</emph> are evaluated as follows: </p>
<olist>
<item>Let 
<emph>qname</emph> be the <termref def="key-qname">qname</termref> of the element matched by SE (*) </item>
<item>If a grammar does not exist for Element 
<emph>qname</emph>, create one based on the <termref def="key-builtinElementGrammar">Built-in Element Grammar</termref></item>
<item>Evaluate the element contents using the SE (<emph>qname</emph>) grammar </item></olist>
</div3>


<div3 id="informedElementFragGrammar">
<head>Schema-informed Element Fragment Grammar</head>
<p>When schema information is available to describe the contents of an EXI stream and more than one element is declared with the same qname, the following grammar describes the events that may occur in these elements when they occur inside an EXI fragment or EXI Element Fragment. </p>
<table width="100%">
<thead>
<tr>
<th align="left" colspan="3">Syntax</th>
<th align="left">Event Code</th></tr></thead>
<tbody>
<tr>
<td colspan="4">&nbsp;</td></tr>
<tr>
<td width="5%"></td>
<td colspan="3">
<emph>ElementFragmentStartTag</emph> :</td></tr>
<tr>
<td></td>
<td width="5%"></td>
<td width="75%">AT (A <sub>0</sub>) <emph>ElementFragmentStartTag</emph></td>
<td>0</td></tr>
<tr>
<td></td>
<td></td>
<td>AT (A <sub>1</sub>) <emph>ElementFragmentStartTag</emph></td>
<td>1</td></tr>
<tr>
<td></td>
<td></td>
<td>&nbsp;&nbsp;&vellip;</td>
<td>&vellip;</td>
</tr>
<tr>
<td></td>
<td></td>
<td>AT (A <sub><emph>n</emph>-1</sub>) <emph>ElementFragmentStartTag</emph></td>
<td><emph>n</emph>-1</td></tr>
<tr>
<td></td>
<td></td>
<td>SE (F <sub>0</sub>) <emph>ElementFragmentContent</emph></td>
<td><emph>n</emph></td></tr>
<tr>
<td></td>
<td></td>
<td>SE (F <sub>1</sub>) <emph>ElementFragmentContent</emph></td>
<td><emph>n</emph>+1</td></tr>
<tr>
<td></td>
<td></td>
<td>&nbsp;&nbsp;&vellip;</td>
<td>&vellip;</td>
</tr>
<tr>
<td></td>
<td></td>
<td>SE (F <sub><emph>m</emph>-1</sub>) <emph>ElementFragmentContent</emph></td>
<td><emph>n</emph>+<emph>m</emph>-1</td></tr>
<tr>
<td></td>
<td></td>
<td>EE</td>
<td>
<emph>n</emph>+<emph>m</emph></td></tr>
<tr>
<td></td>
<td></td>
<td>CH <emph>ElementFragmentContent</emph></td>
<td>
<emph>n</emph>+<emph>m</emph>+1</td></tr>
<tr>
<td colspan="4">&nbsp;</td></tr>

<tr>
<td></td>
<td colspan="3">
<emph>ElementFragmentContent</emph> :</td></tr>
<tr>
<td></td>
<td></td>
<td>SE (F <sub>0</sub>) <emph>ElementFragmentContent</emph></td>
<td><emph>0</emph></td></tr>
<tr>
<td></td>
<td></td>
<td>SE (F <sub>1</sub>) <emph>ElementFragmentContent</emph></td>
<td>1</td></tr>
<tr>
<td></td>
<td></td>
<td>&nbsp;&nbsp;&vellip;</td>
<td>&vellip;</td>
</tr>
<tr>
<td></td>
<td></td>
<td>SE (F <sub><emph>m</emph>-1</sub>) <emph>ElementFragmentContent</emph></td>
<td><emph>m</emph>-1</td></tr>
<tr>
<td></td>
<td></td>
<td>EE</td>
<td>
<emph>m</emph></td></tr>
<tr>
<td></td>
<td></td>
<td>CH <emph>ElementFragmentContent</emph></td>
<td>
<emph>m</emph>+1</td></tr>
<tr>
<td colspan="4">&nbsp;</td></tr>

<tr>
<td></td>
<td colspan="3">
<table>
<tr><th colspan="2" align="left">N.B.</th></tr>
</table></td></tr>
<tr>
<td></td>
<td colspan="3">
<ulist><item>The variable <emph>n</emph> in the grammar above represents the number of unique attribute qnames declared in the schema. The variables A <sub>0</sub>, A <sub>1</sub>, ... A <sub><emph>n</emph>-1</sub> represent these qnames sorted lexicographically, first by <emph>localName</emph>, then by <emph>uri</emph>. If there is more than one attribute declared with the same qname, the qname is included only once and its <emph>value</emph> is evaluated as a String.</item>
</ulist>
<ulist><item>The variable <emph>m</emph> in the grammar above represents the number of unique element qnames declared in the schema. The variables F <sub>0</sub>, F <sub>1</sub>, ... F <sub><emph>m</emph>-1</sub> represent these qnames sorted lexicographically, first by <emph>localName</emph>, then by <emph>uri</emph>. If there is more than one element declared with the same qname, the qname is included only once and its content is evaluated according to the relaxed Element Fragment grammar described above.</item>
</ulist>

</td>
</tr></tbody></table>
<p>As with all schema informed element grammars, the Element Fragment grammar is augmented with additional productions that describe events that may occur in an EXI stream, but are not explicity declared in the schema. The process for augmenting the grammar is described in <specref ref="undeclaredProductions"/>.</p>
</div3>

<div3 id="informedElemGrammars">
<head>Schema-informed Element and Type Grammars</head>
<p><termdef id="key-informedElementGrammar" term="Schema-informed Element Grammar">When one or more XML Schema is available to describe the contents of an EXI stream, a <term>schema-informed element grammar</term> <emph>Element</emph><sub>&nbsp;i&nbsp;</sub> is derived for each element declaration <emph>E</emph><sub>&nbsp;i&nbsp;</sub> described by the schemas, where 0 &le; <emph>i</emph> &lt; <emph>n</emph> and <emph>n</emph> is the number of element declarations in the schema.</termdef>
</p>
<p><termdef id="key-informedTypeGrammar" term="Schema-informed Type Grammar">When one or more XML Schema is available to describe the contents of an EXI stream, a <term>schema-informed type grammar</term> <emph>Type</emph><sub>&nbsp;i&nbsp;</sub> is derived 
for each named type declaration <emph>T</emph><sub>&nbsp;i&nbsp;</sub> described by the schemas as well as for each of the <xspecref spec="XS2" ref="built-in-primitive-datatypes">built-in primitive types</xspecref> and <xspecref spec="XS2" ref="built-in-derived">built-in derived types</xspecref>, the <xspecref spec="XS1" ref="key-urType">complex ur-type</xspecref> and the <xspecref spec="XS2" ref="dt-anySimpleType">simple ur-type</xspecref> defined by XML Schema specification <bibref ref="schema1"/><bibref ref="schema2"/>, where 0 &le; <emph>i</emph> &lt; <emph>n</emph> and <emph>n</emph> is the total number of such available types.
</termdef>
</p>
<p>Each schema-informed element grammar and type grammar is constructed according to the following four steps:</p>
<ol>
<li>Create a proto-grammar that describes the content model according to available schema information (see section <specref ref="protoGrammars"/>). 
</li>
<li>Normalize the proto-grammar into an EXI grammar (see section <specref ref="normalizedGrammars"/>).</li>
<li>Assign event codes to each production in the normalized EXI grammar (see section <specref ref="eventCodeAssignment"/>). 
</li>
<li>Add additional productions to the normalized EXI grammar to represent events that may occur in the EXI stream, but are not described by the schema, such as comments, processing-instructions, schema-deviations, etc. (see section <specref ref="undeclaredProductions"/>). 
</li></ol>
<p>Each element grammar <emph>Element</emph><sub>&nbsp;i&nbsp;</sub> includes a sequence of <emph>n</emph> non-terminals <emph>Element</emph><sub>&nbsp;i,&nbsp;j&nbsp;</sub>, where 0 &le; <emph>j</emph> &lt; <emph>n</emph>. The content of the entire element is described by the first non-terminal <emph>Element</emph><sub>&nbsp;i,&nbsp;0&nbsp;</sub>. The remaining non-terminals describe portions of the element content. Likewise, each type grammar <emph>Type</emph><sub>&nbsp;i&nbsp;</sub> includes a sequence of <emph>n</emph> non-terminals <emph>Type</emph><sub>&nbsp;i,&nbsp;j&nbsp;</sub> and the content of the entire type is described by the first non-terminal <emph>Type</emph><sub>&nbsp;i,&nbsp;0&nbsp;</sub>.</p>
<p>The algorithms expressed in this section provide a concise and formal description of the EXI grammars for a given set of XML Schema definitions. More efficient algorithms likely exist for generating these EXI grammars and EXI implementations are free to use any algorithm that produces grammars and event codes that generate EXI encodings that match those produced by the grammars described here. </p>
<p>
An example is provided in the appendix (see <specref ref="grammarExamples"/>) that demonstrates the process described in this section to generate a complete schema-informed element grammar from an element declaration
in a schema.
</p>
<div4 id="protoGrammars">
<head>EXI Proto-Grammars</head>
<p>This section describes the process for creating the EXI proto-grammars from XML Schema declarations and definitions. EXI proto-grammars differ from normalized EXI grammars in that they may contain productions of the form:</p>
<table width="100%">
<tbody>
<tr>
<td width="5%"></td>
<td colspan="2">
<emph>LeftHandSide</emph> :</td></tr>
<tr>
<td></td>
<td width="5%"></td>
<td><emph>RightHandSize</emph></td>
</tr>
</tbody></table>
<p>where <emph>LeftHandSide</emph> and <emph>RightHandSide</emph> are both non-terminals. Whereas, all productions in a normalized EXI grammar contain exactly one terminal symbol and at most one non-terminal symbol on the right hand side. This is a restricted form of Greibach normal form <bibref ref="greibach"/>. </p>
<p>EXI proto-grammars are derived from XML Schema in a straight-forward manner and can easily be normalized with simple algorithm (see <specref ref="normalizedGrammars"/>).
</p>
<div5 id="grammarConcatOperator">
<head>Grammar Concatenation Operator</head>
<p>Proto-grammars are specified in a modular, constructive fashion. XML Schema components such as terms, particles, attribute uses are transformed each into a distinct proto-grammar, leveraging proto-grammars of their sub-components. At various stages of proto-grammar construction, two or more of proto-grammars are concatenated one after another to form more composite grammars.
</p>
<p>The grammar concatenation operator &oplus; is a binary, associative operator that creates a new grammar from its left and right grammar operands. The new grammar accepts any set of symbols accepted by its left operand followed by any set of symbols accepted by its right operand.
</p>
<p>Given a left operand <emph>Grammar<sup>&nbsp;L</sup></emph> and a right operand <emph>Grammar<sup>&nbsp;R</sup></emph>, the following operation
</p>
<table width="100%">
<tbody>
<tr>
<td width="5%"></td>
<td>
<emph>Grammar<sup>&nbsp;L</sup></emph> &oplus; <emph>Grammar<sup>&nbsp;R</sup></emph>
</td>
</tr>
</tbody></table>
<p>creates a combined grammar by replacing each production of the form
</p>
<table width="100%">
<tbody>
<tr>
<td width="5%"></td>
<td colspan="2">
<emph>Grammar<sup>&nbsp;L</sup></emph><sub>k</sub> :
</td>
</tr>
<tr>
<td></td>
<td width="5%"></td>
<td>
EE
</td>
</tr>
</tbody></table>
<p>where 0 &le; <emph>k</emph> &lt; <emph>n</emph> and <emph>n</emph> is the number of non-terminals that occur on the left hand side of productions in <emph>Grammar<sup>&nbsp;L</sup></emph>, with a production of the form
</p>
<table width="100%">
<tbody>
<tr>
<td width="5%"></td>
<td colspan="2">
<emph>Grammar<sup>&nbsp;L</sup></emph><sub>k</sub> : 
</td>
</tr>
<tr>
<td></td>
<td width="5%"></td>
<td>
<emph>Grammar<sup>&nbsp;R</sup></emph><sub>0</sub>
</td>
</tr>
</tbody></table>
<p>connecting each accept state of <emph>Grammar<sup>&nbsp;L</sup></emph> with the start state of <emph>Grammar<sup>&nbsp;R</sup></emph>.
</p>
</div5>
<div5 id="elementGrammars">
<head>Element Grammars</head>
<p>This section describes the process for creating an EXI element grammar from an XML Schema <xspecref spec="XS1" ref="cElement_Declarations">element declaration</xspecref>. </p>
<p>Given an element declaration <emph>E</emph><sub>&nbsp;i&nbsp;</sub>, with properties {name}, {target namespace}, {type definition}, {scope} and {nillable}, create a corresponding EXI grammar <emph>Element</emph><sub>&nbsp;i&nbsp;</sub> for evaluating the contents of elements in the specified {scope} with <emph>qname</emph> <emph>localName</emph> = {name} and <emph>qname</emph> <emph>uri</emph>  = {target namespace} .</p>

<p>Let <emph>T</emph><sub>&nbsp;j</sub> be the {type definition} of <emph>E</emph><sub>&nbsp;i&nbsp;</sub> and <emph>Type</emph><sub>&nbsp;j</sub> be the type grammar created from <emph>T</emph><sub>&nbsp;j&nbsp;</sub>. The grammar <emph>Element</emph><sub>&nbsp;i&nbsp;</sub> describing the content model of <emph>E</emph><sub>&nbsp;i&nbsp;</sub> is created as follows.
</p>
<table width="100%">
<thead>
<tr>
<th>Syntax:</th>
<th colspan="2">&nbsp;</th>
</tr>
</thead>
<tbody>
<tr>
<td width="5%"></td>
<td colspan="2">
<emph>Element</emph><sub>&nbsp;i,&nbsp;0</sub> :
</td></tr>
<tr>
<td></td>
<td width="5%"></td>
<td>
<emph>Type</emph><sub>&nbsp;j,&nbsp;0</sub>
</td>
</tr>
<tr>
<td colspan="3">&nbsp;</td>
</tr>
</tbody></table>

</div5>
<div5 id="typeGrammars">
<head>Type Grammars</head>
<p>Given an XML Schema type definition <emph>T</emph><sub>&nbsp;i&nbsp;</sub>, two type grammars are created, which are denoted by <emph>Type</emph><sub>&nbsp;i&nbsp;</sub> and <emph>TypeEmpty</emph><sub>&nbsp;i&nbsp;</sub>. <emph>Type</emph><sub>&nbsp;i</sub> is a grammar that fully reflects the type definition of <emph>T</emph><sub>&nbsp;i&nbsp;</sub>, whereas <emph>TypeEmpty</emph><sub>&nbsp;i</sub> is a grammar that accepts only the attribute uses and attribute wildcards of <emph>T</emph><sub>&nbsp;i&nbsp;</sub>, if any.
</p>
<p>
<termdef id="key-contentIndex" term="content">For each type grammar <emph>Type</emph><sub>&nbsp;i&nbsp;</sub>, an unique index number <term><emph>content</emph></term> is determined such that all non-terminal symbols of indices smaller than <emph>content</emph> have at least one AT event and the rest of the non-terminal symbols in <emph>Type</emph><sub>&nbsp;i&nbsp;</sub> do not have AT events on their right-hand-side, where indices are assigned to non-terminal symbols in ascending order with the entry non-terminal symbol of <emph>Type</emph><sub>&nbsp;i&nbsp;</sub> being assined index 0 (zero).</termdef>
</p>
<p>
Sections <specref ref="simpleTypeGrammars"/> and <specref ref="complexTypeGrammars"/> describe the processes for creating <emph>Type</emph><sub>&nbsp;i&nbsp;</sub> and <emph>TypeEmpty</emph><sub>&nbsp;i&nbsp;</sub> from XML Schema <xspecref spec="XS1" ref="Simple_Type_Definitions">simple type definitions</xspecref> and <xspecref spec="XS1" ref="Complex_Type_Definitions">complex type definitions</xspecref> defined in schemas as well as <xspecref spec="XS2" ref="built-in-primitive-datatypes">built-in primitive types</xspecref>, <xspecref spec="XS2" ref="built-in-derived">built-in derived types</xspecref> and <xspecref spec="XS2" ref="dt-anySimpleType">simple ur-type</xspecref> defined by XML Schema specification <bibref ref="schema2"/>.

Section <specref ref="anyTypeGrammar"/> defines the grammar used for processing instances of element contents of type <xspecref spec="XS1" ref="d0e9252">xsd:anyType</xspecref>.
</p>

<div6 id="simpleTypeGrammars">
<head>SimpleType Grammars</head>
<p>This section describes the process for creating an EXI type grammar from an XML Schema <xspecref spec="XS1" ref="Simple_Type_Definitions">simple type definition</xspecref>.</p>
<p>Given a simple type definition <emph>T</emph><sub>&nbsp;i&nbsp;</sub>, with properties {name} and {target namespace},
 create two new EXI grammars <emph>Type</emph><sub>&nbsp;i</sub> and <emph>TypeEmpty</emph><sub>&nbsp;i</sub> for evaluating instances of types with qname localName = {name} and qname uri = {target namespace}.</p>
<p>Add the following grammar productions to <emph>Type</emph><sub>&nbsp;i&nbsp;</sub> and  <emph>TypeEmpty</emph><sub>&nbsp;i</sub> : </p>

<table width="100%">
<thead>
<tr>
<th>Syntax:</th>
<th colspan="2">&nbsp;</th>
</tr>
</thead>
<tbody>
<tr>
<td width="5%"></td>
<td colspan="2">
<emph>Type</emph><sub>&nbsp;i,&nbsp;0&nbsp;</sub> :</td></tr>
<tr>
<td></td>
<td width="5%"></td>
<td>CH [schema-valid
value
] <emph>Type</emph><sub>&nbsp;i,&nbsp;1&nbsp;</sub></td>
</tr>
<tr>
<td></td>
<td colspan="2">
<emph>Type</emph><sub>&nbsp;i,&nbsp;1&nbsp;</sub> :</td></tr>
<tr>
<td></td>
<td></td>
<td>EE</td>
</tr>
<tr>
<td colspan="3">&nbsp;</td>
</tr>
<tr>
<td></td>
<td colspan="2">
<emph>TypeEmpty</emph><sub>&nbsp;i,&nbsp;0&nbsp;</sub> :
</td></tr>
<tr>
<td></td>
<td></td>
<td>
EE
</td>
</tr>
<tr>
<td colspan="3">&nbsp;</td>
</tr>
</tbody></table>
<table>
<tbody>
<tr>
<th align="left">Note:</th>
</tr>
<tr>
<td></td>
<td>
Productions of the form <emph>LeftHandSide</emph> : CH [schema-valid
value
] <emph>RightHandSide</emph> represent typed character data that is valid with respect to the schema. Schema-invalid character data is represented by productions of the form <emph>LeftHandSide</emph> : CH [schema-invalid
value
] <emph>RightHandSide</emph> described in section <specref ref="addingProductions"/>.
</td></tr>
</tbody></table>
<p>
The <termref def="key-contentIndex"><emph>content</emph></termref> index of grammar <emph>Type</emph><sub>&nbsp;i&nbsp;</sub> created from an XML Schema simple type definition is always 0 (zero).
</p>

</div6>
<div6 id="complexTypeGrammars">
<head>Complex Type Grammars</head>
<p>This section describes the process for creating an EXI type grammar from an XML Schema <xspecref spec="XS1" ref="Complex_Type_Definitions">complex type definition</xspecref>.</p>
<p>Given a complex type definition <emph>T</emph><sub>&nbsp;i&nbsp;</sub>, with properties {name}, {target namespace}, 
{attribute uses}, {attribute wildcard} and {content type}, 
 create two EXI grammars <emph>Type</emph><sub>&nbsp;i&nbsp;</sub> and <emph>TypeEmpty</emph><sub>&nbsp;i&nbsp;</sub> for evaluating instances of types with <emph>qname</emph> local-name = {name} and <emph>qname</emph> uri = {target namespace}
 , as follows .
</p>
<!-- p>Given a complex type definition <emph>T</emph><sub>&nbsp;i</sub> , with properties {name}, {target namespace}, {base type definition}, {derivation method}, {abstract}, {attribute uses}, {attribute wildcard} and {content type}, create two EXI grammars <emph>Type</emph><sub>&nbsp;i</sub> and <emph>TypeEmpty</emph><sub>&nbsp;i</sub> for evaluating instances of types with <emph>qname</emph> localName = {name} and <emph>qname</emph> uri = {target namespace}, as follows.
</p -->
<p>Generate a grammar <emph>Attribute</emph><sub>&nbsp;i&nbsp;</sub>, for each attribute use <emph>A</emph><sub>&nbsp;i</sub> in {attribute uses} according to section <specref ref="attributeUses"/>.
</p>
<p>Sort the attribute use grammars first by <emph>qname</emph> local-name, then by <emph>qname</emph> uri to form a sequence of grammars <emph>G</emph><sub>&nbsp;0&nbsp;</sub>, <emph>G</emph><sub>&nbsp;1&nbsp;</sub>, &hellip;, <emph>G</emph><sub>&nbsp;n-1&nbsp;</sub>, where <emph>n</emph> is the number of attribute uses in {attribute uses}. 
</p>
<p>
Generate an additional attribute use grammar <emph>G</emph><sub>&nbsp;n&nbsp;</sub> as follows:
</p>
<table width="100%">
<tbody>
<tr>
<td width="5%"></td>
<td colspan="2">
<emph>G</emph><sub>&nbsp;n,&nbsp;0&nbsp;</sub> :</td></tr>
<tr>
<td></td>
<td width="5%"></td>
<td>EE</td>
</tr>
</tbody></table>
<p>
 If an {attribute wildcard} is specified with the value <emph>any</emph>, add the following production to each grammar 
<emph>G</emph><sub>&nbsp;i&nbsp;</sub> 

generated above:
</p>
<table width="100%">
<tbody>
<tr>
<td width="5%"></td>
<td colspan="2">
<emph>G</emph><sub>&nbsp;i,&nbsp;0&nbsp;</sub> :</td></tr>
<tr>
<td></td>
<td width="5%"></td>
<td>AT(*) <emph>G</emph><sub>&nbsp;i,&nbsp;0&nbsp;</sub></td>
</tr>
</tbody></table>

<p>
If an {attribute wildcard} is specified with a set of values whose members are namespace names or the special value <emph>absent</emph> indicating no namespace, add the following production to each grammar 
<emph>G</emph><sub>&nbsp;i&nbsp;</sub> 
generated above:
</p>
<table width="100%">
<tbody>
<tr>
<td width="5%"></td>
<td colspan="2">
<emph>G</emph><sub>&nbsp;i,&nbsp;0&nbsp;</sub> :</td></tr>
<tr>
<td></td>
<td width="5%"></td>
<td>AT(<emph>uri<sub>x</sub></emph> : *) <emph>G</emph><sub>&nbsp;i,&nbsp;0&nbsp;</sub></td>
</tr>
</tbody></table>

<p>for each member <emph>uri</emph><sub><emph>i</emph></sub> in {attribute wildcard}.</p>

<p>Note that productions of the form <emph>LeftHandSide</emph> : AT(*) <emph>RightHandSide</emph> and <emph>LeftHandSide</emph> : AT (<emph>uri<sub>x</sub></emph> : *) are only matched if a more specific match for the current event does not exist in the grammar.
</p>
<p>
The grammar <emph>TypeEmpty</emph><sub>&nbsp;i</sub> is created by combining the sequence of attribute use grammars as follows:
</p>
<table width="100%">
<tbody>
<tr>
<td width="5%"></td>
<td>
<emph>TypeEmpty</emph><sub>&nbsp;i</sub> = <emph>G</emph><sub>&nbsp;0</sub> &oplus; <emph>G</emph><sub>&nbsp;1</sub> &oplus; &hellip; &oplus; <emph>G</emph><sub>&nbsp;n</sub>
</td>
</tr>
</tbody></table>

<p>
The grammar <emph>Type</emph><sub>&nbsp;i</sub> is generated as follows.
</p>

<p>If {content type} is a simple type definition
<emph>T</emph><sub>&nbsp;j</sub>
, generate a grammar <emph>Content</emph><sub>&nbsp;i</sub>
as <emph>Type</emph><sub>&nbsp;j</sub>
according to section <specref ref="simpleTypeGrammars"/>. 
If {content type} has a content model particle, generate a grammar <emph>Content</emph><sub>&nbsp;i</sub> according to section <specref ref="particles"/>.
Otherwise, if {content type} is <emph>empty</emph>, 
create a grammar <emph>Content</emph><sub>&nbsp;i</sub> as follows:
</p>
<table width="100%">
<tbody>
<tr>
<td width="5%"></td>
<td colspan="2">
<emph>Content</emph><sub>&nbsp;i</sub> :
</td></tr>
<tr>
<td></td>
<td width="5%"></td>
<td>
EE
</td>
</tr>
</tbody></table>

<p>
If {content type} is a content model particle with mixed content, add a production for each non-terminal <emph>Content</emph><sub>&nbsp;i&nbsp;,&nbsp;j&nbsp;</sub> in <emph>Content</emph><sub>&nbsp;i&nbsp;</sub> as follows:
</p>
<table width="100%">
<tbody>
<tr>
<td width="5%"></td>
<td colspan="2">
<emph>Content</emph><sub>&nbsp;i,&nbsp;j&nbsp;</sub> :</td></tr>
<tr>
<td></td>
<td width="5%"></td>
<td>CH <emph>Content</emph><sub>&nbsp;i,&nbsp;j&nbsp;</sub></td>
</tr>
</tbody></table>
<!-- table width="100%">
<tbody>
<tr>
<td width="5%"></td>
<td colspan="2">
<emph>Content</emph><sub>&nbsp;i,&nbsp;0&nbsp;</sub> :</td></tr>
<tr>
<td></td>
<td width="5%"></td>
<td>AT(*) <emph>Content</emph><sub>&nbsp;i,&nbsp;0&nbsp;</sub></td>
</tr>
</tbody></table -->
<p>Then, create a copy <emph>H</emph><sub>&nbsp;i&nbsp;</sub> of each attribute use grammar <emph>G</emph><sub>&nbsp;i&nbsp;</sub> and create the grammar <emph>Type</emph><sub>&nbsp;i</sub> by combining this sequence of attribute use grammars and the <emph>Content</emph><sub>&nbsp;i</sub> grammar using the grammar concatenation operator defined in section <specref ref="grammarConcatOperator"/> as follows:
</p>
<table width="100%">
<tbody>
<tr>
<td width="5%"></td>
<td>
<emph>Type</emph><sub>&nbsp;i</sub> = <emph>H</emph><sub>&nbsp;0</sub> &oplus; <emph>H</emph><sub>&nbsp;1</sub> &oplus; &hellip; &oplus; <emph>H</emph><sub>&nbsp;n</sub> &oplus; <emph>Content</emph><sub>&nbsp;i</sub>
</td>
</tr>
</tbody></table>
<p>
The <termref def="key-contentIndex"><emph>content</emph></termref> index of grammar <emph>Type</emph><sub>&nbsp;i&nbsp;</sub> created from an XML Schema complex type definition is the index of the first non-terminal symbol of <emph>Content</emph><sub>&nbsp;i</sub> within the context of <emph>Type</emph><sub>&nbsp;i&nbsp;</sub>.
</p>

</div6>
<div6 id="anyTypeGrammar">
<head>Complex Ur-Type Grammar</head>
<p>
XML Schema <bibref ref="schema1"/> defines a 
<xspecref spec="XS1" ref="key-urType">complex ur-type</xspecref>
called <xspecref spec="XS1" ref="d0e9252">xsd:anyType</xspecref>, which is the default type for declared elements when no type is specified in the declaration. The type xsd:anyType can  be used as the type of declared elements in schemas, or as the explicit type given to elements by means of xsi:type attribute in <termref def="key-schemainformed-existream">schema-informed EXI streams</termref>.
</p>
<p>When schemas are available to describe the body of an EXI stream, create an ur-type grammar  <emph>UrType</emph> that is used to process the element contents of type <xspecref spec="XS1" ref="d0e9252">xsd:anyType</xspecref> as follows.
</p>
<table width="100%">
<tbody>
<tr>
<td width="5%"></td>
<td colspan="2">
<emph>Ur-Type</emph><sub>&nbsp;0</sub> :</td>
</tr>
<tr>
<td></td>
<td width="5%"></td>
<td>
AT(*) &nbsp;<emph>Ur-Type</emph><sub>&nbsp;0</sub>
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
SE(*) &nbsp;<emph>Ur-Type</emph><sub>&nbsp;1</sub>
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
EE
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
CH &nbsp;<emph>Ur-Type</emph><sub>&nbsp;1</sub>
</td>
</tr>
<tr>
<td colspan="3">&nbsp;</td>
</tr>
<tr>
<td width="5%"></td>
<td colspan="2">
<emph>Ur-Type</emph><sub>&nbsp;1</sub> :</td>
</tr>
<tr>
<td></td>
<td width="5%"></td>
<td>
SE(*) &nbsp;<emph>Ur-Type</emph><sub>&nbsp;1</sub>
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
EE
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
CH &nbsp;<emph>Ur-Type</emph><sub>&nbsp;1</sub>
</td>
</tr>
</tbody></table>
<p>
The <termref def="key-contentIndex"><emph>content</emph></termref> index of grammar <emph>UrType</emph> is always 1 (one).
</p>
</div6>
</div5>
<div5 id="attributeUses">
<head>Attribute Uses</head>
<p>
Given an attribute use <emph>A</emph><sub>&nbsp;i</sub> with properties {required} and {attribute declaration}, where {attribute declaration} has properties {name}, {target namespace} and {scope}, generate a new EXI grammar <emph>Attribute</emph><sub>&nbsp;i</sub> for evaluating attributes in the specified {scope} with qname localName = {name} and qname uri = {target namespace}. Add the following grammar productions to <emph>Attribute</emph><sub>&nbsp;i&nbsp;</sub>:
</p>
<table width="100%">
<tbody>
<tr>
<td width="5%"></td>
<td colspan="2">
<emph>Attribute</emph><sub>&nbsp;i,&nbsp;0&nbsp;</sub> :</td>
</tr>
<tr>
<td></td>
<td width="5%"></td>
<td>AT(<emph>qname</emph>) <emph>Attribute</emph><sub>&nbsp;i,&nbsp;1&nbsp;</sub></td>
</tr>
<tr>
<td colspan="3">&nbsp;</td>
</tr>
<tr>
<td></td>
<td colspan="2">
<emph>Attribute</emph><sub>&nbsp;i,&nbsp;1&nbsp;</sub> :</td></tr>
<tr>
<td></td>
<td></td>
<td>EE</td>
</tr>
</tbody></table>
<p>If the {required} property of <emph>A</emph><sub>&nbsp;i</sub> is false, add the following grammar production to indicate this attribute occurrence may be omitted from the content model.
</p>
<table width="100%">
<tbody>
<tr>
<td width="5%"></td>
<td colspan="2">
<emph>Attribute</emph><sub>&nbsp;i,&nbsp;0&nbsp;</sub> :</td>
</tr>
<tr>
<td></td>
<td width="5%"></td>
<td>EE</td>
</tr>
</tbody></table>
</div5>
<div5 id="particles">
<head>Particles</head>
<p>Given 
an XML Schema <xspecref spec="XS1" ref="cParticles">particle</xspecref> 
<emph>P</emph><sub>&nbsp;i</sub> with {min occurs}, {max occurs} and {term} properties, generate a grammar <emph>Particle</emph><sub>&nbsp;i</sub> for evaluating instances of <emph>P</emph><sub>&nbsp;i</sub> as follows.
</p>
<p>If {term} is an element declaration, generate the grammar <emph>Term</emph><sub>&nbsp;0</sub> according to section <specref ref="elementTerms"/>. If {term} is a wildcard, generate the grammar <emph>Term</emph><sub>&nbsp;0</sub> according to section <specref ref="wildcardTerms"/> Wildcard Terms. If {term} is a model group, generate the grammar <emph>Term</emph><sub>&nbsp;0</sub> according to section <specref ref="modelGroupTerms"/>.
</p>
<p>Create {min occurs} copies of <emph>Term</emph><sub>&nbsp;0&nbsp;</sub>.
</p>
<table width="100%">
<tbody>
<tr>
<td width="5%"></td>
<td>
<emph>G</emph><sub>&nbsp;0&nbsp;</sub>, <emph>G</emph><sub>&nbsp;1&nbsp;</sub>, &hellip;, <emph>G</emph><sub>&nbsp;{min occurs}-1&nbsp;</sub>
</td>
</tr>
</tbody></table>
<p>If {max occurs} is not unbounded, create {max occurs} – {min occurs} additional copies of <emph>Term</emph><sub>&nbsp;0&nbsp;</sub>,  </p>
<table width="100%">
<tbody>
<tr>
<td width="5%"></td>
<td>
<emph>G</emph><sub>&nbsp;{min occurs}&nbsp;</sub>, <emph>G</emph><sub>&nbsp;{min occurs}+1&nbsp;</sub>, &hellip;, <emph>G</emph><sub>&nbsp;{max occurs}-1</sub>
</td>
</tr>
</tbody></table>
<p>Add the following productions to each of the grammars that do not already have a production of this form.
</p>
<table width="100%">
<tbody>
<tr>
<td width="5%"></td>
<td colspan="2">
<emph>G</emph><sub>&nbsp;i,&nbsp;0&nbsp;</sub> :</td>
</tr>
<tr>
<td></td>
<td width="5%"></td>
<td>EE &nbsp;&nbsp;&nbsp;&nbsp;where {min occurs} &le; <emph>i</emph> &lt; {max occurs}</td>
</tr>
</tbody></table>
<p>indicating these instances of <emph>Term</emph><sub>&nbsp;0</sub> may be omitted from the content model. Then, create the grammar for <emph>Particle</emph><sub>&nbsp;i</sub> using the grammar concatenation operator defined in section <specref ref="grammarConcatOperator"/> as follows:
</p>
<table width="100%">
<tbody>
<tr>
<td width="5%"></td>
<td>
<emph>Particle</emph><sub>&nbsp;i</sub> = <emph>G</emph><sub>&nbsp;0</sub> &oplus; <emph>G</emph><sub>&nbsp;1</sub> &oplus; &hellip; &oplus; <emph>G</emph><sub>&nbsp;{max occurs}-1</sub>
</td>
</tr>
</tbody></table>
<p>
Otherwise, if {max occurs} is unbounded, generate one additional copy of <emph>Term</emph><sub>&nbsp;0&nbsp;</sub>, <emph>G</emph><sub>&nbsp;{min occurs}</sub> and replace all productions of the form: 
</p>
<table width="100%">
<tbody>
<tr>
<td width="5%"></td>
<td colspan="2">
<emph>G</emph><sub>&nbsp;{min occurs},&nbsp;k</sub> :</td>
</tr>
<tr>
<td></td>
<td width="5%"></td>
<td>EE</td>
</tr>
</tbody></table>
<p>
with productions of the form:
</p>
<table width="100%">
<tbody>
<tr>
<td width="5%"></td>
<td colspan="2">
<emph>G</emph><sub>&nbsp;{min occurs},&nbsp;k</sub> :</td>
</tr>
<tr>
<td></td>
<td width="5%"></td>
<td><emph>G</emph><sub>&nbsp;{min occurs},&nbsp;0</sub></td>
</tr>
</tbody></table>
<p>indicating this term may be repeated indefinitely. Then if there is no production of the form:
</p>
<table width="100%">
<tbody>
<tr>
<td width="5%"></td>
<td colspan="2">
<emph>G</emph><sub>&nbsp;{min occurs},&nbsp;0</sub> : </td>
</tr>
<tr>
<td></td>
<td width="5%"></td>
<td>EE</td>
</tr>
</tbody></table>
<p>add one after the other productions with the non-terminal <emph>G</emph><sub>&nbsp;{min occurs},&nbsp;0</sub> on the left hand side, indicating this term may be omitted from the content model. Then, create the grammar for <emph>Particle</emph><sub>&nbsp;i</sub> using the grammar concatenation operator defined in section <specref ref="grammarConcatOperator"/> as follows:
</p>
<table width="100%">
<tbody>
<tr>
<td width="5%"></td>
<td>
<emph>Particle</emph><sub>&nbsp;i</sub> = <emph>G</emph><sub>&nbsp;0</sub> &oplus; <emph>G</emph><sub>&nbsp;1</sub> &oplus; &hellip; &oplus; <emph>G</emph><sub>&nbsp;{min occurs}</sub>
</td>
</tr>
</tbody></table>
</div5>
<div5 id="elementTerms">
<head>Element Terms</head>
<p>
Given a particle {term} <emph>PT</emph><sub>&nbsp;i</sub> that is an XML Schema <xspecref spec="XS1" ref="cElement_Declarations">element declaration</xspecref> 
with properties {name} and {target namespace}, let <emph>S</emph> be the set of element declarations that directly or indirectly reaches the element declaration <emph>PT</emph><sub>&nbsp;i</sub> through the chain of {substitution group affiliation} property of the elements, plus <emph>PT</emph><sub>&nbsp;i</sub> itself if was not in the set. Sort the element declarations in <emph>S</emph> lexicographically first by {name} then by {target namespace}, which makes a sorted list of element declarations <emph>E<sub>&nbsp;0&nbsp;</sub></emph>, <emph>E<sub>&nbsp;1&nbsp;</sub></emph>, &hellip; <emph>E<sub>&nbsp;n-1&nbsp;</sub></emph> where <emph>n</emph> is the cardinality of <emph>S</emph>. Then create the grammar <emph>ParticleTerm</emph><sub>&nbsp;i</sub> with the following grammar productions:
</p>
<table width="100%">
<thead>
<tr>
<th align="left" colspan="3">Syntax:</th>
</tr>
</thead>
<tbody>
<tr>
<td colspan="3">&nbsp;</td>
</tr>
<tr>
<td width="5%"></td>
<td colspan="2">
<emph>ParticleTerm</emph><sub>&nbsp;i,&nbsp;0</sub> :</td></tr>
<tr>
<td></td>
<td width="5%"></td>
<td>
SE(<emph>qname</emph><sub>&nbsp;0&nbsp;</sub>) <emph>ParticleTerm</emph><sub>&nbsp;i,&nbsp;1</sub>
</td>
</tr>
<tr>
<td></td>
<td width="5%"></td>
<td>
SE(<emph>qname</emph><sub>&nbsp;1&nbsp;</sub>) <emph>ParticleTerm</emph><sub>&nbsp;i,&nbsp;1</sub>
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>&nbsp;&nbsp;&nbsp;&nbsp;&vellip;</td>
</tr>
<tr>
<td></td>
<td width="5%"></td>
<td>
SE(<emph>qname</emph><sub>&nbsp;n-1&nbsp;</sub>) <emph>ParticleTerm</emph><sub>&nbsp;i,&nbsp;1</sub></td>
</tr>
<tr>
<td colspan="3">&nbsp;</td>
</tr>
<tr>
<td></td>
<td colspan="2">
<emph>ParticleTerm</emph><sub>&nbsp;i,&nbsp;1</sub> :
</td></tr>
<tr>
<td></td>
<td></td>
<td>
EE
</td>
</tr>
<tr>
<td colspan="3">&nbsp;</td>
</tr>
</tbody>
</table>
<table width="100%">
<thead>
<tr>
<th align="left" colspan="3">Note:</th>
</tr>
</thead>
<tbody>
<tr>
<td colspan="3">&nbsp;</td>
</tr>
<tr>
<td colspan="3">
In the productions above, <emph>qname<sub>&nbsp;x&nbsp;</sub></emph> (where 0 &le; <emph>x</emph> &lt; n) represents a <emph>qname</emph> of which localname and uri are {name} property and {target namespace} property of the element declaration <emph>E<sub>&nbsp;x&nbsp;</sub></emph>, respectively.</td>
</tr>
<tr>
<td colspan="3">&nbsp;</td>
</tr>
</tbody>
</table>

<table width="100%">
<thead>
<tr>
<th align="left" colspan="3">Semantics:</th>
</tr>
</thead>
<tbody>
<tr>
<td colspan="3">&nbsp;</td>
</tr>
<tr>
<td colspan="3">
<p>In a schema-informed grammar, all productions of the form <emph>LeftHandSide</emph> : SE(<emph>qname</emph>) <emph>RightHandSide</emph> are evaluated as follows:
</p>
<olist>
<item>Evaluate the element contents using the SE(<emph>qname</emph>) grammar.</item>
<item>Evaluate the remainder of the event sequence using <emph>RightHandSide</emph></item>
</olist>

</td>
</tr>
</tbody>
</table>

</div5>
<div5 id="wildcardTerms">
<head>Wildcard Terms</head>
<p>
Given a particle {term} <emph>PT</emph><sub>&nbsp;i</sub> that is an XML Schema <xspecref spec="XS1" ref="Wildcards">wildcard</xspecref>
with property {namespace constraint}, a grammar that reflects the wildcard definition is created as follows.
</p>
<p>
Create a grammar <emph>ParticleTerm</emph><sub>&nbsp;i</sub> containing the following grammar production:
</p>
<table width="100%">
<tbody>
<tr>
<td width="5%"></td>
<td colspan="2">
<emph>ParticleTerm</emph><sub>&nbsp;i,&nbsp;1</sub> : 
</td>
</tr>
<tr>
<td></td>
<td width="5%"></td>
<td>
EE
</td>
</tr>
</tbody>
</table>

<p>When the wildcard's {namespace constraint} is either <emph>any</emph> or <emph>other</emph>,
add the following production to <emph>ParticleTerm</emph><sub>&nbsp;i&nbsp;</sub>.
</p>
<table width="100%">
<tbody>
<tr>
<td width="5%"></td>
<td colspan="2">
<emph>ParticleTerm</emph><sub>&nbsp;i,&nbsp;0</sub> :
</td>
</tr>
<tr>
<td></td>
<td width="5%"></td>
<td>
SE(*) <emph>ParticleTerm</emph><sub>&nbsp;i,&nbsp;1</sub>
</td>
</tr>
</tbody>
</table>
<p>Otherwise (i.e. {namespace constraint} being a set of namespace names), for each member value <emph>uri<sub>&nbsp;x&nbsp;</sub></emph> in {namespace constraint} where 0 &le; <emph>x</emph> &lt; <emph>n</emph>, and <emph>n</emph> is the number of members, augment the <emph>uri</emph> partition of the String table with <emph>uri<sub>&nbsp;x&nbsp;</sub></emph> (see section <specref ref="stringTablePartitions"/> for String table pre-population), and add the following production to <emph>ParticleTerm</emph><sub>&nbsp;i&nbsp;</sub>.
</p>
<table width="100%">
<tbody>
<tr>
<td width="5%"></td>
<td colspan="2">
<emph>ParticleTerm</emph><sub>&nbsp;i,&nbsp;0</sub> :
</td>
</tr>
<tr>
<td></td>
<td width="5%"></td>
<td>
SE(<emph>uri<sub>&nbsp;x&nbsp;</sub></emph>:&nbsp;*) <emph>ParticleTerm</emph><sub>&nbsp;i,&nbsp;1</sub>
<!-- 
&nbsp;&nbsp;&nbsp;&nbsp;(when <emph>uri<sub>&nbsp;x&nbsp;</sub></emph> is a namespace name)
-->
</td>
</tr>
<!-- tr>
<td></td>
<td width="5%"></td>
<td>
SE(<emph>""<sub>&nbsp;</sub></emph>:&nbsp;*) <emph>ParticleTerm</emph><sub>&nbsp;i,&nbsp;1</sub>
&nbsp;&nbsp;&nbsp;&nbsp;(when <emph>uri<sub>&nbsp;x&nbsp;</sub></emph> is the value <xspecref spec="XS1" ref="key-null">absent</xspecref>)
</td>
</tr -->
</tbody>
</table>
<p>
Note that productions of which right hand side start with terminal SE(*) or SE(<emph>uri<sub>&nbsp;x&nbsp;</sub></emph>:&nbsp;*) are only matched if a more specific match for the current event does not exist in the grammar.  Terminals SE(<emph>uri<sub>&nbsp;x&nbsp;</sub></emph>:&nbsp;*) are matched before it falls back to SE(*) among the productions of same <emph>LeftHandSide</emph>, if any.
</p>
<table width="100%">
<thead>
<tr>
<th align="left" colspan="3">Semantics:</th>
</tr>
</thead>
<tbody>
<tr>
<td colspan="3">&nbsp;</td>
</tr>
<tr>
<td colspan="3">
</td>
</tr>
<tr>
<td colspan="3">
In a schema-informed grammar, all productions of the form <emph>LeftHandSide</emph> : Terminal <emph>RightHandSide</emph> where Terminal is one of SE(*) or SE(<emph>uri<sub>&nbsp;x&nbsp;</sub></emph>:&nbsp;*) are evaluated as follows:
</td>
</tr>
</tbody>
</table>
<olist>
<item>Let <emph>qname</emph> be the <termref def="key-qname">qname</termref> of the element matched by SE (*)
or SE(<emph>uri<sub>&nbsp;x&nbsp;</sub></emph>:&nbsp;*)
</item>
<item>If a grammar does not exist for Element <emph>qname</emph>, create one based on the <termref def="key-builtinElementGrammar"/>.</item>
<item>Evaluate the element contents using the SE(<emph>qname</emph>) grammar.</item>
<item>Evaluate the remainder of the event sequence using <emph>RightHandSide</emph></item>
</olist>
</div5>
<div5 id="modelGroupTerms">
<head>Model Group Terms</head>
<div6 id="sequenceGroupTerms">
<head>Sequence Model Groups</head>
<p>Given a particle {term} <emph>PT</emph><sub>&nbsp;i</sub> that is a model group with {compositor} equal to "sequence" and a list of <emph>n</emph> {particles} <emph>P</emph><sub>&nbsp;0&nbsp;</sub>, <emph>P</emph><sub>&nbsp;1&nbsp;</sub>, &hellip;, <emph>P</emph><sub>&nbsp;n-1&nbsp;</sub>, create a grammar <emph>ParticleTerm</emph><sub>&nbsp;i</sub> as follows:
</p>
<p>If the value of <emph>n</emph> is 0, add the following productions to the grammar <emph>ParticleTerm</emph><sub>&nbsp;i&nbsp;</sub>.
</p>
<table width="100%">
<tbody>
<tr>
<td width="5%"></td>
<td colspan="2">
<emph>ParticleTerm</emph><sub>&nbsp;i,&nbsp;0</sub> :
</td>
</tr>
<tr>
<td></td>
<td width="5%"></td>
<td>
EE
</td>
</tr>
</tbody></table>
<p>Otherwise, generate a sequence of grammars <emph>Particle</emph><sub>&nbsp;0&nbsp;</sub>, <emph>Particle</emph><sub>&nbsp;1&nbsp;</sub>, &hellip;, <emph>Particle</emph><sub>&nbsp;n-1</sub> corresponding to the list of particles <emph>P</emph><sub>&nbsp;0&nbsp;</sub>, <emph>P</emph><sub>&nbsp;1&nbsp;</sub>, &hellip;, <emph>P</emph><sub>&nbsp;n-1</sub> according to section <specref ref="particles"/>. Then combine the sequence of grammars using the grammar concatenation operator defined in section <specref ref="grammarConcatOperator"/> as follows:
</p>
<table width="100%">
<tbody>
<tr>
<td width="5%"></td>
<td>
<emph>ParticleTerm</emph><sub>&nbsp;i</sub> = <emph>Particle</emph><sub>&nbsp;0</sub> &oplus; <emph>Particle</emph><sub>&nbsp;1</sub> &oplus; &hellip; &oplus; <emph>Particle</emph><sub>&nbsp;n-1</sub></td>
</tr>
</tbody></table>
</div6>
<div6 id="choiceGroupTerms">
<head>Choice Model Groups</head>
<p>Given a particle {term} <emph>PT</emph><sub>&nbsp;i</sub> that is a model group with {compositor} equal to "choice" and a list of n {particles} <emph>P</emph><sub>&nbsp;0&nbsp;</sub>, <emph>P</emph><sub>&nbsp;1&nbsp;</sub>, &hellip;, <emph>P</emph><sub>&nbsp;n-1&nbsp;</sub>, create a grammar <emph>ParticleTerm</emph><sub>&nbsp;i</sub> as follows:
</p>
<p>
If the value of <emph>n</emph> is 0, add the following productions to the grammar <emph>ParticleTerm</emph><sub>&nbsp;i&nbsp;</sub>.
</p>
<table width="100%">
<tbody>
<tr>
<td width="5%"></td>
<td colspan="2">
<emph>ParticleTerm</emph><sub>&nbsp;i,&nbsp;0</sub> :
</td>
</tr>
<tr>
<td></td>
<td width="5%"></td>
<td>
EE
</td>
</tr>
</tbody></table>
<p>Otherwise, generate a sequence of grammar productions <emph>Particle</emph><sub>&nbsp;0&nbsp;</sub>, <emph>Particle</emph><sub>&nbsp;1&nbsp;</sub>, &hellip;, <emph>Particle</emph><sub>&nbsp;n-1</sub> corresponding to the list of particles <emph>P</emph><sub>&nbsp;0&nbsp;</sub>, <emph>P</emph><sub>&nbsp;1&nbsp;</sub>, &hellip;, <emph>P</emph><sub>&nbsp;n-1</sub> according to section <specref ref="particles"/>. Then create the grammar <emph>ParticleTerm</emph><sub>&nbsp;i</sub> with the following grammar productions:
</p>
<table width="100%">
<tbody>
<tr>
<td width="5%"></td>
<td colspan="2">
<emph>ParticleTerm</emph><sub>&nbsp;i,&nbsp;0</sub> :
</td>
</tr>
<tr>
<td></td>
<td width="5%"></td>
<td>
<emph>Particle</emph><sub>&nbsp;0,&nbsp;0</sub>
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
<emph>Particle</emph><sub>&nbsp;1,&nbsp;0</sub>
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
&nbsp;&nbsp;&nbsp;&nbsp;&vellip;
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
<emph>Particle</emph><sub>&nbsp;n-1,&nbsp;0</sub>
</td>
</tr>
</tbody></table>
<p>
indicating the grammar for the term may accept any one of the given {particles}.
</p>
</div6>
<div6 id="allGroupTerms">
<head>All Model Groups</head>
<p>Given a particle {term} <emph>PT</emph><sub>&nbsp;i</sub> that is a model group with {compositor} equal to "all" and a list of <emph>n</emph> {particles} <emph>P</emph><sub>&nbsp;0&nbsp;</sub>, <emph>P</emph><sub>&nbsp;1&nbsp;</sub>, ..., <emph>P</emph><sub>&nbsp;n-1&nbsp;</sub>, create a grammar <emph>ParticleTerm</emph><sub>&nbsp;i</sub> as follows:
</p>
<p>Generate a set of grammars <emph>S<sub>&nbsp;0</sub></emph> = { <emph>Particle</emph><sub>&nbsp;0&nbsp;</sub>, <emph>Particle</emph><sub>&nbsp;1&nbsp;</sub>, ..., <emph>Particle</emph><sub>&nbsp;n-1&nbsp;</sub>} corresponding to the list of particles <emph>P</emph><sub>&nbsp;0&nbsp;</sub>, <emph>P</emph><sub>&nbsp;1&nbsp;</sub>, ..., <emph>P</emph><sub>&nbsp;n-1</sub> according to section <specref ref="particles"/>. Then, generate the grammar <emph>ParticleTerm</emph><sub>&nbsp;i</sub> from the set <emph>S</emph><sub>&nbsp;0</sub> by applying the following rules.</p>
<olist>
<item>Given a newly created grammar <emph>G</emph> and a set of <emph>m</emph> grammars, <emph>S</emph> = {<emph>G</emph><sub>&nbsp;0&nbsp;</sub>, <emph>G</emph><sub>&nbsp;1&nbsp;</sub>, &hellip;, <emph>G</emph><sub>&nbsp;m-1&nbsp;</sub>}, the productions of <emph>G</emph> are derived from <emph>S</emph> by the following steps. 
<table width="100%">
<tbody>
<tr>
<td>&nbsp;</td></tr>
</tbody></table>
</item>
<item>If the value of <emph>m</emph> is 0, add the following productions to the grammar <emph>G</emph>, which completes the grammar <emph>G</emph>.
<table width="100%">
<tbody>
<tr>
<td width="5%"></td>
<td colspan="2">&nbsp;</td></tr>
<tr>
<td></td>
<td colspan="2">
<emph>G</emph> :</td>
</tr>
<tr>
<td></td>
<td width="5%"></td>
<td>EE</td>
</tr>
<tr>
<td colspan="3">&nbsp;</td>
</tr>
</tbody></table>
</item>
<item>Otherwise, if <emph>m</emph> &gt; 0, generate a sequence of grammars:
<emph>Part</emph><sub>&nbsp;j</sub> = <emph>C</emph><sub>&nbsp;j</sub> &oplus; <emph>All</emph>(<emph>S</emph> – {<emph>G</emph><sub>&nbsp;j&nbsp;</sub>}) where 0 &le; j &lt; <emph>m</emph>, <emph>C</emph><sub>&nbsp;j</sub> is a copy of <emph>G</emph><sub>&nbsp;j</sub> and <emph>All</emph>(<emph>S</emph> – {<emph>G</emph><sub>&nbsp;j&nbsp;</sub>}) is the All grammar for the set (<emph>S</emph> – {<emph>G</emph><sub>&nbsp;j&nbsp;</sub>}) created by applying this sequence of rules recursively starting step 1. Then add the following productions to the grammar <emph>G</emph>:
<table width="100%">
<tbody>
<tr>
<td width="5%"></td>
<td colspan="2">&nbsp;</td></tr>
<tr>
<td></td>
<td colspan="2">
<emph>G</emph> :
</td>
</tr>
<tr>
<td></td>
<td width="5%"></td>
<td>
<emph>Part</emph><sub>&nbsp;0,&nbsp;0</sub>
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
<emph>Part</emph><sub>&nbsp;1,&nbsp;0</sub>
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
&nbsp;&nbsp;&nbsp;&nbsp;&vellip;
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
<emph>Part</emph><sub>&nbsp;m-1,&nbsp;0</sub>
</td>
</tr>
</tbody></table>
</item>
</olist>
<note>

Although the algorithm used for constructing <emph>all</emph> model group grammars presented above is well suited for clearly describing the grammar system in an implementation-independent manner, implementations may choose a more efficient method for grammar construction because the number of grammars the algorithm would generate tends to grow exponentially relative to the number of particles. 
An example alternative grammar generation strategy that may lend better to implementation would be to use an array of boolean values each of which corresponds one of the particles to keep track of the particle occurrences and construct the grammars at run-time based on those particles that have yet to occur at the time of evaluation.
This sort of simplified special handling of <emph>all</emph> model groups might be appealling given that <emph>all</emph> model group instances are properly disconnected from another model group instances in that <emph>all</emph> model group can only appear as the immediate content model of a complex type and the particles that they contain have always element terms, not model group terms. Implementations are free to choose or invent any method or any combination of methods to construct grammars for <emph>all</emph> model groups.

</note>
</div6>
</div5>
</div4>
<div4 id="normalizedGrammars">
<head>EXI Normalized Grammars</head>
<p>This section describes the process for converting an EXI proto-grammar derived from an XML Schema in accordance with section <specref ref="protoGrammars"/> into an EXI normalized grammar. Each production in an EXI normalized grammar has exactly one non-terminal symbol on the left hand side and one terminal symbol on the right hand side followed by at most one non-terminal symbol on the right hand side. In addition, EXI normalized grammars contain no two grammar productions with the same non-terminal on the left side and the same terminal symbol on the right-hand-side. This is a restricted form of Greibach normal form <bibref ref="greibach"/>. 
</p>
<p>EXI proto-grammars differ from normalized EXI grammars in that they may contain productions of the form:
</p>
<table width="100%">
<tbody>
<tr>
<td width="5%"></td>
<td colspan="2">
<emph>LeftHandSide</emph> :
</td>
</tr>
<tr>
<td></td>
<td width="5%"></td>
<td>
<emph>RightHandSize</emph>
</td>
</tr>
</tbody></table>
<p>where <emph>LeftHandSide</emph> and <emph>RightHandSide</emph> are both non-terminals. Therefore, the first step of the normalization process focuses on replacing productions in this form with productions that conform to the EXI normalized grammar rules. This process can produce a grammar that has more than one production with the same non-terminal on the left hand side and the same terminal symbol on the right hand side. Therefore, the second step focuses on eliminating such productions.
</p>
<p>The first step of the normalization process is described in Section <specref ref="eliminatingProductions"/>. The second step is described in section <specref ref="eliminatingSymbols"/>. Once these two steps are completed, the grammar will be an EXI normalized grammar.
</p>
<div5 id="eliminatingProductions">
<head>Eliminating Productions with no Terminal Symbol</head>
<p>
Given an EXI proto-grammar <emph>G</emph><sub>&nbsp;i&nbsp;</sub>, with non-terminals <emph>G</emph><sub>&nbsp;i,&nbsp;0&nbsp;</sub>, <emph>G</emph><sub>&nbsp;i,&nbsp;1&nbsp;</sub>, &hellip;, <emph>G</emph><sub>&nbsp;i,&nbsp;n-1&nbsp;</sub>, replace each production of the form:
</p>
<table width="100%">
<tbody>
<tr>
<td width="5%"></td>
<td colspan="2">
<emph>G</emph><sub>&nbsp;i,&nbsp;j</sub> :
</td>
</tr>
<tr>
<td></td>
<td width="5%"></td>
<td>
<emph>G</emph><sub>&nbsp;i,&nbsp;k</sub>&nbsp;&nbsp;&nbsp;&nbsp;where 0 &le; <emph>j</emph> &lt; <emph>n</emph> and 0 &le; <emph>k</emph> &lt; <emph>n</emph>
</td>
</tr>
</tbody></table>
<p>with a set of productions:
</p>
<table width="100%">
<tbody>
<tr>
<td width="5%"></td>
<td colspan="2">
<emph>G</emph><sub>&nbsp;i,&nbsp;j</sub> :
</td>
</tr>
<tr>
<td></td>
<td width="5%"></td>
<td>
<emph>RHS</emph>(<emph>G</emph><sub>&nbsp;i,&nbsp;k&nbsp;</sub>)<sub>&nbsp;0</sub>
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
<emph>RHS</emph>(<emph>G</emph><sub>&nbsp;i,&nbsp;k&nbsp;</sub>)<sub>&nbsp;1</sub>
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
&nbsp;&nbsp;&nbsp;&nbsp;&vellip;
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
<emph>RHS</emph>(<emph>G</emph><sub>&nbsp;i,&nbsp;k&nbsp;</sub>)<sub>&nbsp;m-1</sub>
</td>
</tr>
</tbody></table>
<p>where <emph>RHS</emph>(<emph>G</emph><sub>&nbsp;i,&nbsp;k&nbsp;</sub>)<sub>&nbsp;0&nbsp;</sub>, <emph>RHS</emph>(<emph>G</emph><sub>&nbsp;i,&nbsp;k&nbsp;</sub>)<sub>&nbsp;1&nbsp;</sub>, &hellip;, <emph>RHS</emph>(<emph>G</emph><sub>&nbsp;i,&nbsp;k&nbsp;</sub>)<sub>&nbsp;m-1</sub> represents the right hand side of each  production in <emph>G</emph><sub>&nbsp;i</sub> that has the non-terminal <emph>G</emph><sub>&nbsp;j,&nbsp;k</sub> on the left hand side and <emph>m</emph> is the number of such productions. 
</p>
<p>Repeat this process until there are no more production of the form:
</p>
<table width="100%">
<tbody>
<tr>
<td width="5%"></td>
<td colspan="2">
<emph>G</emph><sub>&nbsp;i,&nbsp;j</sub> :
</td>
</tr>
<tr>
<td></td>
<td width="5%"></td>
<td>
<emph>G</emph><sub>&nbsp;i,&nbsp;k</sub>&nbsp;&nbsp;&nbsp;&nbsp;where 0 &le; <emph>j</emph> &lt; <emph>n</emph> and 0 &le; <emph>k</emph> &lt; <emph>n</emph>
</td>
</tr>
</tbody></table>
<p>in the grammar <emph>G</emph><sub>&nbsp;i&nbsp;</sub>. 
</p>
</div5>
<div5 id="eliminatingSymbols">
<head>Eliminating Duplicate Terminal Symbols</head>
<p>Given an EXI proto-grammar <emph>G</emph><sub>&nbsp;i&nbsp;</sub>, with non-terminals <emph>G</emph><sub>&nbsp;i,&nbsp;0&nbsp;</sub>, <emph>G</emph><sub>&nbsp;i,&nbsp;1&nbsp;</sub>, &hellip;, <emph>G</emph><sub>&nbsp;i,&nbsp;n-1&nbsp;</sub>, identify all pairs of productions that have the same non-terminal on the left hand side and the same terminal symbol on the right hand side of the form:
</p>
<table width="100%">
<tbody>
<tr>
<td width="5%"></td>
<td colspan="2">
<emph>G</emph><sub>&nbsp;i,&nbsp;j</sub> : 
</td>
</tr>
<tr>
<td></td>
<td width="5%"></td>
<td>
Terminal <emph>G</emph><sub>&nbsp;i,&nbsp;k</sub>
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
Terminal <emph>G</emph><sub>&nbsp;i,&nbsp;l</sub>
</td>
</tr>
</tbody></table>
<p>where <emph>k</emph> &nbsp;&ne;&nbsp; <emph>l</emph> and Terminal represents a particular terminal symbol and replace them with a single production:
</p>
<table width="100%">
<tbody>
<tr>
<td width="5%"></td>
<td colspan="2">
<emph>G</emph><sub>&nbsp;i,&nbsp;j</sub> : 
</td>
</tr>
<tr>
<td></td>
<td width="5%"></td>
<td>
Terminal <emph>G</emph><sub>&nbsp;i,&nbsp;k&nbsp;&sqcup;&nbsp;l</sub>
</td>
</tr>
</tbody></table>
<p>
where <emph>G</emph><sub>&nbsp;i,&nbsp;k&nbsp;&sqcup;&nbsp;l</sub> is a distinct non-terminal that accepts the inputs accepted by <emph>G</emph><sub>&nbsp;i,&nbsp;k</sub> and the inputs accepted by <emph>G</emph><sub>&nbsp;i,&nbsp;l&nbsp;</sub>.

Here the notation "&nbsp;&nbsp;k&nbsp;&sqcup;&nbsp;l&nbsp;&nbsp;" denotes a union set of integers and is used to uniquely identify the index of such a non-terminal.
</p>
<p>
When <emph>G</emph><sub>&nbsp;i&nbsp;</sub> is a type grammar, if both <emph>k</emph> and <emph>l</emph> are smaller than <termref def="key-contentIndex"><emph>content</emph></termref> index of <emph>G</emph><sub>&nbsp;i&nbsp;</sub>, k&nbsp;&sqcup;&nbsp;l is also considered to be smaller than <emph>content</emph> for the purpose of index comparison purposes. Otherwise, if either  <emph>k</emph> or <emph>l</emph> is not smaller than <emph>content</emph>, k&nbsp;&sqcup;&nbsp;l is considered to be larger than <emph>content</emph>.
</p>
<p>
 If the non-terminal <emph>G</emph><sub>&nbsp;i,&nbsp;k&nbsp;&sqcup;&nbsp;l</sub> does not exist, create it as follows: 
</p>
<table width="100%">
<tbody>
<tr>
<td width="5%"></td>
<td colspan="2">
<emph>G</emph><sub>&nbsp;i,&nbsp;k&nbsp;&sqcup;&nbsp;l</sub> :
</td>
</tr>
<tr>
<td></td>
<td width="5%"></td>
<td>
<emph>RHS</emph>(<emph>G</emph><sub>&nbsp;i,&nbsp;k&nbsp;</sub>)<sub>&nbsp;0</sub>
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
<emph>RHS</emph>(<emph>G</emph><sub>&nbsp;i,&nbsp;k&nbsp;</sub>)<sub>&nbsp;1</sub>
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
&nbsp;&nbsp;&nbsp;&nbsp;&vellip;
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
<emph>RHS</emph>(<emph>G</emph><sub>&nbsp;i,&nbsp;k&nbsp;</sub>)<sub>&nbsp;m-1</sub>
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
&nbsp;
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
<emph>RHS</emph>(<emph>G</emph><sub>&nbsp;i,&nbsp;l&nbsp;</sub>)<sub>&nbsp;0</sub>
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
<emph>RHS</emph>(<emph>G</emph><sub>&nbsp;i,&nbsp;l&nbsp;</sub>)<sub>&nbsp;1</sub>
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
&nbsp;&nbsp;&nbsp;&nbsp;&vellip;
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
<emph>RHS</emph>(<emph>G</emph><sub>&nbsp;i,&nbsp;l&nbsp;</sub>)<sub>&nbsp;n-1</sub>
</td>
</tr>
</tbody></table>
<p>where <emph>RHS</emph>(<emph>G</emph><sub>&nbsp;i,&nbsp;k&nbsp;</sub>)<sub>&nbsp;0</sub>,
<emph>RHS</emph>(<emph>G</emph><sub>&nbsp;i,&nbsp;k&nbsp;</sub>)<sub>&nbsp;1</sub>,
&hellip;,
<emph>RHS</emph>(<emph>G</emph><sub>&nbsp;i,&nbsp;k&nbsp;</sub>)<sub>&nbsp;m-1</sub>
and
<emph>RHS</emph>(<emph>G</emph><sub>&nbsp;i,&nbsp;l&nbsp;</sub>)<sub>&nbsp;0</sub>, 
<emph>RHS</emph>(<emph>G</emph><sub>&nbsp;i,&nbsp;l&nbsp;</sub>)<sub>&nbsp;1</sub>,
&hellip;,
<emph>RHS</emph>(<emph>G</emph><sub>&nbsp;i,&nbsp;l&nbsp;</sub>)<sub>&nbsp;n-1</sub>
represent the right hand side of each  production in the Grammar <emph>G</emph><sub>&nbsp;i</sub> that has the non-terminals <emph>G</emph><sub>&nbsp;j,&nbsp;k</sub> and <emph>G</emph><sub>&nbsp;j,&nbsp;l</sub> on the left hand side respectively and <emph>m</emph> and <emph>n</emph> are the number of such productions. 
</p>
<p>Repeat this process until there are no more productions in the grammar <emph>G</emph><sub>&nbsp;i</sub> of the form:
</p>
<table width="100%">
<tbody>
<tr>
<td width="5%"></td>
<td colspan="2">
<emph>G</emph><sub>&nbsp;i,&nbsp;j</sub> : 
</td>
</tr>
<tr>
<td></td>
<td width="5%"></td>
<td>
Terminal <emph>G</emph><sub>&nbsp;i,&nbsp;k</sub>
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
Terminal <emph>G</emph><sub>&nbsp;i,&nbsp;l</sub>
</td>
</tr>
</tbody></table>
<p>Then, identify any identical productions of the following form: 
</p>
<table width="100%">
<tbody>
<tr>
<td width="5%"></td>
<td colspan="2">
<emph>G</emph><sub>&nbsp;i,&nbsp;j</sub> : 
</td>
</tr>
<tr>
<td></td>
<td width="5%"></td>
<td>
Terminal <emph>G</emph><sub>&nbsp;i,&nbsp;k</sub>
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
Terminal <emph>G</emph><sub>&nbsp;i,&nbsp;k</sub>
</td>
</tr>
</tbody></table>
<p>
where 0 &le; <emph>k</emph> &lt; <emph>n</emph>, <emph>n</emph> is the number of productions in <emph>G</emph><sub>&nbsp;i</sub> and Terminal represents a specific terminal symbol, then remove one of them until there are no more productions remaining in the grammar <emph>G</emph><sub>&nbsp;i</sub> of this form.
</p>
</div5>
</div4>
<div4 id="eventCodeAssignment">
<head>Event Code Assignment</head>
<p>This section describes the process for assigning unique event codes to each production in a normalized EXI grammar. Given a normalized EXI grammar <emph>G</emph><sub>&nbsp;i&nbsp;</sub>, apply the following process to each unique non-terminal <emph>G</emph><sub>&nbsp;i,&nbsp;j</sub> that occurs on the left hand side of the productions in <emph>G</emph><sub>&nbsp;i</sub> where 0 &le; <emph>j</emph> &lt; <emph>n</emph> and <emph>n</emph> is the number of such non-terminals in <emph>G</emph><sub>&nbsp;i&nbsp;</sub>. 
</p>
<p>Sort all productions with <emph>G</emph><sub>&nbsp;i,&nbsp;j</sub> on the left hand side in the following order:
</p>
<olist>
<item>
All productions with AT(<emph>qname</emph>) on the right hand side
sorted lexically by <emph>qname</emph> localName, then by <emph>qname</emph> uri, followed by
</item>
<item>
any production with AT(*) on the right hand side, followed by
</item>
<item>
all productions with SE(<emph>qname</emph>) on the right hand side sorted in schema order, followed by
</item>
<item>
any production with EE on the right hand side, followed by
</item>
<item>
any production with CH on the right hand side.
</item>
</olist>
<p>Given the sorted list of productions <emph>P</emph><sub>&nbsp;0&nbsp;</sub>, <emph>P</emph><sub>&nbsp;1&nbsp;</sub>, &hellip; <emph>P</emph><sub>&nbsp;n</sub> with the non-terminal <emph>G</emph><sub>&nbsp;i,&nbsp;j</sub> on the left hand side, assign event codes to each of the productions as follows:
</p>
<table width="100%">
<thead>
<tr>
<th colspan="2" align="left">Productions</th>
<th align="left">Event Code</th>
</tr>
</thead>
<tbody>
<tr>
<td width="5%"></td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>
<emph>P</emph><sub>&nbsp;0</sub>
</td>
<td>
0
</td>
</tr>
<tr>
<td></td>
<td>
<emph>P</emph><sub>&nbsp;1</sub>
</td>
<td>
1
</td>
</tr>
<tr>
<td></td>
<td>
&vellip;
</td>
<td>
&vellip;
</td>
</tr>
<tr>
<td></td>
<td>
<emph>P</emph><sub>&nbsp;n-1</sub>
</td>
<td>
<emph>n</emph>-1
</td>
</tr>
</tbody></table>

</div4>
<div4 id="undeclaredProductions">
<head>Undeclared Productions</head>
<p>The normalized element and type grammars derived from a schema describe the sequences of child elements, attributes and character events that may occur in a particular EXI stream. However, there are additional events that may occur in an EXI stream that are not described by the schema, for example events representing comments, processing-instructions, schema deviations, etc. 
</p>
<p>
This section first describes the process for, in cases with <termref def="key-strictOption">strict option</termref> value set to false, augmenting the normalized element and type grammars with productions that describe events that may occur in the EXI stream, but are not explicitly declared in the schema. It then describes the way, in cases with <termref def="key-strictOption">strict option</termref> value set to true, normalized element and type grammars are supplemented with productions to be prepared for the occurrences of xsi:type and xsi:nil attributes that are permitted by the schema. 
</p>
<p>
In the normalized element and type grammars, terminal symbols AT and CH represent attributes or character events that have schema-valid values per the associated datatypes.
When <termref def="key-strictOption">strict option</termref> value is set to false, in order to efficiently permit schema-invalid values for these event types, terminal symbols AT and CH predicated as schema-invalid are introduced to convey that their values are schema-invalid. The following table shows the notation used for such AT and CH terminals along with their definitions.
</p>
<table width="100%" border="1">
<colgroup align="left" width="35%"></colgroup>
<colgroup/>
<thead>
<tr>
<th align="center">Notation</th>
<th>Definition</th>
</tr>
</thead>
<tbody>
<tr>
<td>&nbsp;&nbsp;&nbsp;&nbsp;AT&nbsp;(<emph>qname</emph>)&nbsp;[schema-invalid]</td>
<td>Terminal symbol that matches an attribute (AT) event with <termref def="key-qname">qname</termref> <emph>qname</emph> and a schema-invalid value.</td>
</tr>
<tr>
<td>&nbsp;&nbsp;&nbsp;&nbsp;AT (*) [schema-invalid]</td>
<td>Terminal symbol that matches an attribute (AT) event with any <termref def="key-qname">qname</termref> and a schema-invalid value.</td>
</tr>
<tr>
<td>&nbsp;&nbsp;&nbsp;&nbsp;CH [schema-invalid]</td>
<td>Terminal symbol that matches an characters (CH) event with a schema-invalid value.</td>
</tr>
</tbody>
</table>


<!--
the value content items of events that match non-terminals are schema-valid 
-->

<div5 id="addingProductions">
<head>Adding Productions when Strict is False</head>
<p>This section describes the process for augmenting the normalized grammars when the value of the <termref def="key-strictOption">strict option</termref> is false. For each normalized element grammar <emph>Element</emph><sub>&nbsp;i&nbsp;</sub>, create a copy <emph>Element</emph><sub>&nbsp;i,&nbsp;content2</sub> of <emph>Element</emph><sub>&nbsp;i,&nbsp;content</sub> where the index "content" is the <termref def="key-contentIndex"><emph>content</emph></termref> of the type of the element from which <emph>Element</emph><sub>&nbsp;i&nbsp;</sub> was created. Then, apply the following procedures.</p>

<p>Add the following production to each non-terminal <emph>Element</emph><sub>&nbsp;i,&nbsp;j&nbsp;</sub> that does not already include a production of the form <emph>Element</emph><sub>&nbsp;i,&nbsp;j</sub> : EE, such that 0 &le; j &le; content. 
</p>
<table width="100%">
<thead>
<tr>
<th colspan="3" align="left">Syntax</th>
<th align="left">Event Code</th>
</tr>
</thead>
<tbody>
<tr>
<td width="5%">&nbsp;</td>
<td width="5%"></td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td colspan="3">
<emph>Element</emph><sub>&nbsp;i,&nbsp;j</sub> :
</td>
</tr>
<tr>
<td></td>
<td></td>
<td width="50%">
EE
</td>
<td>
<emph>n</emph>.<emph>m</emph>
</td>
</tr>
<tr>
<td colspan="4">&nbsp;</td>
</tr>
<tr>
<td></td>
<td colspan="3">
where <emph>n</emph>.<emph>m</emph> represents the next available event code with length 2. 
</td>
</tr>
<tr>
<td colspan="4">&nbsp;</td>
</tr>
</tbody></table>

<p>Let <emph>E</emph><sub>&nbsp;i</sub> be the element declaration from which <emph>Element</emph><sub>&nbsp;i</sub> was created and <emph>T</emph><sub>&nbsp;k</sub> be the {type definition} of <emph>E</emph><sub>&nbsp;i&nbsp;</sub>. Let <emph>Type</emph><sub>&nbsp;k</sub> and <emph>TypeEmpty</emph><sub>&nbsp;k</sub> be the type grammars created from <emph>T</emph><sub>&nbsp;k</sub> (see section <specref ref="typeGrammars"/>). Add the following productions to <emph>Element</emph><sub>&nbsp;i&nbsp;</sub>.  
</p>
<table width="100%">
<thead>
<tr>
<th colspan="3" align="left">Syntax</th>
<th align="left">Event Code</th>
</tr>
</thead>
<tbody>
<tr>
<td width="5%">&nbsp;</td>
<td width="5%"></td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td colspan="3">
<emph>Element</emph><sub>&nbsp;i,&nbsp;0</sub> :
</td>
</tr>
<tr>
<td></td>
<td></td>
<td width="50%">
AT(xsi:type) <emph>Element</emph><sub>&nbsp;i,&nbsp;0</sub> 
</td>
<td>
<emph>n</emph>.<emph>m</emph>
</td>
</tr>
<tr>
<td></td>
<td></td>
<td width="50%">
AT(xsi:nil) <emph>Element</emph><sub>&nbsp;i,&nbsp;0</sub>
</td>
<td>
<emph>n</emph>.(<emph>m</emph>+1)
</td>
</tr>
<tr>
<td colspan="4">&nbsp;</td>
</tr>
<tr>
<td></td>
<td colspan="3">
<p>
where <emph>n</emph>.<emph>m</emph> represents the next available event code with length 2. 
</p>
</td>
</tr>
<tr>
<td colspan="4">&nbsp;</td>
</tr>
</tbody>
</table>
<table width="100%">
<thead>
<tr>
<th align="left">Note:</th>
</tr>
</thead>
<tbody>
<tr>
<td>&nbsp;
</td>
</tr>
<tr>
<td>
<ulist>
<item>
The value of each AT (xsi:type) event is represented as a QName (see 
<specref ref="encodingQName"/>). If there is no namespace in scope for the specified <termref def="key-qname">qname</termref> prefix, the QName <emph>uri</emph> is set to empty ("") and the QName <emph>localName</emph> is set to the full lexical value of the QName, including the prefix. 
</item>
</ulist>
</td>
</tr>
<tr>
<td>
<ulist>
<item>
The value of each AT (xsi:nil) event is represented as a Boolean (see 
<specref ref="encodingBoolean"/>) unless the given value is not a valid Boolean. If the value is not a schema-valid Boolean, the AT (xsi:nil) event is represented by the AT(*) [schema-invalid value] terminal (see below). 
</item>
</ulist>
</td>
</tr>
</tbody></table>
<table width="100%">
<thead>
<tr>
<th align="left" colspan="2">Semantics:</th>
</tr>
</thead>
<tbody>
<tr>
<td colspan="2">&nbsp;</td>
</tr>

<tr>
<td colspan="2">
<ulist>
<item>
When using schemas, all productions of the form <emph>LeftHandSide</emph> : AT (xsi:type) are evaluated as follows: </item>
</ulist>
</td>
</tr>
<tr>
<td>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</td>
<td>
<olist>
<item>
Let <emph>qname</emph> be the value of the xsi:type attribute
</item>
<item>
If a grammar exists for the <emph>qname</emph> type, evaluate the element contents using the grammar for the <emph>qname</emph> type instead of the declared type for the current element
</item>
</olist>
</td>
</tr>
<tr>
<td colspan="2">&nbsp;</td>
</tr>

<tr>
<td colspan="2">
<ulist>
<item>
When using schemas, productions of the form <emph>LeftHandSide</emph> : AT (xsi:nil) are evaluated as follows: </item>
</ulist>
</td>
</tr>
<tr>
<td>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</td>
<td>
<olist>
<item>
Let <emph>nil</emph> be the value of the xsi:nil attribute
</item>
<item>
If <emph>nil</emph> is true, evaluate the element contents using the grammar for the <emph>TypeEmpty</emph><sub>&nbsp;k,&nbsp;0</sub> type instead of the declared type for the current element
</item>
</olist>
</td>
</tr>
</tbody></table>

<p>
Add the following productions to each non-terminal <emph>Element</emph><sub>&nbsp;i,&nbsp;j&nbsp;</sub>, such that 0 &le; j &le; content&nbsp;.
</p>
<table width="100%">
<thead>
<tr>
<th colspan="3" align="left">Syntax</th>
<th align="left">Event Code</th>
</tr>
</thead>
<tbody>
<tr>
<td width="5%">&nbsp;</td>
<td width="5%"></td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td colspan="3">
<emph>Element</emph><sub>&nbsp;i,&nbsp;j</sub> :
</td>
</tr>
<tr>
<td></td>
<td></td>
<td width="50%">
AT (*) <emph>Element</emph><sub>&nbsp;i,&nbsp;j</sub>
</td>
<td>
<emph>n</emph>.<emph>m</emph>
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
AT (<emph>qname</emph><sub>&nbsp;0&nbsp;</sub>) [schema-invalid value] <emph>Element</emph><sub>&nbsp;i,&nbsp;j</sub>
</td>
<td>
<emph>n</emph>.(<emph>m</emph>+1).0
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
AT (<emph>qname</emph><sub>&nbsp;1&nbsp;</sub>) [schema-invalid value] <emph>Element</emph><sub>&nbsp;i,&nbsp;j</sub>
</td>
<td>
<emph>n</emph>.(<emph>m</emph>+1).1
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
&nbsp;&nbsp;&nbsp;&nbsp;&vellip;
</td>
<td>
&nbsp;&nbsp;&nbsp;&nbsp;&vellip;
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
AT (<emph>qname</emph><sub>&nbsp;<emph>x</emph>-1&nbsp;</sub>) [schema-invalid value] <emph>Element</emph><sub>&nbsp;i,&nbsp;j</sub>
</td>
<td>
<emph>n</emph>.(<emph>m</emph>+1).(<emph>x</emph>-1)
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
AT (*) [schema-invalid value] <emph>Element</emph><sub>&nbsp;i,&nbsp;j</sub>
</td>
<td>
<emph>n</emph>.(<emph>m</emph>+1).(<emph>x</emph>)
</td>
</tr>
<tr>
<td colspan="4">&nbsp;</td>
</tr>
<tr>
<td></td>
<td colspan="3">
<p>where <emph>n</emph>.<emph>m</emph> represents the next available event code with length 2 and <emph>x</emph> represents the number of attributes declared in the schema for this context. 
</p>
</td>
</tr>
<tr>
<td colspan="4">&nbsp;</td>
</tr>
</tbody></table>

<table width="100%">
<thead>
<tr>
<th align="left">Note:</th>
</tr>
</thead>
<tbody>
<tr>
<td>&nbsp;
</td>
</tr>
<tr>
<td>
<ulist>
<item>
The value of each AT event that has a [schema-invalid value] is represented as a String (see <specref ref="encodingString"/>).
</item>
</ulist>
</td>
</tr>
<tr>
<td>
<ulist>
<item>
Like an element, an attribute may occur in a schema-invalid context, have a schema-invalid value or both. However, unlike an element whose occurance and value are represented by separate SE and CH events, the occurance and value of an attribute are represented by a single AT event. Consequently, four kinds of AT terminals are needed to represent the four possible validity states for an attribute. The table below shows the AT terminals that represent each of these four validity states along with the equivalent combinations of SE and CH events for representing elements. 
<p/>
<table border="1" width="95%" id='table-at-terminals'>
<caption>Events representing schema-valid and schema-invalid attributes and elements</caption>
<colgroup width="20%"/>
<colgroup width="40%"/>
<colgroup width="40%"/>
<thead>
<tr>
<th></th>
<th>Schema-valid value</th>
<th>Schema-invalid value</th></tr>
</thead>
<tbody>
<tr>
<th>Schema-valid occurance</th>
<td>
<table border="0">
<tr><td>AT(<emph>qname</emph>)</td></tr>
<tr><td>SE(<emph>qname</emph>) CH</td></tr>
</table></td>
<td>
<table border="0">
<tr><td>AT(<emph>qname</emph>) [schema-invalid value]</td></tr>
<tr><td>SE(<emph>qname</emph>) CH [schema-invalid
value
]</td></tr>
</table></td>
</tr>
<tr>
<th>Schema-invalid occurance</th>
<td>
<table border="0">
<tr><td>AT(*)</td></tr>
<tr><td>SE(*) CH</td></tr>
</table></td>
<td>
<table>
<tr><td>AT(*) [schema-invalid value]</td></tr>
<tr><td>SE(*) CH [schema-invalid
value
]</td></tr>
</table></td>
</tr>
</tbody></table>

</item>
</ulist>
</td>
</tr>
</tbody></table>
<table width="100%">
<thead>
<tr>
<th align="left" colspan="2">Semantics:</th>
</tr>
</thead>
<tbody>
<tr>
<td>&nbsp;
</td>
</tr>
<tr>
<td colspan="2">
<ulist>
<item>
When using schemas, all productions of the form <emph>LeftHandSide</emph> : AT (*) are evaluated as follows: 
</item>
</ulist>
</td>
</tr>
<tr>
<td>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</td>
<td>
<olist>
<item>
Let <emph>qname</emph> be the <termref def="key-qname">qname</termref> of the attribute matched by AT(*)
</item>
<item>If a global attribute definition exists for <emph>qname</emph>, represent the value of the attribute according to its datatype (see <specref ref="encodingValues"/>). Otherwise, represent the value of the attribute as a String (see <specref ref="encodingString"/>).
</item>
</olist>
</td>
</tr>

<tr>
<td>
</td>
</tr>
</tbody></table>

<p>
Add the following production to <emph>Element</emph><sub>&nbsp;i&nbsp;</sub>. 
</p>
<table width="100%">
<thead>
<tr>
<th colspan="3" align="left">Syntax</th>
<th align="left">Event Code</th>
</tr>
</thead>
<tbody>
<tr>
<td width="5%">&nbsp;</td>
<td width="5%"></td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td colspan="3">
<emph>Element</emph><sub>&nbsp;i,&nbsp;0</sub> :
</td>
</tr>
<tr>
<td></td>
<td></td>
<td width="50%">
NS <emph>Element</emph><sub>&nbsp;i,&nbsp;0</sub> 
</td>
<td>
<emph>n</emph>.<emph>m</emph>
</td>
</tr>
<tr>
<td colspan="4">&nbsp;</td>
</tr>
<tr>
<td></td>
<td colspan="3">
<p>where <emph>n</emph>.<emph>m</emph> represents the next available event code with length 2. 
</p>
</td>
</tr>
</tbody></table>

<p>
When the value of the <termref def="key-selfContained">selfContained option</termref> is true, add the following production to <emph>Element</emph><sub>&nbsp;i&nbsp;</sub>. 
</p>
<table width="100%">
<thead>
<tr>
<th colspan="3" align="left">Syntax</th>
<th align="left">Event Code</th>
</tr>
</thead>
<tbody>
<tr>
<td width="5%">&nbsp;</td>
<td width="5%"></td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td colspan="3">
<emph>Element</emph><sub>&nbsp;i,&nbsp;0</sub> :
</td>
</tr>
<tr>
<td></td>
<td></td>
<td width="50%">
SC <emph>Fragment</emph> 
</td>
<td>
<emph>n</emph>.<emph>m</emph>
</td>
</tr>
<tr>
<td colspan="4">&nbsp;</td>
</tr>
<tr>
<td></td>
<td colspan="3">
<p>where <emph>n</emph>.<emph>m</emph> represents the next available event code with length 2. 
</p>
</td>
</tr>
</tbody></table>
<p></p>

<table width="100%">
<thead>
<tr>
<th align="left" colspan="2">Semantics:</th>
</tr>
</thead>
<tbody>
<tr>
<td>&nbsp;
</td>
</tr>
<tr>
<td colspan="2">
All productions of the form <emph>LeftHandSide</emph> : SC <emph>Fragment</emph> are evaluated as follows: 
</td>
</tr>
<tr>
<td>&nbsp;
</td>
</tr>
<tr>
<td>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</td>
<td>
<olist>
<item>
Save the string table, grammars, namespace prefixes and any implementation-specific state learned while processing this EXI Body.
</item>
<item>Initialize the string table, grammars, namespace prefixes and any implementation-specific state learned while processing this EXI Body to the state they held just prior to processing this EXI Body.
</item>
<item>Skip to the next byte-aligned boundary in the stream.
</item>
<item>Let <emph>qname</emph> be the <termref def="key-qname">qname</termref> of the SE event immediately preceding this SC event.</item>
<item>Let <emph>content</emph> be the sequence of events following this SC event that match the grammar for element <emph>qname</emph>, up to and including the terminating EE event.</item>
<item>Evaluate the sequence of events (SD, SE(<emph>qname</emph>), <emph>content</emph>, ED) according to the <emph>Fragment</emph> grammar.
</item>
<item>Restore the string table, grammars, namespace prefixes and implementation-specific state learned while processing this EXI Body to that saved in step 1 above.
</item>
</olist>
</td>
</tr>
</tbody></table>

<p>
Add the following productions to each non-terminal <emph>Element</emph><sub>&nbsp;i,&nbsp;j&nbsp;</sub>, such that 0 &le; j &le; content&nbsp;.
</p>
<table width="100%">
<thead>
<tr>
<th colspan="3" align="left">Syntax</th>
<th align="left">Event Code</th>
</tr>
</thead>
<tbody>
<tr>
<td width="5%">&nbsp;</td>
<td width="5%"></td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td colspan="3">
<emph>Element</emph><sub>&nbsp;i,&nbsp;j</sub> :
</td>
</tr>
<tr>
<td></td>
<td></td>
<td width="50%">
SE (*) <emph>Element</emph><sub>&nbsp;i,&nbsp;content2</sub>
</td>
<td>
<emph>n</emph>.<emph>m</emph>
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
CH [schema-invalid
value
] <emph>Element</emph><sub>&nbsp;i,&nbsp;content2</sub>
</td>
<td>
<emph>n</emph>.(<emph>m</emph>+1)
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
ER <emph>Element</emph><sub>&nbsp;i,&nbsp;content2</sub>
</td>
<td>
<emph>n</emph>.(<emph>m</emph>+2)
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
CM <emph>Element</emph><sub>&nbsp;i,&nbsp;content2</sub>
</td>
<td>
<emph>n</emph>.(<emph>m</emph>+3).0
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
PI <emph>Element</emph><sub>&nbsp;i,&nbsp;content2</sub>
</td>
<td>
<emph>n</emph>.(<emph>m</emph>+3).1
</td>
</tr>
<tr>
<td colspan="4">&nbsp;</td>
</tr>
<tr>
<td width="5%"></td>
<td colspan="3">
<p>where <emph>n</emph>.<emph>m</emph> represents the next available event code with length 2. 
</p>
</td>
</tr>
<tr>
<td colspan="4">&nbsp;</td>
</tr>
</tbody></table>

<table width="100%">
<thead>
<tr>
<th align="left">Note:</th>
</tr>
</thead>
<tbody>
<tr>
<td>&nbsp;
</td>
</tr>
<tr>
<td>
<ulist>
<item>
Productions of the form <emph>LeftHandSide</emph> : CH [schema-invalid
value
] <emph>RightHandSide</emph> match schema-invalid character data that is represented as untyped data in the EXI stream. Schema-valid character data is represented as typed data in the EXI stream matched by productions of the form <emph>LeftHandSide</emph> : CH [schema-valid
value
] <emph>RightHandSide</emph> described in section <specref ref="simpleTypeGrammars"/>.
</item>
</ulist>
</td>
</tr>
<tr>
<td>
<ulist>
<item>
The terminal symbol SE (*) is only matched if a more specific match for the current event does not exist in the grammar. 
</item>
</ulist>
</td>
</tr>
</tbody></table>

<p>Add the following production to <emph>Element</emph><sub>&nbsp;i,&nbsp;content2</sub> and to each non-terminal <emph>Element</emph><sub>&nbsp;i,&nbsp;j&nbsp;</sub> that does not already include a production of the form <emph>Element</emph><sub>&nbsp;i,&nbsp;j</sub> : EE, such that content &lt; <emph>j</emph> &lt; <emph>n</emph>, where <emph>n</emph> is the number of non-terminals in <emph>Element</emph><sub>&nbsp;i&nbsp;</sub>. 
</p>
<table width="100%">
<thead>
<tr>
<th colspan="3" align="left">Syntax</th>
<th align="left">Event Code</th>
</tr>
</thead>
<tbody>
<tr>
<td width="5%">&nbsp;</td>
<td width="5%"></td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td colspan="3">
<emph>Element</emph><sub>&nbsp;i,&nbsp;j</sub> :
</td>
</tr>
<tr>
<td></td>
<td></td>
<td width="50%">
EE
</td>
<td>
<emph>n</emph>.<emph>m</emph>
</td>
</tr>
<tr>
<td colspan="4">&nbsp;</td>
</tr>
<tr>
<td></td>
<td colspan="3">
<p>where <emph>n</emph>.<emph>m</emph> represents the next available event code with length 2. 
</p>
</td>
</tr>
<tr>
<td colspan="4">&nbsp;</td>
</tr>
</tbody></table>


<p>Add the following productions to <emph>Element</emph><sub>&nbsp;i,&nbsp;content2</sub> and to each non-terminal <emph>Element</emph><sub>&nbsp;i,&nbsp;j&nbsp;</sub>, such that content &lt; <emph>j</emph> &lt; <emph>n</emph>, where <emph>n</emph> is the number of non-terminals in <emph>Element</emph><sub>&nbsp;i&nbsp;</sub>. 
</p>
<table width="100%">
<thead>
<tr>
<th colspan="3" align="left">Syntax</th>
<th align="left">Event Code</th>
</tr>
</thead>
<tbody>
<tr>
<td width="5%">&nbsp;</td>
<td width="5%"></td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td colspan="3">
<emph>Element</emph><sub>&nbsp;i,&nbsp;j</sub> :
</td>
</tr>
<tr>
<td></td>
<td></td>
<td width="50%">
SE (*) <emph>Element</emph><sub>&nbsp;i,&nbsp;j</sub>
</td>
<td>
<emph>n</emph>.<emph>m</emph>
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
CH [schema-invalid
value
] <emph>Element</emph><sub>&nbsp;i,&nbsp;j</sub>
</td>
<td>
<emph>n</emph>.(<emph>m</emph>+1)
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
ER <emph>Element</emph><sub>&nbsp;i,&nbsp;j</sub>
</td>
<td>
<emph>n</emph>.(<emph>m</emph>+2)
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
CM <emph>Element</emph><sub>&nbsp;i,&nbsp;j</sub>
</td>
<td>
<emph>n</emph>.(<emph>m</emph>+3).0
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
PI <emph>Element</emph><sub>&nbsp;i,&nbsp;j</sub>
</td>
<td>
<emph>n</emph>.(<emph>m</emph>+3).1
</td>
</tr>
<tr>
<td colspan="4">&nbsp;</td>
</tr>
<tr>
<td>
</td>
<td colspan="3">
<p>where <emph>n</emph>.<emph>m</emph> represents the next available event code with length 2. 
</p>
</td>
</tr>
<tr>
<td colspan="4">&nbsp;</td>
</tr>
</tbody></table>
<p>Apply the process described above for element grammars to each normalized type grammar <emph>Type</emph><sub>&nbsp;i&nbsp;</sub>.
</p>
</div5>

<div5 id="addingProductionsStrict">
<head>Adding Productions when Strict is True</head>
<p>This section describes the process for augmenting the normalized grammars when the value of the <termref def="key-strictOption">strict option</termref> is true. For each normalized element grammar <emph>Element</emph><sub>&nbsp;i</sub>, apply the following procedures.</p>


<p>Let <emph>E</emph><sub>&nbsp;i</sub> be the element declaration from which <emph>Element</emph><sub>&nbsp;i</sub> was created and <emph>T</emph><sub>&nbsp;k</sub> be the {type definition} of <emph>E</emph><sub>&nbsp;i&nbsp;</sub>. If <emph>T</emph><sub>&nbsp;k</sub> has named sub-types, add the following production to <emph>Element</emph><sub>&nbsp;i&nbsp;</sub>.  
</p>
<table width="100%">
<thead>
<tr>
<th colspan="3" align="left">Syntax</th>
<th align="left">Event Code</th>
</tr>
</thead>
<tbody>
<tr>
<td width="5%">&nbsp;</td>
<td width="5%"></td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td colspan="3">
<emph>Element</emph><sub>&nbsp;i,&nbsp;0</sub> :
</td>
</tr>
<tr>
<td></td>
<td></td>
<td width="50%">
AT(xsi:type) <emph>Element</emph><sub>&nbsp;i,&nbsp;0</sub> 
</td>
<td>
<emph>n</emph>.<emph>m</emph>
</td>
</tr>
<tr>
<td colspan="4">&nbsp;</td>
</tr>
<tr>
<td width="5%"></td>
<td colspan="3">
<p>where <emph>n</emph>.<emph>m</emph> represents the next available event code with length 2. 
</p>
</td>
</tr>
<tr>
<td colspan="4">&nbsp;</td>
</tr>
</tbody></table>
<table width="100%">
<thead>
<tr>
<th align="left">Note:</th>
</tr>
</thead>
<tbody>
<tr>
<td>&nbsp;
</td>
</tr>
<tr>
<td>
<ulist>
<item>
The value of each AT (xsi:type) event is represented as a QName (see 
<specref ref="encodingQName"/>). If there is no namespace in scope for the specified <termref def="key-qname">qname</termref> prefix, the QName <emph>uri</emph> is set to empty ("") and the QName <emph>localName</emph> is set to the full lexical value of the QName, including the prefix.
</item>
</ulist>
</td>
</tr>
</tbody></table>
<table width="100%">
<thead>
<tr>
<th align="left" colspan="2">Semantics:</th>
</tr>
</thead>
<tbody>
<tr>
<td colspan="2">&nbsp;
</td>
</tr>
<tr>
<td colspan="2">
<ulist>
<item>
When using schemas, all productions of the form <emph>LeftHandSide</emph> : AT (xsi:type) are evaluated as follows: 
</item>
</ulist>
</td>
</tr>
<tr>
<td>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</td>
<td>
<olist>
<item>
Let <emph>qname</emph> be the value of the xsi:type attribute
</item>
<item>
If a grammar exists for the <emph>qname</emph> type, evaluate the element contents using the grammar for the <emph>qname</emph> type instead of the declared type for the current element
</item>
</olist>
</td>
</tr>
</tbody></table>
<p>
Let <emph>Type</emph><sub>&nbsp;k</sub> and <emph>TypeEmpty</emph><sub>&nbsp;k</sub> be the type grammars created from <emph>T</emph><sub>&nbsp;k</sub> (see section <specref ref="typeGrammars"/>). If the {nillable} property of <emph>E</emph><sub>&nbsp;i</sub> is true, add the following production to <emph>Element</emph><sub>&nbsp;i&nbsp;</sub>.
</p>
<table width="100%">
<thead>
<tr>
<th colspan="3" align="left">Syntax</th>
<th align="left">Event Code</th>
</tr>
</thead>
<tbody>
<tr>
<td width="5%">&nbsp;</td>
<td width="5%"></td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td colspan="3">
<emph>Element</emph><sub>&nbsp;i,&nbsp;0</sub> :
</td>
</tr>
<tr>
<td></td>
<td></td>
<td width="50%">
AT(xsi:nil) <emph>Element</emph><sub>&nbsp;i,&nbsp;0</sub>
</td>
<td>
<emph>n</emph>.<emph>m</emph>
</td>
</tr>
<tr>
<td colspan="4">&nbsp;</td>
</tr>
<tr>
<td width="5%"></td>
<td colspan="3">
<p>where <emph>n</emph>.<emph>m</emph> represents the next available event code with length 2. 
</p>
</td>
</tr>
<tr>
<td colspan="4">&nbsp;</td>
</tr>
</tbody></table>
<table width="100%">
<thead>
<tr>
<th align="left">Note:</th>
</tr>
</thead>
<tbody>
<tr>
<td>&nbsp;
</td>
</tr>
<tr>
<td>
<ulist>
<item>
The value of each AT (xsi:nil) event is represented as a Boolean (see 
<specref ref="encodingBoolean"/>). 
</item>
</ulist>
</td>
</tr>
</tbody></table>
<table width="100%">
<thead>
<tr>
<th align="left" colspan="2">Semantics:</th>
</tr>
</thead>
<tbody>
<tr>
<td colspan="2">&nbsp;
</td>
</tr>
<tr>
<td colspan="2">
<ulist>
<item>
When using schemas, productions of the form <emph>LeftHandSide</emph> : AT (xsi:nil) are evaluated as follows: 
</item>
</ulist>
</td>
</tr>
<tr>
<td>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</td>
<td>
<olist>
<item>
Let <emph>nil</emph> be the value of the xsi:nil attribute
</item>
<item>
If the value of <emph>nil</emph> is true, evaluate the element contents using the grammar for the <emph>TypeEmpty</emph><sub>&nbsp;k,&nbsp;0</sub> type instead of the declared type for the current element
</item>
</olist>
</td>
</tr>
</tbody></table>
<p>
When the value of the <termref def="key-selfContained">selfContained option</termref> is true, add the following production to <emph>Element</emph><sub>&nbsp;i&nbsp;</sub>:
</p>
<table width="100%">
<thead>
<tr>
<th colspan="3" align="left">Syntax</th>
<th align="left">Event Code</th>
</tr>
</thead>
<tbody>
<tr>
<td width="5%">&nbsp;</td>
<td width="5%"></td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td colspan="3">
<emph>Element</emph><sub>&nbsp;i,&nbsp;0</sub> :
</td>
</tr>
<tr>
<td></td>
<td></td>
<td width="50%">
SC <emph>Fragment</emph> 
</td>
<td>
<emph>n</emph>.<emph>m</emph>
</td>
</tr>
<tr>
<td colspan="4">&nbsp;</td>
</tr>
<tr>
<td></td>
<td colspan="3">
<p>where <emph>n</emph>.<emph>m</emph> represents the next available event code with length 2. 
</p>
</td>
</tr>
</tbody></table>
<p></p>

<table width="100%">
<thead>
<tr>
<th align="left" colspan="2">Semantics:</th>
</tr>
</thead>
<tbody>
<tr>
<td>&nbsp;
</td>
</tr>
<tr>
<td colspan="2">
All productions of the form <emph>LeftHandSide</emph> : SC <emph>Fragment</emph> are evaluated as follows: 
</td>
</tr>
<tr>
<td>&nbsp;
</td>
</tr>
<tr>
<td>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</td>
<td>
<olist>
<item>
Save the string table, grammars, namespace prefixes and any implementation-specific state learned while processing this EXI Body.
</item>
<item>Initialize the string table, grammars, namespace prefixes and any implementation-specific state learned while processing this EXI Body to the state they held just prior to processing this EXI Body.
</item>
<item>Skip to the next byte-aligned boundary in the stream.
</item>
<item>Let <emph>qname</emph> be the <termref def="key-qname">qname</termref> of the SE event immediately preceding this SC event.</item>
<item>Let <emph>content</emph> be the sequence of events following this SC event that match the grammar for element <emph>qname</emph>, up to and including the terminating EE event.</item>
<item>Evaluate the sequence of events (SD, SE(<emph>qname</emph>), <emph>content</emph>, ED) according to the <emph>Fragment</emph> grammar.
</item>
<item>Restore the string table, grammars, namespace prefixes and implementation-specific state learned while processing this EXI Body to that saved in step 1 above.
</item>
</olist>
</td>
</tr>
</tbody></table>

</div5>

</div4>
</div3></div2></div1>
<div1 id="compression">
<head>EXI Compression</head>
<p>

The use of

EXI compression 
increases
 compactness utilizing additional computational resources. EXI compression combines knowledge of XML with a widely adopted, standard compression algorithm to achieve higher compression ratios than would be achievable by applying compression to the entire stream.</p>
<p>
EXI compression is applied when <termref def="key-compressionOption">compression</termref> is turned on or when <termref def="key-alignmentOption">alignment </termref> is set to <termref def="key-precompression">pre-compression</termref>. Byte-aligned representations of event codes and content items are more amenable to compression algorithms compared to unaligned representations because most compression algorithms operate on series of bytes to identify redundancies in the octets. Therefore, when EXI compression is used, event codes and content items of EXI events are encoded as aligned bytes in accordance with <specref ref="encodingEventCodes"/> and <specref ref="encodingValues" />.</p>

<!-- p>There are three distinct phases in the compression process:</p>

<olist>
<item><termref def="key-alignmentOption">Alignment</termref></item>
<item><termref def="key-precompressOption">Pre-compression (or channelization)</termref></item>
<item><termref def="key-compressionOption">Compression</termref></item>
</olist>

<p>The user may select through EXI options to have only individual phases of this process completed 
when encoding or decoding a stream.  This provides additional flexibility for using additional 
compression algorithms.</p -->

<!-- p>The <termref def="key-alignmentOption">alignment</termref> phase creats a byte-aligned 
representations of event codes and content items that is more amenable to compression algorithms 
compared to unaligned representations. This is because most compression algorithms operate on a 
series of bytes to identify redundancies in the octets. Therefore, when EXI compression is used, 
event codes and content items of EXI events are encoded as aligned bytes in accordance with 
<specref ref="encodingEventCodes"/> and <specref ref="encodingValues" />.</p -->

<!-- p>Pre-compression is an alternative form of alignment that involves blocking 
and channelization of the byte-aligned stream.  <emph>Blocking</emph> splits a sequence of EXI 
events into a number of contiguous blocks of events. Events that belong to the same block are 
transformed into lower entropy groups of similar values called <emph>channels</emph>, which are 
individually well suited for standard compression algorithms. To reduce compression overhead, 
smaller channels are combined before compressing them, while larger channels are compressed 
independently. The criteria EXI compression uses to define and combine channels is intentionally 
simple to facilitate implementation, reduce processing overhead, and avoid the need to encode 
channel ordering or grouping information in the format. The figure below presents a schematic 
view of the steps involved in EXI compression.</p -->

<p>EXI compression splits a sequence of EXI events into a number of contiguous blocks of events.
Events that belong to the same block are transformed into lower entropy groups of similar values called <emph>channels</emph>, which are individually well suited for standard compression algorithms. To reduce compression overhead, smaller channels are combined before compressing them, while larger channels are compressed independently. The criteria EXI compression uses to define and combine channels is intentionally simple to facilitate implementation, reduce processing overhead, and avoid the need to encode channel ordering or grouping information in the format. The figure below presents a schematic view of the steps involved in EXI compression.
</p>

<graphic source="compression.png" alt="EXI Compression Overview"/>

<p>In the following sections, <specref ref="blocks"/> defines blocks and explains how EXI events are partitioned into blocks.
Section <specref ref="channels"/> defines channels, their organization as well as how a group of channels correlate to its corresponding block of events.
Section <specref ref="CompressedStreams"/> describes how some channels are combined as needed in preparation for applying compression algorithms on channels.
</p>

<div2 id="blocks">
<head>Blocks</head>
<p>EXI compression partitions the sequence of EXI events into a sequence of one or more non-overlapping blocks. Each block preceding the final block contains the minimum set of consecutive events that have exactly <termref def="key-blockSizeOption">blockSize</termref> Attribute (AT) and Character (CH) <emph>values</emph>, where blockSize is the block size of the EXI stream (see <specref ref="options"/>). The final block contains no more than blockSize Attribute (AT) and Character (CH) <emph>values</emph>.</p>

</div2>

<div2 id="channels">
<head>Channels</head>
<p>Events inside each block are multiplexed into channels. The first channel of each block is the structure channel described in Section <specref ref="StructureChannel"/>. The remaining channels in each block are value channels described in Section <specref ref="ValueChannels"/>.
The diagram below presents an exemplary view of the transformation in which events within a block are multiplexed into channels in one way and channels are demultiplexed into events in the other way.</p>
<graphic source="channels.png" alt="Multiplexing EXI events into channels"/>

<div3 id="StructureChannel">
<head>Structure Channel</head>
<p>The structure channel of each block defines the overall order and structure of the events in that block. It contains the event codes and associated content for each event in the block, except for Attribute (AT) and Character (CH) <emph>values</emph>, which are stored in the value channels. In addition, there are two attribute events whose <emph>values</emph> are stored in the structure channel instead of in value channels, which are xsi:nil and xsi:type attributes that match a schema-informed grammar production. These attribute events are intrinsic to the grammar system thus are essential in processing the structure channel because their values affect the grammar to be used for processing the rest of the elements on which they appear. All event codes and content in the structure stream occur in the same order as they occur in the EXI event sequence.</p>
</div3>

<div3 id="ValueChannels">
<head>Value Channels</head>
<p>The <emph>values</emph> of the Attribute (AT) and Character (CH) events in each block are organized into separate channels based on the <emph>qname</emph> of the associated attribute or element. Specifically, the <emph>value</emph> of each Attribute (AT) event is placed in the channel identified by the <emph>qname</emph> of the Attribute and the <emph>value</emph> of each Character (CH) event is placed in the channel identified by the <emph>qname</emph> of its parent Start Element (SE) event. Each block contains exactly one channel for each distinct element or attribute <emph>qname</emph> that occurs in the block. The <emph>values</emph> in each channel occur in the order they occur in the EXI event sequence.</p>
</div3>
</div2>

<div2 id="CompressedStreams">
<head>Compressed Streams</head>
<p>The channels in a block are further organized into compressed streams. Smaller channels are combined into the same compressed stream, while others are each compressed separately. Below are the rules applied within the scope of a block used to determine the channels to be combined together, the order of the compressed streams and the order amongst the channels that are combined into the same compressed stream.</p>

<p>If the block contains at most 100 <emph>values</emph>, the block will contain only 1 compressed stream containing the structure channel followed by all of the value channels. The order of the value channels within the compressed stream is defined by the order in which the first <emph>value</emph> in each channel occurs in the EXI event sequence.</p>

<p>If the block contains more than 100 <emph>values</emph>, the first compressed stream contains only the structure channel. The second compressed stream contains all value channels that contain no more than 100 <emph>values</emph>. And the remaining compressed streams each contain only one channel, each having more than 100 <emph>values</emph>. The order of the value channels within the second compressed stream is defined by the order in which the first <emph>value</emph> in each channel occurs in the EXI event sequence. Similarly, the order of the compressed streams following the second compressed stream in the block is defined by the order in which the first <emph>value</emph> of the channel inside each compressed stream occurs in the EXI event sequence.</p>

<note>EXI compression changes the order in which event codes and <emph>value</emph>s are read and written to and from an EXI stream. Implementations must encode and decode <emph>value</emph>s in this revised order so order sensitive constructs like the string table (see <specref ref="stringTable"/>) work properly.</note>


<p>
When the value of the <termref def="key-compressionOption">compression</termref> option is set to true, each compressed 
stream in a block is stored using the standard DEFLATE Compressed Data Format defined by 
RFC 1951 <bibref ref="RFC1951"/>. Otherwise, when the value of the <termref def="key-alignmentOption">alignment </termref> option is set to <termref def="key-precompression">pre-compression</termref>, each compressed stream in a block is stored directly without the DEFLATE algorithm.</p>
</div2>

</div1>

<div1 id="conformance">
<head>Conformance</head>

<div2 id="streamConformance">
<head>EXI Stream Conformance</head>
<p>
<termdef id="key-conformantExiStream" term="conformant EXI stream"><term>Conformant EXI streams</term> consist of a sequence of octets that follows the syntax of <termref def="key-existream">EXI stream</termref> that is defined in this document. </termdef>
<termdef id="key-extendedExiStream" term="extended EXI stream">
EXI format provides a way to involve user-defined 
datatype representations 

in EXI streams processing, which is an extension point that, when used in conjunction with relevant datatype representations 

specifications external to this document, leads to the formulation of <term>Extended EXI streams</term>.
</termdef>
</p>
<p>
Conformance of extended EXI streams are relative to the syntax defined by the relevant user-defined datatype representations 

specifications. The definitions of user-defined 
datatype representations 

syntax are out of the scope of this document. 
<termdef id="key-conformantExtendedExiStream" term="conformant extended EXI stream">
An extended EXI stream is a <term>conformant extended EXI streams</term> if replacing value items represented using user-defined 
datatype representations 

with their intrinsic representations would make the stream a <termref def="key-conformantExiStream">conformant EXI streams</termref>. 
</termdef>

When the use of user-defined 
datatype representations 

is expected, and agreed upon prior to the exchange of EXI streams, the parties intended to participate in the exchange not only need to share the knowledge about the 
datatype representations, 

but also MUST advertise the stream as "EXI streams with regards to 
datatype representations 

<emph>S</emph>" instead of simply as "EXI streams" when they are asked to do so, where <emph>S</emph> is an unordered set of 
datatype representations. 

An "EXI streams with regards to 
datatype representations 

<emph>S</emph>" where <emph>S</emph> is the set of 
datatype representations 

can be processed by an <termref def="key-exidecoder">EXI stream decoder</termref> only if the processor has the shared knowledge about each one of the 
datatype representations 

in the set <emph>S</emph>. <termref def="key-exidecoder">EXI stream decoders</termref> MAY fail with an error when they receive an extended EXI Stream that uses an user-defined 
datatype representations 

that it does not understand.
</p>
<p>The structural syntax of <termref def="key-existream">EXI streams</termref> and 
<termref def="key-extendedExiStream">extended EXI streams</termref>
is described by the abstract EXI grammar system defined in this document. Although this document specifies the normative way in which XML Schema schemas are mapped into the EXI grammar system to make a schema-informed grammar, EXI allows the use of other schema languages to process EXI streams or extended EXI streams so far as there is a well known EXI grammar binding to the schema language and the binding preserves the semantics part of the EXI grammar system. EXI streams or extended EXI streams generated using schemas of such schema language are still conformant. The definitions of grammar binding to schema languages other than XML Schema is out of the scope of this document, and each community of schema languages is encouraged to define a binding to make the most efficiency out of EXI when schemas of that language are available .
</p>
</div2>

<div2 id="processorConformance">
<head>EXI Processor Conformance</head>
<p>
The conformance of EXI Processors are defined separately for each of the two processor roles, <termref def="key-exiencoder">EXI stream encoders</termref> and <termref def="key-exidecoder">EXI stream decoders</termref>; the conformance of the former is described in terms of the conformance of the <termref def="key-existream">EXI streams</termref> or <termref def="key-extendedExiStream">extended EXI streams</termref> that they produce, while that of the latter is based on the set of format features that EXI stream decoders are prepared with for processing <termref def="key-conformantExiStream">conformant EXI streams</termref> or <termref def="key-conformantExtendedExiStream">conformant extended EXI streams</termref>.
</p>
<p>
An <termref def="key-exiencoder">EXI stream encoder</termref> is conformant if and only if it is capable of generating <termref def="key-conformantExiStream">conformant EXI streams</termref> or <termref def="key-conformantExtendedExiStream">conformant extended EXI streams</termref> given any input structured data it is made to work on.
On the other hand, <termref def="key-exidecoder">EXI stream decoders</termref> MUST support all format features described in this document as they are explained, except for the capability of handling 
<termref def="key-datatypeRepresentationMaps">Datatype Representation Map</termref>

which is an optional feature. 
EXI stream decoders that do not implement 
<termref def="key-datatypeRepresentationMaps">Datatype Representation Map</termref>

feature MUST report an error with a meaningful message upon encountering a <termref def="key-datatypeRepresentationOption">"datatypeRepresentationMap"</termref> element while processing <termref def="key-optionsDoc">EXI options documents</termref> in <termref def="key-exiheader">EXI headers</termref>.
</p>

<p>Both an EXI stream encoder and an EXI stream decoder MAY support only a certain range of values for the EXI header option <termref def="key-blockSizeOption">blockSize</termref>. For interoperability between processors, every EXI processors SHOULD at least support the blockSize option value of 1,000,000.
</p>

</div2>

</div1>
    </body>
    <back>
<div1 id="References">
<head>References</head>

    <div2 id='Normative-References'>
        <head>Normative References</head>

	<blist>
	  <bibl key="IETF RFC 1951" href="http://www.ietf.org/rfc/rfc1951.txt" id="RFC1951">
	    <titleref>DEFLATE Compressed Data Format Specification version 1.3</titleref>, P. Deutsch, Author. Internet
	    Engineering Task Force, May 1996. Available at
	    http://www.ietf.org/rfc/rfc1951.txt.
	  </bibl>

	  <bibl key="IETF RFC 2119" href="http://www.ietf.org/rfc/rfc2119.txt" id="RFC2119">
	    <titleref>Key words for use in RFCs to Indicate
	    Requirement Levels</titleref>, S. Bradner, Author. Internet
	    Engineering Task Force, June 1999. Available at
	    http://www.ietf.org/rfc/rfc2119.txt.
	  </bibl>

	  <bibl key="IETF RFC 3023" href="http://www.ietf.org/rfc/rfc3023.txt" id="RFC3023">
	    <titleref>XML Media Types</titleref>, 
	    M. Murata, S. St.Laurent and D. Kohn, Author. Internet
	    Engineering Task Force, January 2001. Available at
	    http://www.ietf.org/rfc/rfc3023.txt.
	  </bibl>

	  <bibl id="ISO10646" key="ISO/IEC 10646">
	    <titleref>ISO/IEC 10646-1:2000. Information technology &mdash; Universal Multiple-Octet Coded Character Set (UCS) &mdash; Part 1: Architecture and Basic Multilingual Plane</titleref> and <titleref>ISO/IEC 10646-2:2001. Information technology &mdash; Universal Multiple-Octet Coded Character Set (UCS) &mdash; Part 2: Supplementary Planes</titleref>, as, from time to time, amended, replaced by a new edition or expanded by the addition of new parts. [Geneva]: International Organization for Standardization. (See <loc href="http://www.iso.org">http://www.iso.org</loc> for the latest version.)
</bibl>
	<bibl id="UnicodeDB" key="Unicode Database">
	The Unicode Consortium. <emph>Unicode Character Database</emph> (Revision 5.0.0).
	Available at: <loc href="http://www.unicode.org/Public/5.0.0/ucd/UCD.html">
	http://www.unicode.org/Public/5.0.0/ucd/UCD.html</loc>
	</bibl>
<!--
	  <bibl key="IETF RFC 3023"	 
		href="http://www.ietf.org/rfc/rfc3023.txt" id="RFC3023">IETF
	  "RFC 3023: XML Media Types", M. Murata, S. St. Laurent, D. Kohn, July
	  1998.</bibl>
-->
	  <bibl id="XML10" key="XML 1.0" href="http://www.w3.org/TR/2006/REC-xml-20060816/">
	    <titleref>Extensible Markup Language (XML) 1.0 (Fourth Edition)</titleref>,
	    T.  Bray, J. Paoli, C. M. Sperberg-McQueen, and E. Maler, Editors.
	    World Wide Web Consortium, 10 February 1998, revised 16 August 2006.
	    This version is http://www.w3.org/TR/2006/REC-xml-20060816.
	    The latest version is available at
	    <loc href="http://www.w3.org/TR/REC-xml/">
	    http://www.w3.org/TR/REC-xml</loc>.
	  </bibl>
	  <bibl id='XMLInfoset' key='XML Information Set' href='http://www.w3.org/TR/2004/REC-xml-infoset-20040204/'>
	    <titleref>XML Information Set (Second Edition)</titleref>,
	    J. Cowan and R. Tobin, Editors. World Wide Web Consortium,
	    24 October 2001, revised 4 February 2004.
	    This version is http://www.w3.org/TR/2004/REC-xml-infoset-20040204.
	    The latest version is available at
	    <loc href='http://www.w3.org/TR/xml-infoset/'>
	    http://www.w3.org/TR/xml-infoset</loc>.
	  </bibl>
	  <bibl id="schema1" key="XML Schema Structures" href="http://www.w3.org/TR/2004/REC-xmlschema-1-20041028/">
	    <titleref>XML Schema Part 1: Structures Second
	    Edition</titleref>, H. Thompson, D. Beech, M. Maloney, and
	    N. Mendelsohn, Editors. World Wide Web Consortium, 2 May
	    2001, revised 28 October 2004. 
	    This version is http://www.w3.org/TR/2004/REC-xmlschema-1-20041028.
	    The latest version is available at
	    <loc href='http://www.w3.org/TR/xmlschema-1/'>
	    http://www.w3.org/TR/xmlschema-1</loc>.
	  </bibl>
	  <bibl key="XML Schema Datatypes" id="schema2"
		href="http://www.w3.org/TR/2004/REC-xmlschema-2-20041028/">
	    <titleref>XML Schema Part 2: Datatypes Second
	    Edition</titleref>, P. Byron and A. Malhotra,
	    Editors. World Wide Web Consortium, 2 May 2001, revised 28
	    October 2004.
	    This version is http://www.w3.org/TR/2004/REC-xmlschema-2-20041028.
	    The latest version is available at
	    <loc href='http://www.w3.org/TR/xmlschema-2/'>
	    http://www.w3.org/TR/xmlschema-2</loc>.
	  </bibl>
	</blist>
    </div2>
    <div2 id='Informative-References'>
      <head>Other References</head>
      <blist>
	<bibl id="efx" key="Efficient XML"
	  href="http://www.w3.org/TR/2007/WD-exi-measurements-20070725/#contributions-efx">
	  <titleref>Efficient XML</titleref>, part of <bibref ref="eximeas"/> independently referenced.
	  The latest version is available at
	  <loc href="http://www.w3.org/TR/exi-measurements/#contributions-efx">
	  http://www.w3.org/TR/exi-measurements/#contributions-efx</loc>.
	</bibl>
	<bibl id="exieval" key="EXI Evaluation Note"
	      href="http://www.w3.org/TR/2008/WD-exi-evaluation-20080728/">
	  <titleref>Efficient XML Interchange Evaluation</titleref>, 
	  Carine Bournez, Editor. 
	  World Wide Web Consortium. 
	  The latest version is available at 
	  <loc href="http://www.w3.org/TR/exi-evaluation/">
	  http://www.w3.org/TR/exi-evaluation/</loc>.
	</bibl>
	<bibl id="exiimpacts" key="EXI Impacts Note"
	      href="http://www.w3.org/TR/2008/WD-exi-impacts-20080903/">
	  <titleref>Efficient XML Interchange (EXI) Impacts</titleref>, 
	  Jaakko Kangasharju, Editor. 
	  World Wide Web Consortium. 
	  The latest version is available at 
	  <loc href="http://www.w3.org/TR/exi-impacts/">
	  http://www.w3.org/TR/exi-impacts/</loc>.
	</bibl>
	<bibl id="eximeas" key="EXI Measurements Note"
	      href="http://www.w3.org/TR/2007/WD-exi-measurements-20070725/">
	  <titleref>Efficient XML Interchange Measurements Note</titleref>,
	  Greg White, Jaakko Kangasharju, Don Brutzman and Stephen Williams, Editors.
	  World Wide Web Consortium.
	  The latest version is available at
	  <loc href="http://www.w3.org/TR/exi-measurements/">
	  http://www.w3.org/TR/exi-measurements/</loc>.
	</bibl>
	<!-- bibl id="eximeas" key="EXI Measurements Note"
	      href="http://www.w3.org/TR/2006/WD-exi-measurements-20060718/">
	  <titleref>Efficient XML Interchange Measurements Note</titleref>,
	  Greg White, Don Brutzman, Stephen Williams and Jaakko Kangasharju, Editors.
	  World Wide Web Consortium, 18 July 2006.
	  This version is http://www.w3.org/TR/2006/WD-exi-measurements-20060718/.
	  The latest version is available at
	  <loc href="http://www.w3.org/TR/exi-measurements/">
	  http://www.w3.org/TR/exi-measurements/</loc>.
	</bibl -->
	<bibl id="exiprimer" key="EXI Primer"
	      href="http://www.w3.org/TR/2007/WD-exi-primer-20071219/">
	  <titleref>

	  Efficient XML Interchange (EXI) Primer

	  </titleref>,

	  Daniel Peintner, Santiago Pericas-Geertsen, Editors.
	  World Wide Web Consortium.

	  The latest version is available at
	  <loc href="http://www.w3.org/TR/exi-primer/">
	  http://www.w3.org/TR/exi-primer/</loc>.
	</bibl>
	<bibl id="greibach" key="Greibach Normal Form">
	  <titleref>

	  A New Normal-Form Theorem for Context-Free Phrase Structure Grammars

          </titleref>,

	  Sheila A. Greibach, Author.
	  Journal of the ACM Volume 12&nbsp; Issue 1, January 1965, pp. 42–52.

	</bibl>
	<bibl id="huffman" key="Huffman Coding"
	      href="http://compression.ru/download/articles/huff/huffman_1952_minimum-redundancy-codes.pdf">
	  <titleref>A Method for the Construction of
	  Minimum-Redundancy Codes</titleref>, D. A. Huffman,
	  Author. Proceedings of the I.R.E., September 1952, pp.
	  1098-1102.
	</bibl>
	<bibl id="relaxng" key="ISO/IEC 19757-2:2003"
	      href="
http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=37605">
	    <titleref>Document Schema Definition Language (DSDL) -- Part 2: Regular-grammar-based validation -- RELAX NG</titleref>
	  </bibl>
	<bibl id="soap12" key="SOAP 1.2"
	      href="http://www.w3.org/TR/2003/REC-soap12-part1-20030624/">
	  <titleref>SOAP Version 1.2 Part 1: Messaging
	  Framework</titleref>, M. Gudgin, M.  Hadley, N. Mendelsohn,
	  J-J. Moreau, H. Frystyk Nielsen, Editors. World Wide Web
	  Consortium, 24 June 2003.
	  This version is http://www.w3.org/TR/2003/REC-soap12-part1-20030624/.
	  The latest version is available at
	  <loc href='http://www.w3.org/TR/soap12-part1/'>
	  http://www.w3.org/TR/soap12-part1/</loc>.
	</bibl>
	<bibl id="xbcmeas" key="XBC Measurement Methodologies"
	      href="http://www.w3.org/TR/2005/NOTE-xbc-measurement-20050331/">
	  <titleref>XML Binary Characterization Measurement
	  Methodologies</titleref>, S. D. Williams and P. Haggar,
	  Editors. World Wide Web Consortium, 31 March 2005.
	  This version is http://www.w3.org/TR/2005/NOTE-xbc-measurement-20050331/.
	  The latest version is available at
	  <loc href='http://www.w3.org/TR/xbc-measurement/'>
	  http://www.w3.org/TR/xbc-measurement</loc>.
	</bibl>
	<bibl id="xbcusecases" key="XBC Use Cases"
	      href="http://www.w3.org/TR/2005/NOTE-xbc-use-cases-20050331/">
	  <titleref>XML Binary Characterization Use Cases</titleref>,
	  Mike Cokus and Santiago Pericas-Geertsen, Editors.
	  World Wide Web Consortium, 31 March 2005.
	  This version is http://www.w3.org/TR/2005/NOTE-xbc-use-cases-20050331/.
	  The latest version is available at
	  <loc href="http://www.w3.org/TR/xbc-use-cases/">
	  http://www.w3.org/TR/xbc-use-cases</loc>.
	</bibl>
	<bibl id="xbcproperties" key="XBC Properties"
	      href="http://www.w3.org/TR/2005/NOTE-xbc-properties-20050331/">
	  <titleref>XML Binary Characterization Properties</titleref>,
	  Mike Cokus and Santiago Pericas-Geertsen, Editors.
	  World Wide Web Consortium, 31 March 2005.
	  This version is http://www.w3.org/TR/2005/NOTE-xbc-properties-20050331/
	  The latest version is available at
	  <loc href="http://www.w3.org/TR/xbc-properties/">
	  http://www.w3.org/TR/xbc-properties/</loc>.
	</bibl>
      </blist>
    </div2>
</div1>
<!--
<div1 id="media-type">
    <head>The application/exi Media Type</head>
    <p>This appendix defines the <attval>application/exi</attval>
    media type which can be used to describe Efficient XML Interchange
    documents.</p>
	<ednote>
		<name>John</name>
		<edtext>This section is mainly a placeholder at this time.</edtext>
	</ednote>
    <div2 id="ietf-reg">
        <head>Registration</head>
        <glist>
            <gitem><label>MIME media type name:</label><def><p>application</p></def></gitem>
            <gitem><label>MIME subtype name:</label><def><p>exi</p></def></gitem>
            <gitem><label>Required parameters:</label><def><p>none</p></def></gitem>
            <gitem><label>Optional parameters:</label>
                <def>
                    <glist>
                        <gitem><label>charset</label>
                            <def><p>This parameter has identical semantics to the charset parameter
                                of the <attval>application/xml</attval> media type as specified in
                                <bibref ref="RFC3023"/>.</p></def></gitem>
                        
                    </glist></def></gitem>
            <gitem><label>Encoding considerations:</label>
                <def><p>Identical to those of <attval>application/xml</attval>
                    as described in <bibref ref="RFC3023"/>,
                    section 3.2, as applied to the Web Services Policy document Infoset.</p></def></gitem>
            <gitem><label>Security considerations:</label>
                <def><p>See section <specref ref="Security_Considerations"/> in this document.</p></def></gitem>
            <gitem><label>Interoperability considerations:</label>
                <def><p>TBD.</p></def></gitem>
            <gitem><label>Published specifications:</label>
                <def><p>This document.</p></def></gitem>
            <gitem><label>Applications which use this media type:</label>
                <def><p>This new media type is being registered to allow for deployment of Efficient
	    XML Interchange on the World Wide Web.</p></def></gitem>
            <gitem>
                <label>Additional information:</label>
                <def><glist>
                    <gitem>
                        <label>File extension:</label>
                        <def><p>exi</p></def>
                    </gitem>
                    <gitem>
                        <label>Fragment identifiers:</label>
                        <def><p>A syntax identical to that of
                            <attval>application/xml</attval> as described in <bibref
                                ref="RFC3023"/>.</p></def>
                    </gitem>
                    <gitem>
                        <label>Base URI:</label>
                        <def><p>As specified in <bibref ref="RFC3023"/>, section 6.</p>
                        </def>
                    </gitem>
                    <gitem>
                        <label>Macintosh File Type code:</label>
                        <def><p>TEXT</p></def>
                    </gitem>
                    <gitem>
                        <label>Person and email address to contact for further information:</label>
                        <def><p>World Wide Web Consortium &lt;web-human@w3.org&gt;</p></def></gitem>
                    <gitem>
                        <label>Intended usage:</label><def><p>COMMON</p></def></gitem>
                    <gitem>
                        <label>Author/Change controller:</label>
                        <def><p>The Efficient XML Interchange specification set is a work product of the World Wide
                    Web Consortium's 
                    <loc
                        href="http://www.w3.org/XML/EXI/"
                        >Efficient XML Interchange Working Group</loc>.
                    The W3C has change control over these specifications.</p></def></gitem>
                </glist>
            </def>
            </gitem>
        </glist>
    </div2>
</div1>
-->
<div1 id="InfosetMapping">
  <head>Infoset Mapping</head>

  <p>
    This appendix contains the mappings between the XML Information
    Set <bibref ref="XMLInfoset"/> model and the EXI format.
    Starting from the document information item,
    each <term>information item</term> definition is mapped to its respective
    unordered set of EXI event types. The actual order amongst information set items when it is relevant reflects the occurrence order of EXI events or their references in an EXI stream that correlate to the infoset items. As used in the XML Information
    Set specification, the Infoset property names are shown in square
    brackets, <emph role="infoset-property">thus</emph>.
  </p>

  <div2 id='DocumentInformationItem'>
      <head>Document Information Item</head>
      <p>
	A document information item maps to a pair of SD and ED event with each of its properties subject to further mapping as shown in the following table.
      </p>
      <table border='1' cellpadding='3' width="100%">
	<caption>Mapping between the document information item properties to EXI event types</caption>
	<thead>
	  <tr>
	    <th width="35%">Property</th>
	    <th width="60%">EXI event types</th>
	  </tr>
	</thead>
	<tbody>
	  <tr>
	    <td><emph role="infoset-property">children</emph></td>
	    <td>CM* PI* DT? [SE, EE]</td>
	  </tr>
	  <tr>
	    <td><emph role="infoset-property">document element</emph></td>
	    <td>[SE, EE]</td>
	  </tr>
	  <tr>
	    <td><emph role="infoset-property">notations</emph></td>
	    <td>Computed based on <emph>text</emph> content item of DT to which each notation information set item maps to.</td>
	  </tr>
	  <tr>
	    <td><emph role="infoset-property">unparsed entities</emph></td>
	    <td>Computed based on <emph>text</emph> content item of DT to which each unparsed entity information set item maps to.</td>
	  </tr>
	  <tr>
	    <td><emph role="infoset-property">base URI</emph></td>
	    <td>The base URI of the EXI stream</td>
	  </tr>
	  <tr>
	    <td><emph role="infoset-property">character encoding scheme</emph></td>
	    <td>N/A</td>
	  </tr>
	  <tr>
	    <td><emph role="infoset-property">standalone</emph></td>
	    <td>Not available</td>
	  </tr>
	  <tr>
	    <td><emph role="infoset-property">version</emph></td>
	    <td>Not available</td>
	  </tr>
	  <tr>
	    <td><emph role="infoset-property">all declarations processed</emph></td>
	    <td>True if all declarations contained directly or indirectly in DT are processed, otherwise false, which is the processor quality as opposed to the information provided by the format.</td>
	  </tr>
	</tbody>
      </table>
  </div2>


    <div2 id='ElementInformationItem'>
      <head>Element Information Items</head>
      <p>
	An element information item maps to a pair of a SE event and the corresponding EE event with each of its properties subject to further mapping as shown in the following table.
      </p>

      <table border='1' cellpadding='3' width="100%">
	<caption>Mapping of the element information item properties to EXI event types</caption>
	<thead>
	  <tr>
	    <th width="35%">Property</th>
	    <th width="60%">EXI event types</th>
	  </tr>
	</thead>
	<tbody>
	  <tr>
	    <td><emph role="infoset-property">namespace name</emph></td>
	    <td>SE</td>
	  </tr>
	  <tr>
	    <td><emph role="infoset-property">local name</emph></td>
	    <td>SE</td>
	  </tr>
	  <tr>
	    <td><emph role="infoset-property">prefix</emph></td>
	    <td>SE</td>
	  </tr>
	  <tr>
	    <td><emph role="infoset-property">children</emph></td>
	    <td>[SE, EE]* PI* CM* CH* ER*</td>
	  </tr>
	  <tr>
	    <td><emph role="infoset-property">attributes</emph></td>
	    <td>AT*</td>
	  </tr>
	  <tr>
	    <td><emph role="infoset-property">namespace attributes</emph></td>
	    <td>NS*</td>
	  </tr>
	  <tr>
	    <td><emph role="infoset-property">in-scope namespaces</emph></td>
	    <td>
	      The namespace information items computed using the <emph
	      role="infoset-property">namespace attributes</emph>
	      properties of this information item and its ancestors
	    </td>
	  </tr>
	  <tr>
	    <td><emph role="infoset-property">base URI</emph></td>
	    <td>The base URI of the element information item</td>
	  </tr>
	  <tr>
	    <td><emph role="infoset-property">parent</emph></td>
	    <td>Computed based on the last SE event encountered that did
	    not get a matching EE event if any, or computed based on the SD event</td>
	  </tr>
	</tbody>
      </table>

  </div2>


    <div2 id='AttributeInformationItem'>
      <head>Attribute Information Item</head>
      <p>
	An attribute information item maps to an AT event with each of its properties subject to further mapping as shown in the following table.
      </p>

      <table border='1' cellpadding='3' width="100%">
	<caption>Mapping of the attribute information item properties to EXI event types</caption>
	<thead>
	  <tr>
	    <th width="35%">Property</th>
	    <th width="60%">EXI event types</th>
	  </tr>
	</thead>
	<tbody>
	  <tr>
	    <td><emph role="infoset-property">namespace name</emph></td>
	    <td>AT</td>
	  </tr>
	  <tr>
	    <td><emph role="infoset-property">local name</emph></td>
	    <td>AT</td>
	  </tr>
	  <tr>
	    <td><emph role="infoset-property">prefix</emph></td>
	    <td>AT</td>
	  </tr>
	  <tr>
	    <td><emph role="infoset-property">normalized value</emph></td>
	    <td>The <emph>value</emph> of AT</td>
	  </tr>
	  <tr>
	    <td><emph role="infoset-property">specified</emph></td>
	    <td>True if the item maps to AT, otherwise false</td>
	  </tr>
	  <tr>
	    <td><emph role="infoset-property">attribute type</emph></td>
	    <td>
	      Computed based on AT and DT
	    </td>
	  </tr>
	  <tr>
	    <td><emph role="infoset-property">references</emph></td>
	    <td>
	      Computed based on <emph role="infoset-property">attribute type</emph> and <emph>value</emph> of AT
	    </td>
	  </tr>
	  <tr>
	    <td><emph role="infoset-property">owner element</emph></td>
	    <td>Computed based on the last SE event encountered that did
	    not get a matching EE event</td>
	  </tr>
	</tbody>
      </table>
  </div2>

  <div2 id="ProcessingInstructionInformationItem">
      <head>Processing Instruction Information Item</head> 
      <p>
	A processing instruction information maps to a PI event with each of its properties subject to further mapping as shown in the following table.
      </p>
      <table border='1' cellpadding='3' width="100%">
	<caption>Mapping of the processing instruction information item properties to EXI event types</caption>
	<thead>
	  <tr>
	    <th width="35%">Property</th>
	    <th width="60%">EXI event types</th>
	  </tr>
	</thead>
	<tbody>
	  <tr>
	    <td><emph role="infoset-property">target</emph></td>
	    <td>PI</td>
	  </tr>
	  <tr>
	    <td><emph role="infoset-property">content</emph></td>
	    <td>PI</td>
	  </tr>
	  <tr>
	    <td><emph role="infoset-property">base URI</emph></td>
	    <td>The base URI of the processing information item</td>
	  </tr>
	  <tr>
	    <td><emph role="infoset-property">notation</emph></td>
	    <td>
	      Computed based on the availability of the internal DTD subset
	    </td>
	  </tr>
	  <tr>
	    <td><emph role="infoset-property">parent</emph></td>
	    <td>Computed based on the last SE event encountered that did
	    not get a matching EE event type</td>
	  </tr>
	</tbody>
      </table>
  </div2>

  <div2 id="UnexpandedEntityInformationItem">
    <head>Unexpanded Entity Reference Information item</head>

      <p>
	An unexpanded entity reference information item maps to an ER with each of its properties subject to further mapping as shown in the following table.
      </p>
      <table border='1' cellpadding='3' width="100%">
	<caption>Mapping of the entity reference information item properties to
	the EXI event types</caption>
	<thead>
	  <tr>
	    <th width="35%">Property</th>
	    <th width="60%">EXI event types</th>
	  </tr>
	</thead>
	<tbody>
	  <tr>
	    <td><emph role="infoset-property">name</emph></td>
	    <td>ER</td>
	  </tr>
	  <tr>
	    <td><emph role="infoset-property">system identifier</emph></td>
	    <td>Based on the availability of the internal DTD subset</td>
	  </tr>
	  <tr>
	    <td><emph role="infoset-property">public identifier</emph></td>
	    <td>Based on the availability of the internal DTD subset</td>
	  </tr>
	  <tr>
	    <td><emph role="infoset-property">declaration base URI</emph></td>
	    <td>The base URI of the unexpanded entity reference information item</td>
	  </tr>
	  <tr>
	    <td><emph role="infoset-property">parent</emph></td>
	    <td>Computed based on the last SE event encountered that did
	    not get a matching EE event type</td>
	  </tr>
	</tbody>
      </table>
  </div2>

  <div2 id="CharacterInformationItem">
    <head>Character Information item</head>

      <p>
	A character information item maps to the individual characters contained in a CH event following a SE event that did not get a matching EE event.
      </p>
    
      <table border='1' cellpadding='3' width="100%">
	<caption>Mapping of the character information item properties and the EXI event types</caption>
	<thead>
	  <tr>
	    <th width="35%">Property</th>
	    <th width="60%">EXI event types</th>
	  </tr>
	</thead>
	<tbody>
	  <tr>
	    <td><emph role="infoset-property">character code</emph></td>
	    <td>Each character in CH</td>
	  </tr>
	  <tr>
	    <td><emph role="infoset-property">element content whitespace</emph></td>
	    <td>Computed based on <emph role="infoset-property">parent</emph> and DT</td>
	  </tr>
	  <tr>
	    <td><emph role="infoset-property">parent</emph></td>
	    <td>Computed based on the last SE event encountered that did
	    not get a matching EE event</td>
	  </tr>
	</tbody>
      </table>
  </div2>

  <div2 id="CommentInformationItem">
    <head>Comment Information item</head>

      <p>
	A comment information item maps to a CM event with each of its properties subject to further mapping as shown in the following table.
      </p>
    
      <table border='1' cellpadding='3' width="100%">
	<caption>Mapping of the comment information item properties and the EXI event types</caption>
	<thead>
	  <tr>
	    <th width="35%">Property</th>
	    <th width="60%">EXI event types</th>
	  </tr>
	</thead>
	<tbody>
	  <tr>
	    <td><emph role="infoset-property">content</emph></td>
	    <td><emph>text</emph> content item of CM</td>
	  </tr>
	  <tr>
	    <td><emph role="infoset-property">parent</emph></td>
	    <td>Computed based on the last SE event encountered that did
	    not get a matching EE event, or the SD event</td>
	  </tr>
	</tbody>
      </table>

  </div2>

  <div2 id="DocumentTypeDeclaractionInformationItem">
    <head>Document Type Declaration Information item</head>

      <p>
	A document type declaration information item maps to a DT event with each of its properties subject to further mapping as shown in the following table.
      </p>
      <table border='1' cellpadding='3' width="100%">
	<caption>Mapping of the document type declaration information item properties to the EXI event types</caption>
	<thead>
	  <tr>
	    <th width="35%">Property</th>
	    <th width="60%">EXI event types</th>
	  </tr>
	</thead>
	<tbody>
	  <tr>
	    <td><emph role="infoset-property">system identifier</emph></td>
	    <td>DT</td>
	  </tr>
	  <tr>
	    <td><emph role="infoset-property">public identifier</emph></td>
	    <td>DT</td>
	  </tr>
	  <tr>
	    <td><emph role="infoset-property">children</emph></td>
	    <td>Computed based on <emph>text</emph> content item of DT</td>
	  </tr>
	  <tr>
	    <td><emph role="infoset-property">parent</emph></td>
	    <td>Computed based on the SD event</td>
	  </tr>
	</tbody>
      </table>
  </div2>

  <div2 id="UnparsedEntityInformationItem">
      <head>Unparsed Entity Information Item</head>
      <p>
	An unparsed entity information item maps to part of the
	<emph>text</emph> content item of DT event with each of its properties subject to further mapping as shown in the following table.
      </p>

      <table border='1' cellpadding='3' width="100%">
	<caption>Mapping of the unparsed entity information item properties to EXI event types</caption>
	<thead>
	  <tr>
	    <th width="35%">Property</th>
	    <th width="60%">EXI event types</th>
	  </tr>
	</thead>
	<tbody>
	  <tr>
	    <td><emph role="infoset-property">name</emph></td>
	    <td>Computed based on <emph>text</emph> content item of DT</td>
	  </tr>
	  <tr>
	    <td><emph role="infoset-property">system identifier</emph></td>
	    <td>Computed based on <emph>text</emph> content item of DT</td>
	  </tr>
	  <tr>
	    <td><emph role="infoset-property">public identifier</emph></td>
	    <td>Computed based on <emph>text</emph> content item of DT</td>
	  </tr>
	  <tr>
	    <td><emph role="infoset-property">declaration base URI</emph></td>
	    <td>The base URI of the unparsed entity information item</td>
	  </tr>
	  <tr>
	    <td><emph role="infoset-property">notation name</emph></td>
	    <td>Computed based on <emph>text</emph> content item of DT</td>
	  </tr>
	  <tr>
	    <td><emph role="infoset-property">notation</emph></td>
	    <td>Computed based on <emph>text</emph> content item of DT</td>
	  </tr>
	</tbody>
      </table>
  </div2>

  <div2 id="NotationMapping">
    <head>Notation Information Item</head>
      <p>
	An notation information item maps to part of the
	<emph>text</emph> content item of DT event with each of its properties subject to further mapping as shown in the following table.
      </p>
      <table border='1' cellpadding='3' width="100%">
	<caption>Mapping of the notation information item properties to EXI event types</caption>
	<thead>
	  <tr>
	    <th width="35%">Property</th>
	    <th width="60%">EXI event types</th>
	  </tr>
	</thead>
	<tbody>
	  <tr>
	    <td><emph role="infoset-property">name</emph></td>
	    <td>Computed based on <emph>text</emph> content item of DT</td>
	  </tr>
	  <tr>
	    <td><emph role="infoset-property">system identifier</emph></td>
	    <td>Computed based on <emph>text</emph> content item of DT</td>
	  </tr>
	  <tr>
	    <td><emph role="infoset-property">public identifier</emph></td>
	    <td>Computed based on <emph>text</emph> content item of DT</td>
	  </tr>
	  <tr>
	    <td><emph role="infoset-property">declaration base URI</emph></td>
	    <td>The base URI of the notation information item</td>
	  </tr>
	</tbody>
      </table>
  </div2>

  <div2 id="NamespaceInformationItem">
    <head>Namespace Information Item</head>
      <p>
	An namespace information item ismaps toa NS event with each of its properties subject to further mapping as shown in the following table.
      </p>
      <table border='1' cellpadding='3' width="100%">
	<caption>Mapping of the namespace information item properties to EXI event types</caption>
	<thead>
	  <tr>
	    <th width="35%">Property</th>
	    <th width="60%">EXI event types</th>
	  </tr>
	</thead>
	<tbody>
	  <tr>
	    <td><emph role="infoset-property">prefix</emph></td>
	    <td>NS</td>
	  </tr>
	  <tr>
	    <td><emph role="infoset-property">namespace name</emph></td>
	    <td>NS</td>
	  </tr>
	</tbody>
      </table>
  </div2>

</div1>
<div1 id="optionsSchema">
<head>XML Schema for EXI Options Header</head>
<p>The following schema describes the EXI options header. It is
designed to produce smaller headers for option combinations used when
compactness is critical.</p>

<eg xml:space="preserve">
&lt;xsd:schema targetNamespace="&exins;"
            xmlns:xsd="http://www.w3.org/2001/XMLSchema"
            elementFormDefault="qualified"&gt;

  &lt;xsd:element name="header"&gt;
    &lt;xsd:complexType&gt;
      &lt;xsd:sequence&gt;
        &lt;xsd:element name="lesscommon" minOccurs="0"&gt;
          &lt;xsd:complexType&gt;
            &lt;xsd:sequence&gt;
              &lt;xsd:element name="uncommon" minOccurs="0"&gt;
                &lt;xsd:complexType&gt;
                  &lt;xsd:sequence&gt;
                    &lt;xsd:any namespace="##other" minOccurs="0" maxOccurs="unbounded" /&gt;
                    &lt;xsd:element name="alignment" minOccurs="0"&gt;
                      &lt;xsd:complexType&gt;
                        &lt;xsd:choice&gt;
                          &lt;xsd:element name="byte"&gt;
                            &lt;xsd:complexType /&gt;
                          &lt;/xsd:element&gt;
                          &lt;xsd:element name="pre-compress"&gt;
                            &lt;xsd:complexType /&gt;
                          &lt;/xsd:element&gt;
                        &lt;/xsd:choice&gt;
                      &lt;/xsd:complexType&gt;
                    &lt;/xsd:element&gt;
                    &lt;xsd:element name="selfContained" minOccurs="0"&gt;
                      &lt;xsd:complexType /&gt;
                    &lt;/xsd:element&gt;
                    &lt;xsd:element name="valueMaxLength" minOccurs="0"&gt;
                      &lt;xsd:simpleType&gt;
                        &lt;xsd:restriction base="xsd:unsignedInt" /&gt; 
                      &lt;/xsd:simpleType&gt;
                    &lt;/xsd:element&gt;
                    &lt;xsd:element name="valuePartitionCapacity" minOccurs="0"&gt;
                      &lt;xsd:simpleType&gt;
                        &lt;xsd:restriction base="xsd:unsignedInt" /&gt; 
                      &lt;/xsd:simpleType&gt;
                    &lt;/xsd:element&gt;

                    &lt;xsd:element name="datatypeRepresentationMap" minOccurs="0" maxOccurs="unbounded"&gt;
                      &lt;xsd:complexType&gt;
                        &lt;xsd:sequence&gt;
                          &lt;xsd:any namespace="##other" /&gt; &lt;!-- schema datatype --&gt;
                          &lt;xsd:any namespace="##other" /&gt; &lt;!-- datatype representation --&gt;
                        &lt;/xsd:sequence&gt;
                      &lt;/xsd:complexType&gt;
                    &lt;/xsd:element&gt;
                  &lt;/xsd:sequence&gt;
                &lt;/xsd:complexType&gt;
              &lt;/xsd:element&gt;
              &lt;xsd:element name="preserve" minOccurs="0"&gt;
                &lt;xsd:complexType&gt;
                  &lt;xsd:sequence&gt;
                    &lt;xsd:element name="dtd" minOccurs="0"&gt;
                      &lt;xsd:complexType /&gt;
                    &lt;/xsd:element&gt;
                    &lt;xsd:element name="prefixes" minOccurs="0"&gt;
                      &lt;xsd:complexType /&gt;
                    &lt;/xsd:element&gt;
                    &lt;xsd:element name="lexicalValues" minOccurs="0"&gt;
                      &lt;xsd:complexType /&gt;
                    &lt;/xsd:element&gt;<!--
                    &lt;xsd:element name="whitespace" minOccurs="0"&gt;
                      &lt;xsd:complexType /&gt;
                    &lt;/xsd:element&gt; -->
                    &lt;xsd:element name="comments" minOccurs="0"&gt;
                      &lt;xsd:complexType /&gt;
                    &lt;/xsd:element&gt;
                    &lt;xsd:element name="pis" minOccurs="0"&gt;
                      &lt;xsd:complexType /&gt;
                    &lt;/xsd:element&gt;
                  &lt;/xsd:sequence&gt;
                &lt;/xsd:complexType&gt;
              &lt;/xsd:element&gt;
              &lt;xsd:element name="blockSize" minOccurs="0"&gt;
                &lt;xsd:simpleType&gt;
                  &lt;xsd:restriction base="xsd:unsignedInt" /&gt; 
                &lt;/xsd:simpleType&gt;
              &lt;/xsd:element&gt;                 
            &lt;/xsd:sequence&gt;
          &lt;/xsd:complexType&gt;
        &lt;/xsd:element&gt;
        &lt;xsd:element name="common" minOccurs="0"&gt;
          &lt;xsd:complexType&gt;
            &lt;xsd:sequence&gt;
              &lt;xsd:element name="compression" minOccurs="0"&gt;
                &lt;xsd:complexType /&gt;
              &lt;/xsd:element&gt;
              &lt;xsd:element name="fragment" minOccurs="0"&gt;
                &lt;xsd:complexType /&gt;
              &lt;/xsd:element&gt;
              &lt;xsd:element name="schemaId" minOccurs="0" nillable="true"&gt;
                &lt;xsd:simpleType&gt;
                  &lt;xsd:restriction base="xsd:string" /&gt;
                &lt;/xsd:simpleType&gt;
              &lt;/xsd:element&gt;
            &lt;/xsd:sequence&gt;
          &lt;/xsd:complexType&gt;
        &lt;/xsd:element&gt;
        &lt;xsd:element name="strict" minOccurs="0"&gt;
          &lt;xsd:complexType /&gt;
        &lt;/xsd:element&gt;
      &lt;/xsd:sequence&gt;
    &lt;/xsd:complexType&gt;
  &lt;/xsd:element&gt;

&lt;/xsd:schema&gt;

</eg>
</div1>
<div1 id="initialStringValues">
<head>Initial Entries in String Table Partitions</head>
<div2 id="initialUriValues">
<head>Initial Entries in Uri Partition</head>
<p>The following table lists the entries that are initially populated in uri partitions, where partition name URI denotes that they are entries in the uri partition.</p>
<table border="1">
<caption>Initial values in <emph>uri</emph> partition</caption>
<colgroup span="2" align="center"></colgroup>
<colgroup></colgroup>
<thead>
<tr>
<th>Partition</th>
<th>String ID</th>
<th>String Value</th></tr>
</thead>
<tbody>
<tr>
<td>URI</td>
<td>0</td>
<td>"" [empty string]</td></tr>
<tr>
<td>URI</td>
<td>1</td>
<td>"http://www.w3.org/XML/1998/namespace"</td></tr>
<tr>
<td>URI</td>
<td>2</td>
<td>"http://www.w3.org/2001/XMLSchema-instance"</td></tr>
<!-- tr>
<td>URI</td>
<td>3</td>
<td>"http://www.w3.org/2001/XMLSchema"</td></tr -->
</tbody></table>

<p>When XML Schemas are used to inform the grammars for processing EXI body, there is an additional entry that is appended to the uri partition.
</p>

<table border="1">
<caption>Additional entry when XML Schemas are used</caption>
<colgroup span="2" align="center"></colgroup>
<colgroup></colgroup>
<thead>
<tr>
<th>Partition</th>
<th>String ID</th>
<th>String Value</th></tr>
</thead>
<tbody>
<tr>
<td>URI</td>
<td>3</td>
<td>"http://www.w3.org/2001/XMLSchema"</td></tr></tbody></table>

</div2>

<div2 id="initialPrefixValues">
<head>Initial Entries in Prefix Partitions</head>
<p>The following table lists the entries that are initially populated in prefix partitions, 
where XML-PF represents the partition for <emph>prefixes</emph> in
the "http://www.w3.org/XML/1998/namespace" namespace and XSI-PF
represents the partition for <emph>prefixes</emph> in the
"http://www.w3.org/2001/XMLSchema-instance" namespace.</p>
<table border="1">
<caption>Initial 
<emph>prefix</emph> string table entries</caption>
<colgroup span="2" align="center"></colgroup>
<colgroup></colgroup>
<thead>
<tr>
<th>Partition</th>
<th>String ID</th>
<th>String Value</th></tr>
</thead>
<tbody>
<tr>
<td>""</td>
<td>0</td>
<td>"" [empty string]</td></tr>
<tr>
<td>XML-PF</td>
<td>0</td>
<td>"xml"</td></tr>
<tr>
<td>XSI-PF</td>
<td>0</td>
<td>"xsi"</td></tr>
</tbody></table>
</div2>
<div2 id="initialLocalNames">
<head>Initial Entries in Local-Name Partitions</head>
<p>The following table lists the entries that are initially populated in local-name partitions, 
where XML-NS represents the partition for <emph>local-names</emph>
in the "http://www.w3.org/XML/1998/namespace" namespace, XSI-NS
represents the partition for <emph>local-names</emph> in the
"http://www.w3.org/2001/XMLSchema-instance" namespace, and XSD-NS
represents the partition for <emph>local-names</emph> in the
"http://www.w3.org/2001/XMLSchema" namespace. </p>
<table border="1">
<caption>Initial <emph>local-name</emph> string table entries</caption>
<colgroup span="2" align="center"></colgroup>
<colgroup></colgroup>
<thead>
<tr>
<th>Partition</th>
<th>String ID</th>
<th>String Value</th></tr>
</thead>
<tbody>
<tr>
<td>XML-NS</td>
<td>0</td>
<td>"space"</td></tr>
<tr>
<td>XML-NS</td>
<td>1</td>
<td>"lang"</td></tr>
<tr>
<td>XML-NS</td>
<td>2</td>
<td>"id"</td></tr>
<tr>
<td>XML-NS</td>
<td>3</td>
<td>"base"</td></tr>
<tr>
<td>XSI-NS</td>
<td>0</td>
<td>"type"</td></tr>
<tr>
<td>XSI-NS</td>
<td>1</td>
<td>"nil"</td></tr>
<!-- tr>
<td>XSI-NS</td>
<td>2</td>
<td>"schemaLocation"</td></tr>
<tr>
<td>XSI-NS</td>
<td>3</td>
<td>"noNamespaceSchemaLocation"</td></tr -->
<tr>
<td>XSD-NS</td>
<td>0</td>
<td>"anyType"</td></tr>
<tr>
<td>XSD-NS</td>
<td>1</td>
<td>"anySimpleType"</td></tr>
<tr>
<td>XSD-NS</td>
<td>2</td>
<td>"string"</td></tr>
<tr>
<td>XSD-NS</td>
<td>3</td>
<td>"normalizedString"</td></tr>
<tr>
<td>XSD-NS</td>
<td>4</td>
<td>"token"</td></tr>
<tr>
<td>XSD-NS</td>
<td>5</td>
<td>"language"</td></tr>
<tr>
<td>XSD-NS</td>
<td>6</td>
<td>"Name"</td></tr>
<tr>
<td>XSD-NS</td>
<td>7</td>
<td>"NCName"</td></tr>
<tr>
<td>XSD-NS</td>
<td>8</td>
<td>"ID"</td></tr>
<tr>
<td>XSD-NS</td>
<td>9</td>
<td>"IDREF"</td></tr>
<tr>
<td>XSD-NS</td>
<td>10</td>
<td>"IDREFS"</td></tr>
<tr>
<td>XSD-NS</td>
<td>11</td>
<td>"ENTITY"</td></tr>
<tr>
<td>XSD-NS</td>
<td>12</td>
<td>"ENTITIES"</td></tr>
<tr>
<td>XSD-NS</td>
<td>13</td>
<td>"NMTOKEN"</td></tr>
<tr>
<td>XSD-NS</td>
<td>14</td>
<td>"NMTOKENS"</td></tr>
<tr>
<td>XSD-NS</td>
<td>15</td>
<td>"duration"</td></tr>
<tr>
<td>XSD-NS</td>
<td>16</td>
<td>"dateTime"</td></tr>
<tr>
<td>XSD-NS</td>
<td>17</td>
<td>"time"</td></tr>
<tr>
<td>XSD-NS</td>
<td>18</td>
<td>"date"</td></tr>
<tr>
<td>XSD-NS</td>
<td>19</td>
<td>"gYearMonth"</td></tr>
<tr>
<td>XSD-NS</td>
<td>20</td>
<td>"gYear"</td></tr>
<tr>
<td>XSD-NS</td>
<td>21</td>
<td>"gMonthDay"</td></tr>
<tr>
<td>XSD-NS</td>
<td>22</td>
<td>"gDay"</td></tr>
<tr>
<td>XSD-NS</td>
<td>23</td>
<td>"gMonth"</td></tr>
<tr>
<td>XSD-NS</td>
<td>24</td>
<td>"boolean"</td></tr>
<tr>
<td>XSD-NS</td>
<td>25</td>
<td>"base64Binary"</td></tr>
<tr>
<td>XSD-NS</td>
<td>26</td>
<td>"hexBinary"</td></tr>
<tr>
<td>XSD-NS</td>
<td>27</td>
<td>"float"</td></tr>
<tr>
<td>XSD-NS</td>
<td>28</td>
<td>"double"</td></tr>
<tr>
<td>XSD-NS</td>
<td>29</td>
<td>"anyURI"</td></tr>
<tr>
<td>XSD-NS</td>
<td>30</td>
<td>"QName"</td></tr>
<tr>
<td>XSD-NS</td>
<td>31</td>
<td>"NOTATION"</td></tr>
<tr>
<td>XSD-NS</td>
<td>32</td>
<td>"decimal"</td></tr>
<tr>
<td>XSD-NS</td>
<td>33</td>
<td>"integer"</td></tr>
<tr>
<td>XSD-NS</td>
<td>34</td>
<td>"nonPositiveInteger"</td></tr>
<tr>
<td>XSD-NS</td>
<td>35</td>
<td>"negativeInteger"</td></tr>
<tr>
<td>XSD-NS</td>
<td>36</td>
<td>"long"</td></tr>
<tr>
<td>XSD-NS</td>
<td>37</td>
<td>"int"</td></tr>
<tr>
<td>XSD-NS</td>
<td>38</td>
<td>"short"</td></tr>
<tr>
<td>XSD-NS</td>
<td>39</td>
<td>"byte"</td></tr>
<tr>
<td>XSD-NS</td>
<td>40</td>
<td>"nonNegativeInteger"</td></tr>
<tr>
<td>XSD-NS</td>
<td>41</td>
<td>"positiveInteger"</td></tr>
<tr>
<td>XSD-NS</td>
<td>42</td>
<td>"unsignedLong"</td></tr>
<tr>
<td>XSD-NS</td>
<td>43</td>
<td>"unsignedInt"</td></tr>
<tr>
<td>XSD-NS</td>
<td>44</td>
<td>"unsignedShort"</td></tr>
<tr>
<td>XSD-NS</td>
<td>45</td>
<td>"unsignedByte"</td></tr></tbody></table>
</div2>
</div1>
<div1 id="regexToCharset">
<head>Deriving Character Sets from XML Schema Regular Expressions</head>
<p>
XML Schema datatypes specification <bibref ref="schema2"/> defines its 
<xspecref spec="XS2" ref="regexs">regular expression</xspecref>
syntax for use in pattern facets of simple type definitions. Pattern facets are applied to values literally to constrain the set of valid values to those that lexically matches the specified regular expression. Though regular expression syntax is defined by dozens of productions, after all, they are character sets that constitute a regular expression at the finest granularity of the grammar, which are leveraged such as being combined, concatenated, complemented or subtracted in a bottom-up fashion to form a regular expression. In this regard, a regular expression can be seen as a sort of micro-schema that suggests a concrete character set to which characters in a string are likely to belong. The remainder of this section describes a method for deriving a character set from an XML Schema regular expression. Hereinafter, "character set" and "XML Schema regular expression" are referred to as "charset" and "regexp", respectively.
Regexp syntax permits the use of <xspecref spec="XS2" ref="dt-cces">character class escapes</xspecref> some of which depend on the mapping from code points to character properties. This document assumes the use of revision 5.0.0 of Unicode Standard <bibref ref="UnicodeDB"/> to obtain the mapping.
</p>
<p>At the top level, regexp syntax is summarized by the following three productions, excerpted here from <bibref ref="schema2"/>. Note the notation used for the numbers that tag the productions. "XSD:" is prefixed to the original numeric tags to make it easier to discern them as belonging to XML Schema specification.</p>
<table>
<tr>
<td width="5%"/>
<td>
<xspecref spec="XS2" ref="regex">[XSD:1]</xspecref>&nbsp;&nbsp;regExp&nbsp;&nbsp;::=&nbsp;&nbsp;branch&nbsp;&nbsp;(&nbsp;&nbsp;'|'&nbsp;&nbsp;branch&nbsp;&nbsp;)*  
</td>
</tr>
<tr>
<td/>
<td>
<xspecref spec="XS2" ref="branch">[XSD:2]</xspecref>&nbsp;&nbsp;branch&nbsp;&nbsp;::=&nbsp;&nbsp;piece* 
</td>
</tr>
<tr>
<td/>
<td>
<xspecref spec="XS2" ref="piece">[XSD:3]</xspecref>&nbsp;&nbsp;piece&nbsp;&nbsp;::=&nbsp;&nbsp;atom&nbsp;&nbsp;quantifier? 
</td>
</tr>
</table>
<p>These productions indicate that the charset of a regexp (i.e. <code>regExp</code> above) equals to the union of all the charsets each of which corresponds to an atom that participates in the regexp. There are exceptions which are based on empirical observations that are introduced here to identify certain regexps that are not subject to the computation of charsets. 
If any atom itself is or contains one of the following character groups directly or indirectly, the charset of the whole regexp is defined to be the entire set of XML characters.
</p>
<ulist>
<item>
All <xspecref spec="XS2" ref="dt-ccesN">multi-character escapes</xspecref> (including 
meta-character <code>'.'</code>) except for <code>'\s'</code> and <code>'\d'</code>.
</item>
<item>
All <xspecref spec="XS2" ref="dt-ccescat">category escapes</xspecref> that carry one of the following character properties.
<ulist>
<item>
All category names that are of the forms:
<code>'L'[ulo]?</code>, 
<code>'M'[n]?</code>, 
<code>'N'</code>, 
<code>'P'</code>, 
<code>'Z'</code>, 
<code>'S'[mo]?</code> or
<code>'C'[o]?</code>&nbsp;.
</item>
<item>
The following block names: Ethiopic, UnifiedCanadianAboriginalSyllabics, CJKUnifiedIdeographs,
CJKCompatibilityIdeographs, ArabicPresentationForms-A, CJKUnifiedIdeographsExtensionA,
YiSyllables, HangulSyllables and PrivateUse.
</item>
</ulist>
</item>
<item>
<xspecref spec="XS2" ref="nt-complEsc"><code>complEsc</code></xspecref> (examples of which are <code>'\P{ L }'</code> and <code>'\P{ N }'</code> ).
</item>
<item>
<xspecref spec="XS2" ref="nt-negCharGroup"><code>negCharGroup</code></xspecref> as indicated by meta-character <code>'^'</code>. See [XSD:15].
</item>
</ulist>
<p>Most regexps that contain one of the character groups listed above result in a very large number of characters, and even in such rare cases where it is not necessarily the case, there are usually alternative ways to describe the same effect more intuitively using none of the above constructs.
The rest of this section describes the system to derive character sets from such regexps that do not contain any of the character groups listed above.
</p>
<p>Shown below is the rule for assembling the charset of a regexp given a list of atoms that are contained directly in the regexp, excluding those atoms contained in sub-regexp parenthesized within the regexp (see [XSD:9] below) in which case the sub-regexp itself is the atom that is included in the list. Note the pseudo-function notation of the form <code>CS(arg)</code> with <code>arg</code> being a regexp construct denotes the method of obtaining the character set of the argument.
</p>
<scrap>
<bnf>
[1] CS(regExp) := 

    union of every CS(atom[0...N-1])

        where N represents the number of atoms
</bnf>
</scrap>
<p>
There are three kinds of atoms per its definition [XSD:9].
</p>
<table id="atom">
<tr>
<td width="5%"/>
<td>
<xspecref spec="XS2" ref="atom">[XSD:9]</xspecref>&nbsp;&nbsp;atom&nbsp;&nbsp;::=&nbsp;&nbsp;Char&nbsp;&nbsp;|&nbsp;&nbsp;charClass&nbsp;&nbsp;|&nbsp;&nbsp;(&nbsp;&nbsp;'('&nbsp;&nbsp;regExp&nbsp;&nbsp;')'&nbsp;&nbsp;) 
</td>
</tr>
</table>
<p>This production directly translates to the following rule for acquiring the charset of an atom.
</p>
<scrap>
<bnf>
[2] CS(atom) :=

    a single char represented by Char (if atom is Char)

    or 

    CS(charClass) (if atom is charClass. See rule [3]) 

    or

    CS(regExp) (if atom is sub-regexp. See rule [1])
</bnf>
</scrap>
<p>Similarly, there are three choices for <code>charClass</code> per its definition [XSD:11] below, which is followed by the corresponding rule for acquiring the charset of a <code>charClass</code>.
</p>
<table>
<tr>
<td width="5%"/>
<td>
<xspecref spec="XS2" ref="charClass">[XSD:11]</xspecref>&nbsp;&nbsp;charClass&nbsp;&nbsp;::=&nbsp;&nbsp;charClassEsc&nbsp;&nbsp;|&nbsp;&nbsp;charClassExpr&nbsp;&nbsp;|&nbsp;&nbsp;WildcardEsc
</td>
</tr>
<tr>
<td>&nbsp;</td>
</tr>
</table>
<scrap>
<bnf>
[3] CS(charClass) :=

    CS(charClassEsc) (if charClass is charClassEsc. See [XSD:23] that defines
                      the characters contained in CS(charClassEsc) for each kind
                      of charClassEsc)

    CS(charClassExpr) (if charClass charClassExpr. See rule [3])
</bnf>
</scrap>
<p>Note that there is no rule specified above for a <code>charClass</code> that is a <code>WildcardEsc</code>. This is because the presence of a <code>WildcardEsc</code> causes to conclude the charset of the <code>regExp</code> that contains this <code>charClass</code> to be the entire XML charset (See rule [1] above).</p>
<p>A <code>charClassExpr</code> is either <code>posCharGroup</code>, <code>negCharGroup</code> or <code>charClassSub</code> per production [XSD:12] and [XSD:13] as excerpted below.
</p>
<table>
<tr>
<td width="5%"/>
<td>
<xspecref spec="XS2" ref="charClassExpr">[XSD:12]</xspecref>&nbsp;&nbsp;charClassExpr&nbsp;&nbsp;::=&nbsp;&nbsp;'['&nbsp;&nbsp;charGroup&nbsp;&nbsp;']' 
</td>
</tr>
<tr>
<td width="5%"/>
<td>
<xspecref spec="XS2" ref="chargroup">[XSD:13]</xspecref>&nbsp;&nbsp;charGroup&nbsp;&nbsp;::=&nbsp;&nbsp;posCharGroup&nbsp;&nbsp;|&nbsp;&nbsp;negCharGroup&nbsp;&nbsp;|&nbsp;&nbsp;charClassSub 
</td>
</tr>
</table>
<p>This directly translates to the rule for <code>charClassExpr</code> charset as shown below.</p>
<scrap>
<bnf>
[4] CS(charClassExpr) :=

    CS(posCharGroup) (if charClassExpr is posCharGroup. See rule [5])

    CS(charClassSub) (if charClassExpr is charClassSub. See rule [6])
</bnf>
</scrap>
<p>Note that there is no rule specified above for a <code>charClassExpr</code> that is a <code>negCharGroup</code>. This is because the presence of a <code>negCharGroup</code> causes to conclude the charset of the <code>regExp</code> that contains this <code>charClassExpr</code> to be the entire XML charset (See rule [1] above).
</p>
<p><code>posCharGroup</code> is defined to be a sequence of <code>charRange</code> and <code>charClassEsc</code> per production [XSD:14].
</p>
<table>
<tr>
<td width="5%"/>
<td>
<xspecref spec="XS2" ref="poschargroup">[XSD:14]</xspecref>&nbsp;&nbsp;posCharGroup&nbsp;&nbsp;::=&nbsp;&nbsp;(&nbsp;&nbsp;charRange&nbsp;&nbsp;|&nbsp;&nbsp;charClassEsc&nbsp;&nbsp;)+  
</td>
</tr>
</table>
<p>The above production translates to the following rule for acquiring the charset of a <code>posCharGroup</code>.
</p>
<scrap>
<bnf>
[5] CS(posCharGroup) := union of every CS(charRange[0...M-1]) and
                                 every CS(charClassEsc[0...N-1])

        where M and N represent the number of charRanges and charClassEscs
        contained in the posCharGroup, respectively.
</bnf>
</scrap>
<p>Lastly, <code>charClassSub</code> is defined using a subtraction operation as follows.
</p>
<table>
<tr>
<td width="5%"/>
<td>
<xspecref spec="XS2" ref="charclasssub">[XSD:16]</xspecref>&nbsp;&nbsp;charClassSub&nbsp;&nbsp;::=&nbsp;&nbsp;(&nbsp;&nbsp;posCharGroup&nbsp;&nbsp;|&nbsp;&nbsp;negCharGroup&nbsp;&nbsp;)&nbsp;&nbsp;'-'&nbsp;&nbsp;charClassExpr 
</td>
</tr>
</table>
<p>Because the presence of <code>negCharGroup</code> would have resulted in the containing <code>regExp</code> to have the entire XML charset in the first place, <code>negCharGroup</code> can be pruned from the above production, which makes the following reduced version of [XSD:16].
</p>
<table>
<tr>
<td width="5%"/>
<td>
&nbsp;[XSD:16']&nbsp;&nbsp;charClassSub&nbsp;&nbsp;::=&nbsp;&nbsp;posCharGroup&nbsp;&nbsp;'-'&nbsp;&nbsp;charClassExpr 
</td>
</tr>
</table>
<p>The above production translates to the following rule for acquiring the charset of a <code>charClassSub</code>.
</p>
<scrap>
<bnf>
[6] CS(charClassSub) := characters that are found
                        in CS(posCharGroup) (See rule [5])
                        but not in CS(charClassExpr) (See rule [4])
</bnf>
</scrap>
</div1>

<div1 id="mediaTypeRegistration">
<head>Content Coding and Internet Media Type</head>
<p>
Two labels are defined for use in the interchange of XML Information Set data encoded as EXI streams. They serve two distinct roles of indicating metadata in data interchange; one is for content coding, and the other is for internet media type. 
</p>
<p>
In such protocols that support a mechanism to indicate the encoding transformation of the data being exchanged, the label "exi" is used as a content coding (see section <specref ref="contentCoding"/>) in an occurrence of or a request of an XML Information Set interchange of which the document body is encoded as an EXI stream. 
</p>
<p>
For other protocols that lack the capability of indicating the encoding transformation of the data being transferred, the other label "application/exi" is defined as an internet media type (see section <specref ref="internetMediaType"/>) in order to identify that the data being retrieved or sent is an XML Information Set represented as an EXI stream.
</p>

<div2 id="contentCoding">
<head>Content Coding</head>
<p>
Protocols that support a mechanism to indicate the encoding transformation of the data being transferred (e.g. HTTP 1.1) SHOULD use the label "exi" (case-insensitive) to annotate the transfer or the request of data structured as an XML Information Set to convey the actual use of or the acceptance of EXI encoding for the interchange that is underway.
</p>
</div2>

<div2 id="internetMediaType">
<head>Internet Media Type</head>
<p>
A new media type registration "application/exi" described below is being proposed for community review, with the intent to eventually submit it to the IESG for review, approval, and registration with IANA.

<!-- In the introduction to the relevant section, say that this registration is for community review and will be submitted to the IESG for review, approval, and registration with IANA. -->


</p>
<glist>
<gitem>
<label>Type name:</label>
<def>
<p>
application
</p>
</def>
</gitem>
<gitem>
<label>Subtype name:</label>
<def>
<p>
exi
</p>
</def>
</gitem>
<gitem>
<label>Required parameters:</label>
<def>
<p>
none
</p>
</def>
</gitem>
<gitem>
<label>Optional parameters:</label>
<def>
<p>
none
</p>
</def>
</gitem>
<gitem>
<label>Encoding considerations:</label>
<def>
<p>
binary
</p>
</def>
</gitem>
<gitem>
<label>Security considerations:</label>
<def>
<p>
When used as an XML replacement in an application, EXI shares
the same security concerns as XML, described in IETF RFC 3023 <bibref ref="RFC3023"/>,
section 10.
</p>
<p>
In addition to concerns shared with XML, the schema identifier
refers to information external to the EXI document itself. If
an attacker is able to substitute another schema in place of
the intended one, the semantics of the EXI document could be
changed in some ways. As an example, EXI is sensitive to the
order of the values in an enumeration. It is not known whether
such an attack is possible on the actual structure of the
document.
</p>
<p>
Also, EXI supports user-defined datatype representations, and such
representations, if present in a document and purportedly understood by
a processor, can be a security weakness. Definitions of these
representations are expected to be external, often application- or
industry-specific, so any definition needs to be analyzed carefully from
the security perspective before being adopted.
</p>
</def>
</gitem>
<gitem>
<label>Interoperability considerations:</label>
<def>
<p>
The datatype representation map feature of EXI requires
coordination between the producer and consumer of an EXI
document, and is not recommended except in controlled
environments or using standardized datatype representations
potentially defined in the future.
</p>
<p>
EXI permits information necessary to decode a document to be
omitted with the expectation that such information has been
communicated out of band. Such omissions hinder
interoperability in uncontrolled environments.
</p>
</def>
</gitem>
<gitem>
<label>Published specification:</label>
<def>
<p>
Efficient XML Interchange (EXI) Format 1.0, World Wide Web
Consortium
</p>
</def>
</gitem>
<gitem>
<label>Applications that use this media type:</label>
<def>
<p>
No known applications currently use this media type.
</p>
</def>
</gitem>
<gitem>
<label>Additional information:</label>
<def>
<table width="100%">
<tbody>
<tr>
<td colspan="2">&nbsp;</td>
</tr>
<tr align="left">
<th colspan="2">
Magic number(s):
</th>
</tr>
<tr>
<td width="5%">&nbsp;</td>
<td>
The first four octets may be hexadecimal 24 45 58 49 ("$EXI").
The first octet after these, or the first octet of the whole
content if they are not present, has its high two bits set to
values 1 and 0 in that order.
</td>
</tr>
</tbody>
</table>
<table width="100%">
<tbody>
<tr align="left">
<th colspan="2">
File extension(s):
</th>
</tr>
<tr>
<td width="5%">&nbsp;</td>
<td>.exi</td>
</tr>
</tbody>
</table>
<table width="100%">
<tbody>
<tr align="left">
<th colspan="2">
Macintosh file type code(s):
</th>
</tr>
<tr>
<td width="5%">&nbsp;</td>
<td>APPL</td>
</tr>
<tr>
<td colspan="2">&nbsp;</td>
</tr>
</tbody>
</table>
</def>
</gitem>
<gitem>
<label>Person &amp; email address to contact for further information:</label>
<def>
<p>
World Wide Web Consortium &lt;web-human@w3.org&gt;
</p>
</def>
</gitem>
<gitem>
<label>Intended usage:</label>
<def>
<p>
COMMON
</p>
</def>
</gitem>
<gitem>
<label>Restrictions on usage:</label>
<def>
<p>
none
</p>
</def>
</gitem>
<gitem>
<label>Author/Change controller:</label>
<def>
<p>
The EXI specification is the product of the World Wide Web
Consortium's Efficient XML Interchange Working Group. The W3C
has change control over this specification.
</p>
</def>
</gitem>

</glist>

<!--

:
-->

</div2>

</div1>

<inform-div1 id="example">
<head>Example Encoding</head>
<p>
EXI Primer <bibref ref="exiprimer"/> contains a section that explains the workings of EXI format using simple example documents. Those examples are intended to serve as a tool to confirm the understanding of the EXI format in action by going through encoding and decoding processes step by step.
</p>
</inform-div1>
<inform-div1 id="grammarExamples">
<head>Schema-informed Grammar Examples</head>

<p>As an example to exercise the process to produce schema-informed element grammars, consider the following XML Schema fragment declaring two complex-typed elements, &lt;product&gt; and &lt;order&gt;: </p>
<example>
<head>Example XML Schema fragment</head>
<eg xml:space="preserve">
&lt;xs:element name=&quot;product&quot;&gt; 
  &lt;xs:complexType&gt; 
    &lt;xs:sequence maxOccurs=&quot;2&quot;&gt; 
      &lt;xs:element name=&quot;description&quot; type=&quot;xs:string&quot; minOccurs=&quot;0&quot;/&gt; 
      &lt;xs:element name=&quot;quantity&quot; type=&quot;xs:integer&quot; /&gt; 
      &lt;xs:element name=&quot;price&quot; type=&quot;xs:float&quot; /&gt; 
    &lt;/xs:sequence&gt; 
    &lt;xs:attribute name=&quot;sku&quot; type=&quot;xs:string&quot; use=&quot;required&quot; /&gt; 
    &lt;xs:attribute name=&quot;color&quot; type=&quot;xs:string&quot; use=&quot;optional&quot; /&gt; 
  &lt;/xs:complexType&gt; 
&lt;/xs:element&gt; 

&lt;xs:element name=&quot;order&quot;&gt; 
  &lt;xs:complexType&gt; 
    &lt;xs:sequence&gt; 
      &lt;xs:element ref=&quot;product&quot; maxOccurs=&quot;unbounded&quot; /&gt; 
    &lt;/xs:sequence&gt; 
  &lt;/xs:complexType&gt; 
&lt;/xs:element&gt; 
</eg>
</example>
<p>Section <specref ref="exampleProtoGrammars"/> guides you through the process of deriving EXI proto-grammars from the schema components available in the example schema above. EXI grammars in the normalized form that correspond to the proto-grammars are shown in section <specref ref="exampleNormGrammars"/>. Section <specref ref="exampleCompleteGrammars"/> shows the complete EXI grammars for elements &lt;product&gt; and &lt;order&gt;.
</p>
<div2 id="exampleProtoGrammars">
<head>Proto-Grammar Examples</head>
<p>Grammars for element declaration terms "description", "quantity" and "price" are as follows. See section <specref ref="elementTerms"/> for the rules used to derive grammars from element terms.
</p>
<example>
<table width="80%">
<tbody>
<tr>
<td>&nbsp;</td>
</tr>
</tbody>
</table>
<table width="80%" id="termDescription">
<thead>
<tr>
<th align="left" colspan="3">Term_description</th>
</tr>
</thead>
<tbody>
<tr>
<td colspan="3">&nbsp;</td>
</tr>
<tr>
<td width="5%"></td>
<td colspan="2">
<emph>Term_description</emph><sub>&nbsp;0</sub> :</td></tr>
<tr>
<td></td>
<td width="5%"></td>
<td>
SE(<emph>"description"</emph>) <emph>Term_description</emph><sub>&nbsp;1</sub>
</td>
</tr>
<tr>
<td colspan="3">&nbsp;</td>
</tr>
<tr>
<td></td>
<td colspan="2">
<emph>Term_description</emph><sub>&nbsp;1</sub> :
</td></tr>
<tr>
<td></td>
<td></td>
<td>
EE
</td>
</tr>
<tr>
<td colspan="3">&nbsp;</td>
</tr>
</tbody>
</table>

<table width="80%" id="termQuantity">
<thead>
<tr>
<th align="left" colspan="3">Term_quantity</th>
</tr>
</thead>
<tbody>
<tr>
<td colspan="3">&nbsp;</td>
</tr>
<tr>
<td width="5%"></td>
<td colspan="2">
<emph>Term_quantity</emph><sub>&nbsp;0</sub> :</td></tr>
<tr>
<td></td>
<td width="5%"></td>
<td>
SE(<emph>"quantity"</emph>) <emph>Term_quantity</emph><sub>&nbsp;1</sub>
</td>
</tr>
<tr>
<td colspan="3">&nbsp;</td>
</tr>
<tr>
<td></td>
<td colspan="2">
<emph>Term_quantity</emph><sub>&nbsp;1</sub> :
</td></tr>
<tr>
<td></td>
<td></td>
<td>
EE
</td>
</tr>
<tr>
<td colspan="3">&nbsp;</td>
</tr>
</tbody>
</table>

<table width="80%" id="termPrice">
<thead>
<tr>
<th align="left" colspan="3">Term_price</th>
</tr>
</thead>
<tbody>
<tr>
<td colspan="3">&nbsp;</td>
</tr>
<tr>
<td width="5%"></td>
<td colspan="2">
<emph>Term_price</emph><sub>&nbsp;0</sub> :</td></tr>
<tr>
<td></td>
<td width="5%"></td>
<td>
SE(<emph>"price"</emph>) <emph>Term_price</emph><sub>&nbsp;1</sub>
</td>
</tr>
<tr>
<td colspan="3">&nbsp;</td>
</tr>
<tr>
<td></td>
<td colspan="2">
<emph>Term_price</emph><sub>&nbsp;1</sub> :
</td></tr>
<tr>
<td></td>
<td></td>
<td>
EE
</td>
</tr>
<tr>
<td colspan="3">&nbsp;</td>
</tr>
</tbody>
</table>
</example>

<p>Grammars for element particle "description" are derived from <termref def="termDescription"><emph>Term_description</emph></termref> given { minOccurs } value of 0 and { maxOccurs } value of 1. See section <specref ref="particles"/> for the rules used to derive grammars from particles.
</p>

<example>
<table width="80%">
<tbody>
<tr>
<td>&nbsp;</td>
</tr>
</tbody>
</table>
<table width="80%" id="particleDescription">
<thead>
<tr>
<th align="left" colspan="3">Particle_description</th>
</tr>
</thead>
<tbody>
<tr>
<td colspan="3">&nbsp;</td>
</tr>
<tr>
<td width="5%"></td>
<td colspan="2">
<emph>Term_description</emph><sub>&nbsp;0</sub> :</td></tr>
<tr>
<td></td>
<td width="5%"></td>
<td>
SE(<emph>"description"</emph>) <emph>Term_description</emph><sub>&nbsp;1</sub>
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
EE
</td>
</tr>
<tr>
<td colspan="3">&nbsp;</td>
</tr>
<tr>
<td></td>
<td colspan="2">
<emph>Term_description</emph><sub>&nbsp;1</sub> :
</td></tr>
<tr>
<td></td>
<td></td>
<td>
EE
</td>
</tr>
<tr>
<td colspan="3">&nbsp;</td>
</tr>
</tbody>
</table>
</example>

<p>Grammars for element particle "quantity" and "prices" are the same as those of their terms (<termref def="termQuantity"><emph>Term_quantity</emph></termref> and <termref def="termPrice"><emph>Term_price</emph></termref>, respectively) because { minOccurs } and { maxOccurs } are both 1.
</p>

<example>
<table width="80%">
<tbody>
<tr>
<td>&nbsp;</td>
</tr>
</tbody>
</table>
<table width="80%" id="particleQuantity">
<thead>
<tr>
<th align="left" colspan="3">Particle_quantity</th>
</tr>
</thead>
<tbody>
<tr>
<td colspan="3">&nbsp;</td>
</tr>
<tr>
<td width="5%"></td>
<td colspan="2">
<emph>Term_quantity</emph><sub>&nbsp;0</sub> :</td></tr>
<tr>
<td></td>
<td width="5%"></td>
<td>
SE(<emph>"quantity"</emph>) <emph>Term_quantity</emph><sub>&nbsp;1</sub>
</td>
</tr>
<tr>
<td colspan="3">&nbsp;</td>
</tr>
<tr>
<td></td>
<td colspan="2">
<emph>Term_quantity</emph><sub>&nbsp;1</sub> :
</td></tr>
<tr>
<td></td>
<td></td>
<td>
EE
</td>
</tr>
<tr>
<td colspan="3">&nbsp;</td>
</tr>
</tbody>
</table>

<table width="80%" id="particlePrice">
<thead>
<tr>
<th align="left" colspan="3">Particle_price</th>
</tr>
</thead>
<tbody>
<tr>
<td colspan="3">&nbsp;</td>
</tr>
<tr>
<td width="5%"></td>
<td colspan="2">
<emph>Term_price</emph><sub>&nbsp;0</sub> :</td></tr>
<tr>
<td></td>
<td width="5%"></td>
<td>
SE(<emph>"price"</emph>) <emph>Term_price</emph><sub>&nbsp;1</sub>
</td>
</tr>
<tr>
<td colspan="3">&nbsp;</td>
</tr>
<tr>
<td></td>
<td colspan="2">
<emph>Term_price</emph><sub>&nbsp;1</sub> :
</td></tr>
<tr>
<td></td>
<td></td>
<td>
EE
</td>
</tr>
<tr>
<td colspan="3">&nbsp;</td>
</tr>
</tbody>
</table>
</example>

<p>Grammars for the sequence group term in &lt;product&gt; element declaration is derived from the grammars of subordinate particles as follows. See section <specref ref="sequenceGroupTerms"/> for the rules used to derive grammars from a sequence group.
</p>

<table width="100%">
<tbody>
<tr>
<td width="5%"></td>
<td>
<emph>Term_sequence</emph> = <termref def="particleDescription"><emph>Particle_description</emph></termref> &oplus; <termref def="particleQuantity"><emph>Particle_quantity</emph></termref> &oplus; <termref def="particlePrice"><emph>Particle_price</emph></termref>
</td>
</tr>
</tbody></table>

<p>which yields the following grammars for <emph>Term_sequence</emph>.
</p>

<example>
<table width="80%">
<tbody>
<tr>
<td>&nbsp;</td>
</tr>
</tbody>
</table>
<table width="80%" id="termSequence">
<thead>
<tr>
<th align="left" colspan="3"><emph>Term_sequence</emph></th>
</tr>
</thead>
<tbody>
<tr>
<td colspan="3">&nbsp;</td>
</tr>
<tr>
<td width="5%"></td>
<td colspan="2">
<emph>Term_description</emph><sub>0</sub> : 
</td>
</tr>
<tr>
<td></td>
<td width="5%"></td>
<td>
SE("description") <emph>Term_description</emph><sub>1</sub> 
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
<emph>Term_quantity</emph> <sub>0</sub>
</td>
</tr>
<tr>
<td colspan="3">&nbsp;</td>
</tr>
<tr>
<td></td>
<td colspan="2">
<emph>Term_description</emph> <sub>1</sub> :  
</td></tr>
<tr>
<td></td>
<td></td>
<td>
<emph>Term_quantity</emph> <sub>0</sub>
</td>
</tr>
<tr>
<td colspan="3">&nbsp;</td>
</tr>
<tr>
<td></td>
<td colspan="2">
<emph>Term_quantity</emph> <sub>0</sub> : 
</td></tr>
<tr>
<td></td>
<td></td>
<td>
SE("quantity") <emph>Term_quantity</emph> <sub>1</sub>
</td>
</tr>
<tr>
<td colspan="3">&nbsp;</td>
</tr>
<tr>
<td></td>
<td colspan="2">
<emph>Term_quantity</emph> <sub>1</sub> : 
</td></tr>
<tr>
<td></td>
<td></td>
<td>
<emph>Term_price</emph> <sub>0</sub>
</td>
</tr>
<tr>
<td colspan="3">&nbsp;</td>
</tr>
<tr>
<td></td>
<td colspan="2">
<emph>Term_price</emph> <sub>0</sub> : 
</td></tr>
<tr>
<td></td>
<td></td>
<td>
SE("price") <emph>Term_price</emph> <sub>1</sub> 
</td>
</tr>
<tr>
<td colspan="3">&nbsp;</td>
</tr>
<tr>
<td></td>
<td colspan="2">
<emph>Term_price</emph> <sub>1</sub> :  
</td></tr>
<tr>
<td></td>
<td></td>
<td>
EE  
</td>
</tr>
<tr>
<td colspan="3">&nbsp;</td>
</tr>
</tbody>
</table>
</example>

<p>Grammars for the particle that is the content model of element &lt;product&gt; are derived from <termref def="termSequence"><emph>Term_sequence</emph></termref> (shown above) given { minOccurs } value of 1 and { maxOccurs } value of 2. See section <specref ref="particles"/> for the rules used to derive grammars from particles.
</p>

<example>
<table width="80%">
<tbody>
<tr>
<td>&nbsp;</td>
</tr>
</tbody>
</table>
<table width="80%" id="particleSequence">
<thead>
<tr>
<th align="left" colspan="3"><emph>Particle_sequence</emph></th>
</tr>
</thead>
<tbody>
<tr>
<td colspan="3">&nbsp;</td>
</tr>
<tr>
<td width="5%"></td>
<td colspan="2">
<emph>Term_description</emph><sub>0,0</sub> :  
</td>
</tr>
<tr>
<td></td>
<td width="5%"></td>
<td>
SE("description") <emph>Term_description</emph><sub>0,1</sub> 
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
<emph>Term_quantity</emph><sub>0,0</sub> 
</td>
</tr>
<tr>
<td colspan="3">&nbsp;</td>
</tr>

<tr>
<td></td>
<td colspan="2">
<emph>Term_description</emph><sub>0,1</sub> :  
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
<emph>Term_quantity</emph><sub>0,0</sub> 
</td>
</tr>
<tr>
<td colspan="3">&nbsp;</td>
</tr>

<tr>
<td></td>
<td colspan="2">
<emph>Term_quantity</emph><sub>0,0</sub> :  
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
SE("quantity") <emph>Term_quantity</emph><sub>0,1</sub> 
</td>
</tr>
<tr>
<td colspan="3">&nbsp;</td>
</tr>

<tr>
<td></td>
<td colspan="2">
<emph>Term_quantity</emph><sub>0,1</sub> :  
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
<emph>Term_price</emph><sub>0,0</sub> 
</td>
</tr>
<tr>
<td colspan="3">&nbsp;</td>
</tr>

<tr>
<td></td>
<td colspan="2">
<emph>Term_price</emph><sub>0,0</sub> :  
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
SE("price") <emph>Term_price</emph><sub>0,1</sub> 
</td>
</tr>
<tr>
<td colspan="3">&nbsp;</td>
</tr>

<tr>
<td></td>
<td colspan="2">
<emph>Term_price</emph><sub>0,1</sub> :  
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
<emph>Term_description</emph><sub>1,0</sub>  
</td>
</tr>
<tr>
<td colspan="3">&nbsp;</td>
</tr>

<tr>
<td></td>
<td colspan="2">
<emph>Term_description</emph><sub>1,0</sub> :  
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
SE("description") <emph>Term_description</emph><sub>1,1</sub> 
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
<emph>Term_quantity</emph><sub>1,0</sub> 
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
EE  
</td>
</tr>
<tr>
<td colspan="3">&nbsp;</td>
</tr>

<tr>
<td></td>
<td colspan="2">
<emph>Term_description</emph><sub>1,1</sub> :  
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
<emph>Term_quantity</emph><sub>1,0</sub> 
</td>
</tr>
<tr>
<td colspan="3">&nbsp;</td>
</tr>

<tr>
<td></td>
<td colspan="2">
<emph>Term_quantity</emph><sub>1,0</sub> :  
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
SE("quantity") <emph>Term_quantity</emph><sub>1,1</sub> 
</td>
</tr>
<tr>
<td colspan="3">&nbsp;</td>
</tr>

<tr>
<td></td>
<td colspan="2">
<emph>Term_quantity</emph><sub>1,1</sub> :  
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
<emph>Term_price</emph><sub>1,0</sub> 
</td>
</tr>
<tr>
<td colspan="3">&nbsp;</td>
</tr>

<tr>
<td></td>
<td colspan="2">
<emph>Term_price</emph><sub>1,0</sub> :  
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
SE("price") <emph>Term_price</emph><sub>1,1</sub> 
</td>
</tr>
<tr>
<td colspan="3">&nbsp;</td>
</tr>

<tr>
<td></td>
<td colspan="2">
<emph>Term_price</emph><sub>1,1</sub> :  
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
EE  
</td>
</tr>
<tr>
<td colspan="3">&nbsp;</td>
</tr>

</tbody>
</table>
</example>

<p>Grammars for attribute uses of attributes "sku" and "color" are as follows. See section <specref ref="attributeUses"/> for the rules used to derive grammars from attribute uses.
</p>

<example>
<table width="80%">
<tbody>
<tr>
<td>&nbsp;</td>
</tr>
</tbody>
</table>
<table width="80%" id="useSku">
<thead>
<tr>
<th align="left" colspan="3"><emph>Use_sku</emph></th>
</tr>
</thead>
<tbody>
<tr>
<td colspan="3">&nbsp;</td>
</tr>
<tr>
<td width="5%"></td>
<td colspan="2">
<emph>Use_sku</emph> <sub>0</sub> :
</td>
</tr>
<tr>
<td></td>
<td width="5%"></td>
<td>
AT("sku") <emph>Use_sku</emph> <sub>1</sub>
</td>
</tr>
<tr>
<td colspan="3">&nbsp;</td>
</tr>

<tr>
<td></td>
<td colspan="2">
<emph>Use_sku</emph> <sub>1</sub> :
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
EE
</td>
</tr>
<tr>
<td colspan="3">&nbsp;</td>
</tr>

</tbody>
</table>

<table width="80%" id="useColor">
<thead>
<tr>
<th align="left" colspan="3"><emph>Use_color</emph></th>
</tr>
</thead>
<tbody>
<tr>
<td colspan="3">&nbsp;</td>
</tr>
<tr>
<td width="5%"></td>
<td colspan="2">
<emph>Use_color</emph> <sub>0</sub> :
</td>
</tr>
<tr>
<td></td>
<td width="5%"></td>
<td>
AT("color") <emph>Use_color</emph> <sub>1</sub>
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
EE
</td>
</tr>
<tr>
<td colspan="3">&nbsp;</td>
</tr>

<tr>
<td></td>
<td colspan="2">
<emph>Use_color</emph> <sub>1</sub> :
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
EE
</td>
</tr>
<tr>
<td colspan="3">&nbsp;</td>
</tr>

</tbody>
</table>
</example>

<p>
Note the subtle difference between grammars <termref def="useSku"><emph>Use_sku</emph></termref> and <termref def="useColor"><emph>Use_color</emph></termref>. In the first grammar of each, only <termref def="useColor"><emph>Use_color</emph></termref> contains a production of which the right hand side starts with EE, which stems from the difference in their occurrence optionality as defined in the schema. 
</p>

<p>Finally, grammars for the element &lt;product&gt; is derived from the grammars of its attribute uses and content model particle as follows. See section <specref ref="complexTypeGrammars"/> for the rules used to derive grammars from a complex type.
</p>

<table width="100%">
<tbody>
<tr>
<td width="5%"></td>
<td>
<emph>ProtoG_ProductElement</emph> = <termref def="useColor"><emph>Use_color</emph></termref> &oplus; <termref def="useSku"><emph>Use_sku</emph></termref> &oplus; <termref def="particleSequence"><emph>Particle_sequence</emph></termref>
</td>
</tr>
</tbody></table>

<p>which yields the following grammars for element &lt;product&gt;.
</p>

<example>
<table width="80%">
<tbody>
<tr>
<td>&nbsp;</td>
</tr>
</tbody>
</table>
<table width="80%" id="protoProductElement">
<thead>
<tr>
<th align="left" colspan="3"><emph>ProtoG_ProductElement</emph></th>
</tr>
</thead>
<tbody>
<tr>
<td colspan="3">&nbsp;</td>
</tr>
<tr>
<td width="5%"></td>
<td colspan="2">
<emph>Use_color</emph> <sub>0</sub> :
</td>
</tr>
<tr>
<td></td>
<td width="5%"></td>
<td>
AT("color") <emph>Use_color</emph> <sub>1</sub>
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
<emph>Use_sku</emph> <sub>0</sub>
</td>
</tr>
<tr>
<td colspan="3">&nbsp;</td>
</tr>
<tr>
<td></td>
<td colspan="2">
<emph>Use_color</emph> <sub>1</sub> :
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
<emph>Use_sku</emph> <sub>0</sub>
</td>
</tr>
<tr>
<td colspan="3">&nbsp;</td>
</tr>

<tr>
<td width="5%"></td>
<td colspan="2">
<emph>Use_sku</emph> <sub>0</sub> :
</td>
</tr>
<tr>
<td></td>
<td width="5%"></td>
<td>
AT("sku") <emph>Use_sku</emph> <sub>1</sub>
</td>
</tr>
<tr>
<td colspan="3">&nbsp;</td>
</tr>
<tr>
<td></td>
<td colspan="2">
<emph>Use_sku</emph> <sub>1</sub> :
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
<emph>Term_description</emph><sub>0,0</sub>
</td>
</tr>
<tr>
<td colspan="3">&nbsp;</td>
</tr>

<tr>
<td></td>
<td colspan="2">
<emph>Term_description</emph><sub>0,0</sub> :  
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
SE("description") <emph>Term_description</emph><sub>0,1</sub> 
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
<emph>Term_quantity</emph><sub>0,0</sub> 
</td>
</tr>
<tr>
<td colspan="3">&nbsp;</td>
</tr>

<tr>
<td></td>
<td colspan="2">
<emph>Term_description</emph><sub>0,1</sub> :  
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
<emph>Term_quantity</emph><sub>0,0</sub> 
</td>
</tr>
<tr>
<td colspan="3">&nbsp;</td>
</tr>

<tr>
<td></td>
<td colspan="2">
<emph>Term_quantity</emph><sub>0,0</sub> :  
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
SE("quantity") <emph>Term_quantity</emph><sub>0,1</sub> 
</td>
</tr>
<tr>
<td colspan="3">&nbsp;</td>
</tr>

<tr>
<td></td>
<td colspan="2">
<emph>Term_quantity</emph><sub>0,1</sub> :  
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
<emph>Term_price</emph><sub>0,0</sub> 
</td>
</tr>
<tr>
<td colspan="3">&nbsp;</td>
</tr>

<tr>
<td></td>
<td colspan="2">
<emph>Term_price</emph><sub>0,0</sub> :  
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
SE("price") <emph>Term_price</emph><sub>0,1</sub> 
</td>
</tr>
<tr>
<td colspan="3">&nbsp;</td>
</tr>

<tr>
<td></td>
<td colspan="2">
<emph>Term_price</emph><sub>0,1</sub> :  
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
<emph>Term_description</emph><sub>1,0</sub>  
</td>
</tr>
<tr>
<td colspan="3">&nbsp;</td>
</tr>

<tr>
<td></td>
<td colspan="2">
<emph>Term_description</emph><sub>1,0</sub> :  
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
SE("description") <emph>Term_description</emph><sub>1,1</sub> 
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
<emph>Term_quantity</emph><sub>1,0</sub> 
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
EE  
</td>
</tr>
<tr>
<td colspan="3">&nbsp;</td>
</tr>

<tr>
<td></td>
<td colspan="2">
<emph>Term_description</emph><sub>1,1</sub> :  
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
<emph>Term_quantity</emph><sub>1,0</sub> 
</td>
</tr>
<tr>
<td colspan="3">&nbsp;</td>
</tr>

<tr>
<td></td>
<td colspan="2">
<emph>Term_quantity</emph><sub>1,0</sub> :  
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
SE("quantity") <emph>Term_quantity</emph><sub>1,1</sub> 
</td>
</tr>
<tr>
<td colspan="3">&nbsp;</td>
</tr>

<tr>
<td></td>
<td colspan="2">
<emph>Term_quantity</emph><sub>1,1</sub> :  
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
<emph>Term_price</emph><sub>1,0</sub> 
</td>
</tr>
<tr>
<td colspan="3">&nbsp;</td>
</tr>

<tr>
<td></td>
<td colspan="2">
<emph>Term_price</emph><sub>1,0</sub> :  
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
SE("price") <emph>Term_price</emph><sub>1,1</sub> 
</td>
</tr>
<tr>
<td colspan="3">&nbsp;</td>
</tr>

<tr>
<td></td>
<td colspan="2">
<emph>Term_price</emph><sub>1,1</sub> :  
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
EE  
</td>
</tr>
<tr>
<td colspan="3">&nbsp;</td>
</tr>

</tbody>
</table>
</example>

<p>The other element declaration &lt;order&gt; can be processed in the same fashion as was seen done above for element &lt;product&gt;, which would generate the following proto-grammars.
</p>

<example>
<table width="80%">
<tbody>
<tr>
<td>&nbsp;</td>
</tr>
</tbody>
</table>
<table width="80%">
<!-- thead>
<tr>
<th align="left" colspan="3"><emph>ProtoG_OrderElement</emph></th>
</tr>
</thead -->
<tbody>
<tr>
<td width="5%"></td>
<td width="5%"></td>
<td>

</td>
</tr>

<tr>
<td></td>
<td colspan="2">
<emph>Term_product</emph> <sub>0,0</sub> :
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
SE("product") <emph>Term_product</emph> <sub>0,1</sub>
</td>
</tr>
<tr>
<td colspan="3">&nbsp;</td>
</tr>

<tr>
<td></td>
<td colspan="2">
<emph>Term_product</emph> <sub>0,1</sub> :
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
<emph>Term_product</emph> <sub>1,0</sub>
</td>
</tr>
<tr>
<td colspan="3">&nbsp;</td>
</tr>

<tr>
<td></td>
<td colspan="2">
<emph>Term_product</emph> <sub>1,0</sub> :
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
SE("product") <emph>Term_product</emph> <sub>1,1</sub>
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
EE
</td>
</tr>
<tr>
<td colspan="3">&nbsp;</td>
</tr>

<tr>
<td></td>
<td colspan="2">
<emph>Term_product</emph> <sub>1,1</sub> :
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
<emph>Term_product</emph> <sub>1,0</sub>
</td>
</tr>
<tr>
<td colspan="3">&nbsp;</td>
</tr>

</tbody>
</table>
</example>

<p>In the above grammars, two grammars <emph>Term_product</emph> <sub>0,1</sub> and <emph>Term_product</emph> <sub>1,1</sub> are redundant because they serve for no other purpose than simply relaying one non-terminal to another. Though it is not required, the uses of non-terminals <emph>Term_product</emph> <sub>0,1</sub> and <emph>Term_product</emph> <sub>1,1</sub> are each replaced by <emph>Term_product</emph> <sub>1,0</sub> and <emph>Term_product</emph> <sub>1,0</sub>, which produces the following modified proto-grammars.
</p>

<example>
<table width="80%">
<tbody>
<tr>
<td>&nbsp;</td>
</tr>
</tbody>
</table>
<table width="80%" id="protoOrderElement">
<thead>
<tr>
<th align="left" colspan="3"><emph>ProtoG_OrderElement</emph></th>
</tr>
</thead>
<tbody>
<tr>
<td width="5%"></td>
<td colspan="2">

</td>
</tr>
<tr>
<td></td>
<td width="5%"></td>
<td>

</td>
</tr>
<tr>
<td colspan="3">&nbsp;</td>
</tr>

<tr>
<td></td>
<td colspan="2">
<emph>Term_product</emph> <sub>0,0</sub> :
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
SE("product") <emph>Term_product</emph> <sub>1,0</sub>
</td>
</tr>
<tr>
<td colspan="3">&nbsp;</td>
</tr>

<tr>
<td></td>
<td colspan="2">
<emph>Term_product</emph> <sub>0,1</sub> :
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
<emph>Term_product</emph> <sub>1,0</sub>
</td>
</tr>
<tr>
<td colspan="3">&nbsp;</td>
</tr>

<tr>
<td></td>
<td colspan="2">
<emph>Term_product</emph> <sub>1,0</sub> :
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
SE("product") <emph>Term_product</emph> <sub>1,0</sub>
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
EE
</td>
</tr>
<tr>
<td colspan="3">&nbsp;</td>
</tr>

<tr>
<td></td>
<td colspan="2">
<emph>Term_product</emph> <sub>1,1</sub> :
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
<emph>Term_product</emph> <sub>1,0</sub>
</td>
</tr>
<tr>
<td colspan="3">&nbsp;</td>
</tr>

</tbody>
</table>
</example>

</div2>

<div2 id="exampleNormGrammars">
<head>Normalized Grammar Examples</head>

<p>The element proto-grammars <termref def="protoProductElement"><emph>ProtoG_ProductElement</emph></termref> and <termref def="protoOrderElement"><emph>ProtoG_OrderElement</emph></termref> produced in the previous section can be turned into their normalized forms which are shown below with an event code assigned to each production. See section <specref ref="normalizedGrammars"/> for the process that converts proto-grammars into normalized grammars, and section <specref ref="eventCodeAssignment"/> for the rules that determine the event codes of productions in normalized grammars.
</p>

<example>
<table width="80%">
<tbody>
<tr>
<td>&nbsp;</td>
</tr>
</tbody>
</table>
<table width="80%" id="normProductElement">
<thead>
<tr>
<th align="left" colspan="4"><emph>NormG_ProductElement</emph></th>
</tr>
<tr>
<th colspan="3"></th>
<th align="left">Event Code</th>
</tr>
</thead>
<tbody>
<tr>
<td width="5%"></td>
<td colspan="3">
<emph>Use_color</emph> <sub>0</sub> :
</td>
</tr>
<tr>
<td></td>
<td width="5%"></td>
<td>
AT("color") <emph>Use_color</emph> <sub>1</sub>
</td>
<td>0</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
AT("sku") <emph>Use_sku</emph> <sub>1</sub>
</td>
<td>1</td>
</tr>
<tr>
<td colspan="4">&nbsp;</td>
</tr>
<tr>
<td></td>
<td colspan="3">
<emph>Use_color</emph> <sub>1</sub> :
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
AT("sku") <emph>Use_sku</emph> <sub>1</sub>
</td>
<td>0</td>
</tr>
<tr>
<td colspan="4">&nbsp;</td>
</tr>

<!-- tr>
<td width="5%"></td>
<td colspan="2">
<emph>Use_sku</emph> <sub>0</sub> :
</td>
</tr>
<tr>
<td></td>
<td width="5%"></td>
<td>
AT("sku") <emph>Use_sku</emph> <sub>1</sub>
</td>
</tr>
<tr>
<td colspan="3">&nbsp;</td>
</tr -->

<tr>
<td></td>
<td colspan="3">
<emph>Use_sku</emph> <sub>1</sub> :
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
SE("description") <emph>Term_description</emph><sub>0,1</sub> 
</td>
<td>0</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
SE("quantity") <emph>Term_quantity</emph><sub>0,1</sub> 
</td>
<td>1</td>
</tr>

<tr>
<td colspan="4">&nbsp;</td>
</tr>

<!-- tr>
<td></td>
<td colspan="2">
<emph>Term_description</emph><sub>0,0</sub> :  
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
SE("description") <emph>Term_description</emph><sub>0,1</sub> 
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
SE("quantity") <emph>Term_quantity</emph><sub>0,1</sub> 
</td>
</tr>
<tr>
<td colspan="3">&nbsp;</td>
</tr -->

<tr>
<td></td>
<td colspan="3">
<emph>Term_description</emph><sub>0,1</sub> :  
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
SE("quantity") <emph>Term_quantity</emph><sub>0,1</sub> 
</td>
<td>0</td>
</tr>
<tr>
<td colspan="4">&nbsp;</td>
</tr>

<!-- tr>
<td></td>
<td colspan="2">
<emph>Term_quantity</emph><sub>0,0</sub> :  
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
SE("quantity") <emph>Term_quantity</emph><sub>0,1</sub> 
</td>
</tr>
<tr>
<td colspan="3">&nbsp;</td>
</tr -->

<tr>
<td></td>
<td colspan="3">
<emph>Term_quantity</emph><sub>0,1</sub> :  
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
SE("price") <emph>Term_price</emph><sub>0,1</sub> 
</td>
<td>0</td>
</tr>
<tr>
<td colspan="4">&nbsp;</td>
</tr>

<!-- tr>
<td></td>
<td colspan="2">
<emph>Term_price</emph><sub>0,0</sub> :  
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
SE("price") <emph>Term_price</emph><sub>0,1</sub> 
</td>
</tr>
<tr>
<td colspan="3">&nbsp;</td>
</tr -->

<tr>
<td></td>
<td colspan="3">
<emph>Term_price</emph><sub>0,1</sub> :  
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
SE("description") <emph>Term_description</emph><sub>1,1</sub> 
</td>
<td>0</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
SE("quantity") <emph>Term_quantity</emph><sub>1,1</sub> 
</td>
<td>1</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
EE  
</td>
<td>2</td>
</tr>
<tr>
<td colspan="4">&nbsp;</td>
</tr>

<!-- tr>
<td></td>
<td colspan="2">
<emph>Term_description</emph><sub>1,0</sub> :  
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
SE("description") <emph>Term_description</emph><sub>1,1</sub> 
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
SE("quantity") <emph>Term_quantity</emph><sub>1,1</sub> 
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
EE  
</td>
</tr>
<tr>
<td colspan="3">&nbsp;</td>
</tr -->

<tr>
<td></td>
<td colspan="3">
<emph>Term_description</emph><sub>1,1</sub> :  
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
SE("quantity") <emph>Term_quantity</emph><sub>1,1</sub> 
</td>
<td>0</td>
</tr>
<tr>
<td colspan="4">&nbsp;</td>
</tr>

<!-- tr>
<td></td>
<td colspan="2">
<emph>Term_quantity</emph><sub>1,0</sub> :  
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
SE("quantity") <emph>Term_quantity</emph><sub>1,1</sub> 
</td>
</tr>
<tr>
<td colspan="3">&nbsp;</td>
</tr -->

<tr>
<td></td>
<td colspan="3">
<emph>Term_quantity</emph><sub>1,1</sub> :  
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
SE("price") <emph>Term_price</emph><sub>1,1</sub> 
</td>
<td>0</td>
</tr>
<tr>
<td colspan="4">&nbsp;</td>
</tr>

<!-- tr>
<td></td>
<td colspan="2">
<emph>Term_price</emph><sub>1,0</sub> :  
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
SE("price") <emph>Term_price</emph><sub>1,1</sub> 
</td>
</tr>
<tr>
<td colspan="3">&nbsp;</td>
</tr -->

<tr>
<td></td>
<td colspan="3">
<emph>Term_price</emph><sub>1,1</sub> :  
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
EE  
</td>
<td>0</td>
</tr>
<tr>
<td colspan="4">&nbsp;</td>
</tr>
</tbody>
</table>

<table width="80%" id="normOrderElement">
<thead>
<tr>
<th align="left" colspan="4"><emph>NormG_OrderElement</emph></th>
</tr>
<tr>
<th colspan="3"></th>
<th align="left">Event Code</th>
</tr>
</thead>
<tbody>
<tr>
<td width="5%"></td>
<td colspan="3">
<emph>Term_product</emph> <sub>0,0</sub> :
</td>
</tr>
<tr>
<td></td>
<td width="5%"></td>
<td>
SE("product") <emph>Term_product</emph> <sub>1,0</sub>
</td>
<td>0</td>
</tr>
<tr>
<td colspan="4">&nbsp;</td>
</tr>
<tr>
<td></td>
<td colspan="3">
<emph>Term_product</emph> <sub>1,0</sub>:
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
SE("product") <emph>Term_product</emph> <sub>1,0</sub>
</td>
<td>0</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
EE
</td>
<td>1</td>
</tr>
<tr>
<td colspan="4">&nbsp;</td>
</tr>
</tbody>
</table>
</example>

<p>Note that some grammars that were present in the proto-grammars have been removed in the normalized grammars. Those grammars were culled upon the completion of grammar normalization because their <emph>LeftHandSide</emph> are not referenced from <emph>RightHandSide</emph> of any available productions.
</p>

</div2>

<div2 id="exampleCompleteGrammars">
<head>Complete Grammar Examples</head>
<p>The normalized grammars <termref def="normProductElement"><emph>NormG_ProductElement</emph></termref> and <termref def="normOrderElement"><emph>NormG_OrderElement</emph></termref> are augumented with undeclared productions to become complete grammars.
See section <specref ref="undeclaredProductions"/> for the process that augments normalized grammars with productions that represent terminal symbols not declared in schemas.
Those productions not necessary per fidelity options are pruned using the rules described in section <specref ref="pruningProductions"/>. 
The resulting grammar with the default fidelity options setting is shown below.
</p>

<example>
<table width="80%">
<tbody>
<tr>
<td>&nbsp;</td>
</tr>
</tbody>
</table>
<table width="80%" id="completeProductElement">
<thead>
<tr>
<th align="left" colspan="4"><emph>CompleteG_ProductElement</emph></th>
</tr>
<tr>
<th colspan="3"></th>
<th align="left">Event Code</th>
</tr>
</thead>
<tbody>
<tr>
<td width="5%"></td>
<td colspan="3">
<emph>Use_color</emph> <sub>0</sub> :
</td>
</tr>
<tr>
<td></td>
<td width="5%"></td>
<td>
AT("color") <emph>Use_color</emph> <sub>1</sub>
</td>
<td>0</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
AT("sku") <emph>Use_sku</emph> <sub>1</sub>
</td>
<td>1</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
EE <sub>&nbsp;</sub>
</td>
<td>2.0</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
AT(xsi:type) <emph>Use_color</emph> <sub>0</sub>
</td>
<td>2.1</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
AT(xsi:nil) <emph>Use_color</emph> <sub>0</sub>
</td>
<td>2.2</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
AT(*) <emph>Use_color</emph> <sub>0</sub>
</td>
<td>2.3</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
AT("color") [schema-invalid value] <emph>Use_color</emph> <sub>0</sub>
</td>
<td>2.4.0</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
AT("sku") [schema-invalid value] <emph>Use_color</emph> <sub>0</sub>
</td>
<td>2.4.1</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
AT(*) [schema-invalid value] <emph>Use_color</emph> <sub>0</sub>
</td>
<td>2.4.2</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
SE(*) <emph>Use_sku</emph> <sub>1</sub>
</td>
<td>2.5</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
CH [schema-invalid value] <emph>Use_sku</emph> <sub>1</sub>
</td>
<td>2.6</td>
</tr>
<tr>
<td colspan="4">&nbsp;</td>
</tr>
<tr>
<td></td>
<td colspan="3">
<emph>Use_color</emph> <sub>1</sub> :
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
AT("sku") <emph>Use_sku</emph> <sub>1</sub>
</td>
<td>0</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
EE <sub>&nbsp;</sub>
</td>
<td>1.0</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
AT(*) <emph>Use_color</emph> <sub>1</sub>
</td>
<td>1.1</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
AT("sku") [schema-invalid value] <emph>Use_color</emph> <sub>1</sub>
</td>
<td>1.2.0</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
AT(*) [schema-invalid value] <emph>Use_color</emph> <sub>1</sub>
</td>
<td>1.2.1</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
SE(*) <emph>Use_sku</emph> <sub>1</sub>
</td>
<td>1.3</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
CH [schema-invalid value] <emph>Use_sku</emph> <sub>1</sub>
</td>
<td>1.4</td>
</tr>
<tr>
<td colspan="4">&nbsp;</td>
</tr>

<tr>
<td></td>
<td colspan="3">
<emph>Use_sku</emph> <sub>1</sub> :
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
SE("description") <emph>Term_description</emph><sub>0,1</sub> 
</td>
<td>0</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
SE("quantity") <emph>Term_quantity</emph><sub>0,1</sub> 
</td>
<td>1</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
EE <sub>&nbsp;</sub>
</td>
<td>2.0</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
AT(*) <emph>Use_sku</emph> <sub>1</sub>
</td>
<td>2.1</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
AT(*) [schema-invalid value] <emph>Use_sku</emph> <sub>1</sub>
</td>
<td>2.2.0</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
SE(*) <emph>Use_sku</emph> <sub>1</sub>
</td>
<td>2.3</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
CH [schema-invalid value] <emph>Use_sku</emph> <sub>1</sub>
</td>
<td>2.4</td>
</tr>
<tr>
<td colspan="4">&nbsp;</td>
</tr>

<tr>
<td></td>
<td colspan="3">
<emph>Term_description</emph><sub>0,1</sub> :  
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
SE("quantity") <emph>Term_quantity</emph><sub>0,1</sub> 
</td>
<td>0</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
EE <sub>&nbsp;</sub>
</td>
<td>1</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
SE(*) <emph>Term_description</emph><sub>0,1</sub>
</td>
<td>2.0</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
CH [schema-invalid value] <emph>Term_description</emph><sub>0,1</sub>
</td>
<td>2.1</td>
</tr>
<tr>
<td colspan="4">&nbsp;</td>
</tr>

<tr>
<td></td>
<td colspan="3">
<emph>Term_quantity</emph><sub>0,1</sub> :  
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
SE("price") <emph>Term_price</emph><sub>0,1</sub> 
</td>
<td>0</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
EE <sub>&nbsp;</sub>
</td>
<td>1</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
SE(*) <emph>Term_quantity</emph><sub>0,1</sub>
</td>
<td>2.0</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
CH [schema-invalid value] <emph>Term_quantity</emph><sub>0,1</sub>
</td>
<td>2.1</td>
</tr>
<tr>
<td colspan="4">&nbsp;</td>
</tr>

<tr>
<td></td>
<td colspan="3">
<emph>Term_price</emph><sub>0,1</sub> :  
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
SE("description") <emph>Term_description</emph><sub>1,1</sub> 
</td>
<td>0</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
SE("quantity") <emph>Term_quantity</emph><sub>1,1</sub> 
</td>
<td>1</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
EE  <sub>&nbsp;</sub>
</td>
<td>2</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
SE(*) <emph>Term_price</emph><sub>0,1</sub>
</td>
<td>3.0</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
CH [schema-invalid value] <emph>Term_price</emph><sub>0,1</sub>
</td>
<td>3.1</td>
</tr>
<tr>
<td colspan="4">&nbsp;</td>
</tr>

<tr>
<td></td>
<td colspan="3">
<emph>Term_description</emph><sub>1,1</sub> :  
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
SE("quantity") <emph>Term_quantity</emph><sub>1,1</sub> 
</td>
<td>0</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
EE <sub>&nbsp;</sub>
</td>
<td>1</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
SE(*) <emph>Term_description</emph><sub>1,1</sub>
</td>
<td>2.0</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
CH [schema-invalid value] <emph>Term_description</emph><sub>1,1</sub>
</td>
<td>2.1</td>
</tr>
<tr>
<td colspan="4">&nbsp;</td>
</tr>

<tr>
<td></td>
<td colspan="3">
<emph>Term_quantity</emph><sub>1,1</sub> :  
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
SE("price") <emph>Term_price</emph><sub>1,1</sub> 
</td>
<td>0</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
EE <sub>&nbsp;</sub>
</td>
<td>1</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
SE(*) <emph>Term_quantity</emph><sub>1,1</sub>
</td>
<td>2.0</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
CH [schema-invalid value] <emph>Term_quantity</emph><sub>1,1</sub>
</td>
<td>2.1</td>
</tr>
<tr>
<td colspan="4">&nbsp;</td>
</tr>

<tr>
<td></td>
<td colspan="3">
<emph>Term_price</emph><sub>1,1</sub> :  
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
EE <sub>&nbsp;</sub>
</td>
<td>0</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
SE(*) <emph>Term_price</emph><sub>1,1</sub>
</td>
<td>1.0</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
CH [schema-invalid value] <emph>Term_price</emph><sub>1,1</sub>
</td>
<td>1.1</td>
</tr>
<tr>
<td colspan="4">&nbsp;</td>
</tr>
</tbody>
</table>

<table width="80%" id="completeOrderElement">
<thead>
<tr>
<th align="left" colspan="4"><emph>CompleteG_OrderElement</emph></th>
</tr>
<tr>
<th colspan="3"></th>
<th align="left">Event Code</th>
</tr>
</thead>
<tbody>
<tr>
<td width="5%"></td>
<td colspan="3">
<emph>Term_product</emph> <sub>0,0</sub> :
</td>
</tr>
<tr>
<td></td>
<td width="5%"></td>
<td>
SE("product") <emph>Term_product</emph> <sub>1,0</sub>
</td>
<td>0</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
EE <sub>&nbsp;</sub>
</td>
<td>1.0</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
AT(xsi:type) <emph>Term_product</emph> <sub>0,0</sub>
</td>
<td>1.1</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
AT(xsi:nil) <emph>Term_product</emph> <sub>0,0</sub>
</td>
<td>1.2</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
AT(*) <emph>Term_product</emph> <sub>0,0</sub>
</td>
<td>1.3</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
AT(*) [schema-invalid value] <emph>Term_product</emph> <sub>0,0</sub>
</td>
<td>1.4.0</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
SE(*) <emph>Term_product</emph> <sub>0,0</sub>
</td>
<td>1.5</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
CH [schema-invalid value] <emph>Term_product</emph> <sub>0,0</sub>
</td>
<td>1.6</td>
</tr>
<tr>
<td colspan="4">&nbsp;</td>
</tr>

<!-- tr>
<td width="5%"></td>
<td colspan="3">
<emph>Term_product&nbsp;<sup>2</sup></emph> <sub>0,0</sub> :
</td>
</tr>
<tr>
<td></td>
<td width="5%"></td>
<td>
SE("product") <emph>Term_product</emph> <sub>1,0</sub>
</td>
<td>0</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
<emph>UndeclaredEE</emph>
</td>
<td>1</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
<emph>UndeclaredContentItems</emph> (2.0) &nbsp;<emph>Term_product&nbsp;<sup>2</sup></emph> <sub>0,0</sub>
</td>
<td>2.0</td>
</tr>

<tr>
<td colspan="4">&nbsp;</td>
</tr -->

<tr>
<td></td>
<td colspan="3">
<emph>Term_product</emph> <sub>1,0</sub> :
</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
SE("product") <emph>Term_product</emph> <sub>1,0</sub>
</td>
<td>0</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
EE <sub>&nbsp;</sub>
</td>
<td>1</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
SE(*) <emph>Term_product</emph> <sub>1,0</sub>
</td>
<td>2.0</td>
</tr>
<tr>
<td></td>
<td></td>
<td>
CH [schema-invalid value] <emph>Term_product</emph> <sub>1,0</sub>
</td>
<td>2.1</td>
</tr>
<tr>
<td colspan="4">&nbsp;</td>
</tr>
</tbody>
</table>
</example>
</div2>
</inform-div1>
<inform-div1 id="changes">
<head>Recent Specification Changes</head>
<div2 id="changes5">
<head>
Changes from Fourth Public Working Draft
</head>
<ulist>
<item>
Added the section <specref ref="mediaTypeRegistration"/>.
</item>
</ulist>
</div2>
<div2 id="changes4">
<head>
Changes from Third Public Working Draft
</head>
<ulist>
<item>
Added the section <specref ref="conformance"/>.
</item>
</ulist>
<ulist>
<item>
<termref def="key-exiCookie">EXI Cookie</termref> was introduced in the EXI header in order to provide a facility to distinguish EXI streams from a broader range of document types used on the Web. (see <specref ref="EXICookie"/>)
</item>
</ulist>
<ulist>
<item>
Header options fields <termref def="key-valueMaxLengthOption">valueMaxLength</termref> and <termref def="key-valuePartitionCapacityOption">valuePartitionCapacity</termref> have beed added in EXI options to limit the length of a string and the total number of strings that are put into value partitions of a string table.
</item>
</ulist>
<ulist>
<item>
Added support for mixed and empty content model in complex type grammar generation.
(see <specref ref="complexTypeGrammars"/>)
</item>
</ulist>
<ulist>
<item>
Described how built-in element grammars handle xsi:type and xsi:type attribute occurrences in schema-informed EXI streams.
(see <specref ref="builtinElemGrammars"/>)
</item>
</ulist>
<ulist>
<item>
Defined the grammar that represents xsd:anyType.
(see <specref ref="anyTypeGrammar"/>)
</item>
</ulist>
<ulist>
<item>
Described the restricted character sets used for certain typed values when <termref def="key-preserveLexicalValuesOption">preserve.lexicalValues</termref> is true.
(see <specref ref="builtInRestrictedStrings"/>)
</item>
</ulist>
<ulist>
<item>
Added a section <specref ref="informedElementFragGrammar"/>
describing the content of elements declared with the same qname when they occur inside an EXI fragment or EXI Element Fragment.
</item>
</ulist>
<ulist>
<item>
Added <termref def="key-selfContained">selfContained</termref> option for creating elements that can be indexed for random access.
(see <specref ref="options"/>)
</item>
</ulist>
<ulist>
<item>
Added mention of Unsigned Integer size that EXI processors SHOULD and MUST support.
(see <specref ref="encodingUnsignedInteger"/>)
</item>
</ulist>
<ulist>
<item>
Changed the order of CH events during event code assignment. 
(see <specref ref="eventCodeAssignment"/>)
</item>
</ulist>
<ulist>
<item>
Added semantics for empty "schemaID" element value.
(see <specref ref="options"/>)
</item>
</ulist>
<ulist>
<item>
The term "CODEC" has been replaced by a more appropriate term "datatype representation" throughout this document.
</item>
</ulist>
<ulist>
<item>
Added support for parameterized attribute wildcards in the generation of complex type grammars.
(see <specref ref="complexTypeGrammars"/>)
</item>
</ulist>

</div2>
<div2 id="changes3">
<head>
Changes from Second Public Working Draft
</head>
<ulist>
<item>
Added <termref def="key-strictOption">strict option</termref> (see <specref ref="options"/> and <specref ref="addingProductionsStrict"/>).
</item>
</ulist>
<ulist>
<item>
The order of content items for NS event has been corrected. (see <specref ref="eventTypes"/>)
</item>
</ulist>
<ulist>
<item>
Improved the description of <termref def="key-options">EXI Options document</termref> to make it clear that its representation does not start with an EXI header (see section <specref ref="options"/>). 
</item>
</ulist>
<ulist>
<item>
Reworked the section <specref ref="undeclaredProductions"/> for grammar system accuracy as well as describing how grammars are augumented with undeclared productions when <termref def="key-strictOption">strict option</termref> is true.
<!--
Reworked the note for the the set of productions <emph>UndeclaredStartTagItems</emph> to articulate the role of each participating production (see section <specref ref="undeclaredTerminalSymbols"/>).
-->
</item>
</ulist>
<ulist>
<item>
The indicator content item in NS event type has been renamed to"local-element-ns", and improved its description for clarity (see section <specref ref="streams"/>).
</item>
</ulist>
<ulist>
<item>
The calculus of the MonthDay component used in Date-Time representation has been corrected (see section <specref ref="encodingDateTime"/>).
</item>
</ulist>
<ulist>
<item>
<emph>TypeEmpty</emph> is created for each type grammars to facilitate xsi:nil handling (see section <specref ref="typeGrammars"/>).
</item>
</ulist>
<ulist>
<item>
Described how the use of substitution groups in schemas affects the grammars (see section <specref ref="elementTerms"/>).
</item>
</ulist>
<ulist>
<item>
Described how the namespace constraints of wildcard terms are factored into the grammars. (see section <specref ref="wildcardTerms"/>).
</item>
</ulist>
<ulist>
<item>
Regular expressions do not apply to constrain character sets when regexps contain certain category escapes (see section <specref ref="regexToCharset"/>).
</item>
</ulist>
</div2>
<div2 id="changes2">
<head>
Changes from First Public Working Draft
</head>
<ulist>
<item>

Specified how schema-informed grammars are derived from available XML Schemas (see section <specref ref="informedElemGrammars"/>).

</item>
</ulist>
<ulist>
<item>

Described how QNames are represented with prefixes when prefix preservation is turned on (see section <specref ref="encodingQName"/>).

</item>
</ulist>
<ulist>
<item>

The <termref def="key-alignmentOption">alignment</termref> option was introduced in <termref def="key-options">EXI Options</termref> to support <termref def="key-precompression">pre-compression</termref> stream as well as plain <termref def="key-bytealignment">byte-alignment</termref> stream, in addition to the default <termref def="key-unaligned">bit-packed</termref> representation.

</item>
</ulist>
<ulist>
<item>

Described how the presence of pattern facets affects the encoding of EXI Boolean (see section <specref ref="encodingBoolean"/>) and EXI String (see section <specref ref="encodingString"/>) representation.

</item>
</ulist>
<ulist>
<item>

Values typed as integer with bounded range of 4095 or smaller are now represented as <emph>n</emph>-bit Unsigned Integers (see section <specref ref="encodingBoundedUnsigned"/>).

</item>
</ulist>
<ulist>
<item>

Added a section <!-- (see section <specref ref="otherProposedFeatures"/>) --> that lists additional items that have been advised and are currently under consideration.

</item>
</ulist>
</div2>
</inform-div1>
<inform-div1 id="acknowledgements">
<head>Acknowledgements</head>

<p>This document is the work of the <loc href="http://www.w3.org/XML/EXI/">Efficient XML Interchange (EXI) WG</loc>.</p>

<p>Members of the Working Group are (at the time of writing, sorted alphabetically by last name): </p>
<ulist>
<item>Carine Bournez, W3C/ERCIM (<emph>staff contact</emph>)</item>
<item>Don Brutzman, Web3D Consortium</item>
<item>Alex Ceponkus, AgileDelta, Inc.</item>
<item>Michael Cokus, MITRE Corporation 
(<emph>co-chair</emph>)
</item>
<item>Roger Cutler, Chevron</item>
<item>Ed Day, Objective Systems, Inc.</item>
<item>Joerg Heuer, Siemens AG</item>
<item>Alan Hudson, Web3D Consortium</item>
<item>Takuki Kamiya, Fujitsu Laboratories of America, Inc. 
(<emph>co-chair</emph>)
</item>
<item>Jaakko Kangasharju, University of Helsinki</item>
<item> Richard Kuntschke, Siemens AG </item>
<item>Don McGregor, Web3D Consortium</item>
<item>Daniel Peintner, Siemens AG</item>
<item>Santiago Pericas-Geertsen, Sun Microsystems, Inc.</item>
<item>Liam Quin, W3C/MIT (<emph>staff contact</emph>)</item>
<item>Rich Rollman, AgileDelta, Inc.</item>
<item>Paul Sandoz, Sun Microsystems, Inc.</item>
<item>John Schneider, AgileDelta, Inc.</item>
<item>Young Wang, Intel Corporation</item>
<item>Greg White, Stanford University (<emph>former co-chair</emph>)</item>
</ulist>
<p>The EXI Working Group would like to acknowledge the following former members of the group for their leadership, guidance and expertise they provided throughout their individual tenure in the WG. (sorted alphabetically by last name)
</p>
<ulist>
<item>Robin Berjon, Expway (<emph>former co-chair</emph>) (until 17 October 2006) </item>
<item>Oliver Goldman, Adobe Systems, Inc. (<emph>former co-chair</emph>) (until 08 June 2006) </item>
<item>Peter Haggar, IBM (until 07 March 2007) </item>
<item>
Kimmo Raatikainen, Nokia (until 13 March 2008)
</item>
<item>Paul Thorpe, OSS Nokalva, Inc. (until 11 Sept 2007)</item>
<item>
Daniel Vogelheim, Invited Expert (<emph>former co-chair</emph> then from Siemens AG) (until 15 July 2008)
</item>
<item>Stephen Williams, High Performance Technologies, Inc. (until 30 June 2008)</item>
</ulist>
<p>
The EXI working group owes so much to our distinguished colleague from Nokia, Kimmo Raatikainen (1955-2008), on the progress of our work, who succumbed to an ailment on March 13, 2008. His breadth of knowledge, depth of insight, ingenuity and courage to speak up constantly shed a light onto us whenever the group seemed to stray into a futile path of disagreements during the course. We shall never forget and will always appreciate his presence in us, and great contribution that is omnipresent in every aspect of our work throughout.
</p>
</inform-div1>
</back></spec>
