W3C

Efficient XML Interchange (EXI) Profile

W3C Working Draft 16 August 2011

This version:
http://www.w3.org/TR/2011/WD-exi-profile-20110816/
Latest version:
http://www.w3.org/TR/exi-profile/
Editors:
Youenn Fablet, Canon Research Centre France
Daniel Peintner, Siemens AG

Abstract

This document describes a profile of the EXI 1.0 specification that allows bounding the memory consumption of EXI internal structures such as grammars and string partitions.

Status of this Document

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.

Ths is the first Public Working Draft of the Efficient XML Interchange (EXI) Profile specification. It is intended for review by W3C members and other interested parties. Although still incomplete, this document identifies all the indexing structures that are currently in the final scope of this profile. Except for the prefix partitions, the processing rules that kick in when a memory bound is reached are defined. This document also includes Editorial notes on topics that the Working Group is still discussing. Feedback is welcome on the document content as well as on topics covered by the editorial notes.

Please send comments about this document to the public-exi@w3.org mailing list (Archives).

This document has been produced by the Efficient XML Interchange Working Group as part of the W3C XML Activity. The goals of the Efficient XML Interchange (EXI) Format are discussed in the Efficient XML Interchange (EXI) Format 1.0 document. The authors of this document are the members of the Efficient XML Interchange Working Group.

Publication as a Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.

This document was produced by a group operating under the 5 February 2004 W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.

Table of Contents

1. Introduction
2. Capping Rules
    2.1 Name Partitions Capping Definition
        2.1.1 Uri Partition Capping Definition
        2.1.2 Local-name Partition Capping Definition
        2.1.3 Prefix Partition Capping Definition
    2.2 Grammar Capping Definition
        2.2.1 Built-in Capping Definition
    2.3 Local Value Capping Definition
3. Capping Parameters
    3.1 Name Partitions Capping Thresholds
    3.2 Grammar Capping Thresholds
    3.3 Local Value Partitions Threshold
4. Integration with EXI 1.0
    4.1 EXI Stream Detection
    4.2 EXI Profile Header Parameters

Appendix

A References


1. Introduction

Many device classes and use-cases desire to use EXI as its exchange format. Due to various restrictions some of those application areas are not capable or allowed to require arbitrary memory growth at runtime. Certain evaluations of EXI in the context of such areas exposed some challenges to the attempt to restrict memory usage predictably within their limited respective threshold.

This EXI profile document specifies capping rules to ensure that the memory restrictions are respected. Section 2. Capping Rules states these rules that have to be followed by any EXI profile. Section 3. Capping Parameters provides means to parameterize these rules according different memory restriction levels. Other groups and standardization bodies may rely on these capping parameters or build their own profile if the suggested thresholds and parameters are not sufficient. Section 4. Integration with EXI 1.0 describes the integration of the EXI profile with EXI 1.0. In particular, it details how the profile parameters can be described in the EXI option header and how EXI Profile streams can be distinguished from EXI 1.0 streams.

As long as the introduced memory restrictions are not reached, an EXI Profile stream is fully compatible with an EXI 1.0 Stream.

Editorial note 
Control last compatibility statement after all parameters are elaborated. Check also how this statement fits with the EXI header subject.

2. Capping Rules

The [Efficient XML Interchange (EXI) Format 1.0] specification defines learning mechanisms that enable a very compact representation of the XML infoset. These mechanisms are based on EXI internal structures that are augmented during the EXI processing, for instance when a uri is added to the uri partition or when a production is inserted in a built-in element grammar.

This section defines the processing rules that enable capping the memory size of these structures. Generally, EXI profile stream processing must proceed without adding or inserting an item in a structure whenever it would exceed the defined memory restriction of that structure. This section also defines rules that ensure that different structures remain consistent one with the other. For instance, the rules ensure that a local-name can only be indexed if its related uri can also be indexed.

An EXI event may lead to the augmentation of several different structures. This specification assumes the following augmentation order:

  1. uri partition

  2. local-name partition

  3. prefix partition

  4. production creation and event-code incrementation

  5. value partition(s)

An implementation is free to choose any approach and any order as long as the result remains the same. The capping rules for an event value depend on the value datatype representation. If the value is represented as a string, the local value partition capping rules are used. If the value is represented as a QName, the rules are first applied to the QName uri and then to the QName local-name.

2.1 Name Partitions Capping Definition

Name partitions refer to the uri, prefix and local-name partitions. To limit the memory growing, string insertion in the name partitions is stopped at a given threshold according the rules described in the following sub-sections. By definition, when a string is not added to a partition, no compact identifier is assigned to the given string.

2.1.1 Uri Partition Capping Definition

A uri string is represented according section 7.3.2 of the EXI 1.0 specification to the exception that the uri string is NOT added to the uri partition if any of the following conditions is met:

  1. Adding the uri to the uri partition would exceed the defined uri partition restriction.

2.1.2 Local-name Partition Capping Definition

A local-name string is represented according section 7.3.3 of the EXI 1.0 specification to the exception that the local-name string is NOT added to the local-name partitions if any of the following conditions is met:

  1. Adding the local-name to the local-name partition would exceed the defined local-name partition restriction.

  2. The local-name is associated with a namespace uri that is not present in the uri partition.

2.1.3 Prefix Partition Capping Definition

A prefix string can be represented according section 7.3.2 of the EXI 1.0 specification if any of the following conditions is met:

  1. Adding the prefix to the prefix partition would NOT exceed the defined prefix partition restriction.

  2. The prefix is associated with a namespace uri that is present in the uri partition.

Otherwise, a prefix string cannot be represented.

Editorial note 
In some cases prefixes cannot be represented. The working group is currently investigating the issue.

2.2 Grammar Capping Definition

2.2.1 Built-in Capping Definition

To limit the memory growing due to grammar learning, the production creation and event-code incrementation is stopped at a given threshold according the following rules. This applies to built-in element and built-in fragment grammars.

A production is not added if any of the following conditions is met:

  1. Adding a production to the grammar would exceed the defined grammar restriction.

  2. The uri of the element QName, associated to its grammar, is not present in the uri partition.

  3. The local-name of the element QName, associated to its grammar, is not present in its local-name partition.

  4. The uri of the qname associated to the production, if any, is not present in the uri partition.

  5. The local-name of the qname associated to the production, if any, is not present in its local-name partition.

Note in particular that no other production than the default productions defined in sections 8.4.2 and 8.4.3 of the EXI 1.0 specification can be added to a built-in grammar whose element QName is not totally indexed.

By definition, a production that is not added to its grammar is evaluated as described in sections 8.4.2 and 8.4.3 of the EXI 1.0 specification to the exception that:

  • Step 2 is skipped for a built-in element grammar production of the form LeftHandSide: AT (*) RightHandSide.

  • Steps 3 to 5 are skipped for a built-in element or fragment grammar production of the form LeftHandSide: SE (*) RightHandSide.

  • Step 2 is skipped for a built-in element grammar production of the form LeftHandSide: CH RightHandSide.

  • Steps 1 and 2 are skipped for a built-in element grammar production of the form LeftHandSide: EE

2.3 Local Value Capping Definition

Local value partitions grow with the insertion of string values to the partitions, as described in section 7.3.3 of the EXI specification. To limit the memory growing, local value partitions insertion is stopped at a given threshold according the following rules.

A string value is represented according section 7.3.3 of the EXI 1.0 specification to the exception that it is NOT added to its associated local value partition if any of the following conditions is met:

  1. Adding the string value to its associated local value partition would exceed the defined local value partition restriction.

  2. The uri of the qname associated to the string value is not present in the uri partition.

  3. The local-name of the qname associated to the string value is not present in its local-name partition.

3. Capping Parameters

This section describes the threshold rules and associated parameters. Hereby it defines when a given operation would exceed the defined memory restrictions (see 2. Capping Rules).

3.1 Name Partitions Capping Thresholds

In the remainder of the sections, we will refer to name partitions entries, uri entries, local-name entries or prefix entries. In all cases, these terms refer to entries that are added after the pre-population of the name partitions. Schema-informed or schema-less name partitions pre-population is not constrained in this section and pre-populated entries are never used in the memory restriction computations.

Editorial note 
The working group currently elaborates different ways on how to restrict name partitions. The general idea is to introduce options to restrict the number of characters and/or the number of entries for different partitions. The chosen approach is not set yet and will be published in a future version of this document.

3.2 Grammar Capping Thresholds

[Definition: A built-in element grammar is considered to be an evolving built-in element grammar if a production has been dynamically inserted within the grammar. ]

[Definition: The maximumNumberOfEvolvingBuiltInElementGrammars option is the maximum number of elements for which evolving built-in element grammars can be instantiated. ]

[Definition: The maximumNumberOfBuiltInProductions option is the maximum number of top-level productions that can be dynamically inserted in built-in fragment and built-in element grammars. ] Note that only dynamically inserted top level productions are counted. In particular, the top level EE productions of the built-in ElementContent grammar are not counted since they are added when creating each ElementContent grammar.

Grammar augmentation of a built-in grammar does exceed grammar restriction if any of following condition is true:

  1. The number of dynamically inserted top level productions of all built-in fragment and element grammars is equal or greater than the maximumNumberOfBuiltInProductions value.

  2. The augmentation process makes it to be a new evolving built-in element grammar while the number of already existing evolving built-in element grammars is equal or greater than the maximumNumberOfBuiltInElementGrammars value.

Note that non evolving built-in element grammars can be shared by different elements and do not cause to exceed grammar restrictions in any case.

By default the maximumNumberOfBuiltInElementGrammars and maximumNumberOfBuiltInProductions values are unbounded.

3.3 Local Value Partitions Threshold

[Definition: The localValuePartitions option is a Boolean used to indicate whether local value partitions are used. ] The value "false" indicates that no local value patition is used while "true" represents the behavior of the EXI 1.0 specification.

A local value partition does exceed the defined local value partition restriction if the following condition is true:

  1. The value localValuePartitions is set to "false".

By default the localValuePartitions value "true" is assumed.

Editorial note 
The group is still assessing the benefit of this parameter. Besides more flexible solutions are discussed.

4. Integration with EXI 1.0

This section describes how this profile integrates with EXI1.0. In particular, it describes how an implementation can detect whether an EXI stream follows this profile or not. It also describes how the parameters, described in the profile, can be communicated in the EXI stream.

4.1 EXI Stream Detection

Editorial note 
The working group is considering different approaches such as defining a new version number or making use of the extensibility of the current EXI Header Options document. The solution space is not limited to these approaches.

4.2 EXI Profile Header Parameters

Editorial note 
Define which parameters appear in the header, if any and where. Default values will be defined in the above section, for instance like it was done in Table 5-1 of the EXI specification.

A References

Efficient XML Interchange (EXI) Format 1.0
Efficient XML Interchange (EXI) Format 1.0, John Schneider and Takuki Kamiya, Editors. World Wide Web Consortium. The latest version is available at http://www.w3.org/TR/exi/. (See http://www.w3.org/TR/2011/REC-exi-20110310/.)
EXI Evaluation Note
Efficient XML Interchange Evaluation, Carine Bournez, Editor. World Wide Web Consortium. The latest version is available at http://www.w3.org/TR/exi-evaluation/. (See http://www.w3.org/TR/2009/WD-exi-evaluation-20090407/.)
XML Schema Datatypes
XML Schema Part 2: Datatypes Second Edition, P. Byron and A. Malhotra, Editors. World Wide Web Consortium, 2 May 2001, revised 28 October 2004. The latest version is available at http://www.w3.org/TR/xmlschema-2. (See http://www.w3.org/TR/2004/REC-xmlschema-2-20041028/.)