W3C

List of comments on “Efficient XML Interchange (EXI) Format 1.0” (dated 19 September 2008)

Quick access to

There are 47 comments (sorted by their types, and the section they are about).

1-20 21-40 41-47

question comments

Comment LC-2192
Commenter: Jochen Darley <joda@upb.de> (archived message)
Context: in
assigned to Takuki Kamiya
Resolution status:

Hallo EXI WG,

I'll just start with an example scenario:

Let's assume www.markmail.org wants use an XML compression for their
services. Markmail offers personalized feeds of news and mailing lists
to it's customers (as RSS/Atom feed). The goal is to allow customers to
receive their personalized feed as a compressed XML stream.

If markmail implemented the compressed streams by compressing each
personalized stream by itself then they need a lot resources. My
assumption is that they will have to use a separate EXI compressor for
each (of the thousand) compressed customer streams.

The solution would be to pre-compress the feed's entries and just copy
them into the customized streams. Markmail can't use a single continuous
EXI stream because:

1) EXI has a global string table which can't be reset per block
2) EXI enforces a fixed blocksize "n" except for the last block

My solution would be to pre-compress multiple XML fragments and then
copy compressed fragments into the customers personalized stream.

My questions:

1) How will EXI support such a compressed, streaming
scenario?

2) Should EXI support this scenario ?

3) What are the design intentions for the fixed blocksize?

4) Is it acceptable to remove the fixed blocksize?

5) Can a mode be added to EXI which resets the string table
for each block?

6) What are design choices/constraints which require a global
string table or the fixed blocksize?

Regards,
Jochen Darley
(space separated ids)
(Please make sure the resolution is adapted for public consumption)

Comment LC-2172
Commenter: FABLET Youenn <Youenn.Fablet@crf.canon.fr> (archived message)
Context: in
assigned to Takuki Kamiya
Resolution status:

1) Some facets are supported like minInclusive or maxExclusive.
What about the support of the length, minLength and maxLength facets which could be useful to better encode string or list sizes.
It should not be too difficult to support them based on current facet support.
Is there a rationale to not include these facets?
(space separated ids)
(Please make sure the resolution is adapted for public consumption)

Comment LC-2173
Commenter: FABLET Youenn <Youenn.Fablet@crf.canon.fr> (archived message)
Context: in
assigned to Michael Cokus
Resolution status:

2) Guidelines for schema modeling
Is there any guideline regarding the relationship between EXI and schema modeling?
Guidelines would be useful to understand the impact of some schema modeling decisions on EXI encoding/decoding in terms of efficiency and compression.
For instance, it seems that the more global constructs (elements, types, attributes), the bigger will be the generated grammars since all global schema constructs need to be kept (right?),
having a lot of xs:all or maxOccurs="999" may also hurt efficiency.
See also question 3)
(space separated ids)
(Please make sure the resolution is adapted for public consumption)

Comment LC-2193
Commenter: Youenn Fablet <Youenn.Fablet@crf.canon.fr> (archived message)
Context: in
assigned to Takuki Kamiya
Resolution status:

5) ..............

Additionaly, while EXI provides great flexibility in the amount of schema put in grammars,
the schemaID mechanism seems very minimal.
It seems that interoperable uses of schema-informed EXI will greatly restrain the use of this flexibility.
Is there some additional work in that area that could or will be further conducted?
(space separated ids)
(Please make sure the resolution is adapted for public consumption)

Comment LC-2174
Commenter: FABLET Youenn <Youenn.Fablet@crf.canon.fr> (archived message)
Context: in
assigned to John Schneider
Resolution status:

3) DataTypeRepresentationType question
I would like a confirmation of the current DataTypeRepresentationType behaviour.
Let's have a schema with the following attribute definition:
<xs:attribute name="test" type="xs:string"/>
In that case, the only way to change the encoding for @test1 values with the DataTypRepresentationType feature
is to redefine xs:string which may have great impact.
If we only want to change the @test values with the DataTypRepresentationType feature, we would need to
change the schema as follow:
<xs:simpleType name="mystring">
<xs:restriction base="xs:string"/>
</xs:simpleType>
<xs:attribute name="test" type="mystring"/>
DataTypeRepresentationType could then be used to redefine mystring.
Is it correct?
If so, the interoperability will generally be lost, since interoperable DataTypeRepresentationType use is currently limited to XML Schema part 2 predefined types redefinition (end of section 7.4).
What about extending that behaviour to all simple types that have been gathered by consuming the schema in use?
Is there any rationale behind that specific constraint?
(space separated ids)
(Please make sure the resolution is adapted for public consumption)

Comment LC-2198
Commenter: FABLET Youenn <Youenn.Fablet@crf.canon.fr> (archived message)
Context: in
assigned to Richard Kuntschke
Resolution status:

Dear EXI WG,

I would like to have some clarification on two cases regarding SE(* ) grammar selection.

0) A schema with several element definitions for the same QName.
We can have a schema with several local element definitions and at most one global element definition with the same QName.
I assume that we generate as many grammars as needed for the same QName element and that the selection of the right grammar in schema-informed mode is done using scope information. Is that assumption right or is a different approach being used?

1) Wildcard SE(*).
Which grammar should I peak for a SE(*) belonging to a wildcard term?

- If I have a global element definition and one or more local element definition, should I peak the global element grammar?

- If I have only one local element definition, should I peak the local element grammar or peak/create a built-in grammar?
I did not found much description on the wildcard section related to that. Some guidance may be good there.

2) Built in SE(*).
Which grammar should I peak for a SE(*) belonging to a built-in grammar?
If I have a global element definition (plus maybe local element definitions), should I peak/create a built-in grammar or the global element grammar ?
If I have a local element definition, should I peak a built-in grammar or the local element grammar ?
My understanding of the current spec (see the semantics section of 8.4.3) is that a SE(*) belonging to a built-in grammar may only lead to a built-in grammar for its content but my understanding may be too restrictive?
Since we can go from built-in grammar to schema-informed grammar using xsi:type, I would hope that at least when we have a GED grammar, we are able to go from built-in to schema-informed grammar directly through the SE mechanism.

Regards,
Youenn
2248
(space separated ids)
(Please make sure the resolution is adapted for public consumption)

Comment LC-2186
Commenter: SHIMIZU Wataru <shimizu.wataru@canon.co.jp> (archived message)
Context: in
assigned to Takuki Kamiya
Resolution status:

According to the specification, if I use a user-defined datatype
representation, I have to specify a datatype representation map. How do
I use my original encoding only for the specific element or attribute?

For example, I want to use my original float encoding only for b
elements.

<a>
<b>1.2</b> <!-- xsd:float -->
<c>3.4</c> <!-- xsd:float -->
</a>

If I define a datatype representation map as follows, c element will
aslo be encoded with myfloat encoding.

<datatypeRepresentationMap xmlns:myenc="http://example.org/myenc">
<xsd:float/>
<myenc:myfloat/>
</datatypeRepresentationMap/>
(space separated ids)
(Please make sure the resolution is adapted for public consumption)

Comment LC-2227
Commenter: Gengo Suzuki <suzuki.gengo@lab.ntt.co.jp> (archived message)
Context: in
assigned to Takuki Kamiya
Resolution status:

Hello,

I have an question about Document Grammars.

In 8.5.1 Schema-Informed Document Grammar, there is 'SE(*)' event
which is evaluated by 'Built-in' Element Grammar.

But this rule (perhaps) can be applied under strict mode.

I think the principle of strict mode is that if XML instance has
an element which isn't defined in XML schema, encoder should stop
with some error.

I feel lacking of consistency between Schema-Informed Document
Grammar specification and strict mode.

How do you think about it?
Or were there any arguments?

Regards,

//---------------------------------------------------------------
NTT Cyber Space Laboratories
Gengo Suzuki <suzuki.gengo@lab.ntt.co.jp>
TEL: +81-46-859-3412 FAX: +81-46-859-2768
----------------------------------------------------------------//
(space separated ids)
(Please make sure the resolution is adapted for public consumption)

Comment LC-2176
Commenter: FABLET Youenn <Youenn.Fablet@crf.canon.fr> (archived message)
Context: in
assigned to Daniel Peintner
Resolution status:

5) EXI schema-less/schema-informed modes
Based on internal discussions and internal feedback, there is a general assumption that the EXI specification somehow defines two separate modes (schema-less and schema-informed).
While this is clearly stated in the specification that both modes easily coexist in a single EXI stream,
additional advertisement (maybe in the primer) of that feature may be good for adoption.
The latest published primer (dec 2007) could maybe be improved with that respect.
(space separated ids)
(Please make sure the resolution is adapted for public consumption)

Comment LC-2188
Commenter: SHIMIZU Wataru <shimizu.wataru@canon.co.jp> (archived message)
Context: in
assigned to Michael Cokus
Resolution status:

EXI specification has a lot of features that can reduce document size.
However it seems too complex for small embedded devices and I will have
to implement partial implementation. Of course it will lose
interoperability. I think additional conformance level or tiny profile
is useful for small devices and interoperability. Is there plan to
define like it?
(space separated ids)
(Please make sure the resolution is adapted for public consumption)

Comment LC-2178
Commenter: FABLET Youenn <Youenn.Fablet@crf.canon.fr> (archived message)
Context: in
assigned to Jaakko Kangasharju
Resolution status:

7) RDF/XMP use case
This is more a general comment on specific XML/EXI use cases, notably RDF or XMP documents where
no standard, well defined XML schemas are available.
These documents generally have some defined structures and types (RDF schema, XMP schemas…) but no
well defined XML schemas.
What would be the recommendation from the WG to enable good interoperable EXI compression? Stick with schema less encoding? Create a XML schema, publish it and use it?
(space separated ids)
(Please make sure the resolution is adapted for public consumption)

Comment LC-2189
Commenter: SHIMIZU Wataru <shimizu.wataru@canon.co.jp> (archived message)
Context: in
assigned to Daniel Peintner
Resolution status:

EXI documents do not include data type identifier of each value. Thus
all data types other than string can be used only in schema-informed
documents. Is it impossible to encode attribute values as integer
without schema? Fast Infoset document has data type identifier of each
values and I think it's a good approach.
(space separated ids)
(Please make sure the resolution is adapted for public consumption)

Comment LC-2165
Commenter: ISHIZAKI Tooru <ishizaki.tooru@canon.co.jp> (archived message)
Context: 4. EXI Streams
assigned to John Schneider
Resolution status:

Dear EXI members,

I have a feedback of EXI specification.
In chapter 4, what's the advantage of local-element-ns flag?

Best Regrads,
Tooru Ishizaki.
(space separated ids)
(Please make sure the resolution is adapted for public consumption)

Comment LC-2194: standalone pseudo-attribute
Commenter: TAMIYA Keisuke <tamiya.keisuke@canon.co.jp> (archived message)
Context: 5. EXI Header
assigned to John Schneider
Resolution status:

Dear W3C EXI WG members,

I have a question about this draft specification.

EXI dose not support the XML declaration - character encoding scheme,
standalone, version. (ref. B.1).
But why does not it support the XML declaration?
I think "character encoding scheme" is not necessary, but I cannot
understand why the "standalone", "version" is not suppoerted.

Regards,
Keisuke Tamiya (tamiya.keisuke@canon.co.jp)
2185
(space separated ids)
(Please make sure the resolution is adapted for public consumption)

Comment LC-2185: XML Version
Commenter: TAMIYA Keisuke <tamiya.keisuke@canon.co.jp> (archived message)
Context: 5. EXI Header
assigned to Takuki Kamiya
Resolution status:

EXI dose not support the XML declaration - character encoding scheme,
standalone, version. (ref. B.1).
But why does not it support the XML declaration?
I think "character encoding scheme" is not necessary, but I cannot
understand why the "standalone", "version" is not suppoerted.
2194
(space separated ids)
(Please make sure the resolution is adapted for public consumption)

Comment LC-2183
Commenter: FABLET Youenn <Youenn.Fablet@crf.canon.fr> (archived message)
Context: 8.4.3 Built-in Element Grammar
assigned to John Schneider
Resolution status:

13)

Section 8.4.3

xsi:schemaLocation attributes seems to be removed from the infoset before encoding in agile delta streams.
Is it by design or is it implementation related?
(space separated ids)
(Please make sure the resolution is adapted for public consumption)

Comment LC-2109
Commenter: Yuri Delendik <yury_exi@yahoo.com> (archived message)
Context: 8.5.4.2.1 Eliminating Productions with no Terminal Symbol
assigned to Takuki Kamiya
Resolution status:

Hello,

In some instances during elimination of productions with no terminal symbol (8.5.4.2.1) infinite loops can appear in forms:
G_(i,j):
G_(i,j)

Or
G_(i,j):
G_(i,k)
G_(i,k):
G_(i,l)
G_(i,l):
G_(i,k)

Eliminating them using only algorithm is not trivial and produce variations therefore may produce different grammars for same XSD schema on different implementations.

Source of those productions is particle {max occurs} = unbound.

Also, in paragraph when additional copy of Term_0 generated for unbound particle restrictions “k > 0” is missing. When
G_({min occurs}, 0):
EE

is replaced by:
G_({min occurs}, 0):
G_({min occurs}, 0)

Which is circular production with no terminal symbol and I cannot find well-documented way to eliminate it.

Could you illustrate how to convert following schema to EXI normalized grammars?

<xsd:element name="el1">
<xsd:complexType>
<xsd:sequence>
<xsd:element name="el1_1" minOccurs="1" maxOccurs="unbounded">
<xsd:complexType />
</xsd:element>
</xsd:sequence>
<xsd:attribute name="at1" type="xsd:string" >
</xsd:complexType>
</xsd:element>

Thank you.
(space separated ids)
(Please make sure the resolution is adapted for public consumption)

Comment LC-2110
Commenter: Yuri Delendik <yury_exi@yahoo.com> (archived message)
Context: 8.5.4.3 Event Code Assignment
assigned to John Schneider
Resolution status:

Hello,
Event code assignment section (8.5.4.3) describes sorting of the events in normalized EXI grammar. It does not show where SE(*), SE(uri:*) and AT(uri:*) events will be in sorting order.
Thanks.
(space separated ids)
(Please make sure the resolution is adapted for public consumption)

Comment LC-2248
Commenter: FABLET Youenn <Youenn.Fablet@crf.canon.fr> (archived message)
Context: 8.5.4.4.1 Adding Productions when Strict is False (AT(*) semantics description)
assigned to Richard Kuntschke
Resolution status:

In the same spirit, it may be good to tighten the wording
concerning the typing of global attributes.

Currently, section 8.5.4.4.1 (strict = false section) states
that:

"when using schemas [...] If a global attribute definition
exists for qname, represent the value of the attribute
according to its datatype"

First, it seems that only section 8.5.4.4.1 is dealing with
this, while this seems quite applicable to strict mode in
the case of attribute wildcards.

Second, the "when using schemas" wording seems vague to me,
at least for that particular sentence.

A quick reading made me thought that this meant "when some
schema information is available to the EXI processor", but
I think it is actually meaning "when using a schema-informed
grammar".

The last interpretation would also lead to the fact that a
schema containing only global attribute definitions would be
useless for typing attribute values.
2198
(space separated ids)
(Please make sure the resolution is adapted for public consumption)

Comment LC-2108
Commenter: <pub@upokecenter.com> (archived message)
Context: E Deriving Character Sets from XML Schema Regular Expressions
assigned to Takuki Kamiya
Resolution status:

I want to make a suggestion on the section 'Deriving Character Sets from XML Schema Regular Expressions':

I want to propose that datatypes with a regular expression containing a "charClassSub" should have no restricted character set. The reason is that all the remaining parts of the regular expression derivation expect only a union of characters, which is very efficient in determining whether the expression contains a restricted character set or not. Having a 'charClassSub' as part of the derivation process may complicate this, as the program now has to subtract portions of the character set as well as add to them, which may be a problem if the character set contains a large number of characters, like this:

[&#x20;-&#xFF00;-[&#x60;-&#xFF00]]

That regular expression above would yield a restricted character set of 64 characters; however the implementation may require storing thousands of characters (a naive implementation, yes) before it must exclude them in the 'charClassSub' portion of the regular expression. Another problem is nested 'charClassSub' sets. For example, the following regular expression is allowed:

[A-Z-[B-Z-[C-Z-[D-Z-[E-Z-[...]]]]]]

Both problems make 'charClassSub' problematic in restricted character set derivation. I thank you for your time.
(space separated ids)
(Please make sure the resolution is adapted for public consumption)

1-20 21-40 41-47

Add a comment.


Developed and maintained by Dominique Hazaël-Massieux (dom@w3.org).
$Id: index.html,v 1.1 2017/08/11 06:44:23 dom Exp $
Please send bug reports and request for enhancements to w3t-sys.org