RE: Support for Canonical EXI interoperability test in TTFMS

Hi Daniel,

> * if processing the current XML Information item fails by means of existing 
>   event codes of length 1 (i.e., no EE or SE event exists), and

Does this include a situation in which you are trying to encode an infoset SE
event, but the current grammar does not contain production of SE event type?

I am not sure if you want to insert empty CH at that point. Does doing so
help the process of encoding?

Thank you,

Takuki Kamiya
Fujitsu Laboratories of America


-----Original Message-----
From: Peintner, Daniel (ext) [mailto:daniel.peintner.ext@siemens.com] 
Sent: Wednesday, November 18, 2015 8:39 AM
To: Takuki Kamiya; public-exi@w3.org
Subject: AW: Support for Canonical EXI interoperability test in TTFMS

Hi Taki, all,

Thank you for your reply and your valuable comments.

I updated the proposal to incorporate your feedback. Also, the description now states the intent and lists again the rules.


--->

In general, Canonical EXI MUST NOT change the sequence of XML information items. However, the XML Infoset in some rare cases (e.g., due to API characteristics) may miss "Character Information items" such as strings with the number of characters equal to 0 (zero). EXI encoding may also fail without such an "empty" character information item (e.g., strict schema-informed streams that state the requirement of an expected character string - even if empty).

Hence, Canonical EXI aims for adding an "empty" character information item if the intent requires to do so (e.g., expected character string) and not for any other use case (e.g., mixed content).

That said, a canonical EXI processor MUST add a CH event with a String of length 0 (zero)
* if processing the current XML Information item fails by means of existing event codes
  of length 1 (i.e., no EE or SE event exists), and
* when processing a schema-informed grammar where a CH event code of length 1 exists with
  Built-in EXI Datatype Representation "Binary" (exi:base64Binary and exi:hexBinary),
  "String", "List" or an Enumeration with an empty item.

In all other cases no further events MUST be added.
<---

What do you think?
Do you have any updates/proposals?

Thanks,

-- Daniel



________________________________
Von: Takuki Kamiya [tkamiya@us.fujitsu.com]
Gesendet: Mittwoch, 11. November 2015 22:40
An: Peintner, Daniel (ext); public-exi@w3.org
Betreff: RE: Support for Canonical EXI interoperability test in TTFMS

Hi Daniel,

In schema-informed context, CH event-type with event-code length 1 comes from
two different schema constructs. One is from simple type content, the other is
from mixed-content.

For CH event types that came from mixed-content, there is no need for inserting
empty CH event. Therefore, I would suggest to exclude mixed-content CH event
types from the rule you described below.

You listed three EXI datatype representations (i.e. Binary, String and List) as
applicable to the described empty CH event insertion rule. I would like to point
out that enumerated values where one of the values is an empty string (i.e. "")
also should also apply. In other words, in all context where the EXI datatype
representation associated with the current CH event type allows for an empty CH,
empty CH event should be inserted.

Thanks,

taki


-----Original Message-----
From: Peintner, Daniel (ext) [mailto:daniel.peintner.ext@siemens.com]
Sent: Wednesday, November 11, 2015 5:08 AM
To: Takuki Kamiya; public-exi@w3.org
Subject: AW: Support for Canonical EXI interoperability test in TTFMS

All,

According to yesterday's telecon I explored the empty CH("") event a bit further.

There are various situations when an empty CH could be added. One rather obvious case is a schema-informed stream that states the requirements of an expected character string (even if the string is empty). However,  also in schema-less mode one could assume that a previously "learned" CH event could mean that a CH is expected even if it is not there...

Summarizing I would like to propose the following requirement/addition to the Canonical EXI document.

--->
The XML Infoset in some rare cases (e.g., due to API characteristics) may miss "Character Information items" such as strings with the number of characters equal to 0 (zero). That said, EXI encoding may also fail without such an "empty" character information item. Hence, a canonical EXI processor MUST add a CH event with a String of length 0 (zero), if not already there, when beeing in a schema-informed grammar where a CH event code of length 1 exists with Built-in EXI Datatype Representation "Binary" (exi:base64Binary and exi:hexBinary), "String" or "List". The availability of such a CH event in the grammar clearly states the intent, in this case the requirement of empty characters. In all other cases no further events MUST be added.
<---

What do people think?

Thanks,

-- Daniel





________________________________
Von: Peintner, Daniel (ext) [daniel.peintner.ext@siemens.com]
Gesendet: Montag, 9. November 2015 17:06
An: Takuki Kamiya; public-exi@w3.org
Betreff: AW: Support for Canonical EXI interoperability test in TTFMS

Taki, all,

we looked into the issue more closely and found the following issues.

1. How to deal with conflicting framework options

The framework (or the associated test cases) may define conflicting parameters (e.g, preserve processing instructions and strict). In such a situation an EXI processor may decide whether to use non-strict encoding to support processing instructions or to eliminate PI support.

As it turns out the EXI processors (OpenEXI and EXIficient) tend to use different strategies. That said, both strategies are OK. Hence, I think we need to make the framework aware of such a situation so that the framework decides what is the desired result.


2. Empty CH("") events

An XML schema may define an element as follows
<xs:element name="foo" type="xs:string"/>

A valid instance may look as follows.

<foo></foo>

Depending on the EXI options and the mode (strict vs. non-strict) the following two EXI streams are possible

SE(foo) EE(foo)                --> applicable in non-strict only
SE(foo) CH("") EE(foo)      --> applicable in strict and non-strict

Again, we need to ensure all Canonical EXI processors behave the same.
Hence, I would argue for the latter case given that it is usable in both (strict and non-strict) scenario but I am open for other ideas/thoughts.

3. Whitespace handling

I wonder whether we need to define whitespace preservation rules in Canonical EXI similar to the TTFMS framework rules.


Thanks,

-- Daniel





________________________________
Von: Takuki Kamiya [tkamiya@us.fujitsu.com]
Gesendet: Mittwoch, 21. Oktober 2015 00:05
An: Peintner, Daniel (ext); public-exi@w3.org
Betreff: RE: Support for Canonical EXI interoperability test in TTFMS

Hi Daniel,

I fixed a bug in the TTFMS framework.

Next time you compile the framework and run the test,
you will be able to see schema-informed EXI files generated
when the test case provides one and schema use is enabled.

Thank you,

Takuki Kamiya
Fujitsu Laboratories of America


-----Original Message-----
From: Peintner, Daniel (ext) [mailto:daniel.peintner.ext@siemens.com]
Sent: Wednesday, October 14, 2015 6:50 AM
To: Takuki Kamiya; public-exi@w3.org
Subject: AW: Support for Canonical EXI interoperability test in TTFMS

Hi Taki,

I uploaded a revised EXIficient library but I agree, I do still see some issues.
(in my test run 20 files out of 115 are still different)

Maybe this has to do with whitespace handling (will send separate email...)

Moreover, I am currently able to run schema-less test runs only by calling
ant run-iot-c14n-classes -DtestCases=config/testCases-restricted/all-v1.xml

Maybe someone can point me to the configuration how to call schema-informed test runs or byteAligned test runs to facilitate debugging.

Thanks,

-- Daniel



________________________________
Von: Takuki Kamiya [tkamiya@us.fujitsu.com]
Gesendet: Dienstag, 13. Oktober 2015 03:23
An: Peintner, Daniel (ext); public-exi@w3.org
Betreff: RE: Support for Canonical EXI interoperability test in TTFMS

Hi Daniel,

I also modified openexi driver so that it always output header options.

However, I still see many differences between exificient and openexi
outputs. We will need to further investigate this.

Thank you,

Takuki Kamiya
Fujitsu Laboratories of America


-----Original Message-----
From: Peintner, Daniel (ext) [mailto:daniel.peintner.ext@siemens.com]
Sent: Thursday, October 01, 2015 5:50 AM
To: Takuki Kamiya; public-exi@w3.org
Subject: AW: Support for Canonical EXI interoperability test in TTFMS

Hi Taki,

Thank you for pointing me to the parameter "measure" which indicates the type of the test run.

I also uploaded a first snapshot of the EXIficient library supporting Canonical EXI. Additional updates may be necessary.
When comparing the encoded files with OpenEXI I do see mostly diffs. I think it is because OpenEXI at the moment does not always include the EXI Options.

Please let me know if you encounter other issues.

Thanks,

-- Daniel

________________________________
Von: Takuki Kamiya [tkamiya@us.fujitsu.com]
Gesendet: Donnerstag, 1. Oktober 2015 01:45
An: Peintner, Daniel (ext); public-exi@w3.org
Betreff: RE: Support for Canonical EXI interoperability test in TTFMS

Hi Daniel,

You should be able to get the test mode by accessing:
measure field (of class MeasureParam) that is in _driverParams (of class DriverParameters)

When it is iot_c14n_encode, you should change the behavior of the
processor to comply with c14n rules.

Do you plan to check-in new EXIficient jar to TTFMS soon?

Thank you,

Takuki Kamiya
Fujitsu Laboratories of America


-----Original Message-----
From: Peintner, Daniel (ext) [mailto:daniel.peintner.ext@siemens.com]
Sent: Wednesday, September 30, 2015 8:28 AM
To: Takuki Kamiya; public-exi@w3.org
Subject: AW: Support for Canonical EXI interoperability test in TTFMS

Hi Taki,

I did check out the new code and it worked as expected.
Thank you for your work!

The only thing I miss is a testCase option that informs about whether the EXI processor is required to produce canonical EXI.

Did I miss anything with that regard?

Thanks,

-- Daniel



P.S. EXIficient does not sort attributes in schema-less mode


________________________________
Von: Takuki Kamiya [tkamiya@us.fujitsu.com]
Gesendet: Dienstag, 29. September 2015 02:20
An: public-exi@w3.org
Betreff: Support for Canonical EXI interoperability test in TTFMS

Hi,

I added support for Canonical EXI interoperability test in TTFMS.

You need to invoke target " run-iot-c14n-classes" in order to run the
encoding process.

After that, diff tools such as WinMerge (on windows) can be used to
compare the encoded files output by various implementations.

Initial experimental run showed quite a lot of differences in encodings
between EXIficient and OpenEXI.

I found at least some of the diffs are due to the attribute orders in
schema-less setting. Is it true that EXIficient sorts attributes whether
it is schema-less or schema-informed?

Thank you,

Takuki Kamiya
Fujitsu Laboratories of America

Received on Thursday, 19 November 2015 23:59:13 UTC