W3C

- DRAFT -

Efficient XML Interchange Working Group Teleconference

08 Dec 2015

See also: IRC log

Attendees

Present
Regrets
Chair
SV_MEETING_CHAIR
Scribe
TK

Contents


<scribe> scribe: TK

<scribe> scribeNick: taki

Preliminaries

TK: Next week telecon will be the last telecon for the year.

DP: Will be back on 1/12.

TK: There won't be meetings on 12/22, 29 and 1/5.

EXI for JSON

DP: I did work a bit on EXI for JSON.
... I added a note that says we don't plan to provide mapping between XML and JSON.
... I described how other datatypes can be used in section 3.1.3.
... Similarly, in 3.1.4 for numbers.
... I am not sure how decimal encoding is better than float.
... section 3.2.7 talks about <other/> element.
... I think it makes sense to always use strict mode.
... schemaId is "schema-for json"
... Once we decide what to do with decimal, we are ready for publication.
... I will go over the document again, then will ask the group for review, in preparation for publication.

<scribe> ACTION: DP to Go over EXI for JSON draft and ask the group for review in preparation for publication [recorded in http://www.w3.org/2015/12/08-exi-minutes.html#action01]

<trackbot> Created ACTION-733 - Go over exi for json draft and ask the group for review in preparation for publication [on Daniel Peintner - due 2015-12-15].

DP: We should publish sooner as a note, as suggested by Liam.

<brutzman> typo in final section C. Examples

<brutzman> http://www.w3.org/XML/EXI/docs/json/exi-for-json.html#examples

<brutzman> ... should the next-to-last } actually be a ] character?

DB: By using very simple types, by inspecting JSON, you can infer the type. Boolean, number, string, array and object.

DP: It is correct.
... The distinction comes from JSON, and we just encode them. number to number, array to array.
... I already tried implementation. It is very simple.
... straightforward to implement it.

DB: JSON data structure is unambiguous. Good approach.
... You avoid the need for JSON schema.

<brutzman> This also sidesteps the fact that JSON schema is not yet final. http://json-schema.org

<brutzman> http://tools.ietf.org/html/draft-zyp-json-schema-04 draft 4 JSON Schema: core definitions and terminology

DP: I am not very sure whether people are still working on JSON schema.
... Given that the draft was expired.

<brutzman> Slightly encouraging that they got to v4 but nevertheless it is old, Expires: August 4, 2013

<brutzman> There are also drafts for JSON validation and JSON hyper-schema, but they only got to draft 0 and at the same time in 2013. So maybe they ran into a problem...

<brutzman> I think that your approach will remain stable because you are only using the basic Javascript/JSON data types, which are standardized by ECMA.

<brutzman> If a future JSON Schema set of RFCs identifies further types, hopefully there will be a compatibility path.

<brutzman> Thinking of other possible data types: is there any XML schema convention regarding representation of geographic location? doubles or strings could be applied.

DB: Do we expect other datatypes to emerge?

<brutzman> Double precision is handled OK by Javascript number type.

DB: Such as double-precision for geo-location?

DP: We selected binary data, datetime, time, integer, decimal, besides straight-forward mappings.
... transforming back to JSON, you won't notice the use of those encodings.

DB: How do you notice the use of those types?

DP: If someone see some important types missing, we can still add them.

<brutzman> If you do detect binary/datetime/time value, how do you guarantee precise round-trip conversion?

<brutzman> A small point for conversion: note that number type is not quite the same as IEEE float/double; there must be a leading 0 before decimal point for a fractional value.

DP: We base on XML schema.

<brutzman> I think we need to be careful about round-tripping because we don't really know the literal requirements or the type that the JSON author intended. The only thing that we can be sure of is an exact match to string type.

DP: Otherwise, string representation needs to be used.

<brutzman> Wondering about base64 and other encodings.

DB: numbers in different bases?

DP: JSON uses base10 approach for number type.

<brutzman> RFC 7159 https://tools.ietf.org/html/rfc7159 "JavaScript Object Notation (JSON) Data Interchange Format" is a very important resource for interoperability concerns that complement the ECMA specification.

<brutzman> RFC7159 section 8.1. Character Encoding states "JSON text SHALL be encoded in UTF-8, UTF-16, or UTF-32."

<brutzman> ... it also states there is no byte order mark (BOM)

DB: We should be able to preserve the original encoding.

<brutzman> ... thus it might be necessary for the EXI JSON Schema to note what the original encoding was, and return that encoding, if a perfect round trip is desired.

<brutzman> For example, UTF32 JSON -> UTF8 EXI. What result is provided when decoding, is UTF8 okay or should the original UTF32 be honored?

DP: Practically, it is very difficult.

DB: We can have "source-encoding" attribute in the schema.

<brutzman> Possibly adding an attribute for jsonSourceEncoding might help perfect round tripping be accomplished. An implementation that reads the JSON would need to determine what encoding was found, since it is implicit and not an explicit part of the data in the JSON object.

DP: Canonical JSON will need to take care of it.

<brutzman> It may be that the same information can be preserved in different encodings; this issue examines whether original encoding is preserved.

DB: Preserving JSON source encoding should be an issue that needs to be described.

DP: For plain EXI, canonical EXI does the job.

<brutzman> Possibly this needs to be an EXI property for source encoding?

<brutzman> Does EXI (in general) have an option to preserve source encoding? Perhaps it should...

DP: I will add a section that states issues and decisions, asking for feedbacks.

<brutzman> Perhaps this is all a non-issue if a "UTF16" document primarily consists of UT8 characters, and a parser has to be ready to read the intermingled UTF16 characters when they occur.

<brutzman> It would be interesting to compare EXI compression between two sources: XML-encoded schema-valid X3D scene and a corresponding X3D JSON scene.

<brutzman> Perhaps there are other XML encodings that also have a corresponding JSON encoding defined.

<brutzman> Regarding a more general mapping between JSON document and XML Schema: is such work timely, or more appropriate after a JSON schema RFC is completed?

<brutzman> Comparison of JSON EXI compactness to XML EXI compactness might become very interesting in this regard.

<brutzman> Prior thesis work by Bruce Hill compared XML EXI compactness to a variety of JSON compression algorithms for similar data. The XML data did not always have an XML schema to guide EXI compression. (EXI was consistently as good or better.)

Summary of Action Items

[NEW] ACTION: DP to Go over EXI for JSON draft and ask the group for review in preparation for publication [recorded in http://www.w3.org/2015/12/08-exi-minutes.html#action01]
 

Summary of Resolutions

[End of minutes]

Minutes formatted by David Booth's scribe.perl version 1.144 (CVS log)
$Date: 2015/12/08 17:05:59 $

Scribe.perl diagnostic output

[Delete this section before finalizing the minutes.]
This is scribe.perl Revision: 1.144  of Date: 2015/11/17 08:39:34  
Check for newer version at http://dev.w3.org/cvsweb/~checkout~/2002/scribe/

Guessing input format: RRSAgent_Text_Format (score 1.00)

Succeeded: s/base10 approach/base10 approach for number type/
Found Scribe: TK
Found ScribeNick: taki

WARNING: No "Present: ... " found!
Possibly Present: DB DP TK brutzman caribou dape exi joined left scribeNick trackbot
You can indicate people for the Present list like this:
        <dbooth> Present: dbooth jonathan mary
        <dbooth> Present+ amy


WARNING: No meeting chair found!
You should specify the meeting chair like this:
<dbooth> Chair: dbooth

Found Date: 08 Dec 2015
Guessing minutes URL: http://www.w3.org/2015/12/08-exi-minutes.html
People with action items: dp

[End of scribe.perl diagnostic output]