14:02:23 RRSAgent has joined #exi 14:02:23 logging to http://www.w3.org/2016/08/02-exi-irc 14:02:25 RRSAgent, make logs public 14:02:25 Zakim has joined #exi 14:02:27 Zakim, this will be EXIWG 14:02:27 ok, trackbot 14:02:28 Meeting: Efficient XML Interchange Working Group Teleconference 14:02:28 Date: 02 August 2016 14:04:56 brutzman has joined #exi 14:05:37 dape has joined #exi 14:08:52 scribe: TK 14:08:58 scribeNick: taki 14:09:11 TOPIC: EXI4JSON 14:09:24 TOPIC: Character escaping method to represent JSON names in XML names 14:10:27 TK: DB and DP commented to TK's proposal. 14:11:02 DP: I think we could just use unicode number. 14:11:48 DB: I liked numeric number as well. 14:12:37 DP: Alternative is we say explicitly people need to use upper or lower case. 14:12:57 the compacted size is the same for numeric values. however there is a drawback: in-memory size for processing is bigger because the numeric strings are bigger. 14:14:37 Using regular expression (regex) for example to convert upper to lower case is quite fast and efficient. 14:16:14 The case of special interest here is when the (perhaps very large) numeric array is in string from getting validated as XML, just prior to EXI compression. 14:17:09 In comparing the alternatives, i expect that hex unicode is simplest and best. 14:17:37 DP: test comparisons could show whether there is an impact, don't think that there will be much difference. 14:18:24 DP: We do have another situation. How to express names when they equal to one of the predefined ones? 14:18:37 DP: of note is that this technique only applicable to keys, not values 14:19:13 Aha then, that greatly reduces (eliminates) the potential of really long strings. 14:20:06 Perhaps we could look at what is common; likely hex 14:20:26 s/common/most common/ 14:21:35 lower-case characters are smaller than upper-case hex characters 8) 14:28:23 since you two are implementers, you are welcome to decide hex/numeric later. the general concept of escaping seems sound. 14:28:55 s/escaping/escaping with underscores + unicode value/ 14:29:10 {"number": 123} would become <._number> to indicate that it is a special name that cannot be used 14:30:38 s/<._number>/<_.number> 14:32:18 DB: By going numeric, we could differentiate them better. 14:32:48 the two defaults that have a hex character as first character are j:array and j:boolean 14:34:08 123 14:34:23 ... so those might need some disambiguation from hex unicode when parsing, e.g. escaped special name _boolean versus escaped unicode value _b123 14:35:15 <_.number>123 14:36:59 good example. agreed this case - where a key name uses a reserved word - is important for the escaping to always work. 14:40:30 TOPIC: Flag to indicate the need of un-escaping 14:41:20 DP: Helping decoders to know whether it needs to cope with un-escaping or not. 14:42:10 DB: It is a good concept to consider. 14:43:08 We seem to have a good escaping mechanism now (unambiguous and complete). So, is the un-escaping flag a requirement or a hint? 14:44:15 it seems safer to not have a flag that might have an incorrect value which can break the decoding; however if it is a hint that speeds up decompression then that is worth considering 14:44:59 DB: If it is a hint, that is good. If not, it may cause inconsistency. 14:45:06 software engineering principle: DRY don't repeat yourself 14:45:40 https://en.wikipedia.org/wiki/Don%27t_repeat_yourself 14:46:39 hint value seems fine if performance can be improved, but default value and mistaken value should not cause harm 14:48:41 it is a curious situation. most JSON keys will not need escaping, so the default value of a hint would likely be to not perform escaping. however if an escaped key existed, it would have to be checked anyway. 14:49:27 ... so the hint might not be relevant, if a parser always has to check for escaping regardless of the value of the hint. 14:50:30 we know that an escaping capability is essential for key names 14:50:55 DP: the escaping might be a contained character, not just the first character. 14:52:27 DP: Always-unescaping may not be a lot of overhead in implementations. 14:52:51 if instead of a hint, then we might avoid using a default value to ensure it is considered correctly. however this approach has several issues making it less desirable; possible inconsistency, and requiring an implementation to define the value. 14:53:13 s/define the value/define the value of a hint/ 14:53:51 (discussion of tradeoffs) 14:54:29 it is not difficult for an encoder to set this value; either escaping occurred or not. 14:57:22 potential performance impact could be measured on parsing a large unmodified EXI4JSON document by comparing differences between escape checking and no escape checking 14:58:48 DP: I think there is overhead. If we introduce flags, flags also introduce overhead. 14:59:43 if the overhead of checking escaping on a character-by-character basis for each key is considered nontrivial, then that would lead us to requiring the encoder to set the value (which has no performance cost) 15:01:22 TK: should we revisit the topic later? 15:02:11 the analysis seems good here, a performance test will likely give us the answer pretty easily 15:07:41 http://www.w3.org/2005/06/tracker/exi/products/15 15:07:57 (discussion of readiness to release next draft after reconciling today's progress) 15:10:36 confirmation: today's discussion relates to ISSUE-116 and ISSUE-117 (and mostly resolves them) 15:11:10 http://www.w3.org/2005/06/tracker/exi/issues/116 JSON keys invalid as XML Names 15:11:54 http://www.w3.org/2005/06/tracker/exi/issues/117 Name Clash between JSON key names and names used by EXI4JSON 15:13:40 TOPIC: Canonical EXI 15:14:01 TOPIC: Communicating EXI-C14 options 15:14:19 DP: will consider today's discussion, update draft, review remaining issues and recommend to group whether to publish next EXI4JSON draft 15:16:06 DP: Option 2, fragment identifier may not be necessary. 15:16:36 DP: Option 3, is not feasible any more since we have more information that cannot be represented in EXI options document. 15:20:15 DP: If we count in bytes, there is only sometimes 1 byte difference. 15:23:25 DP: elements are more consistent with how the rest of options are described. 15:24:02 sent reply on member list (not yet archived) expressing that consistency is important 15:28:48 inconsistent = inefficient (and a source of error) so elements seems like the appropriate choice 15:30:35 suggested topic for upcoming call: can we write a paper together for WWW2017 on application of EXI to Open Web Architecture 15:30:49 suggested topic for upcoming call: future work 15:31:55 question: how amenable is EXI to insertion of binary data block (BDATA perhaps) similar to CDATA? The X3D binary encoding has a use case. 15:32:38 it reminds me of representation maps... 15:32:49 ... presumably defining byte length of a follow-on block isn't too difficult 15:35:16 rrsagent, create minutes 15:35:16 I have made the request to generate http://www.w3.org/2016/08/02-exi-minutes.html taki 16:26:37 Zakim has left #exi 18:17:06 liam has joined #exi