JSON-LD Working Group Telco — Minutes

Date: 2020-01-24

Attendees

Present: Gregg Kellogg, Ruben Taelman, Rob Sanderson, Ivan Herman, Pierre-Antoine Champin, Benjamin Young, Dave Longley, Harold Solbrig, David I. Lehn, Adam Soroka

Regrets: Tim Cole

Guests:

Chair: Rob Sanderson

Scribe(s): Ruben Taelman, Rob Sanderson

Rob Sanderson: present_

1. Approve minutes of previous call

Proposed resolution: Approve Minutes of previous call: https://www.w3.org/2018/json-ld-wg/Meetings/Minutes/2020/2020-01-17-json-ld (Rob Sanderson)

Gregg Kellogg: +1

Rob Sanderson: +1

Ruben Taelman: +0

Pierre-Antoine Champin: +1

Resolution #1: Approve Minutes of previous call: https://www.w3.org/2018/json-ld-wg/Meetings/Minutes/2020/2020-01-17-json-ld

2. Announcements / Reminders?

Ivan Herman: +1

Gregg Kellogg: JS implementation from digitalbazaar is close to 100% for everything but framing.
… Treatment of entities in script elements is difficult with dom parser. So this will be difficult to get spec-compliant.

Rob Sanderson: Will there be difficulties for all JS implementations?

Dave Longley: I would propose adding at-risk.

Gregg Kellogg: jsonld.js PRs https://github.com/digitalbazaar/jsonld.js/pulls

Dave Longley: Browsers may not be able to do what we want to do. Different behaviour may be allowed if such difficulties are encountered.

Ivan Herman: XML entities in script elements are the problem?

Gregg Kellogg: Script elements should be unaltered. XML Entities inside script elements are typically decoded by parsers at the moment. But we don’t want that to happen.
… One way to handle JSON-LD in script tags would be to escape them, but we don’t want to escape them. According to what ivan said, we want script elements to remain unaltered.
… We could encourage authors to not include such entities in script elements.

Rob Sanderson: This is new in 1.1, so this is not a backwards-incompatibility. So we can change this as we want.

Action #1: create issue for ATRISK of entities within HTML script elements. (Gregg Kellogg)

Rob Sanderson: Shall we make it at-risk now? Or wait to see if we can solve it.

Benjamin Young: https://github.com/digitalbazaar/jsonld.js/pull/347

Benjamin Young: The PR gkellogg is for jsonld.js. Is this code running in the browser?

Gregg Kellogg: Testing environment mostly node. The use case within a browser is not considered in this PR.
… The browser may have different interfaces. Should be reconsidered for this use case.

Benjamin Young: Schema.org parsing by chrome is done using headless browser.

Dave Longley: the DOM APIs are what probably matter here: https://dom.spec.whatwg.org/#dom-node-textcontent

Ruben Taelman: to mention the parser I use, it’s HtmlParser2. I have not looked at this problem yet. It’s possible that I do not have the issue. Wondering if there are tests for this case already to see

Gregg Kellogg: Yes
… if you look at the PR, one of the changed files is test common, and shows a couple of commented out tests for this

Rob Sanderson: +1 to writing to the spec :D

Dave Longley: People should be writing the parsers to what the specs say. Whatever we ask should be compatible with the spec. So it’s important to know if xmldom is spec-compliant.

Gregg Kellogg: The specs may not specifically talk about this.

Benjamin Young: Do we provide any guidance on getting text out of script element?

Gregg Kellogg: There’s a section in the API spec on that.

Rob Sanderson: Is someone willing to do the research on what the spec says that should happen?

Benjamin Young: The linked section only talks about HTML spec, not DOM spec. So we may look into that.
… I can look at it this afternoon.

Rob Sanderson: The Greggs have worked together, and now the Perl implementation for expansion passes all of the tests. He will now work on compaction, so expect new issues on that.

Dave Longley: +1 to macros

Gregg Kellogg: There was a fairly large rewrite of the compaction algorithm, which was an improvement.

3. Issues

3.1. Normalize language tags

link: https://github.com/w3c/json-ld-api/issues/337

Rob Sanderson: This is about our workaround language tags in i18n namespace.

Gregg Kellogg: We removed requirements to normalize language tags to lowercase, because it is problematic for many people in i18n community. When creating RDF, we have possibility that 2 processors create different data types.
… The question is if that is what we intended. To allow 2 diff datatypes by 2 different processors.

Ivan Herman: That would be wrong in terms of RDF.
… If you have 2 datatypes with different cases, RDF sees those as not equal.
… Maybe pchampin_ will say I’m wrong…

Pierre-Antoine Champin: I slightly disagree. 2 different URLs may denote different things, but also the same, but this depends on implementation.
… It’s hard to require all impls to support these different i18n datatypes and consider them equal.
… We could say that they are semantically equivalent.

Ivan Herman: Yes, we can do that. I don’t know if we are discussing something that is insignificant.
… If we do that, and have an implementation that does datatype reasoning, then that impl will likely fall on its face.
… Datatype reasoning is quite a challenge. Many implementations just check char-by-char.
… We could say that if you use these datatypes, that you are supposed to lowercase language tags.
… It’s ugly, but I don’t see a better choice.

Rob Sanderson: Is there some i18n requirement?

Ivan Herman: No, it’s a habit that there is mixing of cases.
… Usual way is:

Ivan Herman: the usual way is : en-US and not en-us

Ivan Herman: We should not require normalization when using lang tags the old way, but we should when using i18n datatypes.

Rob Sanderson: Is the set of characters that is permissible in URIs and language tags compatible?

Ivan Herman: Just ASCII characters.

Pierre-Antoine Champin: There will be 2 kinds of RDF impls: ones not recognizing our custom IRIs, and those that do.
… Those that will take into account our custom datatypes, can interpret them as lang tags and do smart things.
… The roundtripping would be lost when direction is used. I’m still in favour of not normalizing them.

Gregg Kellogg: I’m neutral on normalization. We should add a non-normative note in any case.

Dave Longley: Was it a mistake to not normalize language tags when they were invented?

Ivan Herman: Invented by whom?

Dave Longley: Not JSON-LD, but the group that came up with it 30 years ago…

Ivan Herman: We can not change it because it’s out there already.

Rob Sanderson: We can fix it for reduced datatype IRI.

Dave Longley: It looks like this grew organically, so the spec was built around it.
… What we introduce is new, so we can enforce normalization.
… So we simplify part of the space.

Ivan Herman: I don’t disagree.
… How important is it to roundtrip on such a detail?
… Because that is why we are discussing this.
… I don’t think it’s important.
… So I would normalize it.

Gregg Kellogg: From RDF Concepts: “A literal is a language-tagged string if the third element is present. Lexical representations of language tags may be converted to lower case. The value space of language tags is always in lower case.”

Gregg Kellogg: We did change the language of JSON-LD, which always normalized language tags, which was over-strict.
… RDF spec says that language tags may be lowercased.
… We are talking here about special case: roundtripping.
… It’s a minor thing what we are going to do.
… I would support that we change the language in toRdf, that language tags be normalized in compound literals and i18n.

Rob Sanderson: We ran into this in practice when having to do case-insensitive language tag comparison.

Proposed resolution: In toRDF recommend normalization of language tag based URIs (Rob Sanderson)

Pierre-Antoine Champin: +1

Rob Sanderson: +1

Ivan Herman: +1

Benjamin Young: +0

Gregg Kellogg: +1

Dave Longley: +1

Ruben Taelman: +1

Harold Solbrig: +0

David I. Lehn: +1

Adam Soroka: +1

Resolution #2: In toRDF recommend normalization of language tag based URIs

Proposed resolution: … also compound literal form (Rob Sanderson)

Rob Sanderson: +1

Pierre-Antoine Champin: +1

Ruben Taelman: +1

Dave Longley: +1

Gregg Kellogg: +1

Adam Soroka: +1

Resolution #3: … also compound literal form

Ivan Herman: +1

David I. Lehn: +1

Benjamin Young: +1

3.2. Boolean comparison issue (JSON Datatype)

link: https://github.com/w3c/json-ld-syntax/issues/323))

Rob Sanderson: Last week, we concluded that we should fix 323.

Pierre-Antoine Champin: The value of the JSON value type should not be a structured representation of JS object, but canonical form of JSON representation.
… We have our own canonic process. But this was marked as non-normative. I think this should be marked as normative.

Ivan Herman: Yes, I agree.
… You avoided canonicalization term, which is a good idea.

pr: https://github.com/w3c/json-ld-syntax/pull/325
… There is a small part that needs to be changed.

Pierre-Antoine Champin: I did change it.

Ivan Herman: I may have missed something then.

Pierre-Antoine Champin: Currently lexical value should be re-serialized.

Gregg Kellogg: Reason it was non-normative was JSC was still in draft.
… Object keys are ordered by converting them by UTF16 may be controversial.

Pierre-Antoine Champin: We should update the API doc as a copy of the normalization text in processing document.

Gregg Kellogg: We should also change the test descriptions.

Ivan Herman: Also the API doc currently repeats the canonicalization steps, and we should refer to the proper place.
… We will get a similar situation as with language tags, where we can not fully guarantee roundtripping.

Gregg Kellogg: Yes. Ordering of keys in our case is just lexicographical, while JSC is much more detailed with localization concerns.

Proposed resolution: Update api document to be in line with syntax for json datatype, test descriptions, and “canonicalization” algorithm (Rob Sanderson)

Gregg Kellogg: Not including that in syntax doc would not be sufficient for interoperation.

Gregg Kellogg: +1 (modulo key ordering)

Pierre-Antoine Champin: +1

Gregg Kellogg: modulo key ordering

Dave Longley: +1 modulo gregg’s comments

Rob Sanderson: +1

Ruben Taelman: +1

Harold Solbrig: +1

Adam Soroka: +1

Benjamin Young: +1 modulo gregg’s comments

Ivan Herman: +1

Resolution #4: Update api document to be in line with syntax for json datatype, test descriptions, and “canonicalization” algorithm (modulo key ordering)

Dave Longley: (it is important to match JCS … and hope it sticks in the future)

3.3. Confusing context URL handling,

link: https://github.com/w3c/json-ld-api/issues/265

Rob Sanderson: We concluded that the change would be good, but was too big.
… Now that we dropped out of CR, we should discuss whether the change is useful.
… Should we do that now?
… gkellogg, will this be an editorial change that doesn’t need a resolution?

Gregg Kellogg: It will change the algorithm signature.
… We need a way to keep track of the resolved context URL to use when resolving future context URLs.
… It impacts the places that call it to pass a value around.

Ivan Herman: If you change the parameters, how many implementations will have to change?

Gregg Kellogg: We don’t change tests. Implementations do not have to follow the algorithm exactly.

Ivan Herman: Does anyone else next to kasei follow the spec exactly?
… Do we know if we are affecting anyone else?

Gregg Kellogg: This was behaviour that was expected in 1.0.

Ivan Herman: If you do this change, do the implementation of Ruben and Dave have to change.

Ruben Taelman: I don’t follow the algorithm.

Dave Longley: agree with gregg, i don’t think we need to change anything in JS

Proposed resolution: Add text to api to clarify the need for tracking of base IRIs during context processing (Rob Sanderson)

Ivan Herman: +1

Gregg Kellogg: +1

Rob Sanderson: +1

Ruben Taelman: +1

Dave Longley: +1

Benjamin Young: +1

Harold Solbrig: +1

Adam Soroka: +1

Resolution #5: Add text to api to clarify the need for tracking of base IRIs during context processing

Pierre-Antoine Champin: +1

4. New CR timing

Gregg Kellogg: Probably would take 4 weeks from last week.
… kasei working on framing should give us enough time.
… So around February 14th, or 21st.

Ivan Herman: We have to go through the whole charade, so it will take a while.

Gregg Kellogg: So we start of 17th.

Ivan Herman: Ok, so that would mean the end of February.

Harold Solbrig: Is there any wording that I can say that even though JSON-LD is not a CR yet, I can use it safely without things changing?

Ivan Herman: You can say that there is no intention to change the technical content, but we may find bugs in the spec.

Harold Solbrig: Ok, that’s what I need.

Rob Sanderson: For framing, I will require kasei’s time during office hours, after compaction.
… So he will probably not do an implementation, just a review.

5. Adjourn

6. Resolutions

Resolution #1: Approve Minutes of previous call: https://www.w3.org/2018/json-ld-wg/Meetings/Minutes/2020/2020-01-17-json-ld
Resolution #2: In toRDF recommend normalization of language tag based URIs
Resolution #3: … also compound literal form
Resolution #4: Update api document to be in line with syntax for json datatype, test descriptions, and “canonicalization” algorithm (modulo key ordering)
Resolution #5: Add text to api to clarify the need for tracking of base IRIs during context processing

7. Action Items

Action #1: create issue for ATRISK of entities within HTML script elements. (Gregg Kellogg)