JSON-LD Working Group Telco — Minutes

Date: 2018-06-29

See also the Agenda and the IRC Log

Attendees

Present: Simon Steyskal, Adam Soroka, Rob Sanderson, Leonard Rosenthol, Gregg Kellogg, Benjamin Young, David Lehn, David Newbury, Ivan Herman, Harold Solbrig, Tim Cole, Jeff Mixter

Regrets: Ivan Herman

Guests:

Chair: Rob Sanderson

Scribe(s): David Newbury, Rob Sanderson

Content:


1. approve minutes

Rob Sanderson: First topic–approving the minutes of the last call.

Benjamin Young: +1

Harold Solbrig: +1

David Newbury: +1

Gregg Kellogg: +1

Rob Sanderson: +1

Simon Steyskal: +1

Leonard Rosenthol: 0

David I. Lehn: +1

Ivan Herman: +1

Resolution #1: last week’s minutes approved

2. TPAC info

Rob Sanderson: Hotel block extended through the Friday night, which means that people who are leaving on Saturday (aka us) couldn’t even book. That’s been extended, so everyone can both register and et the hotel at the correct TPAC rate.
… There are diversity scholarships available, so if there are people who are typically underrepresented in the web standards community, and would like to attend, please encourage them to apply for that scholarship fund.

3. next week telco

Rob Sanderson: Next week is independence day, on July 4th, is anyone then not going able to make next week’s call?

Leonard Rosenthol: I cannot make it

Rob Sanderson: Nobody else will not be available, it seems, so we will have the call as per normal.

4. further introductions

Rob Sanderson: Introductions. Let’s go down the list. If you were on the call, name and introductions.

Rob Sanderson: Semantic architect, J. Paul Getty Trust, co-chair

Adam Soroka: Apache software foundation, as well as smithsonian

Jeff Mixter: OCLC, working with semantic web for the past six years, getting linked data in OCLC’s web standards, and working with Rob and others in advancing standards and the new IIIF presentation API.

Benjamin Young: Benjamin Young, John Wiley & Sons, Co-Chair of JSON-LD also participating in the Publishing WG

david lane: working at digital bazzar

Gregg Kellogg: Spec Ops, and driver of JSON-lD

Leonard Rosenthol: adobe systems, but chairs the XMP working group at the ISO, and one of the projects is to create a JSON-LD serialization of XMP, so here to participate.

Simon Steyskal: research scientist at Siemens

Tim Cole: Tim Cole, University of Illinois, Urbana Champaign
… also working with the web annotation spec

Harold Solbrig: harold solbrig: Mayo Clinic

5. overview of the CG documents

Rob Sanderson: JSON-LD community group has been working on things like framing and other things
… so, Gregg, as the main instigator of those drafts, can you go through them rapidly, since most people here understand JSON-LD 1.0 but may not have kept up with the community group changes?

Gregg Kellogg: https://www.w3.org/community/json-ld/

Gregg Kellogg: The community group made pages, and if you open them, you’ll find the updates to the 1.0 document which are obvious in a couple ways.

Gregg Kellogg: https://www.w3.org/2018/jsonld-cg-reports/json-ld/#node-type-indexing

Gregg Kellogg: for instance, if you check a section that has changed, you’ll see it’s highlighted and provides a visual indicator of the parts of the spec that have been changed.

Gregg Kellogg: https://www.w3.org/2018/jsonld-cg-reports/json-ld/#changes-since-1-0-recommendation-of-16-january-2014

Gregg Kellogg: there’s also a section describing what has changed.

Gregg Kellogg: https://json-ld.org/presentations/JSON-LD-Update-TPAC-2017/assets/player/KeynoteDHTMLPlayer.html#6

Gregg Kellogg: rather than walking through the document, I’ll point to the presentation that was given at 2017 TPAC, which went over the notable changes
… principal is the version announcement, to activate the new processing model
… which announces the version 1.1. Unfortunately, 1.0 did not call out as error conditions things like containers that were out of bounds.
… so new features, such as ID container, creating ID maps, requires that the version be specified as 1.1.
… ID maps were a requested feature, previously if you have a number of different values, and you needed to cycle through them–we had maps for non-semantic data tags, and this extends it

Ivan Herman: -> https://json-ld.org/presentations/JSON-LD-Update-TPAC-2017/assets/player/KeynoteDHTMLPlayer.html#7 @id Maps

Gregg Kellogg: similarly, type maps, which lets you index by type

Ivan Herman: -> https://json-ld.org/presentations/JSON-LD-Update-TPAC-2017/assets/player/KeynoteDHTMLPlayer.html#8 Nested property

Gregg Kellogg: nested properties–there are a number of JSON serializations that like to group properties under a non-semantic term
… This also allows for several indefinite recursion by fallout of the design

Ivan Herman: -> https://json-ld.org/presentations/JSON-LD-Update-TPAC-2017/assets/player/KeynoteDHTMLPlayer.html#9 scoped contexts

Gregg Kellogg: scoped contexts actually solved a number of different problems
… it allows you to define a context within a term definition, so the processor will apply the embedded context on top of the context already there

Ivan Herman: -> https://json-ld.org/presentations/JSON-LD-Update-TPAC-2017/assets/player/KeynoteDHTMLPlayer.html#10 scope for @type

Gregg Kellogg: scoped contexts can also be used for type values, and they can also be used for objects with multiple types
… and if any are used, that scope is applied
… that addresses the primary features added by the community group
… you can look at the changes for a more comprehensive list.

Gregg Kellogg: https://github.com/w3c/json-ld-syntax/pull/2

Gregg Kellogg: these versions of the documents were described in the charter as the basis of this
… there are pull requests that the working group should consider, that would pull in the update from the community group that makes it a draft for the working group
… there are similar pull requests for the other documents
… there is the nice feature that will create a preview of the documents and the diff, against the 1.0 spec, though it might not be useful given the number of changes

Rob Sanderson: worth going through the algorithm and framing changes? is there a summary around that?

Gregg Kellogg: https://www.w3.org/2018/jsonld-cg-reports/json-ld-api/#changes-since-1-0-recommendation-of-16-january-2014

Gregg Kellogg: there are also changes in the API document
… the primary one is to separate the processing from JSON-specific to web-idle (sp?), which allows this specification to be used for alternative serializations such as YAML.
… you can see where it’s been highlighted to be changed for 1.1, it didn’t need a whole re-write, just additional steps
… there is support for additional keys, to support the various features that we use, such as container having different values, prefixes explicit
… I’m going to digress–there is an incompatible change from 1.0–a processor running in 1.1–in term selection, when finding a prefix for a compact iris, it adheres to the intention of the 1.0 spec
… in 1.0, any term could be used for a prefix–if you had to find a terms for sports, but you had a term for sports_events

Rob Sanderson: e.g. http://schema.org/sports and http://schema.org/sportsEvent –> sports:Event :(

Gregg Kellogg: so there’s a change to only map the term to an iri, or in 1.1 a prefix key, which allows it to be explicitly used as a prefix
… there’s probably too much to go into in that much detail, but there are some changes to the API section, that rely upon portions of the algorithm implemented rather than the core algorithm itself
… there is a problem that we may run into
… the error codes that are defined in 9.4.2
… are phrases, and apparently web-idle wants them to be single words

Rob Sanderson: https://www.w3.org/2018/jsonld-cg-reports/json-ld-api/#jsonlderrorcode

Gregg Kellogg: so currently there is a RESPEC bug that when we cook this doc doesn’t create IRIs for them
… we may want to change these error codes, though that would be an incompatibility

Rob Sanderson: any questions about the syntax or the APIs or algorithm ?
… less discussion, and more understanding questions?

Adam Soroka: small question–what is web-idle?

Leonard Rosenthol: WebIDL

Benjamin Young: https://heycam.github.io/webidl/

Gregg Kellogg: it’s a specification–interface definition language
… the idea being that you could run a tool that would translate that into the actual interface def. in whatever language you want.
… works well for javascript, but not other languages

Benjamin Young: WebIDL Level 1 Recommendation https://www.w3.org/TR/WebIDL-1/

Gregg Kellogg: in particular, it allows us to describe promise interfaces
… (async interfaces)
… so if you look at the expand/compact/flatten interfaces, you can seem them defined
… webIDL lets us find these sorts of interfaces
… the only way that we have to signal an exceptional condition, causes a promise to fail, which raises an error
… otherwise, the alg. are defined to be written in a purer form that does not require any specific interface requirements

Rob Sanderson: any questions about anything?
… let’s go on to framing

Gregg Kellogg: https://www.w3.org/2018/jsonld-cg-reports/json-ld-framing/

Gregg Kellogg: framing doc is somewhat different, as framing was never defined as a spec.
… work was done, but it was not complete, and it was not of interest to 1.1, however it is widely implemented and people are dependent on it.
… framing was intended to be use if you have a flat representation–RDF to JSON-LD
… you’d get many objects defined at the top level, which is not how people tend to look at these things.
… JSON-LD allows you to have similar embedding rules–but how do you get from a flat representation to a embedded representation
… framing creates a programming by example, you create a schematic document, that can be matched on different terms
… if you have a document that has a specific term, it will match on the type
… if that framed object contains values, framing will then look for objects that have those values
… so we can get back to the types of values that we type to see
… it can also be viewed as a query language
… but 1.0 matching was somewhat constrained, but there was a lot of work to do on wildcards, literal values with language, things such as that.
… 1.0 framing did not support named graphs, 1.1 does, either having the top level be the union or all graphs

Gregg Kellogg: https://www.w3.org/2018/jsonld-cg-reports/json-ld-framing/#changes-since-1-0-draft-of-30-august-2012

Gregg Kellogg: there are changes since the 1.0 draft, but they are based on the document published along.

Tim Cole: Gregg, the community group has done great work
… in regard to framing, we did some early work with compacting, framing, and for moderately large docs it seemed framing, while powerful, was very slow.
… do we need to think about moving that into a recommendation, normalizing it, that the implementations be some real performance issues?
… or do we just worry about the spec, and not care about performance?
… the performance cost of that

Ivan Herman: from the process W3C, the CR phase, the implement checking phase, whether the recommendation is complete and consistent
… which also means that there are no speed requirements by W3C. We may decide to enforce that, but it would not be a generic requirement.

Tim Cole: what has been the recent experience:

Gregg Kellogg: there’s the larger issue of performance of JSON-LD for large values.
… It can be used for a dump format. A document with millions of records in it. The alg. as defined are not kind for this.
… you get into the combinatorial problem–it can be used for querying, but it’s not written to take advantage of fast algorithms.
… the specs do not require that the alg. precisely, but operate in a way consistent with them.
… for instance, they’re prescriptive around bnodes, so you must directly implement them
… personally, I think that there’s other mechanisms that rely on graph querying where framing is specified much more like the construct from SPARQL, and it’s not required that we do that, but we should provide some level of expectations around that.
… and the preponderance of those docs, say schema.org, webpages, it’s probably perfectly adequate, but you get into cases when there are thousands of objects involved.

Leonard Rosenthol: this seems to be an approach like XSLT for JSON

Gregg Kellogg: it’s similar to that

Benjamin Young: Gregg–can you paint a quick picture of framing vs. SHACL or SHeX

Gregg Kellogg: they’re used for matching RDF graphs, and there’s a certain matching component to this, but the audiences are somewhat different
… SHeX can also be used for creating new docs, so it’s quite similar.
… but they’re RDF graph based, so not concerned with topology

Rob Sanderson: we should look at pull requests
… and approve them to be merged

Gregg Kellogg: https://github.com/w3c/json-ld-syntax/pull/2

Simon Steyskal: SHACL https://www.w3.org/TR/shacl-af/ can do that too using rules (not part of the Rec though)

Gregg Kellogg: https://github.com/w3c/json-ld-api/pull/1

Gregg Kellogg: https://github.com/w3c/json-ld-framing/pull/1

Rob Sanderson: no point in reading every line, but can you describe?

Gregg Kellogg: nominally, the WG is moving from the CG, so we need to change those documents to no longer mention CG, etc. These define versions of the documents for that. Pull requests to create as editors drafts versions of these documents for doing changes the group approves of

Proposed resolution: the three PR-s (see above) define versions of the document as WG ED-s, serving as a starting point for the WG, hence merge them (Ivan Herman)

Gregg Kellogg: +1

Benjamin Young: +1

Rob Sanderson: +1

Tim Cole: +1

David I. Lehn: +1

Simon Steyskal: +1

Ivan Herman: +1

Leonard Rosenthol: +1

Rob Sanderson: no normative changes, just changes to make the WG drafts.

David Newbury: +1

Benjamin Young: +1

David Newbury: +1

Simon Steyskal: +1

Rob Sanderson: +1

Adam Soroka: +1

Leonard Rosenthol: +1

Gregg Kellogg: +1

Tim Cole: +1

Harold Solbrig: +1

Rob Sanderson: this doesn’t mean we ever need to publish these, but it gives us a starting point

Resolution #2: the three PR-s (see above) define versions of the document as WG ED-s, serving as a starting point for the WG, hence merge them.

Rob Sanderson: thank you, and are there any other issues or questions?

David I. Lehn: I just noted a common copy & paste typo in all the prs

David Newbury: We were running CONSTRUCT queries that made an index of 200k objects. The major bottleneck was framing, the docs were 1000 lines or so. 500 ms or 1second process, but with 200k objects that was a lot of processing
… so for indexing style projects, the performance indications are very real

Gregg Kellogg: might be to much bike shedding, but maybe there’s an alternative around querying…

Ivan Herman: just a remark: it might be bikesheding, but to make sure we know the reasons to see if the framing document will make it impossible to make optimizations. That might be vague, but
… the blank nodes part, for instance, specified how they are named–there are good reasons, but if that’s the bottleneck we should rethink that.

Gregg Kellogg: I agree, but I want to make sure that the complexity vs. something more precise.
… it might be that a note that describes an alg., or something with more detail

Rob Sanderson: for next call, it would be great to go through some design principles so that we have well-understood methods to assess features
… to help manage complexity vs. functionality, etc.

Ivan Herman: question on process–when I read through the syntax doc, some errors, editorial, some not–should we create those issues, or should we wait until we know the process?

Rob Sanderson: just migrate issues, create new issues.
… .thank you, and we’ll meet next week at the same time.


6. Resolutions