JSON-LD Working Group Telco — Minutes

Date: 2018-07-20

See also the Agenda and the IRC Log

Attendees

Present: Rob Sanderson, Adam Soroka, Ivan Herman, Simon Steyskal, Gregg Kellogg, David Lehn, Dave Longley, David Newbury, Jeff Mixter, Axel Polleres

Regrets: Benjamin Young, Harold Solbrig, Tim Cole, Leonard Rosenthol

Guests:

Chair: Rob Sanderson

Scribe(s): Dave Longley, Rob Sanderson

Content:


Proposed resolution: Approve minutes of last call: https://www.w3.org/2018/json-ld-wg/Meetings/Minutes/2018/2018-07-13-json-ld (Rob Sanderson)

Gregg Kellogg: +

Rob Sanderson: +1

David Newbury: +1

Gregg Kellogg: +1

Simon Steyskal: +1

Resolution #1: Approve minutes of last call: https://www.w3.org/2018/json-ld-wg/Meetings/Minutes/2018/2018-07-13-json-ld

1. Announcement / Reminders

Rob Sanderson: As of the 1st of August all the TPAC prices go up, so if you’re intending to come (please do) and haven’t registered please do so and book your hotel ASAP, etc.
… Save money and it will help us plan effectively.
… Any other announcements/reminders?

2. Introductions

Dave Longley: Work at Digital Bazaar, one of the original creators, worked on implementations. Looking forwards to helping out with new improvements in 1.1

Rob Sanderson: Rob Sanderson, co-chair along with Benjamin Young.

Ivan Herman: Staff contact of the group for W3C.

Adam Soroka: Adam from Apache Foundation

David Newbury: From software getty trust.

Gregg: I am awesome

3. Discuss additions to the regular process

Rob Sanderson: This is mostly brainstorming/approving. We want to make sure everyone agrees with the processes and we start use them as part of our work.
… How can we, in a lightweight way, track use cases for features so that we can express those use cases either in the specification to make it easier to understand or in a companion document providing the rationale and reasoning behind which features get entered and which might not make the cut.
… The second discussion will be valuable will be around selection of which issues make it onto the agenda for the call and there’s an easy way which is the chairs decide over the week, but we’re very very open to having either a more democratic process based on issues with most comments or whatever makes sense.
… Those are two process points I thought of, any others to add to this part of the discussion?

Gregg Kellogg: There may be some more overarching things that don’t resolve down to use cases or specific agendas. Particularly in the beginning of the WG where we might change directions/architecture. The general style of the specs, testing approach, broader structural approach, nomenclature, things like that might need broader discussion.

Rob Sanderson: Structural approach for the specs or ?

Gregg Kellogg: It includes both changing/moving things around in the syntax document and potentially in the API specs. Creating new specifications or NOTEs.

Rob Sanderson: Yup. There are two categories there, documentation of the spec and testing of the spec.
… We’ll add that to the list of things to talk about.
… Any further topics?

Axel Polleres: Axel from University of Economics, co-chaired the sparql WG … I was brought here because I was a bit unhappy with trying to do somethings in sparql w/JSON-LD and would like to help.

3.1. use case handling

Rob Sanderson: We have agreed over the last couple of calls that use cases are a good way to ensure that the specs are practically focused and have real requirements rather than hypothetical. In order to demonstrate that we need to be able to track those use cases. We also don’t want to do some heavy weight six month analysis and then deduplicating requirements and so on, want to get to work.
… So putting the use cases somewhere but not spending a lot of time agonizing over them.
… So a suggestion was a wiki page on the W3C site or on github. And use that to collect them. Once we’re in CR or later come back and clean them up.

Rob Sanderson: What do people think about that?

Adam Soroka: +1

Ivan Herman: +1

Gregg Kellogg: +1

Simon Steyskal: +1

Dave Longley: +1 to github to keep everything there

Jeff Mixter: +1

David Newbury: +1

AxelPollleres: +1

Ivan Herman: There’s a practical problem, I haven’t set up a separate wiki for the group at W3C so we can use the wiki on github.

Proposed resolution: Use the -wg wiki on github to capture use cases as they come up in discussion (Rob Sanderson)

Gregg Kellogg: +1

Dave Longley: +1

Rob Sanderson: +1

Simon Steyskal: +1

Adam Soroka: +1

David Newbury: +1

Ivan Herman: +1

David I. Lehn: +1

Resolution #2: Use the -wg wiki on github to capture use cases as they come up in discussion

Gregg Kellogg: https://github.com/w3c/json-ld-wg/wiki

Rob Sanderson: Any further questions/comments on that?

Rob Sanderson: We can modulate as we go.

Rob Sanderson: +1 to GH templating if possible

Adam Soroka: I agree that we don’t need some formal machinery. It’s helpful to use some of github’s machinery for templating. I believe it lets you offer some suggested fields, it creates some coherence. I’d be happy to look at that if we decide to go that way.

Rob Sanderson: Great, and if you’re willing to set up some templates that would be perfect.

Action #1: Adam to look into wiki templating on GH for use cases

3.2. Selection of issues for the agenda

Rob Sanderson: I think we’re all happy to do whatever makes the most sense and we can change if we’re not happy with the current approach.

Rob Sanderson: Any suggestions for changes or should we just keep going as we’re doing it and change later if needed?

Ivan Herman: Let’s not overcomplicate things, if it works, keep going. People can always suggest by email or if they want to keep it private they can email the chairs list and say what they want to talk about.

Ivan Herman: That’s why we have two chairs.

Dave Longley: +1 to ivan

Gregg Kellogg: To the degree that we want to sort of use the issue list to drive much of our work, I’m wondering whether an issue there … maybe one exists, some needs discussion that we can use to filter through the issues that need to be discussed on the call.

Rob Sanderson: You mean a label or a tag?

Rob Sanderson: +1 to a priority label for GH issues

Gregg Kellogg: Yes, a label – so we can label issues that need discussion.

Dave Longley: +1

Gregg Kellogg: Given that label, everyone in the group, I presume can also give them labels.

Ivan Herman: If you are signed up in the team then you have sent me your github handle, etc. and you can do it.

Simon Steyskal: +1

Gregg Kellogg: Is there a particular label?

Simon Steyskal: Needs discussion was the label. Needs WG discussion, something like that.

Simon Steyskal: https://github.com/w3c/poe/issues

Ivan Herman: I am happy to add the “Needs Discussion” label if you want.

Dave Longley: +1

3.3. style/ structure of specification documents

Rob Sanderson: One thing that was noted in the CG with the 1.0 style and structure was that the normative text was hard to find. The lovely examples were treated as if they were the specification.
… The implementations, if they are not strictly following the API and algorithm they have a lot more work to do. To make it do what should be happening, not mindlessly/dogmatically following the algorithms.
… The single flat list of features gets quite long, 30 something. That could probably do with some further structure to group together similar topics.
… Anything else Gregg?

Gregg Kellogg: There is a difference between syntax and framing/api specs. The syntax spec – the bulk of it is larger informative, the grammar section being normative. We didn’t know how to structure it differently, the spec would limit itself to the grammar. Most of the discussion is the motivation for “why the grammar”.
… The API document on the other hand, the algorithms are normative. That sort of maybe creates the opposite problem. We found some of the discussion around the performance implementations of following the framing formula for non-trivial graphs can run in exponential time.
… Something sticking to the algorithm is somewhat doomed. I think if we were able to discuss behavior more that would be better (what vs how).

Rob Sanderson: +1 to gathering (e.g. indexing)

Gregg Kellogg: With the syntax spec, in terms of grouping things, talking about indexing, graph/identifier/etc, if we had a more general section on indexing or use of dictionaries that would be a way to gather those types of things. Ivan perhaps provoked perhaps separating the named graph/data set aspects of the specification into another section or graph.

Rob Sanderson: +1 to separating named graph / dataset into separate section

Gregg Kellogg: So the primary way to read the syntax spec would be as a graph like most other RDF specs, but leaving the other data set parts into another spec or gathered in the same spec elsewhere to keep those concepts from getting confused.

Ivan Herman: I mostly concentrated on the syntax document when I read through it. There is a big difference between the APIs and the expanded/compacted/flattened forms and that it was all in the same section. I think it should be a separate section to make it clear their role is different from previous explanations.
… One thing we discussed maybe – was to be more systematic for all examples, to represent them in various forms, with … each example, in my view, should be represented in a table in a visual graph. Various readers look in different ways.

Rob Sanderson: GH Issue: https://github.com/w3c/json-ld-syntax/issues/26

Ivan Herman: That’s a lot of work because there are a lot of examples, that’s a good thing, but it would make things much more readable.
… There is a section that says “How to read this document” but I think it should be much more emphasized that what is normative is section 5 or 6 (don’t remember). It should be very very clear that most of section 4 and all of section 4 … is non-normative and so, if as an implementer you want to understand some of the details, that doesn’t count.
… We tried to trace back a problem Gregg and I at some point, and we had to jump back to the examples and the normative section to understand things. That should be reviewed, it may be a problem.

Dave Longley: IIRC, I think we said that the algorithms can be done however, so long as the input/output is the same. I think that statement survived in the spec.
… I think Gregg mentioned that it’s a good idea to separate graph / dataset aspects in the spec, but not as a separate spec completely.
… Not everyone will be super familiar with RDF, and won’t think of things in this way. Need to be careful with categorization, but we can address both audiences.

Gregg Kellogg: Yeah, to Dave’s point, there is something in the conformance section that specifically said that the algorithms are written for clarity and that processors may implement the specification in any way desired. Those sections are described as being normative, however. They are also described in quite a lot of detail. In so much detail, that they might actually get in the way of doing an implementation if you need to stick to it.

AxelPollleres: just as a remark… I think that assuming many are not familiar with RDF is ok, but we still would like RDF tooling to work as intuitive/expected with JSON-LD, at least I would (and hope I am not alone with that in the group).

Gregg Kellogg: I think we gave up on list of lists because it was difficult to specify as an example. Different libraries that work on such things may have tools that allow those things to be done without being described in such details. When I updated the framing spec I tried to be more descriptive than prescriptive.
… A general update to make them more generalized in terms of what needs to be done rather than getting down to the point of specific variables, loops, steps, things to implement.

Ivan Herman: Whether I like it or not, there are number of other specs that have this approach of describing the algorithm. Some are positively unreadable but they are there because there is a style that is there and those things are normative. We need to be careful not to open up to a critique from members that are used to this style.

Rob Sanderson: +1 to it being very hard to find the right balance!

Ivan Herman: We should have a look at some of those around DOM, WebRTC, etc.

Dave Longley: +1 for sure and don’t want to make it difficult for people to just implement the steps mindless too :)

Dave Longley: it’s not always a bad thing.

Rob Sanderson: A piece of feedback we had from a non-W3C member was that they were trying to use X-Query to implement JSON-LD … I don’t know why. But the way XSLT worked … it was quite different and functional and they gave up trying to implement the algorithm. Instead they tried to understand the exact outputs from the syntax doc.
… That is something that we can something improve on – it’s a difficult balancing act to get right.

AxelPollleres: Regarding lists of lists… I was puzzled that things didn’t work for a relatively simple use case that’s why I showed up in the WG. Is it difficult to define a uniform way of handling it or extend the existing algorithm? Why wasn’t list of lists addressed?

Gregg Kellogg: When researching this, in the original CG discussion, the comment was that I was not able to come up with a simple formulation of the algorithm that supported lists of lists. More directly, at the time, we didn’t have use cases to support it. The group decided to abandon further work on that, that doesn’t mean it can’t be done.
… I go down to the nature that we describe these algorithms possibly being in the way.
… There was also an issue we had at the time for the way that a list is described. There was a problem with the way that the list is described within a @context and understanding how that might be applied recursively. We’ve defined some additional features or proposed some things that might help there now.

Gregg Kellogg: It needs some fresh eyes now.

Axel Polleres: a use case I do have, though it is a simple one, admittedly. and if thats’ what it takes, I’d be happy to help think about the formulation of the algorithm, so happy to support looking at it with fressh eyes :-)

Gregg Kellogg: Regarding bnode naming in the spec, particularly in the flattening algorithm which tends to be used for a lot of different things. If you’re expected to produce a document in which the bnode names are predictable – which is really useful for comparing two JSON documents, that prohibits different types or parallel algorithms.

Gregg Kellogg: The problem there is that … if we could come up with a way – easily testing two different JSON documents to see if they are the same … that might be a way to ease that algorithmic requirement.

Dave Longley: we need to be careful that that feature isn’t just for us to test – that maybe others rely on that outside of us

Gregg Kellogg: see https://github.com/w3c/json-ld-syntax/pull/35

Ivan Herman: I was surprised by that, but as Dave says, not everybody knows about RDF. The impression that people get that these bnodes are just like the others and we should give them a semantically meaningful name and they will be kept by other conversions … and this won’t happen.
… I think that’s something we should change all over the place.
… The way we did it for RDFa back then – when we tested our implementations, purely on the test cases, we ran strings through some sparql engines that looked at what’s there. We could do something similar and convert the JSON-LD to RDF and let the RDF processor look at the isomorphisms and then we are fine.

Adam Soroka: Just a note on what was just suggested, I think it’s an excellent and promising suggestion. It shifts locus of authority to the test cases rather than the actual wording of the spec.
… As a programmer, that’s attractive in many ways. It’s also unattractive in other ways. It’s difficult to have a discussion about the test cases.

Rob Sanderson: +1 to that

3.4. testing issues

Gregg Kellogg: We have a pretty comprehensive test suite that was originally created for 1.0 and the CG maintained it so that everything that’s currently described in the spec has tests, it lives in json-ld.org.

Gregg Kellogg: There’s reasonable support for keeping it there. Creates questions about distinguishing CG from ongoing WG. Not sure if that’s important or not.
… There’s a notion of how you determine equivalence, reducing things to graphs is appealing, however, some of the algorithms are specifically intended to look at the structure. For instance, the choice of a particular term when compacting, or the form being a string, or a value object, or in the certainly in the sense of a frame you have the layering of different objects linked together is lost when reduced to a graph.
… If the algorithms describe things that would be lost when reduced to RDF we can’t rely upon that.

Dave Longley: +1

Simon Steyskal: https://github.com/w3c/json-ld-wg/blob/e07e37d59b1df9d04d8ca389b31d118cd85832ea/WorkMode/guiding_principles.md#proposed-the-underlying-data-model-is-rdf

Simon Steyskal: As a follow up to what Gregg just said. I can remember in the last call we discussed those guiding principles. One of those things was that the data model was RDF, despite me being for that. There’s also this objection to it, we should not constrain ourselves to it. If a feature comes up that can’t be modeled with RDF would we still implement it?
… If we say that we transform to RDF and use isomorphism checks for conformance, doesn’t that mean that what we want to test must be representable with RDF. This would mean any new features would not be representable as RDF.

Rob Sanderson: A very good question to come back to.

Ivan Herman: A very practical point, I have no point in keeping the tests in json-ld.org github. One thing we have to think about, W3C has set up an environment where the whole W3C organizational part of github is archived.
… If ever there is a problem with github, god forbid, the test cases for json-ld might become lost.

David Newbury: Going back to the RDF. When understanding JSON-LD from my point of view. Making that distinction … a big chunk of JSON-LD is a serialization of RDF but another chunk is about structural but not semantic manipulation of JSON documents. Might be helpful to think about those as two separate parts when writing the spec.
… I’m thinking about the ways people use JSON-LD in my experience there are those two parts and they are reasonable divorced from each other.

Dave Longley: +1 to that

Gregg Kellogg: I’d say that probably 80% of the algorithm are about structural manipulation, maybe not even 20% are about transformation to RDF. Most things do transform to RDF, some things that don’t. But the algorithms are really about changing between different JSON/more syntactic like representations, compaction<=>expansion, framing, flattening, specific use of @contexts, applying @context documents when expanding/compacting.
… Maybe there’s something we can do within the API documents that emphasizes that.

Dave Longley: JSON-LD is not just a “transport” format … the specific structures are important because people code to them – this is what makes JSON-LD different from other RDF syntaxes.

Rob Sanderson: We should call out and test when things are not based on the RDF data model such that the tests can be implemented, not in an RDF model coherent way, but don’t prevent us from adding features that aren’t RDF specific. It gives us an incentive not to, and a reminder that we’re stepping into untested waters when we do it but doesn’t a priori prevent us from doing it.

Adam Soroka: I like that a lot, it doesn’t cut off the possibility of JSON-LD features that can’t be modeled in RDF. But we should keep in mind there are things that can be modeled in RDF as the center of gravity.

Ivan Herman: +1 to ajs6f

AxelPollleres: just to understand … RDFization looses document order if the coding against the JSON-LD relies on document order, obviously there’s something lost… is that (oner of the things) what you mean?

Rob Sanderson: Does that move the chains forward?

Rob Sanderson: +1 to simon

Simon Steyskal: To highlight this issue, more or less, maybe we should also think about adding some form … we have these guidelines for referring to a future RDF WG these features. Maybe treat these features differently by testing differently, but would also add RDF test cases if certain features are added to RDF at one point. I don’t know if this is just additional work.
… This is not then forever a special test case but if added to RDF we would add the respective RDF test case.

Gregg Kellogg: Right now there are a few things, that I can think of, in JSON-LD which aren’t in the RDF data model. Principally there’s the data indexing and as Axel pointed out the ordering of things in graphs that we say don’t have a semantic meaning. It’s worth enhancing the data model section to describe these things and ascribe to limit them.

Gregg Kellogg: No RDF syntax describes ways in the same way that JSON-LD does, it’s fairly unique.


4. Resolutions

5. Action Items