Meeting minutes
<gb> Issue 35 CBOR-LD, YAML-LD & the JSON-LD Recharter (by mandyvenables) [session]
bigbluehat: Welcome to the group, thanks for coming. In the framing of JSON-LD recharter, we'll talk about CBOR-LD as well.
bigbluehat: I'm Benjamin Young, with a few co-presenter.
bigbluehat: The recharter is still under discussion, you can observe on Thursday. Charter is big work item and would like to qualitfy what goes in it, ship it, etc.
bigbluehat: Some history, JSON-LD has been around since 2014, been around since RDF WG. Synax and API... 2018, we did a JSON-LD WG ... minor version bump to v1.1 (that's where I got involved). v1.1 is used in Verifiable Credentials, schema.org, etc..
bigbluehat: In last epoch, we were a maintenance WG, haven't done much, did some errata, still in v1.1
bigbluehat: What we're doing next -- we'll maintain v1.1, don't know if v1.2 is next or v2.0. We hope there is never a backwards compat break.
bigbluehat: There is intended to be a new version, that's why we're rechartering.
bigbluehat: We would like to make these new feature realities. We have two major new documents, CBOR-LD has multiple inputs. Then YAML-LD, which will be a presentation from Anatoly.
bigbluehat: We'll look at RDF* stuff as well
wes: Hi, Wes Smith from Digital Bazaar, will talk about CBOR-LD, high level overview.
Wes: What is CBOR-LD -- semantic compression format for JSON-LD... if we want to shrink Verifiable Credentials down to a very small size, we can use CBOR-LD.
Wes: We are also proposing extensible registry mechanism for CBOR-LD to be self-describing, smooth interop w/ JSON-LD.
Wes: How does compression work: You build a compression dictionary from processing @context files. You assign integer values to terms and you do this in a deterministic way.
Wes: After compression, it's binary -- someone else needs to decompress... algorithms uses to process contexts and make compression dictionary can invert the compression.
Wes: CBOR has tags, tag registry, type of data that ... range of CBOR tags, additional precision in range describes what kind of CBOR-LD... self-describing CBOR. If I'm a general CBOR-LD supporting verifier of a document, I can decompress it. All this happens w/o any modifications for JSON-LD.
Wes: CBOR-LD Registry - you get a range of CBOR tags for CBOR-LD, you look at CBOR-LD tags, look up data in registry, get compression.
Wes: Compression table -- literally a mapping from terms in JSON-LD to integers.
Wes: Here's an example of JSON-LD VC on left, and on right CBOR-LD compressed form. High level here is terms get mapped to integers and you get an order of magnitude compression out of that.
Wes: Here is part of a VC, example driver's license...
Wes explains how JSON-LD document goes to compression map goes to compressed data.
Wes: You can also compress typed values.
Wes: Here's the result (binary blob of data)
Wes: Back out to big picture... JSON-LD VC is 1,200 bytes, and he CBOR-LD is 325 bytes.
bigbluehat: CBOR-LD is in the charter currently, being developed in JSON-LD CG. There are some others, JSON-LD in CBOR... CBL is another option for compression, we will want to take a look at their work
bigbluehat: CBOR-LD is in production at DB and TruAge.
Wip: Are there other implementations?
Wes: Yes, Java, Rust, Javascript, and other implementations.
bigbluehat: There will also be a test suite... I'm not writing that one!
PA: The conexts are in the registry, can I use CBOR-LD with an unregistered context?
Wes: Yes, short answer. You will have a long URL. There are a couple of ways to make that happen.
Wes: Don't compress URL
Wes: Provide application-specific URL compression.
Wes: Put stuff in registry.
Wes: The registry is more use case specific... mappings of context URLs to integers are big lists that a lot of differnt people can add their context to. It's not going to be California DL, each state can add entries.
iherman: To follow up on formal side for W3C, will there be a W3C Registry? Somewhere else? Registry should be accessible to all.
bigbluehat: Yes, that's the group needs to decide.
iherman: If you want to do a registry, it needs to go into the charter.
iherman: You have to define policy that regitry gets updated with, establish then run policy. It has to be in the Charter.
PA: Technically, we can add deliverables if they're in scope.
iherman: We should probably include it in the charter.
bengo: is the registry specific to CBOR-LD? Or is it useful for other encodings?
bigbluehat: Its not constrained at all, string to number map.
iherman: It would be nice to see an example
bengo: It would be a nice RDF -> CBOR code point registry.
bigbluehat: Same term to number thing would work... you could go from JSON-LD o CBOR-LD.
bengo: Its deterministic? The decoding of some blob is determinsitic.
bengo: Every possible JSON-LD can go from JSON-LD to CBOR-LD.
Wes: If I have some set of contexts associated with the document, if I implement CBOR-LD, same compression map.
bengo: If we use this for activity pub, other non-actity pub people can use this?
Wes: Yes, definitely. We can do two things 1) compression in general, 2) highly specific CBOR-LD. What you're describing is not tied to registries.
gkellogg: Any context in registry cannot change.
Wes: What goes in registry doens't have to be static.
YAML-LD
Anatoly: YAML-LD is targeted at humans, CBOR-LD is targeted at machines.
anatoly-scherbakov: We have implementations, test suite, WG is picking up to formalize.
anatoly-scherbakov: Going to show driver's license vc in YAML-LD.
anatoly-scherbakov: YAML is a superset of JSON, every valid JSON document is valid YAML document. Simplifies testing. You can run json-ld tests.
anatoly-scherbakov: You can make improvements, typing manually, this will be easier
anatoly-scherbakov shows how simplfiications can happen using YAML-LD
anatoly-scherbakov: You can get rid of quotes as well. For most part, gets rid of lots of typing, makes documents smaller. unfortunately "@" needs to be quoted.
anatoly-scherbakov: We can also do multiline strings in YAML, issuer id code, use greater dash in order to stay extra linebreaks get stripped, avoids super long lines.
anatoly-scherbakov: super killer feature is comments... if people author/read linked data, comments are vital are important for inent of author.
anatoly-scherbakov: Comments are especially pleasant
anatoly-scherbakov: We have a few use cases for YAML, writing VCs, FAIR metadata publications, nanopublications -- JSON-LD snippets describing knowledge.
anatoly-scherbakov: YAML is widely used in industry
anatoly-scherbakov: Theoretically can interpret existing YAML... arbitrary JSON can be interpreted as JSON-LD.
bigbluehat: Gets really ineresting on YAML front matter on files.
anatoly-scherbakov: We have two implementations Ruby and Python.
anatoly-scherbakov: YAML LD is a very thin layer on top of JSON-LD.
anatoly-scherbakov: One thing to note, "@context" has to be quoted... but we could remove quoting w other keywords... convenience context on JSON-LD to map JSON-LD keywords to dollar signs, only one context, other maps them to plain keywrods w/o stuff in front.
anatoly-scherbakov: User can choose any convenience context.
anatoly-scherbakov: Next steps -- refining text of specification, feedback welcome, I don't know wha else to write there, please find bugs/mistakes/misprinsts.
anatoly-scherbakov: Future features -- tag any YAML node, can be used for datatypes and language types. YAML allows mapping from one mapping to another.
bigbluehat: Thank you anatoly
bigbluehat: A couple WG NOTE ideas: JSON-LDs value object pattern on itself, encourage implementations of value objects. encourage JSON documents that lack "@context"... "referenceable sec noe for other specs"?
iherman: We got stuff into other documents, maybe leave it there?
bigbluehat: What about hashes for @context, hashlink is general, iteration on that for JSON-LD document loaders.
bigbluehat: As long as context file is served, document loader can use hash to do something ineresting, optional but confirmation that you got schema.org that you thought you wanted.
iherman: We had an issue we postponed, it' this issue?
bigbluehat: Yes, it's this issue.
bigbluehat: This is a document loader thing, relates to media type, hash fragment, not a fragment of document, metadata about the document. Good stuff to debate here.
Wip: If you use this, will it use it ok to resolve contexts live?
bigbluehat: Depends on your use case... wasteful to get 1.5MB schema.org file. If you have this, document loader could use as etag to server. Get 304 not modified, all sorts of other stuff to do that. If you have a local cache, multiple versions of schema.org, author wanted it to be that one.
JSON-LD Star
gkellogg: Quick summary of RDF Star -- eek into property graph space to make statements about statements. RDF had a way to do this called reification, new triple term resource type. That brings in ability to talk about triples, indireciton on triples by reifier, some identifier which references triple term using well known predicate.
gkellogg: Annotations allow one to talk about a triple or set of triples and construct resource sfrom that... in case of reifier, triple that is not asserted ina graph. In annotatin, always aserted in document otherwise very implar
gkellogg: Object contains reifier, triple has a single triple... constrained from describing more than one thing. Identifier and type... one value we're describing... TURTLE is on the right.
gkellogg: Exmple using @reifier, similar except we have @id, whcih is identifier for reifier which references triple term. Could have multiple triples described.
gkellogg: Another value on reifier.
iherman: Subject of @reifies might be an array, a whole graph could e put in there.
gkellogg: or object with multiple entries... gnarly issues are if you have objects, do they get reifiers.
TallTed: Some people might pronounce "rayifies" than "reifies"
gkellogg: In case here, object contains reified triples and additional properties, indirection between triple terms and hey become proprties of reifier.
gkellogg: Last case, we have annotation, JSON-LD node object which contains vvalues and @annotation with triples contained in there. Annotation given an ID, reifier ID for triples, statements made are made about reified triple.
iherman: Annotation can only be on one triple.
gkellogg: Reifier can be used against multiple triples.
iherman: how do you do it JSON-LD?
gkellogg: You can put more properties in object.
Discussion around details of example.
gkellogg: Reification looks like a named graph.
gkellogg: Are named graphs described using refication?
bigbluehat: We ahve a joint WG meeeting Thursday morning.
bigbluehat: Next big thing is plenary
iherman: This part of spec can only be done when 1.2 is done.
iherman: Don't know how it will be reflected in charter
gkellogg: We ahve a dependency on RDF 1.2 -- we could do this in JSON-LD v1.2, as long a we don't go to CR were within process
<Zakim> manu_, you wanted to note context injection? and to how to safely inerpret JSON-LD w/o JSON-LD processor
manu_: context injection has come up often
… people get confused and think it's illegal
… so stating that it's fine is something we should do
… we also get push back around processing JSON-LD as JSON
… and we often have to explain that one can use the data model of JSON-LD without using a JSON-LD API processor
… it's been a very helpful thing for our community
… and it would be very nice to have that text make it into the main JSON-LD Syntax spec.
Thanks everyone, adjourned!