DID WG Telco — Minutes

Date: 2020-08-04

Attendees

Present: Daniel Burnett, Markus Sabadello, Amy Guy, Manu Sporny, Adrian Gropper, Orie Steele, Dmitri Zagidulin, Wayne Chang, Jonathan Holt, Dave Longley, Michael Jones, Eugeniu Rusu, James Qu, Kaliya Young, Oliver Terbu, Chris Winczewski, Daniel Buchner, Justin Richer, Joe Andrieu, Pamela Dingle

Regrets: Ivan Herman

Guests:

Chair: Daniel Burnett

Scribe(s): Wayne Chang, Kaliya Young

Content:

1. Agenda Review, Introductions, Re-introductions
2. Next Topic Call
3. CBOR

1. Agenda Review, Introductions, Re-introductions

Daniel Burnett: agenda is short, intro, re-intro, so on…briefly mention and give info about next topic call, then disussion
… taking a break from issue review, discuss CBOR for majority of meeting
… any other agenda items?
… okay, good. is there anyone joining for the first time?
… james, would you like to introduce yourself again, and possibly your question for the group?

James Qu: worked for UBS and morgan stanley for more than 20 years, familiar with traditional banking. now i’m working for a blockchain company introducing DIDs into the blockchain ecosystem, asking my team to implement DIDs
… my question is: things i try to contribute a bit, having worked for FIX protocol (financial info xchange), i find it difficult to catch up quickly
… i’d like to get some advice from experts on what’s the best way to get up to speed quickly and contribute? thanks

Daniel Burnett: for the next 2 minutes, if you’d like to volunteer materials + pointers, please do

Manu Sporny: There is the DID Primer – a bit out of date, but good background reading: https://w3c-ccg.github.io/did-primer/

Wayne Chang: another resource, intro by burn and brent: https://hackmd.io/ME8SfZymRfK1XtM69oBOcA?both

Manu Sporny: … and honestly, the intro portions of the spec are probably the best thing we have

Dmitri Zagidulin: JamesQU_ - I’d also recommend my slide deck on DIDs + existing domains (did:web method), which gives a historical overview of DIDs - https://docs.google.com/presentation/d/1wWI2qeQfgOgFdDp5Adt9hwHxVTt-ctG9naBEpNjOSTo/edit

2. Next Topic Call

Daniel Burnett: https://github.com/w3c/did-core/labels/keys

Justin Richer: https://hardjono.mit.edu/sites/default/files/documents/WSJ_Digital_Identity_is_Broken.pdf

Daniel Burnett: topic call in 7 hours on keys

Wayne Chang: another resource, intro by burn and brent: https://hackmd.io/ME8SfZymRfK1XtM69oBOcA?both

Wayne Chang: (draft)

3. CBOR

Daniel Burnett: https://github.com/w3c/did-core/issues?q=is%3Aissue+is%3Aopen+label%3ACBOR

Daniel Burnett: main topic is on CBOR
… it is an open discussion

Manu Sporny: the goal for the call today is to try and figure out the direction we are taking
… data model can have several representations - JSON, JSON-LD, CBOR
… we have the most implementations in JSON-LD waiting on JSON - this call is to explore CBOR
… we are hoping to explore directions to go
… we are working on figuring out what should go in the specification
… ideally what can we write so that it articulates it well so we can get through the candidate recommendation process

Kaliya Young: wayne would like to understand more to help level set for discussion

Manu Sporny: Would like Johnathan to give a quick outline on CBOR DAG, Orie to talk about his implementation and i can talk about CBOR-LD

johnathan holt: CBOR has batteries included - JSON strings to keys to Nats

Manu Sporny: Wikipedia has a decent intro to CBOR: https://en.wikipedia.org/wiki/CBOR

johnathan holt: jSON has this thing as un-ordered lists.
… Batteries included semantics with tags - fast good at machine parsing. Settled on CBOR and DAG-CBOR is natively supported in IPLD
… the did document is represented as DAG and the base format is CBOR and DAG-CBOR for deterministic serialization - that will always happen the same way if you approach it with the same constraints. Remove all tags except 42 which is an IANA representation.
… IPLD is stand alone and it just exists and you can resolve it. It is self describing and links are embedded in the hashes in a way that is a noncycilc aproach.
… it has to do with the sub section ___ did Document Maps must be strings integer types … all floating points must be 64 bits. There is obviously conflict in the document - Maps must be strings - this flies in the face of conciseness.
… why not just put an integer instead of long names for key types. This is why it is compact.

Manu Sporny: This is the section of the spec that Jonathan is talking to: https://w3c.github.io/did-core/#cbor

johnathan holt: I didn’t represent that that is why I left the strings in there.
… I did spend the weekend reading the proposed CBOR-LD representation and I have lots of questions.
… nested objects and complex data types and the integrer transformations in a way that is dynamic - not going to any central repository or calling home for the at context.

Orie Steele: cool, i have been experimenting with various CBOR representations of the abstract model of a DID document. How can you get a structure that is bi-directionally transformative between CBOR and JSON.
… you transfer between CBOR and JSON I don’t get it you don’t have any size reductions. The other approach is DAG-CBOR it is the same size form Vanilla CBOR from JSON it is so much more valuable then vanilla CBOR - defined and implemented by the Protocol Labs folks and it is emerging - I have tremendous amount of respect for it there is an IANA registry for CBOR -
… so if you are planning to get the vanilla version of CBOR to be smaller then you have to register a million things to IANA.
… but the size reduction is not significant.
… prior to CBOR being a thing I wrote a ___
… this compressed nquads stuff that you put in - it is about 1/2 the side of CBOR and DAG-CBOR and JSON but you have to transform it to work on it but it has real size recution.
… what else are you doing with CBOR-LD by directional transformation
… when considering the CBOR stuff 1) Bidirectional to compressed representation
… 2) ability to operate semantically on compressed
… 3) ability to perform cryptographic operations on compressed representations
… someone else is going to talk about CBOR LD

Manu Sporny: hopefully folks are getting an idea that we have a number of different ways to do cBOR each one has different properties - I think Orie’s summary of the three things to look out for are great.
… some of them are important from a use-case perspective.
… CBOR-LD is another representation that we are looking at that could give us a way to do express JSON-LD in a semantic binary formats to try and get these formats as small as possible .you care about the file size and you are trying to get it to be small.
… one of the big things you have a string “verifiable credential” you can convert that to a single number - what was single 16 bites you can get down to 1 bite - so huge reduction.

Justin Richer: A small correction: this isn’t just a compression thing, it’s a speed/complexity of processing thing. Reading a string vs. a switch on a byte.

Manu Sporny: this is why people are excited about CBOR - usually when we talk about it many of the CBOR formats that exist at IETF - the are kind of hand created - artisanal CBOR.
… so the upside there - you maximize compression. up side it is a lot of compression downside you need to go through a long process - lots of entity into CBOR registry at IANA - the higher the number the less value to shrink.
… trying to take data formats that have been worked with for years a general semantic compression algorithm to get them as small.

Manu Sporny: https://docs.google.com/presentation/d/199svsHQXt2j1GqcvEXHgpENZIk1AZ53tEcUWuEYSsp4/edit#slide=id.g866980c4a6_0_14

Manu Sporny: this slide shows you the type of compression that you can show.
… the top bar the JSON-LD bar is big 1.2 KB Did documents way in at about that.
… if you look at the thing in red you can look at the thing how much it is compressed
… CBOR-LD is optimized to work on the type of documents that we are working on - tries to compress as much as possibly can.
… you CAN operate on the compressed data.
… you can run code against it - because you don’t have to decompress it. You can store 5x the number of DID docs and VCs in your system if you use this. It does have some interesting overlaps with DAG-CBOR
… we are considering some multi-values __

Jonathan Holt: https://github.com/cabo/cbor-diag

johnathan holt: compression wasn’t my motivation
… I really wanted deterministic mapping
… I was imagining the compaction and conciseness was a reach goal.
… to get to a canonical representation across encoding types.

Orie Steele: the canonical vanilla cbor representation is entirely controlled by IANA…. you can’t get more centralized than handing control over your data model to them.

Wayne Chang: to me CBOR something in between stringy JSON thing - there are some other formats (missed them)
… did method specifying how to compact that like that.

Adrian Gropper: just wondering if that the encoding is not readable - how that would interact user-agents and situations and the verifier needs to trust to the control. There are benefits for Machine processing - are there issues in making it readable?

Wayne Chang: spectrum of full JSON -> CBOR -> protocol formats like avro/protobuf, where do we draw the line?

Dmitri Zagidulin: I wanted to address the question that wayne had about protocol buffers. All compression methods that are relevant as dictionary compression. Does the method pass the dictionary with the object or does it not? __- they compress the with the object protocol buffer by forcing you to do the dictionary out of band. What is remarkable about CBOR-LD that doesn’t include the full compression dictionary but links to it instead. simil[CUT]
… buffer without the need to agree on dictionary

Justin Richer: I proposed that we we write this out in Amsterdam. the purpose is a deterministic mapping - into and out of the abstract data model that IS a did document. What about protocol buffers we can define those the running joke we could have a linked data version of a PDF
… a protocol buffer encoding would have to specify as part of its definitions what that dictionary is and how to specify it - into and out of that specific format.

Manu Sporny: +1 to what justin just said the part of the spec is how to go to and from CBOR in a deterministic way. There is a clear desire to have some CBOR representation - question is which one.
… the fundamental thing is deterministic mapping to and from. We may not need to discuss that - all of us want CBOR and we all want a deterministic mapping. Do we have 1,2, 3 representations that are interoperable.

Orie Steele: we don’t need to discuss deterministic mappings… we have many ways to do this… it about what other features do we get along with deterministic mapping, and who controls how the mapping is handled.

Manu Sporny: to adrian’s point about human readability - but honestly if we are exposing a DID document to a human we are failing. this is a layer above software will translate to human readable.

Orie Steele: CBOR is a trojan horse for centralization if not implemented correctly.

Dave Longley: +1 to Orie, what’s the purpose of using a particular concrete representation – there should be some features/benefits for it.

Manu Sporny: the thing I would like to shift to discussing is whether or not we can go into articulating the format - my concern is that this work is very new and we don’t have a lot of experience. It is going to be a lot of work for us to feel comfortable about it. Ways that we can get to agreement quickly - ways to keep it open to give this time to mature.

Orie Steele: For the record, my tests on CBOR / DAG_CBOR / ZLIB_NQUADS_CBOR are here…. https://github.com/transmute-industries/decentralized-cbor…. entirely experimental, for education purposes only.

johnathan holt: to respond to Adiran re human readability - will be addressed by
… mapped keys to get to short
… vanilla CBOR and DAG CBOR and has an IANA tag name - facilitates the core representations that move between - it is as simple as that having a core representation in CBOR as Lossless. I did listen in on the CBOR-LD calls with IETF
… it is straight forward to hang our hat on - I think what I have put in acomplishes that goal.

Orie Steele: …. we have 4 deterministic methods already :) and they are not interoperable.

Michael Jones: at an emotional level let CBOR be CBOR - you don’t need external representations in the CBOR.

Justin Richer: CBOR-LD could be a thing, but it should be separate from CBOR representation

Orie Steele: ^ same for DAG_CBOR

Orie Steele: and ZLIB_NQUADS_CBOR

Michael Jones: you define particular numbers in particular things. I would oppose trying to add linked data so CBOR if someone later wants to do CBOR-LD later as an extension then knock yourself out.

Orie Steele: lets just let IANA control every aspect of what a did document is… what could be more decentralized than having IANA control everything?

Manu Sporny: stuff you wrote int he spec is fairly ready. The way that is expressed let CBOR be CBOR it prevents DAG-CBOR and prevents CBOR-LD as a valid representation. we have three maybe 4 representation. We are being forced to choose DAG-CBOR based on the spec text.

Orie Steele: we could have 4 representations in CBOR… its just doing to be an embarrassment from a standards perspective… as was argued for key representations.

Manu Sporny: I would like to put into the Spec text that we should put text that says we are working on it. We still have other things we need to do as a group. Put in some text that says we all want CBOR. Say that we are working on it.

Adrian Gropper: What I am hearing is that this is an extremely useful feature - it introduces privacy and security implications but they are on the negative side. WE need to be careful in the spec text - to provide some guidance. This overlaps with this discussion.

Daniel Burnett: trying to wrap up

johnathan holt: the math strings must be - CBOR is so new and so many security concerns. It may be as simple as the integer representation lives in the registry without having to go to JSON-LD
… want it to semantically interoperable and secure

Orie Steele: CBOR introduces security and privacy - relies on registries which is an attack vector. Generally speaking there are privacy and security considerations associated with the registries - note that it is ongoing work. Not close the door to representations.

Kaliya Young: manu - we sure have a lot of questions about CBOR-LD net up at CCG we have a high level and deep dive. would be happy to go into those questions there.

Daniel Burnett: we may be moving in a certain direction with what goes into the spec.
… thank you to those that present in different options.
… any final comments before ending call.
… thank you everyone - thanks Kaliya its not easy to scribe this conversation