DID WG Topic Call on CBOR and the DID Spec Registries — Minutes

Date: 2020-09-10

See also the Agenda and the IRC Log

Attendees

Present: Brent Zundel, Amy Guy, Dave Longley, Manu Sporny, Orie Steele, Markus Sabadello, Jonathan Holt, Dmitri Zagidulin, Daniel Burnett, Adrian Gropper

Regrets:

Guests:

Chair: Brent Zundel

Scribe(s): Amy Guy, Dave Longley

Content:


Amy Guy: Zakim: start meeting

1. CBOR representations and the registries

Orie Steele: we had the f2f in amsterdam and decided on the registries to help with interop
… part of what they are supposed to do is support bidirectional lossless transformations between representations
… that has not happened for any of the representations
… one of the challenging aspects is what data is in the registries, who is updating the registries, how are representations connected to the registries
… all of these questions are largely unanswered by the WG
… one piece that is defined is how you use JSON LD to define your document
… there’s language about how to do it with JSON LD extensions, but you don’t if you’re using the core context
… the purpose of this is to think through how we handle bidirectional lossless transformations between JSON-LD and CBOR
… would be a good place to start. JSON-LD is the most well defined thing so far
… the purpose of this call is to talk about CBOR

Manu Sporny: I’ve got a set of proposals
… the CBOR stuff has been floating out there without any progress. I’m very concerned about our ability to get to anything
… Some things are completely out of reach because of how busy people are and the other things the group needs to do
… there’s very strong interest in a CBOR representation or more
… experiments continue
… we’ve seen a couple of experiments on CBOR representations, they don’t hit the level of maturity that we’d expect to see
… I know no DID method that uses CBOR as the primary representation mechanism, still unsure what we’re doing for tags
… still unsure what the bidirectional lossless conversion requirement means
… at this point I think we’re out of time
… to the point where we can come to consensus on one CBOR representation
… my suggestion is we pull back
… say yes there’s interest and ongoing experiments, here they are, but the group didn’t have time to come to consensus on any one representation
… we expect the work to continue after the spec hits recommendation
… I suggest we modify the language in the spec right now to say that
… if we don’t do that, whoever argues against doing that is going to have to write the test suite tests, all the extensions to the DID spec registry, and we’re going to have to see that happening at a rate we haven’t seen for anything
… it’s doable, but I don’t think anyone is working on it
… we need to figure out our priorities as a WG. My suggestion is we cut our losses on CBOR
… and focus on the other sections that need attention

Jonathan Holt: Where to start..
… The necessity of having a 1-1 mapping or lossless translation.. I’ve demo’d it using DAG CBOR and the serialization strategies with specific constraints in the CBOR section
… that facilitates canonical deterministic CBOR so it’s reproducible into different serialization formats
… that exists, in the native way of representing it in IPLD
… Getting between JSON and YAML
… as long as there’s no edge cases with large integers - that’s still an issue unresolved in DAG CBOR
… but seems to be covered by the attributes in the DID Doc being byte strings. IF there’s a byte string representation in the registry, those map natively on top of natively
… on top of CBOR with lossless translation
… I can demo that right now
… It still opens the door for CBOR-LD with ints as keys
… either we do that in the registry, we just state what an id field number should be in the registry
… or we wait until CBOR-LD is a thing
… There are.. it’s CBOR, it’s very well specified
… method of data representation
… The challenge you’re alluding to in the DID spec registries is we need to have public key jwk represented in CBOR
… or even the pem representation
… but those things are really easily 1-1 map into CBOR
… all of these examples, you can do that in binary in CBOR with a tag
… a tag facilities the semantic interoperability with batteries included functionality

Markus Sabadello: one thought i had was every time we talk about bidirectional translation between representations it’s not entirely accurate
… what we’re really doing with the different representations with the registries I think is to produce and consume different representations
… if I want to convert between the JSON representation and the CBOR representation I first consume the JSON into the abstract data model
… and then from there I produce a CBOR representation of the same data model
… I think that is how it really works
… with regard to CBOR… are we in agreement which one of the approaches we originally wanted to have? I always hear there are 4 different ways of doing DID documents in CBOR
… for me that’s part of the confusion that there seem to be so many different ways

Orie Steele: I don’t necessarily agree with what markus just said about the abstract data model, I don’t know how that in practice is achieved, I don’t understand
… from a technical standpoint it is possible to have bidirectional lossless transformation between JSON, JSON-LD and CBOR
… if both JSON documents contain a context, and the CBOR is a translation of vanilla JSON
… if they are true it’s very trivial
… if those things are not true then the process of managing this is somewhat related to what markus says, is you have to pull out your abstract data model transformer software (which nobody has written) and use that to convert the different representations
… I’m strongly against having that conversation in this WG
… what we have today is off the shelf software that supports CBOR, DAG CBOR, JSON, JSON-LD
… for this call we should focus on what do we mean when we say the word CBOR? How does that map to what we have in JSON-LD in the registries today?
… we should not talk about vanilla JSON because the registries don’t do a great job of that
… we have lots of documentation around JSON-LD
… if we’re going to add similar support to CBOR we should try and do that mapping with something that has that definition around it

Manu Sporny: responding to jonathan
… If we continue to miscommunicate about what the issue is..
… the issue isn’t “is it possible to represent the abstract data model in CBOR?”
… what we’re saying is we need to write tests and spec tests and entries in the DID Spec registries
… and nobody has been working on that consistently enough for us to feel comfortable enough saying this is how you encode the did doc in CBOR
… we’ve done it for JSON and JSON-LD and nobody has for CBOR and we’re out of time
… by the end of this month we’re going to be well into implementing the test suite

Orie Steele: WE HAVE NOT DONE THAT FOR JSON!… unless you count doing it for JSON-LD as doing it for JSON :)

Manu Sporny: i have never seen a group pull off a discussion like this when there are 4 options in front of them in 20 days

Dave Longley: yes, doing it for JSON-LD does it for JSON

Manu Sporny: to be clear this is about someone stepping forward and going I will get the group to agree on one normative representation for CBOR
… among DAG CBOR, CBOR-LD, compressed nquads stuff, direct JSON to CBOR translation
… whoever says I’m going to volunteer to do this needs to do in the next 20 days
… Knowing all of the discussions we need to have to get there, I doubt it will be successful
… and it will take the focus of the group away from what they’re doing just to do that

Dave Longley: note: even if the CBOR work gets done, the group has to agree on whatever gets done as well, which is a high bar

Manu Sporny: Someone needs to step forward to say they will get this done by the end of September or it gets cut
… we should be making that decision now
… everyone wants CBOR to happen, it’s really interesting, the group is interested in pursuing and you should keep track, but the group as a whole is not going to assert any particular representation
… What I’d like to do today is pass some proposals effectively stating we think CBOR is important
… here are the things going on in CBOR land, we were unable to come to consensus on a specific representation

Jonathan Holt: Can I do a demo? (screenshare)
… here is some vanilla JSON that i’ve piped to jq for easy readability
… here it is in IPLD, stored natively as CBOR
… piping that to hexadecimal
… i can convert that to CBOR to JSON which is a ruby functionality
… I can pipe that to pretty.. there’s the diagnostic mode
… and I can do it to YAML
… there it is, serialization with specific constraints, to JSON, CBOR and YAML properly ordered
… what’s the problem

Orie Steele: the problem is defining which of these representations we want to use and how it’s related to the registries
… the simple one is where you have generated your CBOR from vanilla JSON
… if you were to say “yep that’s the one we’re going to talk about” I’d say yep, that works
… if we agreed on the WG call today that the only version of CBOR we’re going to discuss is the version that wraps around vanilla JSON
… the demo you’ve shown is great example that there’s tooling out there to support all of the CBOR
… the problem is: do we have to keep talking about all of them? Or do we have a way forward where we limit to one?
… or do we say CBOR is great, it would be great to try to half ass this, let’s give it a non normative note where we cover as much detail as we can, and preserve CBOR and work on CBOR in a place where w’ere not going to have to write normative spec text for CBOR for everyone of those statements we saw on the last call
… my preference would be to create a NOTE about CBOR and pump all of the valuable stuff about CBOR into it
… describe all of the things jonathan just showed
… create a NOTE that won’t hold up the WG
… and put all that stuff there
… and not have everyone be wrapped around how to support CBOR in the spec today
… who is going to do the work for the test suite? for all of those normative statements for CBOR?
… right now you’re kind of the CBOR guy… we should just be honest about how much resources we have to do CBOR right
… the thing I’m afraid of is doing CBOR in a way that isn’t right or doesn’t do justice is dangerous and a mistake
… i’d much rather preserve the ability to address CBOR appropriate in its own place without trying to rush that into the spec

Manu Sporny: I couldn’t tell from jonathan’s demo if that was a DAG CBOR thing or a plain vanilla conversion of something to CBOR and back

Jonathan Holt: it’s DAG CBOR, on IPLD

Manu Sporny: I want to get a proposal on the table that the group adopt DAG CBOR
… can we put that out there? that’s what you’re arguing for right?

Jonathan Holt: that’s what i prefer. that’s how I wrote the section. I needed to represent CBOR in general
… I like CBOR-LD

Manu Sporny: we have to pick one if we’re going to write normative text
… I’m putting a proposal forward that we use DAG CBOR
… would that work for you jonathan?

Jonathan Holt: yes, I’m mostly focussed on DAG CBOR and I left the rest of the document to be open ended for people to adopt other subsections including CBOR-LD

Manu Sporny: but we have to specify this format to do anything normative around it

Proposed resolution: DAG CBOR will be the CBOR format that the DID WG will specify. (Manu Sporny)

Orie Steele: -1 and I will be same for Pure CBOR and CBOR-LD

Dave Longley: -1

Manu Sporny: -1 we need to consider other formats.

Amy Guy: -0 seems like we’d be closing a door rather than opening one if I understand right

Brent Zundel: +0

Jonathan Holt: 0, reservations

Dmitri Zagidulin: +0

Daniel Burnett: +0

Markus Sabadello: -1 my sense is other CBOR representations make more sense

Brent Zundel: clear that no-one is in favour
… proposal rejected

Manu Sporny: let’s do it for the other formats

Proposed resolution: CBOR-LD will be the CBOR format that the DID WG will specify. (Manu Sporny)

Manu Sporny: -1, too early

Orie Steele: -1

Dave Longley: -1 too early, but interested

Dmitri Zagidulin: specify in a separate note or the main spec?

Jonathan Holt: -1, too early

Manu Sporny: main spec

Brent Zundel: +0

Dmitri Zagidulin: -0 (agree, too early)

Amy Guy: -0

Markus Sabadello: +0

Brent Zundel: no support for this proposal

Proposed resolution: ZLIB_NQUADS will be the CBOR format that the DID WG will specify in DID Core. (Manu Sporny)

Orie Steele: +1

Manu Sporny: -1, too early

Amy Guy: -0

Dave Longley: -1 too early

Jonathan Holt: -0, not familiar

Brent Zundel: +0

Markus Sabadello: +0 haven’t seen a spec or code for this

Orie Steele: please don’t hold up the WG with my +1

Brent Zundel: minimal support, but still a rejected proposal

Proposed resolution: Plain JSON->CBOR conversion will be the CBOR format that the DID WG will specify in DID Core. (Manu Sporny)

Manu Sporny: this next one means take the JSON strings, shove into CBOR, do no magic, spit them back out

Orie Steele: +1 maximum interop potential

Manu Sporny: -1, doesn’t take advantage of CBOR

Dave Longley: -1 don’t see the utility of this

Amy Guy: -0, no magic

Dmitri Zagidulin: -0 don’t see the benefit

Markus Sabadello: -1

Brent Zundel: +0 would be easy to finish specifying

Jonathan Holt: 0, q+

Jonathan Holt: that is a good starting off point, that’s what I did in the CBOR section to get us started
… was to do byte string representation
… you’re representing the left hand side, the vocabulary
… the additional functionality of IPLD is you get the multicodecs of the linking
… it’s this type of link
… it’s hex, base 58, etc
… that is a starting off point
… it doesn’t have the maximum compression
… there’s multiple different ways of approaching ints as keys, including representing that in a registry of agreement of what a key int would represent

Dave Longley: we shouldn’t be specifying a standard that has no consensus – we have interesting uses of technology, but nothing to standardize yet

Jonathan Holt: but I do have reservations about, even myself, about only specifying DAG CBOR as being the CBOR representation
… what you do benefit from in CBOR is the ability to semantics of the tagging with expressivity
… as long as you have specific constraints so you get deterministic canonical CBOR
… with the one caveat being large floating points
… that’s the edge condition with friction

Manu Sporny: based on the proposals, where we are is the only consensus is we have no consensus
… if we’re going to focus on anything, there’s no consensus on what to focus on
… dmitriz did mention that it means there’s not a strong interest in CBOR
… I don’t think that’s the case.. I want to run another proposal to see if there’s a strong interest
… we want to see it, we just haven’t’ figured out which mechanism we’re going to use to do it
… doesn’t seem like anyone wants to pick one of them

Brent Zundel: request from the group - we’ve had proposals with very limited support. I would like to see a proposal for what concretely we as a group can do to move forward on this
… we need to know what our next steps are and who is going to be taking them
… and what will happen if those steps aren’t taken

Dmitri Zagidulin: based on the polls during this call I’m not seeing strong interest in standardising a CBOR representation
… we think it’s neat and everyone is curious to experiment with implementing it but we don’t have enough experience to specify it
… one of the most interesting developments in this WG in the last months was the decision to support multiple representations
… the implication of that is resolvers will need some sort of content negotiation mechanism
… that in itself should comfort those interested in CBOR but don’t’ want to specify right now
… content negotiation gives us future proofing
… when we have enough knowledge to specify CBOR our resolvers will still be able to use it
… maybe there’s no urgency to specify CBOR, given we have content negotiation
… we can specify after the fact

Jonathan Holt: the challenge of conneg is getting to a deterministic encoding between representations
… what I showed in the demo is it’s possible with CBOR
… it has maximum expressivity with the constraints we’d need
… in a way that is safe and we use this in IPFS and IPLD as being a deterministic hash based linking of the world’s data
… that’s the purpose of IPLD
… to link data by hashes
… the functional requirements of the DID doc is to do that
… to be able to input and output into different serializations including raw JSON, JSON-LD, and including others we haven’t even mentioned including YAML
… that all is functional today
… this isn’t magic code
… it’s off the shelf CBOR
… and pretty printing
… and the CBOR to YAML
… I’m still struggling as far as what the block is
… I think this solves a lot of our problems of serialization into and out of different encodings

Brent Zundel: my ask of jonathan would be - how can you formulate that into a proposal that gives us our exact next steps, and add yourself to the q

Manu Sporny: I have a series of proposals with concrete next steps
… I want to get dmitri’s implied proposal on the table
… proposal is that we have a strong interest in CBOR
… if you +1 it, say how strong your interest is

Proposed resolution: The DID WG has a interest in CBOR as a representation format. (Manu Sporny)

Manu Sporny: mild, strong, don’t care, no interest whatsoever

Orie Steele: +1 strong

Manu Sporny: +1 strong

Markus Sabadello: +1 strong

Jonathan Holt: +1 strong

Dave Longley: +1 strong, just too early to figure out what to actually standardize

Brent Zundel: +1 strong

Dmitri Zagidulin: +0 (mild intellectual interest. not enough to put implementation pixels on screen. Also, we get a CBOR implementation for free by having a JSON-LD rep. (JSON-LD -> CBOR-LD, deterministic)

Manu Sporny: the followup proposals I have is basically to add a note in the CBOR section stating that there was interest in CBOR representations but that the WG ran out of time to come to consensus on a specific one. Another proposal is to refactor the section to not be normative
… third is to talk about the various different types and point to them

Justin Richer: I’m not able to follow the audio but I want to note that resolvers ALREADY need content negotiation

Manu Sporny: we could point to Orie’s note, or we can point to the specs themselves, or Jonathan’s demo, or things of that nature in a non normative fashion, just that we were aware of these things
… looks like there is strong interest
… justin_r in chat is correct about content negotiation

Markus Sabadello: support what dmitri has said
… this is very good we have this architecture with the core data model and different representations and content negotiation
… one additional reason why it would be nice at some point to have CBOR is to show that this actually works, this model
… right now we have two representations that are well understood
… however they are also very close to each other
… this other discussion right now on how JSON and JSON ld can be interoperable, or byte to byte equivalent
… how JSON parsers can parse a JSON-LD representation
… they only do a bit of a job to demo this general architecture with multiple reps
… if we had a third different one it would be valuable
… if we don’t get it in right now for CR at least we have to make sure that future proof enough with how th eregistry works and how content negotiation works so in the future we know how to add additional representations

Jonathan Holt: the specific next steps would be to continue to show the seamless lossless translation between the CBOR representation and different formats
… the struggle i have of the registries is a 1-1 mapping which doesn’t exist for the publickey jwk
… it’s fundamentally a different method of representation that requires a transformation of the data
… in the test suites where the rubber meets the road to show the seamless encodings in order for it to be functional
… and that it’s fit for purpose for what we’re trying to accomplish with the core data model

Daniel Burnett: jonathan, the issue here is over the past 6 months that you’ve been arguing for this you haven’t managed to convince people to select one
… each time you say we have CBOR working, we have a CBOR representation, I can show this… every time you say that you’re implying we have agreement on a version of CBOR which we don’t
… which is precisely the problem as made clear through the straw polls that just happened
… there is not agreement on one specific version
… I would strongly suggest we stop talking about there ‘being CBOR’ because practically speaking there isn’t
… let’s focus on CBOR in general and what we can make statements about and what we can’t. We can only put in normative statements we can write tests for
… the last comment - there is no shame in creating a note and starting to write things there
… i’ve been in a number of WGs who did that. Started working in a NOTE that wasn’t ready for the spec. Time went on, work continued, by the time you came to the next version the NOTE was ready with material that could go into a normative specification
… I have also seen companies who implement things from NOTEs, point to them, and say we implement this already
… there’s no negative to that
… let’s please, jonathan, let’s stop saying there “is CBOR”, because there is no single CBOR on which we have agreement

Manu Sporny: the first proposal is to add a note to the CBOR section stating strong interest in CBOR but ran out of time

Proposed resolution: Add a note to the CBOR section stating that there was strong interest in CBOR representations, but the WG ran out of time to come to consensus on a specific representation. (Manu Sporny)

Manu Sporny: +1

Jonathan Holt: -1

Orie Steele: +1

Dave Longley: +1 provided someone steps up to do the work

Manu Sporny: I will volunteer to do all of these

Brent Zundel: +1

Adrian Gropper: +1

Manu Sporny: jonathan could you highlight your -1?

Daniel Burnett: +1

Orie Steele: lol, how will anyone write tests for an undefined representation

Daniel Burnett: exactly, test suites for what

Jonathan Holt: This seems like “corporate jousting” to negate a possible solution because it’s better faster and easier than other ones.

Justin Richer: -1

Manu Sporny: This doesn’t negate anything, this adds a NOTE.

Justin Richer: (if that’s still running, not sure)

Brent Zundel: It seems you’re saying we should wait for tests, but we don’t know what the tests will be. What we need for CBOR is to know which normative statements we’d make and then we’d write tests for that.
… I’m not understanding what you mean by saying we should go to the test suite first.

Jonathan Holt: Really the functionality of the interoperability is what I’m thinking about. Ordering by lowest byte, etc. those are testable. One of the proposals above is using the JSON-LD representation on top of CBOR, it gets us beneficial occupancy and then easing into it like its JSON.
… It facilitates the CBOR discussion and the future benefit of including things like CBOR-LD when that’s a thing.

Justin Richer: my -1 is due to such a statement being a weak hand-wave instead of actionable specification

Jonathan Holt: The statements I made should be testable.
… And I’d be happy to contribute to the development of those tests.

Manu Sporny: I can continue with the other proposals just to get feedback on where we are.
… This only leaves us at a stagnant point – the only thing that can move us forward is if Jonathan signs up to do the work.

Jonathan Holt: I’ve been holding back on the DID spec registries because it seems to be a dead end
… I mentioned in the other issues, the cryptographic golem of creating a name, work that doesn’t need to be done
… I’d be happy to create test suites for CBOR sections I wrote
… to test the operability of it
… and to fine tune the language to make sure it’s crystal clear

Manu Sporny: I’m trying to figure out if it would be easier to put a proposal forward removing CBOR entirely

Proposed resolution: Refactor the CBOR representation section of the specification to not speak normatively about any specific representation format. (Manu Sporny)

Manu Sporny: oh no also not what Justin wants

Manu Sporny: +1

Orie Steele: +1

Justin Richer: -1

Jonathan Holt: -1

Amy Guy: -0 that seems a bit confusing tbh

Adrian Gropper: +0

Dave Longley: -1 can’t see the point of that

Daniel Burnett: -1

Justin Richer: (because this section should be fully normative)

Orie Steele: no, no point.

Orie Steele: I am now in favor of removing CBOR competely.

Proposed resolution: Add a non-normative statement to the CBOR section about the various types of CBOR representations that are being developed and link to each representation type. (Manu Sporny)

Manu Sporny: +1

Adrian Gropper: +1

Jonathan Holt: -1

Orie Steele: -1

Manu Sporny: +1

Markus Sabadello: +1

Brent Zundel: no clear support for any of the proposals

Dave Longley: +0 can’t figure out he difference between this and the last one

Brent Zundel: what we need before CR is not only language in the DID Spec that talks about a CBOR representation, but language in the DID SPec Registries that tells anyone what they need to do when adding new properties to the registry
… so that property can be properly supported in JSON, JSON-LD and CBOR

Proposed resolution: DID Core will NOT define CBOR and DID Spec Registries will not define CBOR. (Orie Steele)

Brent Zundel: that language, for CBOR, does not exist

Manu Sporny: +1

Orie Steele: +1

Brent Zundel: we cannot got to CR and claim support for CBOR without that language

Daniel Burnett: there’s not enough time to get into more discussion
… that’s what we need. this is no longer what would be nice.. without concrete language in the spec, not tests first, without concrete normative language in the spec that gets approved, those sections will eventually get removed
… that’s just the way it is

Dave Longley: +1 except that maybe we should say that there’s a way to add another representation to the DID spec registries later

Daniel Burnett: saying you have an alternative is not the same as saying the world agrees
… standardization would be very easy if everyone just did what I wanted
… you have to write text, and you have to get agreement

Daniel Burnett: agree with dlongley

Brent Zundel: orie threw a proposal in at the last moment, you’re welcome to indicate for that proposal but we don’t have time to discuss
… thanks everyone. Meeting ends