DID WG Topical Call on DID resolution contracts and metadata — Minutes

Date: 2020-06-25

See also the Agenda and the IRC Log

Attendees

Present: Brent Zundel, Dave Longley, Amy Guy, Orie Steele, Manu Sporny, Markus Sabadello, Drummond Reed, Jonathan Holt, Michael Jones, Justin Richer

Regrets:

Guests:

Chair: Brent Zundel

Scribe(s): Amy Guy

Content:


1. current status with DID resolution contract

Manu Sporny: https://github.com/w3c/did-core/pull/331

Markus Sabadello: we’ve had several special topic calls on the resolution contracts, and a series of PRs that have been incrementally done on each other
… my understanding is on the last call we had some consensus that what we could do was to have two functions that we define in the resolution contract
… with their inputs and outputs, they would be very similar but one returns a DID doc in its abstract form as an abstract data model
… that’s useful in cases such as did:key and other methods with no concrete representation
… and the second function returns a DID document as a stream of bytes in one of the conformant representations
… the advantage of that is that the byte stream can be parsed and there can be some content negotiation, accepting certain mime types
… there is a PR where I try to capture this
… a few comments already
… one objective could be to see whether this is acceptable for everyone as one step forward
… may not be perfect, could be improved afterwards

Brent Zundel: https://github.com/w3c/did-core/pull/331

Markus Sabadello: https://github.com/w3c/did-core/pull/331

Markus Sabadello: let’s talk about whether this goes in the right direction

Manu Sporny: I read and reviewed the PR
… I think it goes in the right direction
… I’d be happy with it being merged in
… some minor things need to be discussed, but even if those changes are not made I’d be fine with this going in as a good advance on the problem

Brent Zundel: any other comments on PR 331?
… any objections to merging?
… hearing no objections..
… I’ve somewhat lost track of which PR is which for the contract
… How does this compare to the others that are still open?

Manu Sporny: I think this is specifically PR 331 I believe has achieved consensus
… markus put it in two days ago
… we have two options… merge it immediately because we’ve been talking about this for a very long time, it’s gone through multiple iterations
… it shouldn’t come as a surprise
… the other option is we can keep it open for 5 more days and if we don’t see any objections go ahead and merge it
… I didn’t hear any objections
… the safest thing is to signal to the group we’re ready to merge and wait for another 5 days

Justin Richer: I only just now got a chance to read through 331
… thank you markus
… i think it has probably the right content, i’ve not had chance to read it in detail
… but i think it’s probably right
… i would not have organise the two functions next to each other like that, editorially better to define them separately and have their inputs described separately
… otherwise you end up with if it’s this you must use this otherwise don’t do this… it’s messy

Manu Sporny: agree with Justin, and that it’s editorial.

Justin Richer: the functions should probably be defined separately but that can happen editorially after

Manu Sporny: (so we can do that later)

Justin Richer: best as I can tell the content makes sense but i’ve not had chance to look through it in depth
… if i can comment about the two comments that are on there right now
… the idea about implementations not altering the signatures was precisely becasue we do not have any other formal language within this spec that we can lean on at the moment to say you’re not allowed to add another parameter to this function and claim that it’s the same function
… and you’re not allowed to return another output variable and claim it’s the same function
… that’s what i meant about not altering the signature, if there’s a more formal way to do that at a later date i’m fine with dropping that requirement
… this came from early conversations i had with people who seemed to think it would be okay for them to implement this function but also have their own args they would require for their resolver, but that’s counter to interop
… as far as in terms of the mime type for the accept header
… the idea behind that is it’s really intended to be a signal to the resolver

Dave Longley: -1 to additional required parameters, but optional ones should be possible

Justin Richer: and just like with http the resolver is fully allowed to ignore that
… the resolver is required to tell you what the encoding is when it returns it
… i’m fine with it being a should, i do not believe it should be a must
… i think a must is not testable in a reasonably interoperable way

Manu Sporny: Agree that it should not be a MUST.

Markus Sabadello: this current PR reuses one of the commits from one of justin’s earlier PRs, the language that adds things like accept and content type and error
… these metadata properties that’s from one of justin’s PRs being reused here
… and updated a bit also the PRs that justin created regarding the metadata structure, they are still open and should still fit on top of this, this PR doesn’t break or interfere with the metadata PRs
… regarding separating the two functions, I thought about that too, I agree, it’s an editorial change we could make
… this PR started like this because it’s shorter
… and the functions are very similar
… it felt cleaner to do it like this to make it less confusing
… but i can change that later
… your comments about not altering the signature, and if we want to say may should or must for the mime types, I have some thoughts but we can discuss in the PR

Justin Richer: responding to dave’s comment in chat about optional parameters
… optional params are a function (no pun intended) of your implementation language
… because some languages don’t have that capability
… i would argue that what you’re doing is creating multiple functions with different signatures, because as long as you can call it without the optional params, you can call it without a modified signature
… you can add additional functions with optional params on top of that, and if your implementation folds those together and calls it optional params that’s fine, but it’s different functions in that case

Dave Longley: i agree, that’s one way to conceptualise it
… i want to make sure an implementer reading the spec knows that that’s okay
… they might think now I need to go off and do this in another way
… I want to make sure the language allows for that
… i agree the goal here is interop and whatever driver or resolver you’re implementing must require this function with these params and get the output you want
… if you can do something additionally that’s fine, but it’s optional and we need people to understand they’re allowed to do that

Justin Richer: that is the intent of the following sentence, so if that can be reworded to be clearer that’s fine
… I think adding infra might help that

Brent Zundel: the path I see forward, I propose we indicate we’re going to merge this in 5 days to allow cleanup to happen
… it feels like the bulk of it is acceptable

Manu Sporny: brent’s next steps are aligned with my next steps :)

Brent Zundel: there are a couple of tweaks, and then editorial things for a future PR
… how do people feel about those next steps?

Manu Sporny: +1 to those next steps

Drummond Reed: +1

Dmitri Zagidulin: +1

Dave Longley: +1

Markus Sabadello: +1

Brent Zundel: anybody object?

Michael Jones: Which PR?

Michael Jones: I had to join late because the OpenID Self-Issued OpenID Provider workshop ran long

Brent Zundel: pr 331, raised by markus and seems to reflect the consensus of this group
… Is there more resolution contract conversation before moving into metadata?

Markus Sabadello: we have also talked about a function called dereference that would take a did url rather than just a DID but I would propose to address the metadata topic first
… then maybe later revisit the dereference contract

Brent Zundel: that is a good path forward
… once we know what metadata looks like and resolution gets in, building on that might allow the group to come to agreement around those

2. Metadata

Manu Sporny: Current status (at the end): https://www.w3.org/2019/did-wg/Meetings/Minutes/2020-06-18-did-topic

Brent Zundel: several PRs that recommend a few different designs

Manu Sporny: summarise where we ended last - there were two straw polls with no intend to do a resolution
… just to get a sense of where everyone is
… one was..

Manu Sporny: one straw poll was: Use simple string key-value pairs, where the keys are strings, and the values are strings, for metadata expression.

Manu Sporny: the other one was: Use an Infra unordered map for metadata expression where the keys are simple strings and the values are compatible with Infra and serializable as JSON.

Manu Sporny: I don’t think there was any objection to the use of infra

Justin Richer: https://infra.spec.whatwg.org/

Manu Sporny: the debate is over whether using simple strings for all of the values or more complex values, which one is in fact simpler
… and at the end of last call you could see justin very clearly for the values are strings proposal

Markus Sabadello: Current PRs related to the metadata structure: https://github.com/w3c/did-core/pulls?q=is%3Apr+is%3Aopen+label%3Ametadata

Manu Sporny: and a number of people are -1ing it
… the counter proposal really only two people supporting it and people on the fence or no strong opinion

Manu Sporny: once we have what can the right and side be decision, everything else feels like it falls into place
… does anyone feel there’s something more metadata related we need to talk about or do we have decisions on all the other things people are concerned about?
… so the editors can start pulling in PRs to match consensus in the group

Dmitri Zagidulin: i’d like to offer a third option to that of either simple key value strings, and json
… and that is to still have two arguments, a metadata argument and a document argument, but have the metadata only be content type
… and the second argument is a byte stream. This is for resolveStream
… the problem with having a string complex metadata object is that we still need to either hard code or give the parser what content type in order to parse the metadata, which then discovers the content type of the did doc and then you have to parse that
… i’d like to remove that level of indirection
… and have the string content type which determines how to parse the did document, which is now not just the did doc but the resolution result, which includes the resolution metadata, did document metadata and did doc itself

Justin Richer: markus may address this in a bit. that is a method we had talked about on the resolution calls

Manu Sporny: dmitriz, I got lost in your proposal… sounded like it was a part of what we had just resolved?

Justin Richer: and stepped away from
… not the least of which because we also need this kind of structure for input into the functions
… it’s not just about output metadata, we need to be able to send signals in both directions
… it’s potentially problematic to provide metadata alongside the document in a single super structure that is dependent on encoding type
… that adds a lot of the cases dav’es been talking about where you don’t have an encoded version of this, so the concept should map and stay consistent
… ultimately the last thing we want for interop to function is to have this other layer of indirection that dmitri is hinting at which is precisely why the metadata values need to have very strict definitions about what can go in
… not just fields in general, but specific fields

Justin Richer: the example from last week, say we have something timestamped
… there’s no infra timestamp data type that we can use
… there’s no json time stamp
… so whenever we have a timestamp field we need to define for example it is a string that contains an ISO8601 datetime value
… it’s easy to do
… but we have to declare it at the definition of the field
… and enter jsonld folks saying if it’s jsonld all your problems go away..

Markus Sabadello: i understand what dmitri is saying and to have a concept of a resolution result data structure which contains the did doc and metadata, that is what the did resolution spec is currently saying

Manu Sporny: to be clear – JSON-LD folks rarely say stuff like that, and when they do, they tend to be wrong. :)

Markus Sabadello: i would argue that these contracts we’re discussing now for did core don’t preclude that at all
… behind the scenes we can still have this, an umbrella data structure
… the implementation of the contract, there may be a unified data structure with its own mime type and byte stream
… the reason we wanted to do it like this in did core is because there are several ways of doing that
… the metadata could also be inside the did document
… there could be an extended version of a did doc with a special prefix properties that contain the metadata
… or the metadata could be external to the did doc, in http headers expressed using another protocol
… there’s a lot of potential complexity there and we want to avoid that
… that’s why we said the contract would hide that, how exactly the metadata is represented
… whether it’s in the same document or merged into the document or similar
… how these contracts would be implemented that’s something we can still define in the resolution spec or elsewhere

Manu Sporny: to talk about the metadata structure
… there are a couple of use cases i’m concerned about
… specifically on the right hand value side how do we do arrays, maps, complex data structures
… we already know there is a need for for fairly complex data structures on the RHS, eg. if you want to return back debugging information in a structured form as part of the resolution metadata
… the resolver may let you get back debugging information
… it’s the more complex use cases that are the issue
… wrt we are going to need stuff beyond infra, i think that’s true
… clearly datetimes are one example of that
… there are string based data formats we can point to for doing that
… if there was a string based data format that we could point to that allowed us to do arrays or unordered maps, we could use it
… one example of that is we could just say it’s json encoded
… and then lock it into json only and be done with it
… but i think the issue is that cuts off a potential mechanisms we could use instead like having cbor
… all that to say, i think the driving use cases for more than just strings on the rhs are complex data structures that at least some subset of the community believes we need
… and a clear way to express that through infra
… if we need things like datetimes that infra doesn’t have, that’s a case by case basis
… we can point to specs for that
… I’m curious to hear if anyone in the group feels like arrays and maps on the rhs is just absolutely not a requirement
… or if they feel it’s a requirement then there’s another way we can encode it, what spec would we use

Justin Richer: i don’t think it is a requirement
… i think the complexity that’s being pulled in is going to be to the detriment of the utility of this structure
… I think it should be kept simple
… I think manu described a great way to handle everything as strings
… ultimately I think that these are things that are going to have to be serialised in some fashion or another that we don’t want to prescribe and control from the core spec
… into underlying systems
… the idea here was originally to have a lowest common denominator that could be used to express lots of things
… adding this complexity does not get us away from the need for the specific escape hatches that have been described
… I’d argue we should do that for all of the values

Dave Longley: to sum up, we have two competing use cases or groups of use cases
… the set that justin wants to make sure we support - you want to pluck out pieces of metadata and decide if you want to parse them instead of parsing the entire structure
… the other case is a concern around having lots of different ways of parsing the values such that implementers are going to have to implement lots of different parsers with special rules

Justin Richer: @manu because processing a did document and processing a resolution response/request are two different layers of functionality

Dave Longley: the problem with http headers

Justin Richer: (to pre-empt your proposed question)

Dave Longley: i think we need to weigh those two different groups of use cases and issues and decide which one we think is the more common case, and think about it in terms of priority
… is it the case that it’s going to be very common that people will look at any one of these values and parse what they want
… or are some very common? can we cover 80% of cases for expressing content type as a string?
… could we cover these optimisations people want to make by taking some approach like that?
… or is it the case that there are any number of things and lots of people want to pull otu individual values, and pulling out content type isn’t going to help us that much
… i think we need to look at it like that and make sure we’re not repeating mistakes made in the http space
… it is absolutely a concern that if we don’t have a generalised way of parsing this stuff, people will invent all kinds of ways of doing it
… maybe they want to have their own way, but there would be an additional layering of serialisation and parsing, because you can have data travel inside a string
… that is not prohibited
… by more generally saying we’re going to accept other kinds of avlues
… we have to make a decision as to which one of these issues we want to support and is there a path forward that would cover most cases

Manu Sporny: +1 to that
… the other thing I’m a bit confused about is I don’t understand why we have the DID doc, it is a complex data structure, an expectation on the clientside that you’re going to parse it
… it has arrays and maps and things that are more than just strings
… it is just there
… so the client software is expected to deconstruct that in some way

Justin Richer: -1 to this point, this is a false equivalence of different parts of the software stack. It’s a layering violation.

Manu Sporny: in the same way there are metadata values.. i guess i don’t understand what the difference is, why one is a burden and one is not
… especially given the broad implementation of these data structures
… that’s at least one of the sticking points
… a did doc is a complex data structure
… the metadata we already have use cases where we need complex data structures

Dave Longley: i would expect most people that are adding meta data fields don’t want to define their own serialization (and we don’t want them to pick different things)

Manu Sporny: they’re not theoretical
… if using these more complex data types doesn’t prevent any of the use cases that the string only stuff would be supporting i guess i’m not understanding why there’s pushback on the more complex data types

Dmitri Zagidulin: https://w3c-ccg.github.io/did-resolution/#example

Dmitri Zagidulin: here’s a link to an example of some of the metadata we’re talking about
… the method and resolver metadata
… they are already complex structures, nested
… i’m not sure why we’re saying hypothetically we could fit into simple structures, because we already have these requirements
… Also, our two options being presented so far is an undefined key value mapping similar to http headers, and JSON? is that the two? can someone restate the proposals?

Manu Sporny: the two options are a structure that has strings as keys and strings as values
… or a structure that has strings as keys and anything supported by infra as values

Dmitri Zagidulin: what format?
… http header?

Manu Sporny: no, we’re not asserting anything related to http header or json or cbor or any stuff

Dave Longley: values are either strings or they can be more complex (per Infra spec)

Manu Sporny: it could in theory be represented in any of those formats

Justin Richer: they could be represented as a JSON object, as a set of hTTP headers, as a set of parallel arrays, anything really

Dave Longley: complex == lists/arrays, dictionaries/maps, etc.

Jonathan Holt: i’m looking at the link. do we have a running list of the objects required to represent?

Drummond Reed: +1 to using Infra types

Jonathan Holt: certainly did document is one, content type is two, is the rest of the example that dmitri gave did document resolve metadata, method metadata..? are those the attributes we’re trying to represent?

Dmitri Zagidulin: that’s an example from one method

Brent Zundel: the first proposal with string value and property, are use cases that want to use those precluded by infra supported property value?

Dave Longley: strings are supported already
… infra is any type you would see in json

Brent Zundel: proposal one is a subset of proposal two?

Dave Longley: not quite
… the part that justin is concerned with is having to have a parser parse every value
… if the way that we define the parsing rules is such that you parse the entire metadata structure and the values themselves would also be getting parsed
… the values could be any infra type, a string, array, ordered map or whatever, a number
… what justin is looking for is a case where the parser would just parse strings for keys and strings for values, and the parser would not look into those values
… there are use cases where you don’t want to do that additional parsing, it’s an optimisation
… that’s the difference
… if there are cases of metadata where you want to avoid having to parse the value, how many of those use cases are there?
… can we arrive at a compromise where some of those are just special values like content type that we could output in a different place or way?
… to allow for that optimisation
… or is it the case that there’s lots of those and no simple way to allow for that opitmisation

Daniel Burnett: Sounds like you need two value types: parsed and unparsed

Dave Longley: in which case is it better to optimise for that, or to optimise for people inventing serialisation mechanisms

Justin Richer: burn: that would solve the tension but how to indicate that?

Dave Longley: thats the tension we have to try and resolve

Justin Richer: that’s accurate interpretation of my views

Daniel Burnett: justin_r, that is an exercise left for the reader

Justin Richer: I would not qualify it as an optimisation but i do think it is almost a subset

Daniel Burnett: where I am not the reader :)

Justin Richer: i think there are ways you could manage partial parsing, depending on how the metadata types are delivered across the wire
… you could do lazy resolution and other implementation tricks that would solve the case i’m looking at
… the real question that’s driving this point for me is a question if if the additional complexity is actually worth it
… what I have heard so far is it feels like it should be, and my experience things that feel like they should are not usually the best routes to build standard normative language on

Markus Sabadello: i think what just said is a good point
… what should drive this decision is one one hand is the added complexity worth it?
… I think dave said something similar
… are we going to have a lot of metadata properties that have complex values, or a lot that are only strings?

Dave Longley: it’s also a question of where that “added complexity” is – it could be argued it’s in both places here from different perspectives.

Markus Sabadello: the other question is how will the metadata be serialised on the wire
… we’re not defining that here
… we’re just saying there’s input metadata and on the output side there’s resolution metadata and did doc metadata
… but not defining if it’s part of the same data structure or external to it
… manu said something interesting, we already have complex values in the did document
… dmitriz’s proposal is that the metadata is somehow part of the same data structure
… that may be an argument for supporting the complex values
… if we have them already in the did doc, and we we’re going to serialise and represent metadata in a very similar way or as part of the same document, perhaps it’s more intuitive to have complex values since we already have them in the did doc
… on the other hand, i understand the arguments for plain string values because it’s something people are more used to when you think of http

Dave Longley: from one perspective, the complexity is in having multiple different ways of telling (or not telling) people how to express their values where an complex “optimized” parser implementation could to do “lazy” parsing as justin suggests

Brent Zundel: is there a specific PR or set of PRs that proposes specific language we should look at as we explore these questions?

Dave Longley: from the other other perspective, if we never need anything but strings except in a few special cases, parsers can be simpler

Manu Sporny: we’ve been talking about this for 45 minutes.. maybe we can put the same proposals forward and see if folks have stronger opinions

Daniel Burnett: disagreements over “justifying the extra effort” are often challenging because the answer for one implementer may differ from another. If we don’t get agreement, the typical standards solution is to “make the simple simple and the complex possible”, suggesting we focus on the simple case (unparsed string) but enable the multi-types with a more-involved syntax.

Manu Sporny: from my understanding there’s still only two proposals on the table
… i want to see i f anyone’s mind has been shifted

Dave Longley: (but in reality, the likelihood is that the same parsers will be used in both cases for both parsing DID documents and the meta data)
Dave Longley: (i.e., JSON.parse)

Manu Sporny: and spend the rest of the time figuring out what to do
… just straw polls, attempt to get a reading from the group

Proposed resolution: Use simple string key-value pairs, where the keys are strings, and the values are strings, for metadata expression. (Manu Sporny)

Dmitri Zagidulin: -0 I genuinely do not understand either proposal. Both sounds incredibly vague, since we’re not actually specifying the format of the key-value pair presentation.

Drummond Reed: +1

Manu Sporny: -1, this will result in something /more/ complex.

Orie Steele: -1

Justin Richer: +1

Michael Jones: -1

Dave Longley: -1 this will actually increase complexity, though i think we could find a compromise for certain use cases

Orie Steele: (same reason as before, it will lead to string encoding of complex structures)

Michael Jones: -1 People will need to pass JSON. Limiting it to strings just means that the JSON will be passed as a base64url-encoded string.

Jonathan Holt: -1 i can think of reason why this won’t work, need more complex

Proposed resolution: Use an Infra unordered map for metadata expression where the keys are simple strings and the values are compatible with Infra and serializable as JSON. (Manu Sporny)

Manu Sporny: +1

Dave Longley: +1

Dmitri Zagidulin: +0 - still do not understand it. (How do we get this structure off the wire?)

Orie Steele: +1

Jonathan Holt: -1 i need to understand infra more, can’t find any examples

Michael Jones: What is Infra?

Justin Richer: https://infra.spec.whatwg.org/

Brent Zundel: a couple of questions about infra
… it is a spec that defines stuff you can put in specs

Dmitri Zagidulin: dlongley: so in both cases, the metadata is a json string object?

Manu Sporny: it’s an infrastructure spec, allows you to talk about bytes and strings and links and stacks and queues and maps
… with a very specific concrete definition for all of those things
… concrete meaning all w3c specs are being urged to start using this language so we’re more precise about what a string or a map is

Markus Sabadello: 0, no strong opinion, i see roughly equal advantages and disadvantages in both..

Drummond Reed: 0

Dmitri Zagidulin: +1 to second (if we’re saying both proposals are just JSON, and are talking about complexity of values).

Jonathan Holt: any examples of where infra is being use?

Michael Jones: +0 It sounds like this is at least better than strings, but I’d like to understand the details before going +1

Manu Sporny: jonathan_holt, in most Web specs these days…

Daniel Burnett: general statement.. meta statement.. i’ve seen conversations like this before, frequently
… i’m sure others have as well

Dmitri Zagidulin: jonathan_holt - the WebAuthn and WebCrypto specs are examples of Infra usage

Daniel Burnett: if we can’t agree on one approach, the standards usually end up going to try to figure out what the simplest cases are and giving them the simplest syntax, then creating something uglier but still possible for greater complexity
… one of the challenges will be what is the simple cases? selecting specific properties?
… i think there will be difference of opinion
… that’s probably where we’re going to have to end up
… the next discussion may need to move in the direction of understand what constitutes the most common and simplest cases

Justin Richer: this really is about something above the transmission layer
… able to be separated from the did document itself
… it may be stored and transmitted along with the did doc, but ultimately the caller of this function is going to want access to this stuff separate from understanding what’s in the did doc at all
… the fundamental feature we have to have is the fields are going to have to declare their types
… and fields are goign to have to be able to declare a specific parsable format beyond just saying oh it’s an infra structure deal with it

Michael Jones: +1 to metadata fields being typed

Dave Longley: wrt to parsers.. i would expect in reality that the parsers people are going to use are going to be same ones they use for parsing the did doc
… that’s not to bring up a layer violation, but in reality if you’re going to do json.parse on the did doc, you’re going to use json.parse on other layers as well

Justin Richer: -1, did document can be in CBOR but metadata structure shouldn’t have to be

Dave Longley: i’m expecting the same parsers to be used as opposed to custom ones that we should avoid

Brent Zundel: what steps can we take before we talk again?

Dave Longley: that’s true, justin, but i would expect people to use an out of the box JSON parser for that too vs. writing a custom one

Manu Sporny: noting time, i think we’re going to have to keep talking about this
… still confusion
… dmitri said there’s confusion on his side, some questions, and jonathan_holt and mike have noted that they need to be able to look into infra to have a stronge ropinion
… everyone’s homework is if you have not looked at infra please do

Markus Sabadello: There are open PRs for supporting string values only, someone should propose an alternative PR for complex values.

Michael Jones: What’s a link to the infra spec?

Jonathan Holt: Orie , thanks this example helps!

Manu Sporny: and engage in issues if it makes sense

Drummond Reed: The Infra spec is very solid. I like it a lot.

Dave Longley: selfissued, https://infra.spec.whatwg.org/

Manu Sporny: feels like the next special topic call or virtual f2f we’ll need time on this

Justin Richer: dlongley: I would expect us to put guidance on the metadata structure root definition that says properties should use json serialization

Brent Zundel: those proposing more complex value types, we don’t have a PR that represents that view, that could be beneficial
… we’re at time
… thanks!

Dave Longley: justin_r: understood, my only worry there is that the common case will be to just do lots of “double parsing”

Justin Richer: dlongley: I think double parsing is unavoidable tbh