DID WG Topical Call on metadata — Minutes

Date: 2020-07-16

See also the Agenda and the IRC Log

Attendees

Present: Markus Sabadello, Michael Jones, Dave Longley, Brent Zundel, Manu Sporny, Orie Steele, Jonathan Holt, Dmitri Zagidulin

Regrets: Amy Guy

Guests:

Chair: Brent Zundel

Scribe(s): Dave Longley, Manu Sporny

Content:


1. Metadata

Brent Zundel: Welcome to the special topic call.
… The topic is Metadata.

Brent Zundel: https://github.com/w3c/did-core/labels/metadata

Brent Zundel: The link shows all the things in DID core currently labeled metadata. Our discussion can touch on all of this. I think it’s more productive if we focus on specific PRs.

Manu Sporny: https://github.com/w3c/did-core/pull/347

Manu Sporny: Two days ago I put in a PR to hopefully form the basis of the discussion for this call.
… Adds some language to the spec on metadata. I tried to stay very high level in general.

Manu Sporny: https://pr-preview.s3.amazonaws.com/w3c/did-core/pull/347.html#metadata-structure

Manu Sporny: In a combination of Markus and Justin’s PRs there was a section called Metadata Structure.
… Link in IRC on the proposed text. The current text has been moved over to the Infra spec.
… It’s a specification infrastructure spec. Other specs have adopted it, it talks about abstract data types like strings, maps, lists, booleans, etc. It talks about how to make them real/concrete and so on.
… People seem to like using it and things are clearer what we mean by lists and maps and so on.
… Based on that I put together a PR to talk about the metadata structure in the same way. I tried to strike a compromise with what Justin was asking for with strings as keys and strings as values – and others wanting support for more complex values.
… The metadata structure proposed says you should use strings for values but you can use others and those have to be lists, maps, booleans, numbers, or null.
… If you have to use something other than a string for the value, you can use these other things.
… There are two examples on what this would look like and shows what it would look like if you encoded the structure to JSON.
… Thoughts on the direction?

Markus Sabadello: I think this PR is a great basis, I really like it. I like the compromise. I think it should be able to support all the use cases we’ve heard. I want to point out one thing which is that what Justin would say and I also believe, is that we’re specifying this format of metadata in an abstract way on a certain level.
… We are not defining if the metadata goes into the DID Document or if it is external or if it gets sent in a protocol or whatever. Just like with the contract we’ve defined for the other data, we aren’t saying too much here. I think that’s completely fine.
… So +1 on the PR.

Orie Steele: +1 to JSON, with a recommendation to keep things simple

Michael Jones: As I said on the previous time we discussed this, I believe that the metadata values should be arbitrary JSON objects. There are plenty of times you want to use the value true or have lists of things. There are reasons that the connect metadata and the oauth metadata are not just strings. We’ve found it to be very useful. I don’t know why we would choose string only.

Manu Sporny: The current specification does not say string only. That may be a misunderstanding.
… The PR says the value “should” be a string, but you can use any kind of JSON encodable value.
… The PR says it must be possible to losslessly support transformations across all representations. And, Mike, there is an option to use all values that can be serialized to JSON and back (bools, lists, numbers, etc.).
… The PR is trying to strike a balance between what Justin wants and what others do – it strongly recommends strings but you can allow other types.

Michael Jones: But that is splitting the baby in half. The text is not pejorative. Now people will write strings when another value would be better. We should strike that sentence and say it just has to be serializable with strings, bools, arrays, maps, and null.
… Don’t bias it. That’s not good spec practice.

Manu Sporny: Could you state that in the PR and specifically call out Justin. This is a discussion you need to have with him.
… Justin is the one who is stating very strongly that values should be strings.

Michael Jones: I don’t view it as optionality, it’s generality.

Manu Sporny: The person that needs to be convinced is not me, it’s Justin. I’m fine with the text as is or as more general as you’re saying.

Michael Jones: Ok.

Dave Longley: I’m fine with the text as is, and am fine if we make the changes that Mike is suggesting. People are going to want to support complex types, but do understand desire to keep things simple if you can.

Orie Steele: I agree with dlongely

Dmitri Zagidulin: +1 to what dlongley says.

Brent Zundel: I encourage people to make comments, add +1’s -1’s make specific concrete suggestions as needed.
… I feel like those on this call are in favor of #347 which is the latest from Manu.
… Does anyone have a proposal on where we can go next, there’s more metadata discussion to have.

Dave Longley: the text could say “If you need a more complex type, don’t invent your own, but try to keep it simple”

Dave Longley: if Justin isn’t ok with Mike’s simpler change.

Manu Sporny: I don’t know if you’d object to the current PR as is.

Michael Jones: Yes, I would object now, I’m writing the comment as we speak.

Manu Sporny: If we changed it to remove the “should be string” you’d be fine?

Michael Jones: That’s correct.

Markus Sabadello: I wanted to respond to Brent – if we’re done with this PR the other main topic would be “what metadata are we going to define”, what properties do we want to talk about.
… One is called define DID Document metadata. Another issuer is about the “updated” property. We could talk about those things.

Dave Longley: I don’t have comments on the other PR, I know there are other outstanding issues, I’d rather take those issues and get those knocked out and closed.

Orie Steele: yes, +1 to reviewing URLs :)

Manu Sporny: I agree with that, should be issue driven. I’d like to start with the “updated”/”created” one that should be straightforward.

Manu Sporny: Justin’s Metadata PR – https://pr-preview.s3.amazonaws.com/jricher/did-core/pull/253.html#metadata-structure

Manu Sporny: I wanted to make sure we gave enough time to the other metadata PRs. Since Justin isn’t here, I’d like to see if there is stronger support for that other PR from him.

Manu Sporny: Here is 347 preview – https://pr-preview.s3.amazonaws.com/w3c/did-core/pull/347.html#metadata-structure

Manu Sporny: I wanted to make sure people looked at (drawing attention to it since Justin can’t do so since he’s not here).

Jonathan Holt: Looking at the Infra specs, regarding maps, the alias is for an ordered map.

Dave Longley: it’s based on key insertion order

Dave Longley: which is often irrelevant here.

Manu Sporny: Yes, it is deterministic and it’s the most widely implemented thing.
… I think it’s a “first inserted wins” order. It’s based on the order that is inserted.

Dave Longley: If you are talking about the serialized version of a map, the order doesn’t matter… don’t know if we need to make a comment to that… ordering isn’t that relevant to us.

Michael Jones: Did you really mean #253? That’s about DID dereferencing and DID resolution …

Manu Sporny: That’s where the metadata section lives that Justin wrote.

Orie Steele: yep, its really hard to review that PR

Manu Sporny: So that’s the one I believe he wanted us to take a look at, so you’re right that it covers other things, but it also has the metadata structure in it.

Markus Sabadello: I suspect that Mike might be correct – the #253 does have metadata but the other one #299 only adds the metadata stuff. That’s the more recent one.

Markus Sabadello: https://pr-preview.s3.amazonaws.com/jricher/did-core/pull/299.html#metadata-structure

Markus Sabadello: I think Justin would have liked us to look at that one.
… Rather than #253.

Manu Sporny: Yes, you’re right, Markus, we should look at the newer one.
… The question to the group is – what language do you prefer, the one in #299 or the one in #347.

Michael Jones: So not #253?

Manu Sporny: That’s correct, I was mistaken. Look at #299.

Michael Jones: I can’t live with #299 because it says it must be a string. That’s just shooting ourselves in the foot.

Manu Sporny: From an editorial perspective, #299 has a lot of normative language that is hard to follow, from an editors standpoint, I’d rather use simpler language. There is simpler language with one paragraph in the #347 PR that I believe says almost everything that is said in the #299 PR.

Markus Sabadello: I think that’s the one substantial difference between the two PRs, whether we allow just strings or allowing other data types for values.
… And it sounds like there’s more support for the latter (more types for values). I see basically two other differences, #299 has lots of MUSTs and MAYs and normative language, #347 just saves this space/doesn’t have to specify in such detail because it reuses the Infra spec. But it should be the same rigor.
… But we save the language by just referencing Infra. I think also #299 talks more about metadata coming in different representations and can look different based on how it’s implemented and I personally like that. My preferred way of proceeding is to use #347 and then add things from #299 on top of it.

Orie Steele: -1 to #299… because of the string restriction, too much normative language…. prefer to take the manu’s pr, and then see a smaller, focused change set against it

Dave Longley: +1 to starting with #347

Manu Sporny: With that, would there be any objection to starting with #347? Including with dealing with Mike’s objection? It may be hard to convince Justin of Mike’s change.

Brent Zundel: I think if we as a group aren’t hearing any objections to starting with #347 then we should do so. We encourage folks with different opinions to join us but let’s move forward with #347.

Manu Sporny: I think we can knock out “updated” and “created” properties really quickly.
… I will make some assertions about those and see if people agree. I think the attempt here was to talk about when the DID Document itself was updated/created and it is metadata about the DID Document, full stop. That doesn’t mean we can’t have other metadata properties, etc. but I’m trying to define what we have so far.

Markus Sabadello: Just as a background as properties – those are why we started the metadata discussion. Are we talking about when the DID subject was created or the DID Document? I completely agree with Manu that these are about the DID Document and are metadata. I would describe them as the time when the create/update operation of the appropriate DID method were executed.
… I think it’s when those operations were executed.

Dave Longley: +1 agree with Manu and Markus

Manu Sporny: Then we have an answer to the issue, those properties do not belong in the DID Document, that could be the answer to #165.
… We could put those into the metadata structure and those properties live outside the doc and we can define metadata in a space in the spec and talk about those properties there. We could knock out a number of issues #65, #174, #203.
… Anyway, we’d be able to knock out a number of those.

Dmitri Zagidulin: no objections from me, although I imagine some DID methods will want to store those two metadata fields in the did doc itself, for optimization

Jonathan Holt: To clarify, the “updated” and “created” fields are metadata about the DID Document as asserted by the DID author, not during the resolution for DID methods. Is that correct?

Markus Sabadello: It’s metadata about the DID Document, it’s not about when it was resolved, it doesn’t come from the resolver – we’re not really saying who has asserted it. It could be the DID controller it could also be the DID method. We would rely on the DID method to be able to read/access those metadata properties like “created” and “updated”.

Orie Steele: we don’t know who sets that metadata…

Orie Steele: agree with markus

Dave Longley: +1 agree with Markus and +1 strongly agree with Manu’s line of reasoning above

Markus Sabadello: I think we’d move these things to the metadata section.

Manu Sporny: Before I put a proposal in, did that answer your questions, Jonathan?

Dmitri Zagidulin: I’m reluctant to bring this up, but this does bring up the question of - so where do common metadata fields get defined? did core registry?

Orie Steele: we don’t know who is creating that meta data…. we don’t know ifs its the did controller.

Orie Steele: there is no signature / crypto associated with “updated” / “created”

Jonathan Holt: If it’s true that it’s metadata about the DID Document as determined by the DID author who published it – it’s just convenient to sort and find the appropriate keys based on metadata. Without it being resolving; we have issues with BTC-R where publish times might be different from when updates happen. There’s wiggle room. The date and time from a block would be different from what was asserted by the individual.

Markus Sabadello: In my comment https://github.com/w3c/did-core/issues/65#issuecomment-597030882 I said about the “DID document metadata”: “The sources of this metadata are the DID controller and/or the DID registry.”

Manu Sporny: So right, will try to get very pedantic, so will nitpick on some of that language. We don’t have a “DID author” definition. We talk about DID subjects and DID controllers. The “updated” and “created” fields typically come from the DID registry, the DLT/Verifiable Data Registry itself, for example.
… You could say for did:key that the created and updated fields change since there is no such registry. I’m trying to stay away from who sets it. I’m trying to say it’s up to the DID method.
… Instead of saying it’s coming from the DID controller.

Orie Steele: +1 to what manu is saying… its definitely not “asserted by the did controller” all the time

Jonathan Holt: Publishing to IPLD is the distributed ledger, but it’s asserted by the person who publishes it.

Brent Zundel: I think what we’re going to say is that because every DID method is different – the “updated” and “created” properties are up to the DID method.

Jonathan Holt: +1 that’s fine

Orie Steele: +1 the origins of “updated” and “created” are 100% method specific

Dave Longley: Where those fields come from is up to the DID method, what they mean is the same.

Markus Sabadello: I agree with all that.

Orie Steele: security-ness of how did methods handle “created” and “updated” are for sure reasons to choose or not choose a did method.

Markus Sabadello: When we came up with lists of buckets at the DID F2F we agreed with this.
… We could have made additional buckets for DID controller asserted data and DID method asserted data and so on.
… But due to the reasons that were already mentioned we felt it was sufficient to leave it up to the DID methods to describe where it comes from.
… An interesting example is the did:web method is that the DID controller who set the value for “created”.
… I agree that the one bucket with DID Document metadata is sufficient to serve the needs that we have.

Proposed resolution: The updated and created DID Document metadata properties are about the DID Document itself. They are defined as DID Document Metadata and expressed using a Metadata Structure. They are typically set using information from the Verifiable Data Registry, but might not be – how they are set are DID Method specific. Their values are Datetime values. (Manu Sporny)

Dave Longley: I don’t object to the proposal, not sure if we’re covering … seems to be implicit that metadata structure is not in the DID Document.

Orie Steele: +1 to saying meta data is not inside the did documetn

Michael Jones: I don’t think that saying that it might be in a “Verifiable Data Registry” is necessary.
… It might be defined any place, it might be defined any place.

Manu Sporny: You would prefer to say how it’s set is DID method specific?

Michael Jones: Yes. I think “typically” is also misleading.

Dave Longley: +1 to not saying “typically” in specs as much as possible

Orie Steele: +1

Orie Steele: i was agreeing with selfissued and dlongley

Markus Sabadello: Regarding the language about metadata being outside of the DID Document. The motivation of the resolution contract has been that we don’t specify where the metadata goes. We just specify that it exists. Like I said earlier, there could be different ways it could be expressed and where it lives. It could even be part of the DID Document. We have had a few issues that say that there could be a special section within the DID

Dave Longley: Document where it lives.

Markus Sabadello: There could also be a way to express it in HTTP headers. The whole point has been to avoid expressing where it lives.
… It would not be core properties but in terms of serialization it could show up perhaps as a nested structure within the DID Document.

Jonathan Holt: +1 to that, it doesn’t have to be in the core, how it’s actually outside I need clarification. If it’s not part of the core DID Document registration I think that’s fine.

Dave Longley: +1 strongly opposed to putting metadata inside the DID Document, conflation really bad for data modeling

Manu Sporny: +1 to what Orie is saying.

Dave Longley: +1 to Orie I mean

Manu Sporny: +1 to strongly oppose DID Document metadata /inside/ the DID Document (which are statements about the DID Subject).

Manu Sporny: s/+1 strongly opposed/dlongley: +1 strongly opposed/

Dmitri Zagidulin: -0 to what Orie is saying. (in that, I suspect did:web will need to store ‘updated’ and ‘created’ in the DID DOcument itself. to be protected by the proof over the whole doc)

Manu Sporny: s/+1 to Orie/dlongley: +1 to Orie/

Orie Steele: I am strongly opposed to mixing metadata into the DID Document. It makes it confusing whether statements are about the DID subject or are metadata. We should not duplicate things in the DID Document either. We can’t stop people from misusing extensions that do it, but we should advise against. We should add specific language to make it clear to not mix those things.
… We should advise people not to mix metadata about the DID Document into the DID Document.

Dave Longley: Orie said what I was thinking, conflating these things are dangerous for consumers, we can’t stop people from doing bad things, but we should strongly advise against it.

Manu Sporny: Please make concrete changes to the proposal so we can see if we get to consensus.
… I feel that we could get to consensus if we remove the sentence about keeping metadata out of the DID Document but I agree with Orie and Dave that it would be a mistake not to strongly discourage that.

Markus Sabadello: I’m not strongly opposed to it either, I’m fine with it. Just wanted to get that background information in there.
… Have also seen a VC have a single data structure that has data about a credential but also a section about the subject.

Dmitri Zagidulin: So I don’t have an objection to the proposal. That seems fine. I think it will be fairly common to see “created” and “updated” in the DID Document itself. I don’t object to the proposal.

Orie Steele: -1 to a class of did methods that are signed… we just removed that from did core.

Dmitri Zagidulin: There will be a common class of DID methods that will keep a signature in the DID Document itself. There won’t be an outer envelope that protects the integrity or an underlying blockchain. A class of DID methods will have DID Documents that are self-signed/self-protected or whatever, for those it only makes sense to put the metadata in the DID Document there.

Jonathan Holt: What Dmitri said. It’s method specific and I think … as an author of a DID Document, I’m signing it and attesting to it. It seems like it’s verbiage trying to block specific DID methods.

Dmitri Zagidulin: manu - re “subject-asserted” - I’m not sure it was proposed to be subject-asserted.

Dave Longley: There is a clearly better way to implement that: Wrap the DID Document into a structure with metadata and the DID Document and sign that instead.

Dmitri Zagidulin: dlongley: why is that better?

Dmitri Zagidulin: it’s not clear at all to me.

Dave Longley: It doesn’t mix metadata with other data in confusing ways.

Markus Sabadello: DID Documents might not be exposed in the same way as the resolution function. They may be internal things.

Orie Steele: I want to echo what Markus just said – internal representations of DID Documents and external ones are two separate things.

Dave Longley: +1 to Orie and Markus, this helps solve this problem!

Manu Sporny: +1 to dlongley, Orie, and Markus is saying – we need to get the layering right, shoving everything into the DID Document is not proper layering… comingling security stuff is bad.

Orie Steele: There are things that are not controlled by the DID controller (or may not be) and we should not co-mingle data that doesn’t have those same guarantees across DID methods; DID methods may do so internally but not expose that externally.

Proposed resolution: The created and updated DID Document metadata properties are about the DID Document itself and are held outside of the DID Document. They are defined as DID Document Metadata and expressed using a separate Metadata Structure. How they are set is DID Method specific. Their values are Datetime values. (Manu Sporny)

Orie Steele: +1

Brent Zundel: +1

Dave Longley: +1

Manu Sporny: +1

Markus Sabadello: +1

Dmitri Zagidulin: +1

Jonathan Holt: 0 reservation about the how

Michael Jones: +1

Manu Sporny: If this holds, we will be able to close issues #65, #174, #203.
… Mike, I’d like to hear from you on issue #85 to see if this would address your concern as well.

Michael Jones: I would not try to do that in real time but could someone put that in the minutes?

Manu Sporny: https://github.com/w3c/did-core/issues/85

Manu Sporny: Yes, link in IRC.

Orie Steele: I was saying that the working group should recommend against mixing of statements about the subject, made by the controller, and statements about the did document… its a security issue, we should recommend against inviting people to experience it

Manu Sporny: +1 to what Orie just said above ^

Markus Sabadello: Just to say, what we just discussed about the DID Document metadata, I think the understanding is that this is the case for all DID Document metadata and “created” and “updated” just being examples.

Dave Longley: +1 to that Markus

Dave Longley: +1 to Orie as well

Brent Zundel: +1 to Markus
… Thank you everybody, we made good progress here. Please checkout #347 and #299, and comment on what you’d like changed, how to get to acceptance.
… In addition to that we have several issues we’d like to get addressed as this subgroup. Let’s keep moving forward. Thanks to all of you for taking the time to get this moved forward. You guys are awesome. See you next Tuesday.