DID WG (Virtual) F2F Meeting, 3rd Day — Minutes

Date: 2020-11-04

See also the Agenda and the IRC Log

Attendees

Present: Daniel Burnett, Justin Richer, Ivan Herman, Brent Zundel, Dave Longley, Drummond Reed, Manu Sporny, Amy Guy, Shigeya Suzuki, Markus Sabadello, Kristina Yasuda, Orie Steele, Adrian Gropper, Kaliya Young, Michael Jones, Jonathan Holt, Eugeniu Rusu, Yue Jing (景越), Dmitri Zagidulin, Daniel Buchner, Juan Caballero

Regrets:

Guests: Keiko Itakura, Kazuaki Arai, Takashi Minamii, Tomoaki Mizushima, Jean-Yves Rossi, Jay Kishigani, Titusz Pan

Chair: Daniel Burnett, Brent Zundel

Scribe(s): Manu Sporny, Dave Longley, Amy Guy, Wayne Chang

Content:


Ivan Herman: Slides: https://tinyurl.com/yydapmu3

Daniel Burnett: Welcome to the 3rd day of our VF2F!
… Let’s do quick introductions for those that haven’t yet.

Keiko Itakura: I have joined from the first day, but couldn’t introduce myself. This is Keiko Takura from Japan, responsible for project management for Digital Identity, glad to meet everyone!

Daniel Burnett: Thank you and welcome, anyone else?

Manu Sporny: No one else is new.

1. Agenda, starting remarks

Daniel Burnett: As a reminder, we’re on day 3 - we’ve made some good progress on privacy, yesterday we began conversation on unknown properties - exciting conversation.
… We will continue that conversation tomorrow.
… But for today, we’re going to have brent talk about new W3C Process… we need to make a decision about that.
… We will also then have Daniel Buchner tell us about deterministic equivalence ID.
… We’ve been talking about that for a while.
… Daniel has a new take that he’d like to present to us.
… After the break, we’ll have a brief presentation on content identifier proposal from ISCC.
… We will then spend working time on test suite led by Orie.
… It is often the case in WGs that a small group of people work on test suites, but often others don’t know about what’s required to write tests… Orie is going to talk about how test suite works.
… Before we get started, there’s something I’d like to talk about — Brent and I were discussing the discussion about Unknown Properties — we were reminded about the disagreements that existed before the group before the Amsterdam F2F meeting… that was very frustrating for both of us.
… You remember about WebRTC adoption suffered for years because of a split — attempt to create competing standards, horrible for community.
… I wanted to talk about what it means to do work here in W3C and why we’re here. I know that, and you will hear people say, what we want is interoperability. Yes, that’s true, but at a higher level, we want success and adoption of the technology.
… We want DIDs and VCs to be used broadly in the world.
… It is standards that enable that. I’ve seen a number of standards that have come out, that had poor interop, maybe 80%-90% — but that was good enough. We should strive for more, but we need to remember the end goal.
… The end goal isn’t a beautiful or elegant or ideal or perfect, the goal is something that works.
… Something that works for those that must implement — this is a practical commercial endeavor — we need to know that it’ll work for you and that’s the requirement. Some standards start off beautiful and get ugly, we might be making an ugly baby here, but if it’s adopted in the market… that is a success.

Dave Longley: +1 to burn, +1 to priority of constituencies … we need it to work well/simply/etc. for users of the spec

Daniel Burnett: I think we can do it, there is a path forward here — just as we did in Amsterdam, we believe it’s possible for us to come together and accomplish what we need to — we have a lot of intelligence and … in this community.
… Please lets work towards getting something that works… not the ideal, but something that works well enough — that’s possible, this group can do it. This group has done it in the past, and this group can be that success story going forward.

2. W3C Process and Patent Policy 2020

Ivan Herman: See slides.

Manu Sporny: Brent said deeply profound things on mute.

Manu Sporny: We all missed it.

Brent Zundel: These are the highlights of Process 2020 — we are operating under Process 2020.
… There is a great wiki explainer for Process 2020.

Ivan Herman: see P2020 explainer

Brent Zundel: Some highlights — makes it easier to update RECs and CRs, strengthens patent policy, provides Living Standards capability.
… Add yourself to the queue for questions.
… The big changes — in order to make substantive changes to Recommendations — you can republish them and make changes inline. You can republish editorial stuff easily.
… That can just sort of happen — changes to RECs, you have candidates, you review just those changes, allows patent/AC reviews, then it’s as simple as update request to republish REC.
… This stuff was going to apply more to the VCWG than this group…
… It is possible to publish a REC that identifies features that can be expanded upon.
… We can say what features can be updated — can be added following process above. Allows for us as spec writers to identify things we wish we could have done, but couldn’t in first round.
… In all of these — the whole Director’s Approval process is more streamlined — for things that are not controversial, it’s automatic.
… That reduces some of the bureaucracy.
… The thing that applies to this group — CR Snapshots and CR Drafts.
… You can trigger patent review process at this stage (it used to happen in later stages)

Ivan Herman: We need to be precise, patent review on CR is subject to WG accepting new patent policy. At the moment, we are in the old patent policy… all patent policy implications are not relevant to us.

Brent Zundel: Yes, that is what we need to decide today, we’re straddling both world today.
… This is language that I’ve pulled out of the wiki — thank you for sharing that Ivan.
… Between snapshots, we can publish CR Drafts
… If WG wants to do CR Draft, we can do so like Echidna works for us now…
… Pushes it out to TR space, as you’d expect it to happen.
… Streamlines the process of updating a CR and working toward the next step for a CR.
… Process 2020 is designed with new Patent Policy.

Ivan Herman: What it means in practice, the publishing CR drafts can be done in Echidna as we do now.

Brent Zundel: Yes, just as we made a decision to publish WDs, we can use Echidna to publish CRDs.

Ivan Herman: Drafts, CRSes still go through the full blown process.

Manu Sporny: CRD — Candidate Recommendation Draft

Manu Sporny: CRS - Candidate Recommendation Snapshot.

Daniel Burnett: Where CRS is approx. equiv to the prior “Candidate Recommendation”

Brent Zundel: Patent policy 2020 update — CR snapshot can be done as a patent review draft… previously it was a Proposed Recommendation (much later in the process), which didn’t provide implementers with patent protection.
… Any questions on Process 2020 before we jump into Patent Policy 2020

Daniel Burnett: So, the statement that you just made — is that the case even if we don’t accept new Patent Policy
… Or only if we adopt the new one.

Brent Zundel: If we don’t accept the new patent policy, the new one will come in during REC.
… Process 2020 and Patent Policy 2020 were written to work hand in hand… probably not much thought put into mix/matching them.
… CRs are now a CR Snapshot and could contain patent protection
… CR drafts make it easier to publish
… Flow is more automated
… We are not operating under Patent Policy 2020, we’re under Patent Policy 2017
… We could operate under PP2017, patent review doesn’t exist until REC… but in my opinion, things would be much smoother if we can operate under there. Having said that, I am not a lawyer. I am definitely not your lawyer.

Drummond Reed: +1 to moving to Patent Policy 2020 based on what I’ve heard so far

Brent Zundel: Changes to Patent 2020 — Patent Review Drafts… new process, do patent review earlier… that could be better.

Drummond Reed: Doing patent review earlier IS better.

Manu Sporny: Agree with Drummond.

Ivan Herman: For people new to W3C — might be worth emphasizing what patent policy exists and why it’s important.
… The thing that led to the current patent policy is that implementers should be able to implement things w/o being afraid of being sued by infringing a patent.

Drummond Reed: That’s also a big plus.

Ivan Herman: For those who are new to all this: if you are a member of the WG, a company, the patent policy means that you sign a commitment that even if you have a patent somewhere that might be relevant for this specification, you will not go after another company who uses that patent to implement this specification.
… This is what the patent policy is all about at W3C… the big difference is that this kind of commitment kicked in once the document was published as a REC.
… In practice, what that meant in some WGs, where implementers needed to implement in patent shark tanks… that meant implementers were scared to implement. Implementers didn’t want to implement until they were protected.
… As soon as something is a CR Snapshot, that is a patent review draft… meaning, you will be protected… in practical terms, can’t judge if this community is operating in a patent heavy environment. If the answer is no, we may not care, if the answer is yes, then we are better moving to new patent policy as soon as it can happen.

Brent Zundel: Other slightly more extensive changes…

Manu Sporny: I just wanted to speak in favor of this. There could be patents in there, but the sooner we can deal with this the better, it’s easier for larger entities to start pushing things when they know about patent protection. We can do patent policy about a year in advance — let’s do all the positive signaling we can.

Brent Zundel: What we went through was differences, by specification, we mean REC track document — patent review draft, specification that is under review.

Drummond Reed: +1 to Manu’s point about the earlier we can enable the market signaling, the better

Brent Zundel: Changes to commitments, make explicit that anything that you don’t add an exception for, an exclusion for, becomes a part of the RF license.
… Added persistence of licensing commitment… if you didn’t have exclusions for previous review draft, next draft will use same essential claims… this section adds subsequent patent review drafts…. doesn’t change anything, just clarifies.

Adrian Gropper: Who pays for the patent review?
… Who is doing patent reviews and who is paying for them?

Drummond Reed: Each member company does their own patent review.

Ivan Herman: The member companies do that patent review for their own patents. It’s always a discussion when we have potential members, but we do not necessarily ask a company to go through their entire patent pool all the time.
… Rather, it’s a statement by each organization that effectively says: “If we have a patent, it does not apply to THIS specification.”
… So, when we say “patent review” it doesn’t necessarily mean that institution will go through their entire patent database… it’s more of a promise for their own patent pool.

Adrian Gropper: Just want to say it back to all of us — I specifically have a patent, as a signatory of the IPR policy, I’m simply saying I will not enforce this patent against implementers of the specification.
… It doesn’t require me to do any work as long as I continue to stand by this thing that I signed.

Manu Sporny: adrian what you said is totally fine. By the way, do not bring up patents in the group, do not tell us if you know one, there are all kinds of horrible things that leads to
… if you do know of a patent in the space do not tell us, do not bring it up in this meeting
… even saying if you knowing that there is one is dangerous

Daniel Burnett: I had that happen in a group — someone on a call said “I know of this patent” — and everyone said “stop talking” — if you don’t know why this is important, consult your legal counsel.

Drummond Reed: In short, much bad stuff

Brent Zundel: If you are not excluding your patent — you don’t need to speak up.
… I am not a lawyer
… Minor changes for being able to do multiple patent disclosures…

Drummond Reed: I agree that this is an improvement. +1 to accept.

Brent Zundel: I suggest we agree to Patent Policy 2020 — we need to revise our charter… if we do — Director will approve on December 2020.

Manu Sporny: (like a big mass wedding) like they do for Star Wars weddings sometimes. but for Patents.

Adrian Gropper: If you are the holder of a patent AND you / your lawyer says you can benefit from excluding it — you mention is because you’re the holder… then you have separate process that has be done… spec has to find its way around the patent

Daniel Burnett: Don’t mention patents that you don’t own — license or disclose - for patents that you hold… — do not mention patents that you do not hold.

Manu Sporny: to be more lawyery than that, don’t disclose them to the group, but expose them to the staff contact. There’s a chance people change their minds which is bad. there is a process for disclosure, please follow the process for disclosure

Adrian Gropper: When someone discloses this to the staff, it causes work for someone — just want to be clear that that’s true and to be sure we understand how that works.
… I don’t think any of us want to make work for anyone.

Ivan Herman: If that happens, then yes, you create work that’s on our and your lawyers, W3C staff. Maybe I was lucky — was busy in WGs for 15 years, personally haven’t had to do a PAG.
… Though this PP was a big drama when it was created — become widely accepted by community - very few cases.

Manu Sporny: ivan maybe it would be helpful to talk to the patent advisory group that would be formed if a patent happens. I hesitate to talk too much more about this. The likelihood of it happening is rare
… we went through it in web payments and it was a horrible terrible experience and we never want to do it again. It took almost a year. Patents got chucked into the group and ran to disrupt it. it can be a terrible process, but the w3c has a really good process to navigate it

Brent Zundel: What you need to do is discuss this with your organizations — we are going to ask the question on PP2020 — make a proposal to the group that we do this. I hope you will be ready at that time to make that determination.

Adrian Gropper: I heard two different things from Manu and Ivan.

Daniel Burnett: If you wish to disclose a patent you have in order to not have to license it royalty-free, please reach out directly to Ivan and he will involve W3C legal. Do not mention it to the group.

Daniel Burnett: Change of agenda.
… We are putting Orie now, he needs to leave later… don’t know where Daniel Buchner is.

3. Test Suite - Working Session

Ivan Herman: See slides.

Orie Steele: This session is progress on test suite…
… Our roadmap for how we’re getting tests for normative statements in the spec… if you take one thing away, you’ll be able to understand how we’re testing normative statements in spec.
… Before we dive into details… provide some background on testing… my background is in cybersecurity and computer science… I write tests on a regular basis — personally committed to them… but others may not be familiar with them.
… hard to test randomness… you want deterministic tests.
… Choose Determinism.
… You want to commit test vectors/fixtures to Version Control… if you are generating data and testing data — you won’t see when data changes in ways that are breaking.
… People break up tests differently — you want to group by common feature set — positive and negative tests…
… Don’t want individual test cases to be too long… then move on to next thing — smaller, more precise tests… document what your scenarios are testing in plain english…
… If your business folks can’t understand the tests, doesn’t matter if engineers think it’s right.
… When you write tests, link to issues and spec text, have as much info as possible, whether tests are good enough or not…
… You want to use realistic data — avoid unhelpful examples — hard to do for negative tests — important that you reflect test data test production data.
… You don’t want to overfit to unrealistic data… you want to know what your test coverage is… for software libraries, you want to know that 90% of code is covered.
… Don’t repeat yourself, write simple tests.
… Tests are a formal proof mechanism showing that some behavior is supported…. prove that behavior exists, don’t trust that it exists.
… Architectural Approach…
… We need a way to test normative statement in DID Core — how is that done? Different for each W3C spec… we proposed this Dockerized test suite structure… inspired by Jest, used by React and Facebook and JS companies out there — way of writing tests… describe blocks, describe suite of functionality.
… We create “Scenario” — loosely like “Describe block” in Jest.
… Basically, input, code, output…
… for example, “did:Example” contains no upper case letters is ‘false’.
… That would be the assertion.
… did:Example is the structured input
… ‘contains no upper case letters” is an assertion
… ‘False” is the expected value of the assertion.
… This is an example of a negative test case .
… Even if you don’t know Javascript, you should be able to read this and understand what we’re testing.
… What are we doing — no scenario per statement…
… It depends, but when in doubt, single scenario per statement.
… It’s ok to have many smaller separate test files… less ok to have on test file that covers everything.
… What if you can’t test a statement?
… Raise concern around that the statement can’t be tested… someone else will solve the issue or statement will be removed.
… if you try to write a test around something, communicate it to everyone else… people might help
… if you don’t know how to program, first step for writing great tests, help with english test plan — provide examples of what you’d like to see tested.
… Things that do/don’t support capitalized letters.
… Getting Started….
… We have a repo called the DID Test Suite … we have a list of normative statements… may want to discuss how this is broken down further.
… Previous possible recommendations… to put a single recommendation on the table to start… we want to see companies step forward to commit for statements in sections.
… We’d like to see multiple companies commit to doing that — you’d know, you’ll be assigned a group of statements and you’ll complete tests for all of them… that group of statements will be an issue assigned to you… you can communicate on how to close that issue.
… The key part being that in addition to that, if you are concerned about whether or not you can write a test about something — check issues to see if it’s assigned to anyone already — start tackling it by thinking through things on issue itself.
… Also, don’t just start working on something w/o having an issue assigned to it — we want to avoid double work, we don’t want two people working on the same thing.
… The next piece of this is how we do breaking up of tests… how we do assignment of test for companies — I can speak for Transmute — we’re obviously members of the WG — we’ve written some for DID Parameters… both positive and negative tests… Markus is already working on DID Syntax tests.
… Already some examples in repo for you to follow along.

Daniel Burnett: Would it be possible for you or Markus to do screen share and go to repo and pull up representative tests?
… For many people, they are nervous about going there.
… Hopefully people wont’ be scared off by Javascript.
… But maybe you can show what an assertion looks like.

Orie Steele: See the statements and tests…
… normative statements around DI DParameters.
… How do we get these things to be green and why are these things red?
… This is the test suite repo in Github, normal repo, it’s a monorepo, packages are two modules to test and generate test report… test server is where tests are implemented that are processing input/output
… This test server is dockerized, you can run it locally, easy as possible, abstract engineering side — can run tests from server regardless on language… you will just provide structured JSON input.
… An HTTP server setup w/ docker… you can easily run it in your environment… each scenario, did parameters, negative and positive tests.
… These tests make sure that server is processing scenarios correctly, these are tests about tests.
… You may find it helpful to write these as you go — everything that you care about has a test associated with it… it’s not required to have testa bout tests.
… Let’s look at an example…
… We start by importing a fixture… then they are passing that data to web server that runs data on web server… response from web server is same as what’s committed to source. No part of DID Parameter test scenario has changed.
… Tells you that thing are testable, doesn’t say what’s tested.
… What are we passing in first — look at what’s tested… positive and negative tests.
… Here are valid inputs for DID parameters.
… The service parameter, version id, version time… under this are response assertions… you would expect 100% of these to be true, these are positive tests, these are valid example.s
… A valid example and positive test assertion should be true… they all are.
… Let’s look at negative tests… random base64 url encoded binary… in order to tests some structures, you need to encode JSON and test unencoded value… that’s a confusing part of that piece here…
… if you b64decode it, is it ascii string, it’s not… negative test, these values, expecting them to fail.
… Negative tests are a bit more confusing, if you find them difficult, just skip them, let someone else do them.
… DID Parameter scenarios…
… How are we sending this information to the server…
… Let’s look at positive one first… assertions map to normative statements… the key for assertion, value is some JS function of the scenario which contains input and expected output… program returns boolean.
… Anything that you can test in a program you can test in this format and you can get a true/false value from it…
… This hl, associated value must be ASCII string, we’re pulling valid parameter from URL and checking if it’s ASCII string or not.
… Most of tests in block pulls different param from DID URL and then all calling isASCIIString… that’s an example of Do Not Repeat Yourself
… When you post scenario for positive tests, this is the code that runs, that’s what’s used to generate responses.
… You have to handle awkward base64url encoding — had to represent this isn JSON, have to represent negative test cases — pulling that encoded value apart and then asserting that that’s not ASCII.
… scenario is JSON, program is Javascript, you can put those two things together and always solve the problem.
… Final point about this repo - test vectors… that is structure used to generate test suite… this entire respec document is programmatically generated — says when it was last generated…
… If you submit scenarios with positive/negative merged together…
… In addition to testing all statements, in order to do that, update-respect-test generates respec document… takes all inputs sends them to test server, then takes response and injects responses and generates table that looks nice… this is complicated, you probably won’t need to touch it, might need to have questions… how are tests written and how are they written… did-core-test-server under services — under scenarios
… Each scenario covers positive/negative — any questions?

Manu Sporny: first of all +1 this is awesome and a lot of work, really appreciate you putting in the time to get this set up

Drummond Reed: +1 to Orie really carrying the water here.

Manu Sporny: I wanted to commit digital bazaar to writing a chunk of tests
… I expect we may have some time to talk about assigning companies certain sections
… and underscore that we need to see other companies step up to do this, Danube Tech and Transmute have, we need 3 to 5 other companies to contribute

Markus Sabadello: This is great, spent a bit of time adding scenarios for DID Syntax, in general elegant and easy to figure out where to add scenarios.
… If there is something that says “DID Method” — would you reuse most of the code and just expect result to be different…

Orie Steele: yes, that’s best practice, write function to do one thing, then true/false.

Markus Sabadello: Can you talk a bit about how to integrate with implementations, libraries — how tests pick up files/fixtures from file system… but how do they get there?
… What are inputs/outputs — what part does implementation generate, how do I feed it into test suite?

Orie Steele: We have two competing objectives for test suite — test normative sections… doesn’t really help you if you’re an implementer.
… You may not care that there is a test out there that says did method needs to be uppercase.
… Two goals - first goal is test for all normative statements… second goal is, as a DID Method implementer, generate structured data to prove that your method is conformant.
… First comment — provide one example of DID Method test that proves that DID Method is conformant… right now, all tests are tests for normative statements… how that example works… you as a DID Method, you use version ID, you generate set of JSON files, how you use version id — those files match files in test suite… programmatically generate test fixtures and then use test fixtures to show that your implementation is conformant.
… You can prove that your did method is conformant, but don’t connect… use your DID Method to generate data that’s conformant, you can therefore proving that it’s conformant.

Shigeya Suzuki: I am happy to see presentation, I have experience in test driven development, do we have any idea on timeframe for this work?

Orie Steele: As quickly as we can possibly go.

Shigeya Suzuki: I’d like to help

Daniel Burnett: Thank you, Shigeya!

Jonathan Holt: I’m a big fan of test driven development, limited by ability to write tests… what is role of CDDL?
… I did write this for all items in registry - using regular expressions to constrain.

Orie Steele: I wrote tests suite assume CDDL wouldn’t get adoption…
… The process of writing these tests is miserable because we don’t have CDDL… I’m a huge +1 to get CDDL into DID Core for test conformance, in absence of that, we need to write JS programs by hand.

Brent Zundel: Just wanted to jump in — previous WG, jumped in and wrote tests, don’t be afraid.

Orie Steele: and ask for help.

Daniel Burnett: Much of the work is copy-paste for the most part.

Daniel Burnett: Thanks Wayne!

Wayne Chang: I am publicly committing Spruce to write tests, want to comply to tests coming out from test suite, see you all in the Javscript mines.

Drummond Reed: Are they right next to the Gringott mines? So filled with lots of gold??? ;-)

Manu Sporny: orie one of the things that we have to think ahead to is there are going to be multiple people submitting tests and usually people reviewing how the group is doing, should we go to REC, it’s easy if you look at a side to side comparison where you see a feature and five implementations all passing one test
… is that built into the test report?
… how do we show that there are did methods all conforming?

Orie Steele: There are test of normative statements and DID Methods… 2nd category is not supported in suite… collection of fixtures for DID Methods, when you render their test results, you can see 5 people implemented version id… it’s fundamentally possible to represent the information, but we need examples around single DID Method.

Daniel Burnett: End of queue, any last words, Orie?

Orie Steele: Let’s collaborate on Github issues, if you are a company that wants to sign up to help support — open issue, we’ll figure out how to divide up sections for companies.

Brent Zundel: Anyone can raise issues?

Orie Steele: Be careful about that doing that because the spec isn’t stable… editors/chairs should guide which statements we’ll be testing.

Daniel Burnett: Thank you, Orie — thank you for doing this, helpful to walk through it — may seem obvious to some, but it’s not always clear where to go.
… This was helpful to me and helpful for others, I imagine.

Drummond Reed: Thank you Orie!

Daniel Burnett: Next item is our break — as usual, this particular room will remain open, as will breakout room.
… Everyone be back here at top of hour… 12pm ET.

Brent Zundel: breakout room: https://zoom.us/j/97932508552?pwd=REFrMXF0NVBreTBhN0lzTVhYYS94Zz09

Daniel Burnett: Thank you Manu!

4. break chit-chat

Orie Steele: Issues related to serviceEndpoints and privacy concerns which need to be addressed https://github.com/w3c/did-core/issues/382https://github.com/w3c/did-core/issues/72

Orie Steele: https://github.com/w3c/did-core/issues/382 https://github.com/w3c/did-core/issues/72

5. ISCC Presentation

Ivan Herman: See slides,

Brent Zundel: Our next session is by the ISCC, we thought it would be valuable to the WG. We are joined by Titusz_Pan
… It’s up to you if you want questions during or after the presentation

Titusz Pan: I think it would be best to Q&A after
… This is my first contact with the DID community, I’m from Germany and an open source developer with a small business. I’m the architect and developer of the ISCC.
… Thanks for inviting me. Let’s get started
… ISCC is abbreviation for international standard content code. It is meant as a universal identifier for types such as audio, video, etc. It’s a lightweight fingerprint for digital content, a mix between identifier and fingerprint.
… It should be cross-sector applicable, such as for user generated content, meant to be used as a generalized identifier. We are planning on cross-ledger interoperability.
… The goal is to use these identifiers as the subjects of claims.
… Why are we working on this? On the Internet, content does not currently enjoy the benefit of standard identifiers. Facebook or other sites do not have identifiers for content, for example. Existing identifiers such as digital object identifiers have considerable overhead, costs, and centralization.
… The idea is to negotiate what an identifier means without the need for a third party, communicating “above” the identifier. Proprietary systems typically create a competitive imbalance, lock-in to platforms, etc.
… With DID, we are commoditizing machine-to-machine interaction, so we need to know how to talk about these things.
… If we have a multi-sided ecosystem, there could be an interest from any participant in the ecosystem to come up with identifiers for digital content. This is not limited to content creators, but also other users of content such as archivers, ebook distributors, or others.
… Authorship & copyright is not a requirement to have an identifier, but an identifier could be a requirement for authorship & copyright.
… We can base these on algorithms on content, such as hashes
… What’s exactly being identified? We have 6 layers of content identification. Content is an abstract concept and we seek to deconflate the different threads composing it.
… We have semantic views, such as meaningful descriptions in the form of vectorized data. We may have a generic manifestations such as pixels or text. We have format-specific manifestations to a specific data type. we have exact format specifications, and also the notion of original copies.
… The ISCC is made of four components, hashes in the general sense, with different perspectives of the data: (1) metadata about the data and hash of that, (2) perceptual similarities of the content such as pixels for images or text for documents, (3) raw data component based on a cryptographic hash changing in an uncorrelateable fashion.
… We will go through each layer.
… Meta-ID requires title for content, such as the filename. There is opportunity to include more details and structured formats, we want to be able to estimate similarity across content using this. We can then cluster content based on similar metadata.
… Semantic-ID is the identification of meaning, such as using a multi-lingual text embedding of ‘king - man + woman ~= queen’
… We should be able to measure semantic similarity across human languages.
… Content-ID helps us identify content-level similarity such as two “identical” images even if they are across different data types, file formats, etc. Instead we encode information structure rather than raw data.
… Titusz_Pan: Data-ID does not extract the content, but only looks at the raw data and performs content-level chunking that is shift-resistant against content changes (most of the chunks will stay the same for set-similarity measurements)
… This lets us measure data similarity.
… Instance-ID allows us to perform merkler root hashes over raw data, focused on data integrity.
… The high level overview of creating ISCC consists of Seed Metadata from the data itself, processing based on the content using fingerprinting. We have a full implementation in Python just over 500 lines of code, and also a dockerized HTTP service so you can try it out. There are libraries to interact with the content and extract features.
… How does it relate to the digital object identifier used in scientific publications? They have the same prefix but then differences further in the identifier string, which may be measured against each other for similarity.
… It’s not just an identifier for digital objects, but any assets.
… ISCC takes different perspectives across the layers of data for dimensions of similarity measurements. They can be used in a matrix representation, as seen on the slide.
… Compared to UUIDs and SHA256, ISCC brings potential semantic understanding to the identifier itself.
… We don’t want to build a hash function that changes on bitflips, instead we’re interested in a format that reflects changes to the underlying content.
… ISCC-ID allows us to bridge to stores such as on DLTs to bring additional context.
… ISSC-Code is decentralized, content-based, and similarity-preserving, while ISSC-IDs are short, unique, owned, persistent, and resolvable. We want to be able to put ISCC-Codes on blockchains, in a way that anyone can add them. The ISCC-ID contains information about the blockchain id and other deduplicating features to prevent collisions.

Orie Steele: https://github.com/iscc/iscc-specs/issues/90

Titusz Pan: We want anyone to resolve ISCC-Codes in a universal way, from anywhere. This is where we think DIDs are relevant due to their resolvability properties. We are considering a did method, but need further research. We found a potential fit after reading the did-core specification.
… This is related to existing identifier such as ISWC for digital work, and its subcategories such as recordings.
… We have been approached by DIN, who brought it to ISO for standardization of the ISCC. We want to standardize the procedure and structure of the code. There is no central authorittive registry, so this is a collaboration opportunity with the DID specification and related items from this group. That’s why we’re here.
… We are starting with places such as audio identifiers where there’s some precedence.

Adrian Gropper: This is Fabulous work!

Manu Sporny: Yes, you can use DIDs for non-person entities (NPEs)
… There’s been some chat about the base58 encoding you’re using, which doesn’t seem to be base58btc. We’re trying hard to keep base58 encoding from splitting, such as with base58.

Titusz Pan: We may switch to base 32 to be case-insensitive.

Wayne Chang: my question is to the group, what are the semantics of resolving a DID that refers to a non person entity such as an algorithmor a content hash
… when I find the verification method there, what does that mean?

Drummond Reed: This is indeed very interesting work. Very nice presentation Titusz.

Titusz Pan: The idea is that I can take an ISCC code and put it on the blockchain, and it’s metadata about the content. It’s open and part of the metadata specification.

Wayne Chang: if I resolve an ISCC based DID and I get a public key as part of the verification method, what would that mean? is that to be defined?

Titusz Pan: there would be two pieces of information (1) ISCC-Code itself, and (2) a content link such as IPFS hash with extended metadata. Perhaps with license information or anything of interest.

Adrian Gropper: How does this interact with stenography?

Titusz Pan: There is no relationship or information encoded, but they’re agnostic.

Dmitri Zagidulin: multihash! :)

jay: Regarding fingerprint, if ISCC supports several types of fingerprinting, how do you know which fingerprinting is used?

Titusz Pan: You query the metaregistry to determine this about the content. You won’t get a 1-1 mapping, but multiple views from multiple parties.

6. Equivalent Identifiers

Ivan Herman: See slides.

Brent Zundel: Moving on to Daniel Buchner, q&a during or after?

Daniel Buchner: after
… Not all forms of equivalence are created equal. We will examine the Ship of Theseus. You can change anything in your DID document over its lifetime. At the end…
… T0 I could have created a unique DID document, but in T3 every “plank” of the document could have been changed in Theseus’ did document
… Even in T3, if the document looks exactly the same, we can distinguish the two through using the process. DIDs can change entirely over their lifetime (via documents).

Drummond Reed: I would argue that the DID would still be different across any two DID documents.

Manu Sporny: Yes, agree with Drummond. (that is, fundamentally, how you know one thing from another)

Daniel Buchner: DIDs are not URI entries of an exact form
… A logical and deterministic process is able to determine a logical entry with several valid inbound logical inputs to form the entry.
… Can many forms of a DID string identify the same DID? I think yes.
… alsoKnownAs is present already in the spec. sameAs/formOf, and canonical/preferred may be candidates to add.

Daniel Burnett: Agree Manu, a DID is an identifier whose controller is explicit and clear, cryptographically. What the DID, or DID Subject, “means” is in the eye of the controller.

Daniel Buchner: Are these related? They serve as investigatory hints.

Drummond Reed: I like “formOf” much more than “sameAs”

Manu Sporny: I like the opposite, Drummond :P (for the definition provided)

Daniel Buchner: sameAs/formOf are exactly logically equivalent identifiers, determined to be so and filtered for by the method, such that Theseus’ DID and the hash are referring to the same logical entry. It provides awareness of variants, update support for new formats such as base58 to base32.
… Only the method can determine this, through resolution.
… We have a separate one canonical/preferred, and it’s different that it’s a singular value, which encourages you to modified your held references and possibly gain new awareness going forward. It supports method evolution and signals for migration processes. We are ensured that only an exact logical equivalent is populated.
… Q&A?

Jonathan Holt: It depends on how you do the resolution. If one did string is base64 and another is base32, you still only have the same document. You may be able to use an array of sameAs…

Daniel Buchner: I would assert we have a bit of freedom here across formats after resolution.

Manu Sporny: From a high level, I want to assert that we can always do this through the DID spec registries. You’ve been pushing for these features for the sidetree ION work, and it makes sense how it’s relevant to that. However, if the question is if we put it into did-core or not, I want to underscore that these values can always go into registry so no one is blocked. The other point is that we talked about sameAs and its security implications, but the change is if the DID method can assert sameAs using metadata, that’s a perfectly reasonable thing for the DID method to do. The ledger should know if two DID documents are the same thing or meant to be hte same thing .
… The canonical/preferred item seems to be a requirement in layer 2 where the ledger is slow. They feel hacky, but I understand why they must exist due to requirements.

Daniel Buchner: The underlying need here is that in any system that must lead to centralized constructions, such as bitcoin, time is the difficulty. In some methods such as sidetree-based ones as IONs, you can create a DID instantaneously on your client that are immediately usable. The DID itself is a URI string of the initial base patches across time.

Manu Sporny: Yes, but would you need that feature if the ledger had finality within seconds? I’d argue that you don’t.
… The need here is because the underlying ledgers are slow.
… (which is reality, and is fine, and is why I’m not completely against sameAs being asserted by the DID Method/Ledger)

Daniel Buchner: It’s still resolvable because if you hand it to a method, they can automatically resolve it. They check the underlying records for existence or mutations. The issue that arises is that whenever an anchoring occurs, people may want to use a short form. So we want a way across all DID method to allow specification of logical equivalences across these different string representations.

Daniel Buchner: That’s what this is in service of.

Markus Sabadello: Confused about this: canonical vs. alsoKnownAs
… Seems to be semantically something else, don’t have a strong opinion about whether it should be DID core. Kind of agree with manu about unblocking, but see a universal applicability.

Drummond Reed: I think it has to be in did-core to allow methods to ensure exact logical equivalences are populated. This is a requirement on DID methods, therefore it must go into did-core. In OASIS and other groups, we found the need for canonical and sameAs come up over and over again. We didn’t need alsoKnownAs because it’s a weaker form. dbuc should come up with the PR as soon as possible. We also need to make sure the DID method requirements section includes the assurance text.

Dave Longley: We should have requirements, consumers must check the value upon use, or we place a requirement on DID methods to determine whether you accept the value from controllers or not.

Drummond Reed: I agree with Dave that this is the “tax” of supporting verification of sameAs or canonical values.

Dave Longley: it’s more complicated than just trusting the method — the method must have knowledge of these properties (or blanket disallow properties the DID method does not understand) so that you can trust them

Drummond Reed: I’m hoping Daniel is referring to the new, decentralized “DTwitter” :-)

Dmitri Zagidulin: wait wait. that’s a DID migration use case, which is very different than this formOf or sameAs…

Daniel Buchner: A lot of methods might not even have this nor use it. It acts the same across all of them that use it, but some methods wouldn’t ever populate the field. Assurances are the same across methods. There’s another interesting thing, being able to create lots of IDs for a person, mostly pairwise shielded, but one or two purposefully used across contexts such as twitter and instragram. I want people to know it’s my common DID. A DID could have to live for 50 years, so the methods need to get it perfect off the bat. If there’s no in-built means to “move you over” then. There are a lot of features for security.

Wayne Chang: the equivalence criteria seems to be an enablement of different representations of a DID that are logically equivalent after you go through the DID method
… i might have some ietf json patches that are signed that are jwt encoded and they’re part of the did itself
… this might be logically equivalent to one that is anchored with rotations back onto a ledger and they technically refer to the same thing
… so we should be able to sue them across contexts
… an alternative is to look at the resolution metadata param that is passed into did resolution as a place to include these updates
… its separate because you would have a DID stand the test of time but also specify updates during resolution
… have you considered this approach?

Daniel Buchner: we have security concerns about this approach when using anciallary metadata that has the potential to be omitted on accident to ill effect. You could also have methods in OpenID for porting schemes at 2nd and 3rd layers, but it increases the complexity of more systems and actors, which adds difficulty and even ‘transport’ level operations. also allows for stuff like did:example:xxx --> did:example:xxx-yyy Wherein you could add extra segment parts to do more things for faster resolution and other things as the DID Method evolves

Manu Sporny: Wanted to raise some security reprecussions, agree with dlongley, if we implement this stuff, it has to be implemented across all did methods, it’s super dangerous if only handful of these did methods implement these. I was feeling better about it when we were saying it’s the DID method expressing these things, but at the document level there is new security concern.
… for canonical, if you don’t have a backreference, you can be attacked. if i wanted to attack dbuc, i may set up an account that does illegal things on the internet and reference it to him. We must acknowledge that canonical is bidirectional.

Daniel Buchner: DID Methods that have no code for it would just not populate this. The Method obviates that because these are tied to the same security assurance

Dave Longley: IMO, the choices are: 1. consuming canonicalBikeshed values means you MUST check the method to see if it’s supported, 2. require all DID methods to properly validate or disallow values for canonicalBikeshed

Brent Zundel: can the DID Method populate the field in the DID Document?

Manu Sporny: The canonical entry itself can change leading to complex scenarios rife with security concerns. I retract what I said before, there actually isn’t necessarily a way to do this in DID spec registries. all methods must do this or face security ramifications.

Daniel Buchner: You will have far scarier issues if you tell users they have to do this sort of thing out of band across N transport/handshake protocols and other types of DID Methods

Dmitri Zagidulin: I just want to be sure we’re not talking about DID migration.

Dave Longley: brentz, if the DID method knows about the field, yes, but if it doesn’t, we could have trouble, depending on our consumption rules

Dmitri Zagidulin: It’s a different thing if i’m migrating from ledger 1 to ledger 2. There’s an implication that canonical can address this, but i want to make sure we are clear that this is out of scope.
… How do you imagine did methods validating canonical or formOf

Manu Sporny: dbuc, hrm, well, LD Security already does this “out of band” across N protocols… or rather, there is a bi-directional protocol for this.

Brent Zundel: dlongley, so this really needs to be defined in DID Core?

Dave Longley: brentz, see my comment above about our choices: it’s either DID core (all methods must handle the property by validating or disallowing it) or the consumption rules require the consumer to check the DID method to see if it supports the property or not

Daniel Buchner: It would be a nightmare for RPs and users to have to dance like this, and if there’s any urgency involved, lord help you. If they had to talk over active conduits, I mean. As the type param, which it does not, this is not commingling different logical IDs, it does not do that

Dmitri Zagidulin: @brentz / dlongley - the implication is more, that it /can’t/ be in DID Core. since each different did method validates it differently

Dmitri Zagidulin: +1 markus

Markus Sabadello: Related to manu’s comments on security implications, just wanted to say during our metadata discussion, we had some properties such as created or updated timestamp with the same meaning, but dependent on the did method if the values are guaranteed by the method or self-asserted by the controller. At that point, we decided not to differentiate this yet. Right now, we don’t have different names or buckets for properties on this basis. If we follow this pattern, we could register this either in did-core or spec registries. It would depend on how it’s guaranteed.

Juan Caballero: which type had :D

Adrian Gropper: It sounds like this has the same privacy implications as the type discussion around DIDs in that it expands the privacy attack surface, opening us up to unintended consequences. Can we do this outside of putting it in did-core? We should do it that if possible even if it adds friction.

Drummond Reed: No, this does not have the same privacy issues as ‘type’ or any other privacy-violating properties

Daniel Buchner: This is all about the same Logical ID
0 deviation of that

Adrian Gropper: I also raised the issue of allow lists for methods, and depending on the implementation thereof, then it’s fine.
… If we’re doing something to cause people to raise their eyebrows at dids as a whole, then reject

Daniel Buchner: The security implications are not the same as type. s/security/privacy/, if I give you another string identifier with the same logical reference, then it’s exactly the same thing.
… If methods can’t perform the requirements, then they can simply decide not to implement this. You don’t need to be encumbered by this if you don’t want to support it, it won’t change the document in that case. If you trust the method to resolve the ID, then you can trust it to do this.

Dmitri Zagidulin: dbuc - got it, thanks.

Daniel Buchner: Spec text will be clear about what is allowed or not.

Manu Sporny: so, this is very clearly not cross-ledger, which is good, but now sameAs is a bad name :P

Dmitri Zagidulin: manu - I thought the term being proposed was ‘formOf’, not sameAs?

Manu Sporny: the slide says sameAs / formOf and I don’t like formOf :)

Daniel Buchner: Manu: i have no idea about the naming, don’t care so much about that. If you want bananaOctpusEquivalent, I am down

Manu Sporny: I prefer bananaOctupi

Drummond Reed: responding to agropper, the new thing is that the method must ensure that only exact logical equivalences are populated.
… because you’re only asserting the equivalence that an existing did is there….because a method can assert the use of one or both of these properties, the resolver code should be able to verify this for anyone who needs it. these expressly cannot be used for migration across methods. it’s possible then that the property names can capture the constrained meanings, such as putting ‘logical’ explicitly in front of them.

Drummond Reed: I still believe we must define these properties in DID Core so that we can state the requirement that a DID method that uses either of these properties MUST only allow an exact logical equivalent.

Manu Sporny: Now we are back to being able to do this in the spec registries. I thought canonical was a mechanism for cross-ledger, but it is expressly not for that. It’s only for a single did method without the ability to point outside of the ledger, which makes it safer and easier to use. Because there is no cross-talk, we can let this play out on its own timeline, in spec registries as opposed to did-core, while still achieving the desired goals without adding to the spec authors’ burdens towards CR.

Brent Zundel: what is the concrete proposal?

Kristina Yasuda: +1 Brent

Drummond Reed: I believe it is that exact logical equivilence requirement that makes it safe from a security standpoint.

Brent Zundel: we have alsaKnownAs, sameAs/formOf, canonical/preferred. all of these could solve problems. What do we want to happen here?

Adrian Gropper: given drummond and manu’s responses, it only applies to dids and is entirely within a single method. If people put this in, and FISMA deems it a risk, then they would just disallow the specific method.

Drummond Reed: Yes, Adrian, you heard me correctly. I agree with Dave about these security concerns

Dave Longley: thinking about this in terms of writing code to consume this property. DID methods are aware of this and specifically prohibit it, or consumers know about it and prohibit it. If we allow methods to arbitrarily allow this as a property, then it could be used to deceive a consumer of did-web for example to misunderstand the relationship. We must be explicit for did methods in the acceptance or rejection of these properties.

Daniel Buchner: A did:web resolver would blow those away

Dave Longley: software that resolves dids must now look into the did method to understand the implications, or the did methods must be required to address this.

Dmitri Zagidulin: +1 to what dlongley said. either a) it’s in DID Core, in which case /all/ DID Methods have to process it. or b) it’s in registries, and only methods that need it use it. But not a + b

Daniel Buchner: to maintain the best security, i think we must include this in did-core to ensure enforcement of these requirements. people would be less vigilent

Daniel Buchner: I have one PR, which may need some mods, and another for canonical

Drummond Reed: In terms of next steps, I agree with dlongley. The key is that there are attacks. There are 3 parts to it: dbuc will sign up to submit PRs, definitions of the properties, update to the DID method requirements referring to those constraints. That’s why it has to go into did-core. We must also add a security consideration to cover that. dbuc must sign up for those PRs and quickly.

Daniel Buchner: sameAs/formOf exists as a pr, i can generate one for canonical/preferred

Dave Longley: i don’t see how this works with did:web at all … since the DID Doc contents are entirely under the control of the DID controller. I guess did:web resolver software needs to reject DID docs with that property as entirely invalid (can’t just drop the property if did:web DID Docs are signed).

Daniel Buchner: I would probably just leave the current id prop language then

Manu Sporny: feeling nervous about this PR because of timelines, security ramifications, implementation learnings. I am very concerned that a bitcoin/ethereum ledger needs but other ledgers don’t need. Perhaps even only sidetree needs it.

Juan Caballero: +1

Jonathan Holt: It’d be great to have a process to manage security concerns and threat vectors.

Brent Zundel: Our process is to use github issues and raise concerns there.

Daniel Buchner: any system that is robustly decentralized will encounter these issues.

Manu Sporny: You can have a decentralized system that has fast finality w/o needing this feature :)

Dave Longley: -1 to presuming we know how all decentralized technology works :), it just depends on how fast consensus is

Daniel Buchner: It’s not just sidetree, but anyone with a deterministic initiating form and wants to modify the form. If it’s not in did-core, then we may be rejecting the entire thing.

Manu Sporny: Agree — for example, here are examples of decentralized systems w/ fast consensus times: Hashgraph, Sovrin, Veres One, Hyperledger Fabric, etc.

Manu Sporny: So, it’s not true that you will always hit this issue

Manu Sporny: also, I’m not saying we can’t do the work here — I’m saying that it will be a non-trivial lift.

Manu Sporny: and this is asking the WG to do a bunch of work.

Adrian Gropper: NIST writes a lot about derived PIV credentials. Is that potentially helpful?

Adrian Gropper: https://content.govdelivery.com/accounts/USNIST/bulletins/2a9c676

Jonathan Holt: dmitriz : yes potentially. github issues might be good for now, but we should make a process that is well documented

Justin Richer: FIPS201-3 is the PIV and Derived PIV document that agropper is referencing: https://pages.nist.gov/FIPS201/

Dmitri Zagidulin: @dbuc - sign it with the same key in the DID? that’s the typical way to bind stuff, out of band…

Dave Longley: just wanted to add — even with slow consensus ledgers, you don’t need this property unless you need more than just key material in your DID Document (e.g., you need service endpoints from the beginning)

Daniel Buchner: If it got pushed out of did-core, then it’s effectively a NOOP. The big issue of doing this out of band, is that I don’t know how to do it out of band. If I have a did in form a and it must be in form b, then i go back to the rp who i gave form a, and i wanted to communicate that it’s now form b. how do i prove that they are equivalent? if i don’t have the method, the arbiter of logical equivalence, how

Wayne Chang: do i do this? dual resolution and similartity checking?

Daniel Buchner: it becomes a quagmire if we do it out of the security basis of the method.

Dave Longley: there are other did methods that can securely create things that might take time to come to consensus. for cases where consensus takes a while, the things that would not be supported are arbitrary service endpoint updates. if you can communicate them out of band, then you can do this securely even with slow consensus times

Jonathan Holt: dmitriz: looks like we do have a issue tag “security-consideration”, but perhaps a template for keeping track of them before they become a CVE

Justin Richer: just to respond to agropper, FIPS-201r3, i am a coauthor and published this week. the important thing about derived credentials, dealing with smart cards and related accounts for us feds, is that derived credentials do not necessarily need to be cryptographically derived from the same kind of keys, this linked relationship without having to proved the pocession of the whole chain of keys is important
… we created this notion of an underlying account that these keys are attached to, differing from the previous notion that “everyone gets a certificate and that’s the end of the day”

Daniel Buchner: we hope to make the language more secure to say that the property must have logical equivalence

Justin Richer: thank you for the correction

Drummond Reed: It feels stronger to have it explicit using these proposed properties.

Manu Sporny: I’m aware of the language :)

Daniel Buchner: https://w3c.github.io/did-core/#did-subject

Wayne Chang: i said that, not justin ^ the thanks for correction

Manu Sporny: +1 for the language in the spec being updated first

Jonathan Holt: https://www.w3.org/TR/did-core/#alsoknownas

Brent Zundel: https://www.w3.org/TR/did-core/#did-subject

Markus Sabadello: there are existing tests around that the resolved DID must match the id in the document, would prefer this requirement is reflected in the spec.

Manu Sporny: But if we go back, that throws Daniel’s use case back out again? I’d be -1 for that. If we take that out, Daniel is back to having no solution, right?

Daniel Buchner: yep

Brent Zundel: In the pasted link about did subject in IRC says a requirement for resolution of identifiers in a note

Drummond Reed: I have been uncomfortable with that switch, we should have really clear rules around these terms such as sameAs/canonical/etc. and clear usage requirements for methods

Brent Zundel: sent an email for the ADR session tomorrow, please check
… thank you all for coming, especially those in weird hours
… pls scribes for tomorrow

Jonathan Holt: regarding ADM, easy just look at: https://github.com/w3c/did-spec-registries/pull/138