DID WG Topic Call on Issue Processing Working Session and Test Suites — Minutes

Date: 2021-02-25

See also the Agenda and the IRC Log

Attendees

Present: Brent Zundel, Amy Guy, Ted Thibodeau Jr., Shigeya Suzuki, Manu Sporny, Ivan Herman, Orie Steele, Markus Sabadello, Drummond Reed, Adrian Gropper, Justin Richer

Regrets:

Guests:

Chair: Brent Zundel

Scribe(s): Amy Guy

Content:


Manu Sporny: the topic today was to figure out how we’re going to test all the sections in excruciating detail
… everyone writing tests has their marching orders
… some sections are easier than others

Manu Sporny: https://github.com/w3c/did-test-suite/issues?q=is%3Aissue+is%3Aopen+tests+for

Manu Sporny: reminder - there’s a link of the tests that need to be written, for volunteers
… about 127 of them
… I would like us to decide on exactly what kind of tests we’re writing
… whether they’re spec only tests, static tests vector tests, or any live integrations
… that’s meant to be a gradient of possibilities not just three possibilities
… once we get that settled then we can talk about how each section should be tested, highlighting the ones where we’re very concerned about testability
… depending on the approach we take

Markus Sabadello: I thought the topic of this call was to work on the open PRs on did core
… in which case we could look at the remaining diagrams
… and some ways of finishing those
… more than happy to first discuss the test suite, also important
… when we say there are some sections that are hard to test we probably mean resolution and dereferencing sections, which I signed up for
… it would be valuable to discuss how to approach those
… and also the sections where we may be able to do something that manu said about dynamic and live testing rather than just static files

Brent Zundel: we are hoping to have time today for both
… if you would like we can timebox and spend a certain amount of time and re-asses

Orie Steele: I’d be happy to do the open PRs first, I feel like thats going to be more productive
… and they’re going to be related to knowing whether or not anything is testable
… those PRs should be addressed first

Manu Sporny: I’m confused, I thought markus was talk about the diagram PRs, not the test suite PRs

Orie Steele: I mean the diagrams in DID core make it untestable in its current form, and markus’ PR addresses that issue

Manu Sporny: I’d rather not do that, but I’m hearing people wanting to do that… I think we need to come to ground on the test suite. I don’t think talking about the diagrams is going to get us there. I would like to timebox to 20 past the our

1. Diagrams

See github pull request #596, #597.

Markus Sabadello: there are two open PRs with diagrams that I was working on
… one is #596
… on production and consumption
… one on resolve and resolveRepresentation
… 597, I have not worked on updating that yet
… I have tried to update the production and consumption one
… I came up with this version
… latest attempt maybe not perfect, but how close is it to what people want to see?
… tries to implement this separation of the maps that we’re saying we’re passing into the production function and getting out of the consumption function
… and the did document data model and representation specific entries as a separate map

Manu Sporny: this is not necessarily aligned with where I thought we were going
… its’ confusing because the bottom one looks like it’s just using the representation specific entries
… some changes
… if you take did document data model at the top and figure out how to move it into that dashed box
… and representation specific entries, strike json-ld, and move into the dashed box below
… and do a larger dashed box around both
… and have produce and consume go from that would solve the issue for me

Markus Sabadello: live edits diagram

Manu Sporny: it does dodge some of the questions, but addresses my concern, if everyone agrees

Markus Sabadello: I can live with it

Ivan Herman: +1 with these changes

Orie Steele: what was proposed is consistent with the current diagram that has been merged into did core
… the upper left hand box with dashed lines is already represented in did core under the data model section
… i’d be fine if we used the existing diagram and didn’t add another diagram that in some ways conflicts
… the existing diagram we have in did core is acceptable for me
… as a result of consumption
… i’m struggling why we should show different pictures of the same data model
… almost there, stop representing the data model differently in different diagrams
… use the same diagram for each of them

Shigeya Suzuki: While I almost agree to the diagram… +1 to Orie’s opinion

Shigeya Suzuki: (reuse same diagram on same concept)

Ted Thibodeau Jr.: i’m concerned that surrounding dashed box suggests that those representation specific entries are part of all of the productions

Ivan Herman: orie, when you talk about the diagram in the spec right now, are you talking about that one or the change that is part of another PR?

Manu Sporny: merged few minutes ago

Ivan Herman: you make our life difficult :-)

Orie Steele: I mean use this diagram: https://w3c.github.io/did-core/#data-model

Orie Steele: :)

Manu Sporny: I like where orie is going with this, we do have something already, can we reuse that
… that would be better I think

Ivan Herman: there is a difference, what markus is doing is with code extract which here is not the case
… reusing the same style and colours would make sense
… but it’s not the same diagram
… here we list some of the core properties, not all

Shigeya Suzuki: +1 to Ivan

Justin Richer: +1 this is not the same diagram, this is how we get TO that diagram (and from it)

Ted Thibodeau Jr.: +1 these are different diagrams for different purposes

Brent Zundel: +1

Markus Sabadello: also some similarities and differences
… both show the data model, but this one here that ivan made is intended to be more generic and higher level, whereas the other has a concrete example
… the other difference is that this is showing the concepts of core things and registered extensions and unregistered extensions, which are irrelevant in the other diagram which is about production and consumption

Drummond Reed: +1 to the diagram we are seeing now on “production and consumption”

Markus Sabadello: using similar colours and style, I’d be happy to make an attempt to do that

Manu Sporny: orie, would you be okay with that? I buy the argument that they’re slightly different.. would you be okay that kind of set up?

Orie Steele: we have a resolution that hasn’t made it into the spec yet around ignoring and consuming properties
… somewhat related to what we’re talking about here
… this diagram that’s showing concrete production and consumption for three representations
… aligning the colours is a good idea
… dangerously to represent the adm abstractly and concretely in slightly different ways
… an issue of colouring and consistency
… and boxes coloured one way in the abstract sense, should show how properties are preserved during consumption should show up in those boxes
… when you consume your properties are going into one of many different coloured boxes, and when you produce they’re coming out of the coloured boxes and going into one representation
… I like the image in did core right now, and this one doesn’t look like that
… the subtle differences are harmful
… we should use the same labels
… representation specific entries, core representations specific entries
… all of the term labels should be the same

Manu Sporny: I agree with what orie is saying in general
… is markus willing to give that a shot?
… Ted raised that it make sit look like you use all those things to produce and consume
… that’s why I was saying there’s a bit of a handwave that is okay
… you can

Orie Steele: For the record, I am fine with the “content” of the diagram, just concerned about colors / labels

Manu Sporny: nothing to say that an application can’t go into these boxes and modify them between production and consumption
… I might consume json-ld, and if I know i’m going to produce json and my application feels strongly about removing @context it has the choice of doing that
… we don’t have anything in the spec that says you can’t
… we suggest that you should preserve everything
… but apps can process these boxes for their own reasons
… in consumption we say you have to preserve it
… but we don’t say you have to all the way through to production
… you can modify them between those two
… thats why we’re handwaving a bit over this
… I think that’s okay because we need to get something that we’re more or less okay with, otherwise this is going to keep getting blocked

Orie Steele: For example, you are not required to preserve prototype pollution or sql injection attacks :)

Markus Sabadello: agree that what we have now would allow both
… it would allow having @context int he representations specific entries then producing plain json, and including @context there though personally I don’t like it I think it’s allowed and possible with the spec language
… what I could also do is what we discussed before to split this up into multiple diagrams where we show different flows or different ways of using this
… first consumption, then production
… sometimes @context is there, sometimes it’s not
… I’d rather have one diagram

Brent Zundel: feels like we have consensus on direction
… I’d also rather have one diagram
… opposition to moving on to testing?

Orie Steele: Also for the record, when you advocate for producing application/did+json without an @context, you are advocating for incompatibility / non-interoperability with JSON-LD verifiable credentials.

2. Test suite

Manu Sporny: on tuesday I heard people stretching further than I was suggesting
… as brent mentioned, there are three general classes of things we can shoot for
… bare minimum make sure spec is testable
… human testable comments throughout the spec
… my suggestion is that’s the weakest form of making sure the ecosystem si following a spec
… not a very useful place to get to
… most wgs shoot higher than that
… orie has already done that kind of work
… it’s good work and we should pull it in
… that’s the baseline
… we could say we’re done…..
… but I have heard a number of people complain about the VC test suite and how we didn’t do enough testing, and all these people saying theyr’e conformant with the VC test suite when they’re not really

Drummond Reed: +1

Manu Sporny: the vaccination credentials initiatives are an example of that - they’re doing things wholly outside of what the WG would consider as conforming implementations
… we should avoid that
… what orie proposed for the test suite, the second higher class of testing,w as good
… the static test fixtures
… if we did that would be good
… and allow implementers to submit their test fixtures and for us to run tests against them to make sure they’re within the ballpark
… machine testability, trying to get all the statements in the spec machine testable, that’s what I was targeting
… the suggestion we have static tests fixtures for everything
… I believe the spec is in that shape today and we can do that
… some disagreements between markus and orie about committing certain types of implementation things and we need to get that resolved now
… if that’s all we do I’m happy with stopping there
… what I heard on tuesday was that a subset of people are like why shouldn’t have live testing, or a subset of live testing
… or that you have to have a live version for it to be testable
… I think we should have that discussion, but after we decide whether we’re trying for test fixtures
… that’s the first decision we need to make, are we going to stop at spec only or try for test fixtures with implementations submitted

Orie Steele: it’s accurate what manu said
… markus and i were working on the test suite in the w3c repo, looking at how can you provide evidence ot show your method is conformant with the spec, disagreement on how to do that
… so I proved that the spec was testable in a separate repo
… the spec being testable is not helpful to you as a vendor
… it doesn’t give you any ability to show up as a green checkmark
… when you want a list of vendors that have implementations that are conformant, they’re going ot need to provide their implementations in some format, and the test suite is going to have to process the implementations
… disagreement around how are we going to do that in the did test suite repo
… we need to have that conversation
… how do methods show they conform to the spec
… I heard markus and justin saying it’s not a thing we care about

Markus Sabadello: the disagreement we had on some of the work orie contributed
… we all owe orie a lot for doing that work and taking leadership
… we wouldn’t have anything that looks like a test suite without him
… I opposed some of the PRs because my understanding of them is that they would claim to be testing something different to what they’re actually testing
… if what I did was to block a PR that simply added data form one specific vendor, if that’s what was going on in the PR, then I misunderstood, and I apologise for that
… that was not my intention
… I did think that here was a problem with how production and consumption and resolution was designed
… that was my concern, not the vendor specific choices
… having said that, going back to resolution tests, I think it would be very exciting if we did dynamic tests
… if we had resolver endpoints for example
… that can be given a did and can return a did document plus the other result values
… that would be nice ot do
… there may have been some concerns that it may not be possible with the way the resolution spec is currently written but maybe we can disucss that
… can we do that and do we want to to test the resolution sections with live dids and live endpoints?

Brent Zundel: stopping now sounds nice…. with my chair hat on, as long as we have tests that sufficiently allow us to exit CR, anything beyond that is admirable but we’re not going to let it delay us
… let’s do the things we need to to exit CR first
… if the test suite folks want to continue expanding, two thumbs up

Justin Richer: live endpoint testing can require an adapter to talk to the test suite

Orie Steele: +1 to what brent said

Ivan Herman: +1 to Brent

Orie Steele: we should aim for a layered approach where if we fail to meet the upper layers we’re okay
… a weekend ago I tested the normative statements in the spec without testing any specific method
… the next layer up is show that specific methods can conform to the normative statements in the spec
… we were trying to figure out how to do that
… I don’t hold any grudges about markus blocking the way I was trying to test that… it is hard ot figure out how to test that
… we don’t have perfect answers to show how an implementation of a method is conformant
… but we need to get to that
… to show what is conformant, which methods are equivalent
… we need a chart to show it
… the next layer on top of that is the dynamic testing, maybe wiring into universal resolvers or vendor specific resolvers, I’d love to get there
… I think we can get there, but not if we get through the intermediate step
… suggest a three phased approach
… phase 1 is that normative statements are testable
… phase 2 is that vendors can implement and show conformance to the spec - we’ve been working on that, no great solution
… phase 3 is live realtime testing with a resolver or something like that
… markus and I have worked on various versions of that in other work previously, I believe we can get to that
… encourage us to shoot for all 3 phases in that order

Drummond Reed: I love the idea of a full interop test suite for did methods
… but is that in our scope? is that realistic?
… vs something that’s just a test suite that tests the spec? they seem like two different objectives
… seems like more work than we signed up to take on

Manu Sporny: I think we’re in agreement
… I like the 3 phase approach
… my assumption was to say to exit CR all we had to do was phase 1
… I don’t think we need to resolve the first one, it’s to do the minimum necessary to get to CR
… the second is the static test suite
… the third phase is the live testing
… I don’t want to put the live testing between us and getting to PR
… the harder these things are, the less likely it is that things stay in the spec
… there are combinations here that I’m concerned about
… eg. if we hit phase 2 on everything except resolution, and resolution is this phase 1 thing I personally have a problem with that because it’s going to result in a week part
… but the group is going to vote for the minimum bar, which is fairly useless from a vendor perspective
… the other issue i’ve seen is if we have guaranteed success the second we go to cr there’s no motivation to work on anything
… there’s a motivation issue created by making the bar low
… all that said.. skip the first proposal and pick up the second?

Drummond Reed: first one is redundant

Manu Sporny: anyone objecting to static tests to exit CR?

Justin Richer: proposal two is “submit serializations of did documents in conformant representations” not just “did documents”

Brent Zundel: I’d prefer to have a more minimal bar set, but I believe we’ll achieve it
… minimal bar with stretch goals

Manu Sporny: it evaporates the motivation to do more… is my concern

Orie Steele: I would see the stretch goal as live testing with resolvers
… I don’t see static test fixtures a stretch goal, that’s basic ability to demonstrate your method conforms
… it’s important to meet that bar

Amy Guy: +1 Orie

Markus Sabadello: +1 to Orie

Manu Sporny: no objections to making the minimum bar testing serializations?

Shigeya Suzuki: +1 to Orie

Manu Sporny: we will enable implementers to submit did docs to test against the applicable normative statements?
… that’s the minimum bar to exit CR

Orie Steele: we can’t use the term did document any more. It has to be all of the data models in the spec
… it’s hard to write tests around an adm and part of the problem of the resolution sections is its hard to imagine how you’d test them statically

Markus Sabadello: +1

Orie Steele: the short answer is we can write tests that will work for one way, and note that you could do it another way
… it can’t just be the did doc data model, it has to be ‘the data model’ and cover metadata and all the other data structures
… the fixtures might not be the only way that stuff gets tested or used
… if we’re to make those sections testable we’re going to have to cover the data model and production and consumption, concretely, to meet that phase 2 bar

Manu Sporny: +1 to that

Markus Sabadello: I thought we made the abstract parts of the resolution sections testable by adding statements saying when one of thes eabstract data structures is returned here is a way to serialize it
… we added a sentence that says you can turn it into a concrete representation and test that
… for metadata you can represent the as json using the standard infra section that defines that
… is it solved? we can’t test the abstract things directly, but we can turn them into concrete things and test those, therefore implicitly testing the abstract part

Manu Sporny: exactly, that’s what we were shooting for
… I believe this is achievable
… we should talk about it because the specifics are important
… Yes. We have a path to testing all of the data structures in DID core
… by serializing them to something
… that goes for all the consumption/production tests, resolution, core properties
… we can do all of those
… the phase 2 thing is achievable, just details
… rewrote the proposal
… any changes?

Brent Zundel: I’d get rid of “all of” and say “for the data structures” because not every submission needs all

Proposed resolution: The DID Core Test Suite will, at a minimum, enable implementers to submit at least one set of serializations for the data structures defined in DID Core to be tested against the applicable normative statements in the specification. (Manu Sporny)

Manu Sporny: +1

Orie Steele: +1

Drummond Reed: +1

Brent Zundel: +1

Amy Guy: +1

Ivan Herman: +1

Markus Sabadello: +1

Shigeya Suzuki: +1

Ted Thibodeau Jr.: +1

Resolution #1: The DID Core Test Suite will, at a minimum, enable implementers to submit at least one set of serializations for the data structures defined in DID Core to be tested against the applicable normative statements in the specification.

Manu Sporny: should we talk about how every section and how we test it with this proposal, and treat phase 3 as a stretch goal
… everyone agrees it’d be awesome to hook it up to live systems
… let’s prove that what we resolved is testable and how

Manu Sporny: https://github.com/w3c/did-test-suite/issues?q=is%3Aissue+is%3Aopen+tests+for

Manu Sporny: I’m going to assert some are easy
… core properties is easy. the implementer presents a did doc and the test suite runs a bunch of tests. does the id conform to the abnf? is a verification method right? we can use json schema to test a lot of that stuff
… production into json, jsonld and cbor is easy
… they’re going to provide a document and tell us what the media type is
… the test suite is going to pull that in, use a standard read rules to put it into a data structure and do all the tests
… resolution metadata, easy
… just a json structure
… take it as a fixture input and check it
… did parameters, easy, just a string
… same thing for did syntax and url syntax
… most difficult is consumption portion, and some resolution and dereferencing
… for consumption the only thing we need to do is two people need to write separate implementations on consumers
… eg. things that say a consumer must throw an error
… i don’t know how we’re going to do that
… for resolution and dereferencing I’m suggesting we provide inputs to the function and outputs from the function and we test those for conformance

Orie Steele: good approach
… as markus mentioned, a difference between serializations of the adm and serializations of a representation
… what’s confusing is that the fixtures are json
… There’s going to be json that is used to represent the ADM that isn’t a JSON DID Doc
… and we’ll have to make that very clear

Justin Richer: +1 to that

Orie Steele: I don’t know how we’re going to test those consumption rules outside of providing a standard way of reporting that a consumer would throw an error
… I think we have some of those kinds of errors in the spec, invalid did and did not found
… testing those from static fixtures, you have some concrete function that is processing expected inputs and outputs, and just checking for string equivalents
… which is why it’s better to have live tests

Markus Sabadello: agree with Orie
… about confusion that we will probably have since we are going to deal with the adm but will use json to express that
… at the same time json is one of the supported representations
… a consequences of the fact we’re doing the test suite in javascript
… like Orie said as long s we name that in some smart way confusion can be minimised
… agree with everything manu said on how to test different sections wrt resolution and dereferencing I have some idea
… having inputs and outputs and running tests against those
… I’d add to what manu said, about testing production we’d want to test the inputs and outputs since we’re now saying production rules take two maps plus options
… in the test sweet we want to model that
… here are the contents of those inputs of the production function and here’s the expected output representation

Orie Steele: +1 to what markus is saying

Manu Sporny: +1 to markus_sabadello

Justin Richer: we need to be really careful with being clear to ourselves and to the community about what we are testing
… when we’re testing production what we’re saying is I will give you an ADM and I want to see how you serialize that to make sure it’s okay
… to test your output I’m going to parse that
… to consume it and makes sure it maps to the model I suspect
… I cannot test you’re not doing some sort of weird non conformant magic trickery in order to generate that serialization up to and including just spitting out a static file
… but I can set a test harness such that itt sets expectations
… I have a description of the model, you tell me the serialization and I’ll handle it
… the inverse applies to consumption
… I give you a serialized document to consume and I expect you to come up with this model. Error cases are harder to detect because the end result is an internal state of the system under test
… in order to test those a lot of the results are going to be asserted by the person running the test
… we ran into this with openid testing

Orie Steele: yep, +1 to justin_r… its hard to test error conditions that are not externalized.

Justin Richer: error conditions in that protocol don’t end up as messages within the protocol, but screens shown to a user
… so in that case we had a way to insert a screenshot or a webpage capture as part of your test results to say i can prove this is what happened
… not saying we do that here
… but there is precedent for having a useful set of tests to allow developers to say this is the stuff I need to deal with

Brent Zundel: who is not here that this information is vital for, besides wayne?

Manu Sporny: dan buchner

Brent Zundel: I will ping each of them with the minutes

Manu Sporny: I’m hearing a lot of alignment which is great
… hearing a fairly clear implementation path forward
… want to see if anyone disagrees?
… if we submit a vendor implementation of did:key and I specifically write these core property tests against that implementation I would expect there nt to be a lot of disagreement on those things existing in the test suite
… anyone disagree?

Markus Sabadello: Agree with Manu seems like we’re all aligned

Orie Steele: https://github.com/w3c/did-test-suite/issues/32

Orie Steele: I opened issue 32
… the main problem that we will have without some guidance from someone willing to do some initial work is for somebody to a PR similar to oen markus objected to previously that has support for both concrete and abstract data model fixtures
… we need an example of a vendor submitting both abstract and concrete
… the tests are written over expected data structures
… you can’t write tests unless you have some expected data structures

Amy Guy: .. that’s the way the test suite is written today, it’ll be hard tow rite tests untl we have that contribution

Manu Sporny: I don’t think we had… I think we start with testing concrete things, if we get around to the abstract things great
… all of the normative statements in the spec are around concrete things
… nothing abstract, modulo did method stuff
… if we focus on concrete representations and testing those first, we shouldn’t block on abstract stuff
… Also: implementor not vendor. Vendors may have multiple implementations
… you can include the vendor’s name in the implementation
… but using vendor language causes people to get upset about it being vendor focussed vs implementation focussed

Brent Zundel: thanks everyone for coming
… incredibly productive
… I’ll reach out to wayne and dbuc with the minutes so they can see what they signed up for


3. Resolutions