DID WG Topic Call on the Test Suite — Minutes

Date: 2021-04-01

See also the Agenda and the IRC Log

Attendees

Present: Amy Guy, Ivan Herman, Orie Steele, Markus Sabadello, Charles Lehner, Brent Zundel

Regrets:

Guests:

Chair: Orie Steele

Scribe(s): Amy Guy

Content:


1. PR reviews

Orie Steele: I’d like to be able to merge PRs but I’m hoping other folks will be able to merge PRs. We need more than one person to be able to merge a PR
… to progress at a reasonable pace
… I want others to feel empowered to do that

Markus Sabadello: it’s you and manu right now

Orie Steele: that’s problematic. We need more
… agree not enough of us on the call to do much, other than merge some of your PRs if they’ve been open for some time

Markus Sabadello: 6 days, but only one review

Orie Steele: we need more review

Markus Sabadello: biggest challenge is overlap between work that different people are doing
… eg. at least three people have implemented a test to see if a property value conforms to an xml datetime
… because it’s a requirement that comes up in several parts of the spec
… so people implement that separately, so duplication in the code
… and references between sections
… your view was this isn’t necessarily a bad thing and people should work in parallel and harmonize later

Orie Steele: people should own their section, then we should do cleanup, and repeat
… not worry about code duplication immediately
… at some point a tremendous amount of helper functions that should be centralised and reused
… check url, did url, ascii, datetime, etc
… which PR should be merged markus_sabadello? 42?

Markus Sabadello: I’d rather have 43 merged. A lot of code on testing the did resolution and url dereferencing sections
… a lot of stuff
… to avoid future conflicts
… and to start realising what kind of structural change we might want

Orie Steele: I’m merging it.
… Feel the pain f not doing reviews.
… clean it up later..

Orie Steele: https://github.com/w3c/did-test-suite/pull/44

Markus Sabadello: charles has 2 open PRs, I reviewed one

Orie Steele: let’s look at 44 together. i reviewed
… Organise DID param tests by section
… section annotation prefixing, so we can tell what sections are tested, looks good
… thrust of this PR, charles?

Charles Lehner: yes
… in response to a comment by manu on the previous PR

Orie Steele: cool. it’s been open for 5 days. Any reviews before I just merge it?
… can I merge it?

Markus Sabadello: approved

Orie Steele: merged
… PR 41… not looking at that

Orie Steele: https://github.com/w3c/did-test-suite/pull/45

Orie Steele: PR 45.. test conforming consumer
… this is asserting that errors are produced for different representations that when consumed might result in an error?

Charles Lehner: that’s part of it

Orie Steele: sections are annotated well, expectations for errors are basically reflected through the fixture, nice DID regex there

Markus Sabadello: this is the kind of thing we probably want to reuse
… I also have tests that test if something is a valid DID
… and a regex but it’s less cool than this one

Orie Steele: I do too… there’s a million.. they’re all broken because of the % encoding
… % encoding is included in this one. Great.

Markus Sabadello: I like about the input data that the config.json with the data is the errors. If I understand correctly it allows negative tests
… inputs that are not valid and test if the result of the consumption in this case is an error as expected

Orie Steele: looks great

Markus Sabadello: I tried to do that too
… in resolution
… we may want to reuse some conventions on how we express negative outcomes

Orie Steele: agreed
… any objection to merging 45? it’s been open for 3 days and has 2 approvals
… merging
… shigeya, missing dependency… non spec related… going to merge it
… just trusting him

Orie Steele: https://github.com/w3c/did-test-suite/pull/47

Orie Steele: now his most recent PR 47
… is about metadata structures. I started a review
… we might ask him to include the section headers in the descriptions
… but he does have one section header for metadata structure
… I didn’t get all the way through reviewing
… this may be sufficient
… Looks good. Approving.

Markus Sabadello: I worked in a different suite

Orie Steele: that’s really helpful at this stage to have separate files and directories, smaller files are always safer
… I propose everybody on the call take a look at 47, call out anything problematic
… it’s short so you should be able to request changes or approve it in the next couple of minutes

everyone: sounds of concentration as PR is carefully reviewed

Markus Sabadello: one test that can’t easily be tested in code. I also had some of those in my section. He made it a ?? which is also fine

Orie Steele: it’s really good, it’ll show up as a thing that we’ll have to look at
… if we can’t get rid of it we’ll remove it
… if this were automatic our lives would be totally different
… it’s beyond the reasonable expectation of a test author
… any objection to merging 47?

Markus Sabadello: I think it’s fine

Orie Steele: we can always open a PR to update it in the future
… we know these tests are going to be changed around a lot
… no merge conflicts, 3 approvals, going ot merge
… now we’re talking about the production tests
… markus had a changeset requested on this originally because the language was stale, we were testing statements from a pre CR version
… manu claims to have resolved that, you have change suggestions, he said he implemented some
… wondering if we can resole the outdated conversation so folks aren’t tripping up over it
… under your review, where he says he fixed the requested change
… resolve if that’s true

Markus Sabadello: definitely resolved
… in general this PR has a very insufficient.. the language here in the spec talks about the entries in the data model, the representation specific entries
… and using them both to produce the representation
… and charles did that in a good way in his PR, the one of the two we just merged on conforming consumers
… in the data he distinguished between the ADM and representations
… in manu’s PR it doesn’t happen at all
… its’ doing something but not what the spec says
… I wonder that’s not a reason to not merge it
… could say it’s better to have something than nothing

Orie Steele: we can open an issue for that
… and we should
… if we think this test is better than nothing but flawed we should open an issue noting its a problem, describe changes we’d like to see, contineu that conversaiton and the issue will move along and we’ll do a PR to address those concerns
… seems like looking at the suite config here.. it’s a little.. interesting structure
reads code
… markus, you think the primary issue with this is reliance on did resolution result?
… it should instead be relying on some ADM representation?

Markus Sabadello: yes
… if you look at what charles did

Orie Steele: I commented on the generate did producer test line which is the beginning of the problem
… if you can link to the part of charles’ test that is excellent you should do that

Markus Sabadello: I will post the comment, just need a minute

Orie Steele: we don’t have any merge conflicts.. if I were to open an issue..
… producer tests should be of the ADM

Markus Sabadello: yep
… posted comment

Orie Steele: let’s see if this issue if resolved would address your concerns and if you’d be willing to approve
… manu is asserting that the resolution result is the ADM
… with that issue opened acknowledging it has to be addressed for these tests to meet our needs would you be willing to re review

Markus Sabadello: I’d be fine with that
… wondering if manu is already working on something. His last comment was he’ll take a look. Maybe he doesn’t want us to merge it at this point becuase he’s improving it
… but ot merge now and improve later is fine

Orie Steele: if he doesn’t want it merged he should close the PR. It’s open 11 days
… we’ve gone around, there’s no merge conflict
… a smaller PR which addresses these concerns would be easier to review
… in favour of merging and future fixes in a smaller PR

Markus Sabadello: fine

Orie Steele: do we think he’s coming…?

Brent Zundel: he got double booked

Orie Steele: perhaps while we wait to hear back from manu we should discuss the next primary roadblock. Markus, what are you looking at? Are you blocked
… and same for charles

Markus Sabadello: we should answer the question if we want one test suite or multiple suites
… if multiple then how do we split them up
… do we have a different suite for each section rather than having multiple files in a single test suite?
… that would be helpful
… we’re far enough along that we could actually try to harmonize the different input files
… the default test suite there’s a lot of duplication
… a lot of did documents
… one input file is used for testing production and consumption and a different test suite fro resolution which has a lot of did documents
… from a perspective of a did method implementer right now they’d have to submit a lot of unnecessarily duplicated data
… if you have a did method and you want to submit five did docs some with or without errors, right now that’s spread out across too many files
… we could start having this discussion now we merged a lot of PRs

Orie Steele: https://github.com/w3c/did-test-suite/issues/49

Orie Steele: agreed
… I opened issue 49
… which is proposal to restructure the json to be js files that can include smaller json files
… helping us not repeat ourselves
… . that’s one piece of cleanup we could do
… regarding the suites, I’m in favour of a suite per section for now
… assuming we get the configuration pieces sorted out
… it’s good to have smaller test suites more independant, different folders, will lead to less merge conflicts and more autonomy and velocity
… we should make a proposal to manage multiple suites
… I’ll create an issue for that

2. test suite organization, issues

Orie Steele: https://github.com/w3c/did-test-suite/issues/50

Orie Steele: I think it will allow us ot test better in isolation, own yoru section of the spec more easily and make it easier to work on tests, testing only the things you care to improve

Markus Sabadello: sounds good. Also interested in charles’ opinion

Charles Lehner: either could work

Markus Sabadello: we mean a different directory in the repo. You submitted yours in the did spec path where all the others are so far, the proposal is to split that out into folders

Charles Lehner: I think that’d be good

Orie Steele: related issue is we’ve been talking about testing the spec, but there’s a desire to test specific did methods
… a case where we want separate tests
… i know manu talked about the desire to have the tests for the spec could be reused for the did method
… concerned that’s a dream and hard in reality
… easier to implement did method conformance tests as an isolated suite
… anyone have opinions on that?

Charles Lehner: sounds good to me because this test suite is focussed on testing the normative statements

Orie Steele: yep

Markus Sabadello: not sure I understand, was thinking the opposite
… I thought to test the spec, what we’re doing right now, example data and testing the normative statements. In the future people submit their did method results then I didn’t think we’d have to write additional code or tests, just additional input data that we add
… rather than being fictional example data it’s real data
… but other than that we wouldn’t do anything different

Orie Steele: as an implementer wishing to show your method conforms to the 15 different sections of the spec you’d craft payloads that replaced exmaple.com with your did method if you wanted to show conformance with that section
… and then we’d have a directory of different configurations for each method across each of the sections of the spec
… how you imagine it?

Markus Sabadello: i think so yeah

Orie Steele: I think that’s also how manu imagines it
… concerned that the experience for actually doing that for a real did method might be more frustrating than having an independent test suite that reuses pieces of the tests for the spec
… but I can see both ways

Charles Lehner: either could work but there would be more changes needed here to test the did methods fully because currently some tests are only triggered if conditions are met
… and the input might not meet those conditions

Orie Steele: yeah
… that’s part of… generally with tests when you try to make a test do more than one thing you end up sad
… an opportunity to make a really great test experience for did methods that want to show conformance
… by giving them a more structured input format
… we can reuse a lot of code to support them
… but asking them to submit input matching the same structure of the inputs to test normative statements is potentially burdensome
… maybe we’re not far enough along on the tests to answer whether that’s true or not

Markus Sabadello: talking bout two repos?

Orie Steele: no
… like you had created a separate suite for a section of the did core spec. We might create a suite for did method conformance that would call into this different sections fo ra structured input document
… separating the concerns of normative statement tests vs make it easy to make a did method to conform to the spec
… two separate testing goals
… we know we can force the did methods to produce input fixtures that look like what we have today
… that might be frustrating for folks, they don’t care about a good number of the normative statements, they just want to provide an example of a did doc after resolution for their did method
… provide dids with url parameters
… they migth e hoping for an easier set of input fixtures than what we provide if we require them to submit json fixtures for a bunch of different suites

Markus Sabadello: the fixtures we have no have to be harmonized anyway, they will become simpler than they are now
… the ask of the did method implementers will not be that difficult
… on the other hand with the special test suite for did method conformance sounds nice. Don’t have a strong opinion
… let’s clean up what we have now then we’ll see

Orie Steele: that makes sense, agree
… We only have one PR open, no objections from other editors for merging the producer PR
… two approvals, open for 11 days, issue open to track concerns
… gonna merge it
… Now we have no work to do \o/

Charles Lehner: april fools

Orie Steele: I asked markus what was blocking, you gave some feedback, opened issues to track your feedback
… charles, do you have any direct feedback? what’s blocking you?
… any things you’d like to discuss?

Charles Lehner: I’m not sure, thanks. I’m writing my feedback on the new issue
… and I’m open to taking on some part of the work

Orie Steele: awesome
… declare yourself as taking on that work on the issue

Charles Lehner: in general, but I’m writing feedback on issue 48

Orie Steele: that’s good. Feels like we might want to make manu address 48
… it was his PR
… want to make sure he felt comfortable with it
… but if you’re willing to do the work I’m happy to assign you
… leave a comment, he’ll be happy to review a PR instead of make one

Charles Lehner: okay

various: github misbehaves

Orie Steele: other unassigned issues
… proposal to refactor input json
… markus this was one of your feedback, would you like to tackle it?
… I guess manu had mentioned on a previous call that he wanted to tackle some amount of refactoring

Markus Sabadello: I’m far less competent with this js architectural things than other people

Orie Steele: let’s assign manu, that’ll be good.
… proposal to create a suite for major sections of the spec… also assigning to manu
… I know he doesn’t want to be critical path, but..
… we’ll leave the producer tests should be of the ADM
… Every issue is assigned a person
… we have no open PRs
… Anything we would like to discuss?
… I have a suggestion if no-one else has anything

3. test report formats

Orie Steele: I’d like to talk about how we’ll produce visualisation of the test results

Orie Steele: https://github.com/w3c/did-test-suite/issues/51

Ivan Herman: visualisation seems a very elegant word…

Orie Steele: could be a green/red checkmarks you’ve seen

Ivan Herman: exactly
… don’t overcomplicate things

Orie Steele: okay
… I’ve opened issue 51 and make a comment
… minimum green checkmarks and red xs

Ivan Herman: what you have in such a document is the list of implementors where every test has the greens and reds
… and the goal is that in has row you would have at least two greens
… now, having a human readable description or something you can pull out of metadata of what each tests does or what it is for is a good thing to have
… and also a good thing to have a few words on a metadata level
… for each implementation that you have
… but that’s it

Orie Steele: awesome
… wrote that
… tried in the past to programmatically… after we run the suites you get a json object which contains the test results
… if you iterate that json object you can produce something close enough to a test report
… it is frustrating to do that
… it’s a lot of html building from json objects
… if the object payloads change stuff .. it can became brittle
… I more recently abandoned spending any time on this because we know the test shape is changing
… I would hold off on trying any form of this until we feel really comfortable with the way the tests are feeling

Ivan Herman: yes, clearly a better way to do this
… Did you have examples of other reports?

Orie Steele: I looked at the VCs test suite report
… we had been in other projects that use jest as the underlying test infrastructure there is some automated visualisation and reporting
… but it is not super friendly from a reading perspective, but you get it for free
… also an argument that we should add that free reporting here
… very little work to do it, another thing for folks to look at and it’s automated

Ivan Herman: one example of one we use in a different wg

Ivan Herman: See example for report

Ivan Herman: See implementer report data

Ivan Herman: See test suite metadata

Ivan Herman: is automated from json
… not very exciting
… not a big deal to do

Orie Steele: this is very readable

Ivan Herman: one json file for each implementation
… custom format
… not complicated
… and there is this json file which gives you data about each test
… separate metadata for the test suite
… and each implementer puts in a simple one
… and javascript combines all that

Orie Steele: we’ll need something like this
… we saw evidence of this in manu’s PR around the did:key implementation
… overloading the describe block with metadata about the implementation
… where perhaps we’d rather keep that describe block focussed on what the tests are about and pull that metadata from somewhere else
… this looks good
… Thank you all

Charles Lehner: thanks