13:58:11 RRSAgent has joined #did 13:58:15 logging to https://www.w3.org/2025/07/23-did-irc 13:58:16 rrsagent, draft minutes 13:58:17 I have made the request to generate https://www.w3.org/2025/07/23-did-minutes.html Wip 13:58:23 rrsagent, make logs public 13:58:38 Meeting: Decentralized WG Identifier Special Topic Call 13:58:49 Chair: Wip 13:58:53 Agenda: DID Resolution Test Suite 13:59:08 present+ 14:03:16 KevinDean has joined #did 14:03:21 present+ 14:03:32 bigbluehat has joined #did 14:04:51 present+ KevinDean 14:04:59 present+ 14:04:59 present+ wip 14:05:10 present+ denkeni 14:05:20 scribe+ 14:05:23 brent has joined #did 14:06:48 Topic: https://github.com/w3c/did-resolution/issues/92 14:07:09 bengo has joined #did 14:07:15 wip: This is about the spec, the spec has normative statements, we need to develop a test suite to demonstrate that there are multiple conformance DID resolver implementations. 14:07:30 https://hackmd.io/pxOH-leKR8KD8LOxdb7w5A 14:07:47 https://github.com/digitalbazaar/test-suite-coverage-metrics/blob/122a836631dceb8a665fbda0bfa9445adb9de31b/app/plugins/analysers.py#L82 14:07:55 ...In that issue, there is a link to HackMD from Manu's company (Digital Bazaar). 14:08:07 ...You pass in a spec, and it spits out all the information about it. 14:08:56 q+ 14:08:59 ...It locates "MUST" statements and outputs them. It lacks context about what the MUST statements are, however. I'm working through the output to clarify it. 14:09:20 ack bigbluehat 14:09:22 ...We need to make the statements clearer and what a test for the statement would look like. 14:09:38 bigbluehat: I got Patrick to write the text extractor, so more context on why it does what it does. 14:10:04 ...The reason for the exact quotes is so that it's easy to link back to the spec using text fragment URLs that Chrome and possibly Firefox support. 14:10:10 q+ 14:10:27 ...You can't deviate too far from the language in the spec. 14:10:55 ...There is no way to go from test suite results to the spec without association with the original text. You would have to reason through every test and through the spec to align the two. 14:11:13 ...Every tester would have to understand everything about the test and the spec to be effective. 14:11:35 Some thoughts and code on test 14:11:35 tl;dr I like the focus on 'testable statements' a la w3c test methodology 14:11:35 https://bengo.is/blogging/use-test-specifications-to-improve-testability-of-activitypub-requirements/ 14:11:35 https://bengo.is/blogging/easy-to-test-activitypub-requirements/ 14:11:37 https://bengo.is/blogging/testing-ecosystem-wishlist/ 14:11:37 https://bengo.is/activitypub/apply-w3-test-methodology/ 14:11:37 https://bengo.is/activitypub/projects/activitypub-testing/announcement/ 14:11:38 https://codeberg.org/socialweb.coop/activitypub-testing 14:11:38 oh ya and like how could we test FEPs i.e. fediverse enhancement proposals like this one that talks about how to put a Multikey on your ActivityPub Actor https://activitypub-testing-website.socialweb.coop/fep/521a/ 14:11:46 ...Using verbatim quotations for a one-to-one mapping explains what's being tested. 14:12:06 ...Linking throughout the test suite and comments makes it easy to go back to the spec. 14:12:09 ack Wip 14:12:14 https://github.com/w3c/did-resolution/issues/162 14:12:20 wip: Maybe I should have saved the original statement. It ties to issue 162. 14:12:33 q+ 14:12:43 ack manu 14:12:44 ...I saw the statements come through, I thought we should have normative statements in the spec that can be extracted without loosing context. 14:13:17 manu: The main thing I want to make sure that we're doing, is look at the other test suites that are being done in the community and see if we want to align. 14:13:27 I also like how WCAG has standard 'test rules' that anyone can author and then the groups can legitimize https://www.w3.org/WAI/WCAG22/Understanding/understanding-act-rules.html 14:13:28 q+ bengo 14:13:38 ...The mistake we made in version 1.0 is that we asked people to submit documents for their tests, then they never run the tests again. 14:14:10 ...The good thing we did in the VC 2.0 test suites is provide weekly integration tests to provide information about where the implementer stands. 14:14:23 sorry linkbomb one more for posterity / respec https://socialweb.coop/activitypub/test-cases/ 14:14:29 ...There were a subset of people that didn't like the idea that they had to setup a server and would have preferred a Docker image. 14:14:42 ...It can be an enormous amount of work putting the test suites together. 14:14:49 q+ 14:15:08 ...What we ended up doing with the VC test suites is that we provided the section number in the test so that people can refer to the spec. 14:15:36 ...It's labour-intensive for the testers to read the spec and reference the normative statements. It took a couple days to write each test. 14:16:04 ...We need to know who is going to do the work. When my company did them, it cost a lot of money, upward of $100,000 per test suite. 14:16:39 ...It can differ, but something new like a DID resolution test suite is going to be expensive, and funding is hard to come by. 14:17:07 ...We are willing to help, but we're not willing to shoulder the entire burden of it. 14:17:47 ...What we may want to do is reduce the number of normative statements. We need to understand who is implementing what. 14:18:03 ...We should reuse as much of the work from the VC as we can. 14:18:21 ...We also want the test suites to be repeatable, so that people aren't running them once and never again. 14:18:56 wip: Bengo, could you speak to Manu's points about funding? 14:19:41 bengo: I put a link to standard test rules. Everyone things they're using ??? but they're using the Mastadon protocol. 14:20:19 ...I asked around how some people have done the Mocha test development. So much of the tests in the W3C in the past have been focused on things like "how does this render in the user agent?" rather than networked services. 14:21:21 ...There's a big gap between the set of normative statements and what people implement. It was an interesting exercise, and we should have working sessions on what should be tested, what the accessibility working group calls "test rules", which provides what to be tested and how it is to be tested. 14:22:09 ...Frankly, a lot of that is upstream of any automation. Much of the work is about finding which of the normative statements and then taking perhaps the top five rather than boiling the ocean and targeting something for which there is no funding. 14:22:45 ...Maybe the Sovereign group would be open to be approached. 14:22:53 ack Wip 14:23:28 wip: My purpose for the HackMD is that that is the place where we come to a decision on what normative statements to test and what each test would look like. 14:23:45 q+ 14:23:46 ...Once we have that document, we can share it with the rest of the group to get a rough consensus. 14:23:52 FWIW https://www.sovereign.tech/tech/activitypub-test-suite/ (imho the grant writer put way too much into the grant app and it wasn't just about testing) 14:24:14 ...For each of those statements, there should be a clear statement as to what it refers to in the spec. 14:24:27 ...I was at a session in Geneva about open source technology, standards, and sovereign states. 14:24:47 q+ 14:24:49 ...How do states support open source without compromising it? 14:25:14 KevinDeanI had said 'everyone thinks they're using ActivityPub but really they're using the Mastodon Protocol" i.e. there are almost no 'activitypub conformant servers' (or clients) that implement the client to server part of ActivityPub because Mastodon has its own whole C2S protocol and never even tried to implement AP C2S (mastodon predates AP) 14:25:38 ...From the W3C's perspective, if we can connect with some of these people with sovereign wealth behind them, we could find some funding with the right proposal. 14:25:43 acl bengo 14:25:45 ack manu 14:25:54 At least now there a web component to test it https://socialweb.coop/activitypub/actor/tester/ 14:26:05 q- 14:26:13 q- bengo 14:26:16 q+ 14:26:23 manu: +1 to follow on what Wip said. Our company was funding by the National Science Foundation multiple years ago to do just that. That's where the funding came from for the VC test suite. 14:26:40 ...One of the things we need to do is land that stuff at a standards organization, one of which is W3C. 14:28:18 ...I want to go back to actually "getting the work done". What we're talking about is fine, but I want to make sure that we get to implementing this stuff much sooner than later. 14:28:43 ...We don't really find out how testable a statement is until someone actually implements it and writes a test for it. 14:29:16 ...Until we get the first test written and the first resolver hooked up, we won't have clarity. We need to do that in the next month or two, for just one statement, so that we understand the architecture of what we're testing. 14:29:43 ...There is a DID resolution test suite, created two to three years ago, that has resolver endpoints and how you run the test. This might be something good to start from. 14:30:02 q+ to ask manu and whoever else: tradeoffs of the initial test(s) being in code vs plain language? 14:30:11 ...It would be really nice if we could get this integrated with reporting so we could see who is running the test suite and who is confirmant. 14:30:20 s/confirmant/conformant/ 14:30:50 ...The output of the resolver test suite should look familiar to people with organizations, tests, and results in the output. 14:30:59 ...This was done a long time ago and hasn't been updated. 14:31:24 ...The infrastructure already exists and we can leverage it. I think all we have to do is focus on writing the tests as the infrastructure is already there. 14:31:47 JoeAndrieu has joined #did 14:31:54 ...The concern about extracting the language out is that we found that as the specification changes over time it creates a significant amount of chaos for the test writers. 14:32:06 q+ oh yeah would love to +1 that that one of the reasons I stopped on the test suite is a non-editor started changing the normative statements in the ActivityPub editors draft outside of a WG ! 14:32:27 Oh yeah one reason I stopped on AP test suite is a non-editor started changing all the normative text in the 'editor's draft' of ActivityPub :( 14:32:36 ...We don't like writing test suites until the candidate recommendation phase, because the spec is still in flux and it costs a lot to keep rewriting the tests. 14:33:03 ...We can automate this to a degree by having the editor verify that the normative statement is still present. 14:33:17 ack Wip 14:33:24 ...Getting something running, locking in the architecture, is most important right now. 14:33:48 wip: What I'm hearing from you is that we need to come together to select five or even just one statement to test the architecture. 14:34:25 ...How I see one way to do that architecture is that, as an implementor, I would like my library to pass the test suites. 14:34:49 ...You need to be able to provide an HTTPS URL that can accept resolution requests in the form defined in the spec. 14:35:21 ...I would pass a URL, a DID, and a resolution option that would provide a response that conforms to the test. 14:35:34 q+ 14:35:44 acl bengo 14:35:44 ack bengo 14:35:44 bengo, you wanted to ask manu and whoever else: tradeoffs of the initial test(s) being in code vs plain language? 14:35:55 ...I thinks it's fairer on the working group to have multiple test developers. 14:36:48 bengo: The things you're showing are test suites in JavaScript. It's important that there be plain language about what the tests are. 14:37:22 ...What are the tradeoffs if we pick one statement and document it in plain text versus just providing the test? 14:37:45 manu: Our experience with doing these test suites is that nobody really cares about what's in the test suite. 14:38:04 ...The only point at which anyone cares is when their code fails. 14:38:39 ...I don't think plain language helps. The specification language is supposed to be plain enough for people to read. If they can't, then there's something wrong with the specification. 14:39:16 ...Plain language is time-intensive, as you're writing yet another spec. It bumps the costs way up. We trust the person writing the test suite that they're doing a good job. 14:39:33 ...What often happens is that one person writes the test and they're the only one that understands it. 14:39:54 TallTed has joined #did 14:40:14 From a lean / good ROI perspective I don't disagree with what Manu is saying at all 14:40:14 ...Let's get to implementations, an understanding of who is writing the test, and the architecture. 14:40:39 q? 14:40:39 ...If after creating a test on the architecture, review with the group and get agreement on moving forward. 14:40:48 ack manu 14:41:25 bengo: In terms of funding, people in the community self-funded to change normative statements in the editor draft to align with the tests. 14:41:30 q? 14:41:32 q+ to note we're happy to help, but can't be the central implementer this time around. 14:41:48 KevinDean not to align with the tests or spec but to change the spec to their implementations 14:41:50 wip: Can we do a poll? Who here is willing to help with the development of test suites? 14:41:55 ack manu 14:41:55 manu, you wanted to note we're happy to help, but can't be the central implementer this time around. 14:42:03 q+ 14:42:17 q+ 14:42:21 manu: We're happy to help, if what we do is aligned with what we did in the VC working group. 14:42:43 ...It can get through the standardization process, it can provide daily or weekly output, it can integrate with reporting/tooling. 14:43:00 ...We can help, but we can't be the central implementer of this test suite. 14:43:02 present+ 14:43:31 ...We have some understanding of how we could make this as effective as the VC 2.0 test suite, but we need confirmation of the architecture. 14:43:31 ack bigbluehat 14:43:58 bigbluehat: +1 to what Manu said. We would not want to carry the water for this one after spending the time on the VC one. 14:44:42 ...Test suites are valuable to government agencies and other users looking for implementations that pass the test suites. 14:44:54 ...It may be possible to get funding for that. 14:45:05 ack Wip 14:45:17 ...There's a lot of foundational work done, and we'd be happy to help to keep it moving. 14:45:55 q+ 14:46:28 ack manu 14:46:28 wip: Manu, how do you define the architecture as possible? One of the differences with a DID resolver is that VCs are about verifying a payload, but resolvers are implementation-specific and will implement the normative statements only for their supported DID methods. 14:47:07 manu: The general testing architecture that we have right now is that the testing infrastructure presumes that you're going to call some kind of URL with some kind of payload. 14:47:33 ...The good news is that we could presume a similar architecture for DID resolution, which specifies an HTTP endpoint. 14:47:34 +1 14:47:44 q+ 14:47:55 ...If you expose that endpoint, you can reuse the same architecture that we used for VC 2.0. 14:48:09 q+ 14:48:24 ...If someone wants to test a library, you can wrap it in a Docker image, run the image locally, and access it over HTTP locally. 14:48:49 ...The other problem is, what DID are we going to use for interoperability testing? 14:49:13 ...You can provide the URL as well as test DIDs (good and bad results). 14:49:47 ...Every resolver can come to the implementation supporting its own DID method. Ideally, we don't care what DID method is being used. We just care about the interface working. 14:50:14 ...There might be come DID methods that don't support deactivation, so how do we support that? 14:50:46 ...We may be stuck with the DID resolvers having to specify in their configuration various DID documents that result in testable outcomes. 14:50:48 ack Wip 14:51:11 wip: We are developing a library, and we'll easily be able to spin up a Docker image around it. 14:51:43 ...DID resolver implementations needs proper test cases for their own DID methods. 14:52:05 ack bigbluehat 14:52:08 ...This is work that DID method developers need to do to ensure that they're developing a good DID method. 14:52:26 q+ 14:52:42 bigbluehat: I want to confirm that the Docker file exists. The Docker approach isn't truly necessary as the amount of code is minimal. 14:53:13 ...There's typically one GET and one POST endpoint that take JSON as input and return JSON output. 14:53:21 ack Wip 14:53:59 wip: The default and easy way is to provide a DID URL. If you don't do that, you will have to do extra work to run the test. 14:54:09 q? 14:54:19 q+ to get to concrete next steps. 14:54:24 ack manu 14:54:24 manu, you wanted to get to concrete next steps. 14:54:46 manu: I want to make sure we don't end without an action plan. Are we on the same page about reusing the VC infrastructure? 14:54:56 q+ 14:55:31 ...If you're doing a DID resolver, you're exposing an HTTP endpoint, providing DIDs that can be input, and providing expected results. 14:55:41 ack Wip 14:56:05 wip: I think we do. It's on the agenda for tomorrow to feed back to the main group. We should get a decision there. 14:56:35 ...I feel strongly that we should require resolvers to provide their own testing rather than provide a single central test. 14:56:45 ...Perhaps that's something we can get into a proposal that we may pass. 14:57:25 manu: I'll try to send something to the mailing list in preparation for tomorrow.