SocialWeb Test suite discussion

13 September 2023


btsavage, dmitriz, eprodrom, michielbdejong_, pchampin, pfefferle, sandro, tantek
Dmitri Zagidulin, James Gallagher, nightpool
michielbdejong_, pchampin

Meeting minutes

<mro> Hi allo, are you already preparing? In 10 minutes the meeting is supposed to start, right? https://www.w3.org/events/meetings/919c2a7d-e925-4249-8a29-09001c15b48a/

<mro> Hi tantek, is there a way to see the presented material other than zoom?

<eprodrom> Hello!


<eprodrom> mro: are you here?

<mro> yes

<eprodrom> :thumbsup:

Dimitri: welcome everyone to the test suite special topic breakout session

<mro> Hi eprodrom

we wanted to continue our series of calles

we started a few weeks ago, where we discussed the general challenges

in developing a test suite for the fediverse, specifically activity pub

discussing various test suite efforts

demos of in-progress test suites and tools

today let's continue with a similar pattern

give updates on tests in progress

any sort of demos

and if there's time, there will be talk about starting test suites on other protocols aside from activitypub

and schedule the next test suite call

any questions already?

Evan: yes, i'll do the queueing thing and add myself

I thiknk our previous discussion was just about the AP federation protocol between implementations

<mro> yes, one very general one: what are the concrete goals and audiences for the test suites discussed?

i think we have two other important specifications that we may want to include test suites for

parsing and producing

and AP api (client - server)

<dmitriz> https://github.com/swicg/meetings/tree/main/2023-08-11

Dmitri: and here's the link to the previous meeting

Dmitri: that's a great topic to begin with

focus on the AP test suite. specifically the aim is withasmuch hands-on command-line or human participation as is required

but the general goal is to automate the process as much as possible

to be able to point the test suite to one or two live instances

and check all the musts and the must-nots and should-nots from the AP spec

to that end, several projects started on listing the requirements in a machine readable format

and publishing those to a git repo

on top of which, classical unit tests in javascript or python or go can be built

<dmitriz> https://socialweb.coop/activitypub/behaviors/

Dmitri: so one such repo of behaviours was done by Ben

<eprodrom> Cool domain!

<mro> Are the tests supposed to be run by a independant 3rd party or by each individual self-hosted?

That link is a rendered list of those behaviours, but there will be a git repo for the raw objects as well

Another one was done by ...

Evan: before we move on, i would like to ask about the behaviours implementation

my main question is, is this system complete?

is there additional work that needs to be done on the socialweb.... behaviours list

Dmitri: next step is the shapes, we welcome help on that front

Must is complete, should should be added

Evan: are there mechanisms for this?

Dmitri: yes

I'll show you some of the ..

writing the actual unit tests in javascript

we decided to go with the native test runner

we're using json objects to wrap the test suite

here's an example

<dmitriz> https://codeberg.org/socialweb.coop/socialweb.coop/src/branch/main/activitypub/testing/src/actor.test.mjs

<mro> 404

Dmitri: those are nodejs test runners that wrap some of the behaviours

Dmitri: ah, it was moved, one moment

<dmitriz> sorry, this is the public list of behaviors: https://codeberg.org/socialweb.coop/activitypub-behaviors/src/branch/main/behaviors

Evan: this on codeberg will be roughly the structure that you're talking about

Dmitri: yes, that's the idea

This is the open list of required behaviours. the readme describes the format

<mro> I see yaml.

in terms of an example of it being used by the test suite ...

that ought to provide a starting point

Ben from Meta: I'm not how to use it

<mro> I guess I need ssl + subdomain to run - is that described somewhere?

Dmitri: we use the yaml files as a source code from which we can generate JSON objects, JSON schema,

graph insert statements, JSON-LD

It's a general purpose syntax

but for the test suite we compile the yaml into JSON objects

and just use the attributes like described in the ...

<mro> is the process documented?

Ben: i'm not sure how to use the yaml files to test my implementation

Evan: let me illustrate my own interpretation

this behaviours repo is not a test suite

it is a list of requirements that a test suite would use

Tantek: it's a step towards creating a test suite

<mro> is there such a suite?

Evan: correct. because it gives a unique id to each requirement, it lets us implement multiple test suites and still compare apples to apples

and point to the correct way things should work according to the spec

Dmitri: yes, the idea is to 1)

allow these objects to be used in the test naming and the output

but the idea is to give these stable identifiers

and publish the outcomes of various test suites

so our next step in terms of that is

giving each of these behaviours inboxes so that testers and implementers can ...

and this needs multiple inboxes

make them their own topics or even AP actors

so that we can help coordinate with other testers

Dmitri: any things i've missed?

Tantek: i want to first comment and then make a suggestion

first of all thank you Dmitri, this list is key to developing a good test suite

it's a common pattern, step 1, extract the testable assertions into a discrete list

step 2 create an actual test runner

<Zakim> tantek, you wanted to suggest ActivityPub S2S equivalent of https://webmention.rocks/

dmitriz: thanks!

Tantek: the second point was the AP S2S equivalent of webmention.rocks
… some automated way to do such tests
… we've seen great success that once implementations pass the test suite, it should Just Work (tm) with other implementations
… speaking from a Mozilla perspective as well, also for mozilla.social we want to help this drive
… and help with the goal setting
… as a client of the test suite, let us know how we can help with that

Dmitri: that link you mentioned webmention.rocks is exactly the kind of thing we're aiming for. that's our goal yes.

Evan: i'm going to put you on the spot Dmitri, are you building one?

Dmitri: yes, we are, i thought it was public

Dmitri: actually i'm going to put Juan on the spot for that :)

Juan: the two priorities are: test suite for CI, for feature branches etc, most urgent is supporting people who want to add support for something
… so they can test both trunk and their branch
… the other goal is buy-in from other people working on other test suites
… it's almost more important we get the attention from other people doing other form factors etc
… we hope for the end of the year to have semantic anchors linking with the Gherkin work and then we have the apples to apples comparison

<dmitriz> this is Helge's Gherkin behaviors repository, that he demo'd during the previous testing call: https://codeberg.org/helge/fediverse-features. And I believe Helge expressed interest in collaborating with the behaviors list from the spec

Juan: how you compare architectures, p2p vs client-server, they will have different test runners, but they can agree on the requirements identifiers

Evan: so i can run this and see the output?

Juan: ideally, yes

Juan: and i was on the queue for another thing
… to say the yaml files include one counter-intuitive key-value pair
… they depend on each other; if you do this, then that should happen etc
… so if you're comparing a P2P that does only one of the code paths of a dependency
… then the idea was to have a little bit of structure to say "I'm ignoring this code path" - we were assuming preventing that people down the road would find fault with

<Zakim> by_caballero, you wanted to mention the dependency-encoding of the YAML files

Ben: it occurs to me it would be extremely useful to have somthing like web platform tests
… is that something that we're trying to work towards?

Dmitri: yes and we welcome contributions but that's the goal

Tantek: a manual integration report would be a good start
… I think it's OK to have a more modest goal of manual implementation reports

<tantek> https://webmention.net/implementation-reports/summary/

Tantek: for webmention there is this giant table
… light green is like barely passing, dark green is like 'oh yeah'
… not every implementaiton needs to do every feature
… yellow would be 'oh only one implementation did that'

<tantek> https://webmention.net/implementation-reports/

Dmitri: is there a repo link?

Tantek: yes

<tantek> https://github.com/w3c/webmention/tree/master/implementation-reports

Tantek: and if that structure works for you great, if you have a better structure, also great
… as much as we can help point out building blocks, I'm here for you

Ben: I'm looking for advice, if i was today to write unit tests, to see if Threads passes this list of behaviours
… how have others done this in the past? have sample behaviours and then manually craft the unit tests?

Dmitri: great question, the answer is going to be all of the above
… even if it works between two or more servers, we're going to have to have example data, example create and subscribe messages, follow, unfollow, etc
… so you're absolutely right that the behaviours themselves are not enough

Ben: yes, a repo of mock data would be super awesome

Dmitri: yes, and we would love to host that in the same repo
… where should we host it? on the social CG github space?
… it is git so it doesn't matter that much in the end

<mro> pfefferle: you mean https://fedidevs.org/category/reference

Evan: yeah, i think, i believe social web coop are the only ones so far implementing a test suite with this
… to answer Ben's question, the way people is do it now is just go through the spec

<pfefferle> yes

Evan: ideally set up locally, but you can also set up test users on public instances and send messages from both sides
… from your own implementation to Mastodon and back for instance. a very manual process. saying this a someone who gets test follows sometimes :)

Pfefferle: I just wanted to mention two projects that https://fedidevs.org/
… is kind of a project that collects sample data
… how actors look, etc. you can at least use that

<by_caballero> https://fedidevs.org/reference/actor/ <-- mock data!

<tantek> https://fedidevs.org/

Pfefferle: and there is an nlnet project that tries to fund an AP test suite. Started by Johannes Ernst
… but maybe it is a good idea to fund some of the existing projects

Ben: so i'm wondering if it would be useful for implementers to set up some kind of test user affordance
… so you can ask please create test user so and so
… without needing to send out any sort of test follows

Dmitri: one answer is "yes, let's!"
… another is that it's creating a small parallel spec, where we're testing the test affordance
… but some instance have signup requirements and manual approval

<mro> I found some projects to refuse name a sample instance to create accounts at. Problem.

Ben: then you have n^2 combinations of implementations. but theoretically you could do just n tests if there would be an oracle perfect counterparty instance to test against

Evan: i think the term of art is a "reference implementation" :)

<mro> I wrote things down in a "Federation Fairness" note with a more cultural approach.

Dmitri: that term has been incredibly contentious, its' such a political statement

<tantek> +1 dmitriz "reference implementation" is a very contentious phrase

Tantek: the whole test models thing is very much about how stuff is bootstrapped
… Jeremy Keith (?) have this post where they say 'you can test your replies implementation against this post'
… even before there is a formal test suite

<dmitriz> haha yeah, so, it IS really useful! we'll just have to name it carefully "This thing over here is not a reference implementation we swear" <- has a nice ring to it.

Tantek: and people ask in chat channels people ask 'can someone send me a test follow?' it shows health of the community
… let's shy away from reference implementation. but a test suite server should also be open source
… rather than calling it a reference implementation

<Zakim> tantek, you wanted to suggest a wiki page inside w3.org/wiki/SocialCG/ that at least keeps a flat list of current AP test suite sites/efforts (like we are collecting here)

<dmitriz> "test suite server" <- that's a great name too, yeah!

Tantek: there are a lot of different efforts for test suites, some are disconnected

<by_caballero> https://socialhub.activitypub.rocks/t/wiki-collected-feedback-on-interop-testing-methods-living-docs-and-specs/3538

Tantek: and that's ok but i was going to suggest we add a page on the wiki, listing all test suites we know about

<by_caballero> ^ Here's the current linklist on socialhub

Tantek: for others who want to start yet another one
… and they can hopefully get in contact with each other
… it almost sounds like a coordination channel more than an interest challenge
… and people working on different pieces and then putting them together.. I'm hinting and signing you up for that, Evan

Dmitri: yes

<eprodrom> w3c/activitypub#387

Dmitri: there is very exhaustive collation of testing efforst
… but this wiki page could be more summarised

Tantek: yes, let's not divert existing conversation, but at least link to it to make it more discoverable

Evan: first, i created an issue to remind myself of the task of creating a testing page
… and also link to any testing implementations
… i think the question i wanted to ask is - we have a number of automated testing projects happening here. do we want to have an official "Social CG test suite" that is the single or primary way to put our efforts together
… do we have momentum to say the CG runs and maintains a test suite?

<btsavage> I support an official test suite

Evan: and if so how do we get there

<tantek> +1 if we have a volunteer to lead it and coordinate

Juan: I was on the queue for something similar
… after Johannes talked about his nlnet plan, i started a blogpost. it's not ready to share yet but

<tantek> if/when we have a Social Web WG, we'd likely want that as well, a Social WG test suite, which could live at a different domain (e.g. like webmention.rocks) but would still be an official effort from the group

Juan: but i was thinking of a framework for working from the spec down to the apis, and Johannes was talking about the opposite direction
… one of the trickier thing for the social CG is like framing it in a non-zero way
… directing people based on if your primary goal is 100% interoperability, this might be the test suite for you
… there can be a layer inbetween like architectural decision making
… i keep chewing on that, that discussion about the python test runner that tests fediverse servers. it is not agnostic to client-server vs P2P
… you don't want to pick between your children
… all this to say that i'm thinking about that as a question to answer before we take a decision about the official CG test suite

Dmitri: i'll summarize

michielbdejong_: about reference implementation / opensourcetest server;
… we have a similar thing, opencloud XXX
… for all implementation we tell the server to do this or that
… or the other way around, tell the OCM to go to the API to test

<dmitriz> +1 michiel, sounds like another vote for the usefulness of an "oracle" like test server

michielbdejong_: the two directions are possible; use of stub that just *look* like they implement the protocol
… that's a way to avoid a "reference implementation"

Evan: i lost my thought
… oh, my main item was to say we don't maintain the Mastodon API :)
… i don't think it's necessary for us to test that but we do have AP API and ... S2S and activity streams
… yes my next question is about funding
… we had some discussion about two different systems where raised some cons
… does it make sense for a CG to sollicit funding from other sources
… but is there a way we can get this test suite aligned


michielbdejong_: for Solid, we did that in parallel, independant from the CG

Dmitri: that's a great topic to take to the mailing list

Juan: could someone commit to coordinating?
… CC me on all the emails
… i offerd this informally in a previous call. i am already funded for this volunteer work if coordinating testing is an activity that we want to happen
… that money is specifically for coordinating volunteer efforts
… not product testing, it has to be anchored in the spec
… laser focused

<Zakim> by_caballero, you wanted to speak to funding

Ben: i am happy to try to scale up funding
… to support test suite if the CG could provide us a recommended thing to fund
… that would be super useful

Dmitri: thank you
… let's take this discussion back to async

<eprodrom> PROPOSED: create a Testing Task Force within the SocialCG with by_caballero as Task Force Lead

<dmitriz> +1 to that (Testing task force)

<tantek> +1

<eprodrom> +1

<pfefferle> +1

<by_caballero> +1 danke!


<mro> +1

RESOLUTION: create a Testing Task Force within the SocialCG with by_caballero as Task Force Lead

Summary of resolutions

  1. create a Testing Task Force within the SocialCG with by_caballero as Task Force Lead
Minutes manually created (not a transcript), formatted by scribe.perl version 221 (Fri Jul 21 14:01:30 2023 UTC).


Succeeded: s/parsign/parsing

Succeeded: s/Dimitri/Dmitri

Succeeded: s/guilt/built

Succeeded: s/suitee/suite

Succeeded: s/(...)/inboxes

Warning: ‘s/... dev.org/https://fedidevs.org/’ interpreted as replacing ‘... dev.org’ by ‘https://fedidevs.org’

Succeeded: s/... dev.org/https://fedidevs.org/

Succeeded: s|https:|https://fedidevs.org/

Succeeded: s|https://fedidevs.org///fedidevs.org|https://fedidevs.org/

Succeeded: s/parents/combinations of implementations

Succeeded: s/itmore/it more

Succeeded: s/buty/but

Succeeded: s/like want/likely want

Maybe present: Ben, Dimitri, Dmitri, Evan, Juan

All speakers: Ben, Dimitri, Dmitri, dmitriz, Evan, Juan, michielbdejong_, Pfefferle, Tantek

Active on IRC: btsavage, by_caballero, dmitriz, eprodrom, Ian, michielbdejong_, mro, pchampin, pfefferle, sandro, tantek