All your spec are belong to us - irrigating dev resources from specs

Meeting minutes

Dom: Francois and I thought we would use TPAC as an excuse to look at how specifications are authored today, what ecosystem of tools exist around them, what benefits can be derived and who benefits (e.g. developers)
… If you look at how the Open Web Platform is developed today, you're looking at a very diverse set of people working together.
… 26 WG at W3C developing specs for browsers.
… At the very least 9 CGs incubating ideas.
… 15 work streams at WHATWG, including DOM, HTML, Fetch
… The core of JS is developed at ECMA under TC39
… On top of that, you need to also count the WebGL work done in the Khronos Group
… And that's only for specifications used in browsers
… That diversity is great but we're all working together on a single platform, so diversity cannot reduce consistency of the work we produce.
… I like Conway's law.
… If you look at the way the web platform gets developed, you see quite a lot of cases where this happens. In many cases, the culture or background of the group that develops a spec appears in the spec.
… We need to improve the communications across the groups to ensure that the design patterns flow better among them.
… Communication needs to be asynchronous and decentralized or distributed, and that's hard to get right.
… Tools can help make the problem much more tractable.
… That's we'd like to highlight here.
… One core trend is that we've seen a lot of inspiration on how the way software gets developed gets applied to specification work.
… The notion of versioning, commits, dependencies, etc.
… There is broad consensus that this goes in the right direction.
… One of the ways we believe that this can go further is by looking at dependency managements.
… So far, in spec land, deps have been managed in a very coarce fashion.
… We're talking about specs that are several hundreds of pages long. Saying that a spec has a dependency on HTML does not mean much!
… It does matter because a dependency in spec means that if the spec changes, you may need to adapt your own spec to align with the change.
… A link may no longer work, a concept may no longer make any sense.
… It kind of works today, but not smooth.
… Tooling can help make that more precise
… Idea is to have features of spec A depends on features of spec B, etc.

[scribe goes offline for 3mn]

[scribe resumes]

Dom: Two main authoring tools, Bikeshed and ReSpec.
… Thanks to developments in those tools, we're able to bring more and more of this added value to specs.

[scribe got disconnected again]

Dom: We're seeing the ermergence of tools able to make use of this data. We don't even need to know about all of them. It's clear that there is a demand from the dev ecosystem to have access to the data.
… We should make sure that they are able to
… I mentioned Specref, developed by Tobie Langel, which allows to reference other specs without having to worry about versioning. Double square brackets syntax that you're probably familiar with.
… One piece that has emerged more recently is the ability to reference definitions in other specs. Right now, it's based on Shepherd by Peter Linss. Plan is to switch to webref data, which covers more specs.
… Again, the idea is that you just use a specific syntax (see example on the slide)
… Once you do that, you get all the linking for free. And the tool will also warn you when you target a definition that no longer exists.
… For that to be useful, we need groups to step up to flag definitions that they agree to export and conversely those that they do not want to export.
… These are important messages to convey.
… That also forces groups to negotiate exported definitions, and our experience is that this is a good thing!
… You may have seen other examples about which sections are supported in which browsers. Based on MDN Web docs project. Done semi-manually today. Being able to know where a definition appears in the spec allows to create this type of annotations.

<MikeSmith> https://github.com/foolip/mdn-bcd-collector is a recently-created tool that’s being used to automate populate and audit BCD data

Dom: The IDL fragments in webref are automatically brought to Web Platform Tests, used by browser vendors to test their implementation, thanks to integration done by Philip Jägenstedt.
… I mentioned earlier the MDN Browser Compat Data. That data is also well-positioned to benefit from the automatic extraction we're doing.
… They need to know what definitions exist.
… Once you're able to deal with this extracted data, you can feed the documentation for the feature, and in some case tell whether browsers support the said feature.
… That's what MDN BCD Collector is targeting at.
… Can I Use nowadays reuses data from MDN BCD, so again deep repercussions on the dev ecosystem.

Dom: Another tool that you may have crossed, developed by Kamai Rosylight, is this tool that automatically updates the IDL when there is a change in the WebIDL specification. For example, the recent void to undefined switch.
… Getting all specs that used "void" to switch to "undefined" would have been very difficult without it.
… That again is based on the list of specs identified by the processing tools.
… Another project from Kagami is linked to TypeScript to derive types from the WebIDL defined in the specifications.
… That is what is being explored right now in the PR linked from the slide.
… Again based on webref.
… TypeScript is a pretty popular variant of JS nowadays, so making all our API definitions exposed to developers helps the ecosystem.
… A tool I developed to explore the platform is Webidlpedia.
… As spec writer, if you can reuse the naming conventions in other specs, that's always good.
… If you can tell which specifications extend an interface or use a definition is useful too to know which ones may need to change when you change that interface.
… Fuzz testing is being used more and more to make sure that implementations resist to crash. Also useful to develop validators.
… Surely, there are more uses than the ones we've bumped into.
… We see huge value out of this extraction work.
… We think that requires a bit more authoring work, on the way things get defined.
… Also, lots of legacy specs that may make it hard to do cross-refs.
… Heads-up that we may push in the next few months for updates in that direction.
… If you plan to remove a definition, having a way to tell which specs reference it would be useful, we're planning on having that feature readily available.
… Algorithms could be improved too, to specify the number and types of their arguments.
… Again, the goal is to have a more fine-grained visual of dependencies.
… We think that there is some useful work to be done still to integrate with MDN BCD and WPT.
… One way to expose the data for developers is by packaging it.
… We've started to do it for IDL extracts, CSS extracts to follow.

Dom: I also wanted to take the opportunity to mention that some of these projects are done as side projects.
… Slide links to two opencollective for Specref and ReSpec.
… You may have seen this XKCD on the fragility of foundational infrastructure.
… You could contribute to these opencollective. But we're also looking for volunteers to help with the work.
… And to work with the various spec editors to help them out adopt these new features.
… We thought we would end with a few questions (but also happy to take others!):
… Are you aware of other tools?
… Other useful scenarios to consider for the data?
… Finer-grained dep map, any idea on how you would use it?
… Sometimes, we get pushback on the difficulty to adopt the new features, feedback there would be appreciated.
… It's not clear, at least to me, that most of the groups and editors are aware of the evolution of the tools.
… Beyond this breakout session, spec-prod@w3.org is a mailing-list that is used for that purpose, but not clear that many people are aware of it.
… Thanks.

Discussion

csarven: I've been some of the tools to author specs, as well as using plain HTML. Maybe this is more feedback as an author on some of these tools.
… I find it a bit cumbersome still to have to setup the environment, to learn syntax and language that you need to respect.
… Was there any consideration to offer more WYSIWYG tools for spec authors?
… For instance, to help connect the parts of the requirements which could be referred from test suites.
… Any interest in going in that direction?
… Move away from the command-line, and provide a richer experience.

Dom: Fantastic question. On the automatic inter-linking between specifications and the test suites, it's one of the ways I'm hoping to extend the uses of the extracts.
… I believe we can improve the existing here.
… What I have in mind does not address your bigger point, on the editing experience.
… It is pretty hard core today. I know some groups have been experimenting with using Google Docs as source.
… Whether the W3C community would be interested, absolutely. What that tool would look like, I cannot say.
… I was a bit day-dreaming listening to one of the breakout sessions yesterday on editing API for the browser. If we could use that to improving spec editing tools, that would be fantastic.

Jemma: I'm a co-chair of the authoring practices group.
… Recently, we've been discussing ways to improve accessibility of edited content.
… I see benefit of linking with Can I Use like projects. I see the value with e.g. ARIA roles.
… Link with Education and Outreach WG.

Dom: I mentioned that BCD was using our data, which is great. But they are not constrained by it.
… For ARIA practices, the roles are not defined in that document, but in a separate document, which will remain in the crawl.
… Also, I don't think that today there is good data on which screen reader supports which ARIA role.
… Getting more data on technology support would be a great addition to BCD and Can I Use, but collecting data is hard.

Jemma: OK, I'm kind of relieved that we could do that.

<Zakim> MikeSmith, you wanted to mention adding spec URLs to BCD

MikeSmith: I work for W3C. I'm also closely involved with the BCD work.
… One thing that we're working on currently for BCD is adding specification URL.
… BCD does not show you, for a feature, what spec and section in the spec define that feature.

<MikeSmith> https://github.com/w3c/mdn-spec-links

<MikeSmith> https://github.com/w3c/browser-compat-data

MikeSmith: We have that data in MDN, and what I've been doing is running a mechanism that does old-fashioned scraping to collect the URLs.
… And we have a fork of BCD where the only diff is the URL.
… Ultimately, I set this up so that annotations in the spec that link to MDN data also link to BCD.

<MikeSmith> https://github.com/mdn/browser-compat-data/issues/6765#issuecomment-717855656

<Jemma> These are great info, Mike

MikeSmith: But data needs to be manually checked, otherwise garbage in garbage out. Foolip looked at CSS Flexbox, and found a number of issues, see issue comment I juste pasted on IRC.
… There is an opportunity here to try to figure out how we can automate this spec_url, and make sure that things are linked to the right place.
… How do we provide some mechanism for spec editors (through Bikeshed and Respec) to be able to structure the content so that this all works.
… Given a feature, we would like to have only one spec_url. In some cases, there are multiple signatures.
… How should we handle that better?
… Another opportunity to do something interesting with automation in MDN and BCD.

Jemma: This is really great information. Not sure whether EOWG is using ReSpec, if they switch out of it, will they lose that ability? Also, timeline for this?

MikeSmith: I cannot answer the authoring question.
… The advice to be given to the WG is to be careful about tooling that may not be giving them the right automation. Probably less of a concern for guidelines docs.
… We could just dump the URL in BCD. The reason we're not doing that is that some of these URLs are actually not good enough, so we want to improve on that.
… Other useful advice: have someone designated in a group to make sure that the links between specs and BCD, etc. are done correctly.
… Some things don't get done because no one's putting them as priorities.

<Jemma> Thanks for your answer, Mike.

Dom: What Mike was saying applies more broadly to the whole ecosystem. Happy to serve as point of contact there. Lots of really interesting discussions to be had.

<Jemma> I may want to be that contact person. ;-)

Dom: Your help, your input on all this would be more than welcome.

<Jemma> Thanks for the session.

<csarven> Thanks dom tidoust . As you asked, we've been working on https://dokie.li/ - mostly in context of the Solid project. One of looking at dokieli is along the lines of Amaya. It is client-side editor to authoring and publishing tool for articles, annotations, notifications etc. While it wasn't used in its entirety to author the LDN spec: https://www.w3.org/TR/ldn/ - the underlying HTML+RDFa is what dokieli can output. Still a tonne of work ahead

<csarven> to make it better accessible and usable. This is along the lines of what I was thinking regarding my question on the possibility of transitioning to an environment that may be preferable to authors and editors, and any contributor for that matter. Note that the LDN test suite and the implementation reports that we've collected refer to specific requirements in the spec, and they are all self-describing. Perhaps an overview diagram will help:

<csarven> https://csarven.ca/media/images/articles/linked-specifications-reports.svg . All of the URLs in that diagram are in HTML+RDFa - you can parse and serialize as you like (eg. Turtle, JSON-LD). I wrote on an article on this if you'd like more detail: https://csarven.ca/linked-specifications-reports eg. there is detail on how EARL is used. If there is room for dokieli-like kind of thing for W3C spec ecosystem, it'd be great to exchange notes.

<pchampin> hi Sarven :)

<dom> csarven, thanks for the reference - will look into dokie.li!

– DRAFT –
All your spec are belong to us - irrigating dev resources from specs

29 October 2020

Attendees

Meeting minutes

Discussion

Diagnostics