Skip

Reffy - a tool to analyse Web specifications

Presenter: François Daoust
Duration: 4 min

All demos

Skip

Video

Hello, world! This is an intro to Reffy, a command-line tool that allows you to extract and analyze content from Web specifications.

You need Node.js as a pre-requisite. And then you can install Reffy globally through npm.

npm install -g reffy
    

So, first question is, of course, "why would you want to do that?" Well, let's say that you want to extract the Web IDL that WebTransport defines to run some analysis on it.

I can say: "Reffy, please crawl the spec whose shortname is webtransport, run the IDL extraction module on it. And don't bother wrapping the result in some complex JSON structure".

reffy --spec webtransport --module idl --terse
    

There you go! Of course, most specs have an IDL index these days, so for one spec, you could just as well do a copy-and-paste from the spec directly. The power of Reffy is that it can run extraction modules on multiple specs at once. For instance, here, I'm going to create a dump of the IDL defined in all media capture specs.

reffy --spec screen-capture mediacapture-streams mediacapture-fromelement \
        image-capture html-media-capture mediacapture-depth --module idl \
        | jq -r .[].idl.idl | less
    

Reffy includes processing modules to extract CSS definitions, IDL, markup elements, concept definitions, headings, links, and references. But you can also provide your own processing module. Let's give it a try!

Our manual of style – yes, we do have a manual of style – recommends to avoid contractions such as "don't". Can we quickly find out which specs use "don't" or "doesn't"?

To do that, I'm going to create a contractions.mjs module. The module exports a function that will be run against the loaded spec in a browser. That function must return something that can be serialized as JSON. I kept it really basic, just some Regular Expression magic.

export default function () {
  const occurrences = document.body.textContent
    .match(/\sdo(?:es)?n['’]t\s/gi);
  return occurrences?.length;
}
    

Now giving it a try on The Web App Manifest spec, because why not? If it works, I will run it on all specs.

reffy -s appmanifest -m contractions.mjs
    

It works! See the contractions property at the end of the structure? It tells us that the spec has 21 occurrences of "don't" or "doesn't". Bad spec!

reffy -m contractions.mjs \
 | jq -r "sort_by(.contractions) | .[] | select(.contractions) \
   | (.contractions | tostring) + \" found in \" + .title + \" - \" + .crawled" > result.txt
    

Reffy knows about 450 specs or so that roughly match what browsers implement. Crawling takes time, like 5 to 10 minutes. In the interest of time, I'm skipping it. Final report tells me that some specs contains lots of "don't" and "doesn't". There may be more urgent things to fix in our specs, but at least I know which specs don't, I mean "do not", follow the guideline.

You may pass the URL of a spec that Reffy knows absolutely nothing about as well and, typically, Reffy knows nothing about the Manual of Style:

reffy -m https://w3c.github.io/manual-of-style/ \
 | jq -r "sort_by(.contractions) | .[] | select(.contractions) \
   | (.contractions | tostring) + \" found in \" + .title + \" - \" + .crawled"
    

Ahem. It seems the Manual of Style does not follow its own guidelines. Oh well...

I created a processing module of very limited interest here. The point that I'm trying to make is that specs have a lot of content that may be worth processing for some reason. Reffy gives you the power to extract the info you need from the specs.

Not convinced that this can be of any use? What if I told you that, today, Reffy powers Webref, huh? OK, in turn, Webref is used:

  • in Web Platform Tests to maintain IDL tests;
  • in TypeScript to maintain IDL types;
  • in ReSpec (and soon Bikeshed in theory) to fill the cross-reference database;
  • and perhaps soon in MDN to maintain CSS property definitions.

If you want to learn more about Reffy, I encourage you to read the tool's help:

reffy --help
    

Or to check Reffy's open source code on GitHub.

Things remain clunky here and there. Please report issues that you may encounter! Thanks for watching and happy spec processing!

Skip

All Demos

Skip

Sponsors

Title sponsor

Coil Technologies,

Media sponsor

Legible

Silver exhibitor

Movement for an open web

Bronze exhibitor

Igalia

For further details, contact sponsorship@w3.org