W3C

Data interchange problems come in all sizes

I had a pretty small data interchange problem the other day: I just wanted to archive some play lists that I had compiled using various music player daemon (mpd) clients. The mpd server stores playlists as simple m3u files, i.e. line-oriented files with a path to the media file on each line. But that’s too fragile for archive and interchange purposes.

I had a similar problem a while back with iTunes playlists. In that episode, I chose hAudio, an HTML dialect in progress in the microformats community, as my target.

Unfortunately, hAudio changed out from under me between when I started and when I finished. So this time, a simple search found the music ontology and I tried it with RDFa, which lets you use any RDF vocabulary in HTML*. I’m mostly pleased with the results:

  1. from A Song’s Best Friend_ The Very Best Of John Denver [Disc 1]
    by John Denver
    Poems, Prayers And Promises
  2. from WOW Worship (orange)
    by Compilations
    Did you Feel the Mountains Tremble
  3. from Family Music Party
    by Trout Fishing In America
    Back When I Could Fly

The album names come before the track names because I didn’t read enough of the the RDFa primer when I was coding; RDFa includes @rev as well as @rel for reversing subject/object order. See an advogato episode on m3uin.py for details about the code.

The Music Ontology was developed by a handful of people who staked out a claim in URI space (http://musicontology.org/...) and happily took comments from as big a review community as they could manage, but they had no obligation to get a really global consensus. The microformats process is intended to reach a global consensus so that staking out a claim in URI space is superfluous; it works well given certain initial conditions about how common the problem is and availability of pre-web designs to draw from. Perhaps playlists (and media syndication, as hAudio seems to be expanding in scope to hMedia) will eventually reach these conditions, but the music ontology already meets my needs, since I’m the sort who doesn’t mind declaring my data vocabulary with URIs.

My view of Web architecture is shaped by episodes such as this one. While giga-scale deployment is always impressive and definitely something we should design for, small scale deployment is just as important. The Web spread, initially, not because of global phenomena such as Wikipedia and Facebook but because you didn’t need your manager’s permission to try it out; you didn’t even need a domain name; you could just run it on your LAN or even on just one machine with no server at all.

In an Oct 2008 tech plenary session on web architecture, Henri Sivonen said:

I see the Web as the public Web that people can access. The resources you can navigate publicly. I define Web as the information space accessible to the public via a browser.
If a mobile operator operates behind walls, this is not part of the Web.

I can’t say that I agree with that perspective. I’m no great fan of walled gardens either, but freedom means freedom to do things we don’t like as well as freedom to do things we do like. And architecture and policy should have a sort of church-and-state separation between them.

Plus, data interchange happens not just at planetary scale, but also within mobile devices, across devices, and across communities and enterprises of all shapes and sizes.

I’ve gone a little outside the scope of current standards; RDFa has only been specified for use in modular XHTML, with the application/xhtml+xml media type, so far.


See also: