Palm webOS approach to HTML extensibility: x-mojo-*

I got pretty excited about the iPhone,
and even more about the openness of Android and the G1, and then I
learn that the Palm Pre developer platform is basically just the open
web platform: HTML, CSS, and JavaScript.

Just after the mobile buzz at Web Directions North and the TAG declared victory on how to build The Self-Describing Web with URI-based Extensibility , I get some details on how Palm is building on the open web platform:

A widget is declared within your HTML as an empty div with an x-mojo-element attribute.

<div x-mojo-element="ToggleButton" id="my-toggle"></div>

Oh great; x- tokens… aren’t those passe by now?

The suggestion in the HTML 5 draft is data-* attributes. The ARIA draft suggests @role. The Palm design looks like new information for issue-41, Decentralized-extensibility, in the HTML WG.

Anybody know how frozen the Palm design is? Or if they looked at ARIA, data-* or URI-based namespaces?

JavaScript required for basic textual info? TRY AGAIN

Sam says he’s Online and Airborne. “Needless to say, this is seriously cool.” I’ll say! But when I follow the link to details from the service provider, I get:

Sorry. You must have JavaScript enabled to view this page. Click the
BACK button below or enable JavaScript in your browser preferences and
click TRY AGAIN.

Let’s turn that around, shall we? Sorry, if you’re a network provider and you want my business, read up on unobtrusive javascript (aka the rule of least power), go BACK to work on your web site design and TRY AGAIN.

How to evaluate Web Applications security designs?

I could use some help getting my head around security for Web
Applications and mashups.

The first time someone told me W3C should be working on specs help
the browser prevent sensitive data from leaking out of enterprises, I
didn’t get it. “Use the browser as part of the trusted computing base?
Are you kidding?” was my response. I didn’t see the bigger picture.
Crockford explains in an April 2008 item:

… there are multiple interests involved in a web
application. We have here the interests of the user, of the site, and
of the advertiser. If we have a mashup, there can be many more
interests.

Most of my study of security protocols concentrated on whether a
request from party A should be granted by party B. You know, Alice and
Bob. Using BAN
logic
to analyze the Kerberos protocols was very interesting.

I also enjoyed studying capability
security and the E system
, which is a fascinating model of secure
multi-party communication (not to mention lockless concurrency),
though it seems an impossibly high bar to reach, given the
worse-is-better tendency in software deployment, and it seemed to me
that capabilities are a poor match for the way linking and access
control
work in the Web:

The Web provides several mechanisms
to control access to resources; these mechanisms do not rely on
hiding or suppressing URIs for those resources.

On the other hand, after wrestling with the patchwork of javascript
security policies in browsers in the past few weeks, the capability
approach in adsafe looks simple and elegant by comparison. Is there any
chance we can move the state-of-the-art that far? And what do we do in
the mean time? Crockford’s Jan 2008 post is quite critical of W3C’s current
work:

This same sort of wrong-end-of-the-network thinking can be seen today
in the HTML 5 working group’s crazy XHR access control language.

Access Control for Cross-Site Requests
is a mouthful, and “Access Control” is too generic, which leads to “W3C
Access Control”. Didn’t we already go through this with “W3C XML
Schema”? Generic names are awkward. I think I’ll call it WACL…
yeah… rhymes with spackle… let’s see if it sticks. Anyway…

Crockford’s comment cites his proposal and argues…

JSONRequest
does not allow the server to abdicate its responsibility of deciding if
the data should be delivered to the browser. Therefore, no policy
language is needed. JSONRequest requires explicit authorization.
Cookies and other tokens of ambient authority are neither sent nor
delivered.

I’m not sure I understand that. I’m glad to learn there’s more to
the difference between XMLHttpRequest and JSONRequest than just
<pointy-brackets> vs {curly-braces}, but I’d like to understand
better how “ambient authority” relates to the interests of users,
sites, advertisers, and the like.

In response, the FAQ in the WACL spec says:

JSONRequest has been considered by the Web Applications Working
Group and the group has concluded that it does not meet the documented
requirements. E.g., requests originating from the JSONRequest API
cannot include credentials and JSONRequest is format specific.

Including credentials seems more like a solution than a
requirement; can someone help me understand how it relates to the
multiple interests involved in a web application?

Caching XML data at install time

The W3C web server is spending most of its time serving DTDs to
various bits of XML processing software. In a follow-up comment on an item on DTD traffic, Gerald says:

To try to help put these numbers into perspective, this blog post
is currently #1 on slashdot, #7 on reddit, the top page of
del.icio.us/popular , etc; yet www.w3.org is still serving more than
650 times as many DTDs as this blog post, according to a 10-min
sample of the logs I just checked.

Evidently there’s software out there that makes a lot of use of the
DTDs at W3C and they fetch a new copy over the Web for each use. As
far as this software is concerned, these DTDs are just data files,
much like the timezone database your operating system uses to convert
between UTC and local times. The tz database
is updated with respect to changes by various jurisdictions from time
to time and the latest version is published on the Web, but your
operating system doesn’t go fetch it over the Web for each use. It
uses a cached copy. A copy was included when your operating system
was installed and your machine checks for updates once a week or so
when it contacts the operating system vendor for security updates and
such. So why doesn’t XML software do likewise?

It’s pretty easy to put together an application out of components
in such a way that you don’t even realize that it’s fetching DTDs
all the time. For example, if you use xsltproc like this…

$ xsltproc agendaData.xsl weekly-agenda.html >,out.xml

… you might not even notice that it’s fetching the DTD and several
related files. But with a tiny HTTP
proxy
, we can see the traffic. In one window, start the proxy:

$ python TinyHTTPProxy.py
Any clients will be served...
Serving HTTP on 0.0.0.0 port 8000 ...

And in another, run the same XSLT transformation with a proxy:

$ http_proxy=http://127.0.0.1:8000 xsltproc agendaData.xsl weekly-agenda.html

Now we can see what’s going on:

	connect to www.w3.org:80
localhost - - [05/Sep/2008 15:35:00] "GET http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd HTTP/1.0" - -
connect to www.w3.org:80
localhost - - [05/Sep/2008 15:35:01] "GET http://www.w3.org/TR/xhtml1/DTD/xhtml-lat1.ent HTTP/1.0" - -
bye
bye
connect to www.w3.org:80
localhost - - [05/Sep/2008 15:35:01] "GET http://www.w3.org/TR/xhtml1/DTD/xhtml-symbol.ent HTTP/1.0" - -
bye
connect to www.w3.org:80
localhost - - [05/Sep/2008 15:35:01] "GET http://www.w3.org/TR/xhtml1/DTD/xhtml-special.ent HTTP/1.0" - -
bye

This is the default behaviour of xsltproc, but
it’s not the only choice:

  • You can use xsltproc --novalid tells it to skip DTDs altogether.
  • You can set up an
    XML catalog
    as a form of local cache.

To set up this sort of cache, first grab copies of
what you need:

$ mkdir xhtml1
$ cd xhtml1/
$ wget http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd
--15:29:04--  http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd
=> `xhtml1-transitional.dtd'
Resolving www.w3.org... 128.30.52.53, 128.30.52.52, 128.30.52.51, ...
Connecting to www.w3.org|128.30.52.53|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 32,111 (31K) [application/xml-dtd]
100%[====================================>] 32,111       170.79K/s
15:29:04 (170.65 KB/s) - `xhtml1-transitional.dtd' saved [32111/32111]
$ wget http://www.w3.org/TR/xhtml1/DTD/xhtml-lat1.ent
...
$ wget http://www.w3.org/TR/xhtml1/DTD/xhtml-symbol.ent
...
$ wget http://www.w3.org/TR/xhtml1/DTD/xhtml-special.ent
...
$ ls
xhtml1-transitional.dtd  xhtml-lat1.ent  xhtml-special.ent  xhtml-symbol.ent

And then in a file such as
xhtml-cache.xml, put a little catalog:

<catalog xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog">
<rewriteURI
uriStartString="http://www.w3.org/TR/xhtml1/DTD/"
rewritePrefix="./" />
</catalog>

Then point xsltproc to the catalog file and try it again:

$ export XML_CATALOG_FILES=~/xhtml1/xhtml-cache.xml
$ http_proxy=http://127.0.0.1:8000 xsltproc agendaData.xsl weekly-agenda.html

This time, the proxy won’t show any traffic. The data was all
accessed from local copies.

While XSLT processors such as xsltproc and Xalan have no
technical dependency on the XHTML DTDs, I suspect they’re used with
XHTML enough that shipping copies of the DTDs along with the XSLT
processing software is a win all around. Or perhaps the traffic comes
from the use of XSLT processors embedded in applications, and the DTDs
should be shipped with those applications. Or perhaps shipping the
DTDs with underlying operating systems makes more sense. I’d have to
study the traffic patterns more to be sure.

p.s. I’d rather not deal with DTDs at all; newer schema technologies make them obsolete as far as I’m concerned. But

  • some systems were designed before better schema technology came along, and W3C’s commitment to persistence applies to those systems as well, and
  • the point I’m making here isn’t specific to DTDs; catalogs work for all sorts of XML data, and the general principle of caching at install time goes beyond XML altogether.

The details of data in documents: GRDDL, profiles, and HTML5

GRDDL, a mechanism for putting RDF data in XML/XHTML documents, is
specified mostly at the XPath data model level. Some GRDDL software
goes beyond XML and supports HTML as she are spoke, aka tag soup. HTML 5 is intended to standardize the connection between tag soup and XPath. The tidy use case for GRDDL anticipates that using HTML 5 concrete syntax rather than
XHTML 1.x concrete syntax involves no changes at the XPath level.

But in GRDDL and HTML5,
Ian Hickson, editor of HTML 5, advocates dropping the profile attribute
of the HTML head element in favor of rel=”profile” or some such. I
dropped by the #microformats channel to think out loud about this stuff, and Tantek said similarly, “we may solve this with rel=”profile” anyway.” The rel-profile topic in the microformats wiki shows the idea goes pretty far back.

Possibilities I see include:

  • GRDDL implementations add support for rel=”profile” along with HTML 5 concrete syntax.
  • GRDDL
    implementors don’t change their code, so people who want to use GRDDL
    with HTML 5 features such as <video> stick to XML-wf-happy HTML 5
    syntax and they use the head/@profile attribute anyway, despite what
    the HTML 5 spec says.
  • People who want to use GRDDL stick to XHTML 1.x.
  • People who want to put data in their HTML documents use RDFa.

I
don’t particularly care for the rel=”profile” design, but one should
choose ones battles and I’m not inclined to choose this one. I’m
content for the market to choose.

life without MIME type sniffing?

In a recent item on IE8 Security, Eric Lawrence, Security Program Manager for Internet Explorer, introduced a work-around to the security risks associated with content-type sniffing: an authoritative=true parameter on the Content-Type header in HTTP. This re-started discussion of the content-type sniffing rules and the Support Existing Content design principle of HTML 5. In response to a challenge asking for evidence that supporting existing content requires sniffing, Adam made a suggestion that I’d like to pass along:

I encourage you to build a copy of Firefox without content sniffing
and try surfing the web. I tried this for a while, and I remember
there being a lot of broken sites …

That reminded me of an idea I heard in TAG discussions of MIME types and error recovery: a browser mode for “This is my content, show me problems rather
than fixing them for me silently.”

Though Adam offered a patch, building firefox is not something I have mastered yet, so I’m interested to learn about run-time configuration options in IE (notes Julian) and Opera (notes Michael). Eric Lawrence’s reply points out:

Please do keep in mind, however, that most folks (even the ultra-web engaged on these lists) see but a small fraction of the web, especially considering private address space/intranets, etc.

A report from one developer suggests there’s light at the end of the tunnel, at least for sniffing associated with feeds:

I did, partly as an experiment, stop sniffing text/plain in the latest release of SimplePie (which, inevitably, isn’t the nicest of things to do, seeming there are tens of thousands of users). Next to nothing broke. I know for a fact this couldn’t have been done a year or two ago: things have certainly moved on in terms of the MIME types feeds are served with …

If you get a chance to try life without MIME type sniffing, please let us know how it goes.

Syntax for ARIA: Cost-benefit analysis

Syntax for ARIA: Cost-benefit analysis

1. Introduction

This analysis is intended to be neutral with respect to ideology,
history and constituency. For a useful overview of how we got here, see WAI-ARIA Implementation Concerns (member-only link) by Michael Cooper.

The W3C’s WAI PF Working Group recently published the first
public working draft of the Accessible Rich Internet Applications (WAI-ARIA) specification, which “describes mappings of user interface controls and navigation to accessibility APIs”.

The ARIA spec. defines roles, states and properties to manage the interface
between rich web documents and assistive technologies. The primary expression
of roles, states and properties in markup languages is via attributes. Since
ARIA is meant to augment web applications across a range of languages and user
agents, ARIA has to specify how its vocabulary of attributes and values can be
integrated into both existing and future languages.

In preparing this analysis, I have reviewed the available concrete evidence
bearing on the matter, and have carried out a considerable amount of work to
replicate and, in some cases, correct or extend, testing which has been done in the
past. The details are available in a report entitled Some test results concerning ARIA attribute syntax.

2. The core issue: How should the ARIA attributes be spelled?

ARIA is useful only if it is widely supported. It therefore needs to
integrate cleanly into existing and future languages as easily as possible. Before looking at possible answers to the spelling question, we need to consider
exactly what supporting ARIA means.

We can distinguish two levels of support for ARIA on the part of user
agents, which I’ll call ‘passive’ and ‘active’ support. By passive support, I
mean that documents with ARIA-conformant markup are not rejected by the agent,
and the markup is available in the same way any other markup is, e.g. via a DOM
API or for matching by CSS selectors. By ‘active’ support
I mean the user agents actually implement their part of ARIA semantics, that is, reflecting changes to ARIA-defined states and properties via
accessibility APIs.

Although already deployed implementations cannot offer active support, an
optimal answer to the spelling question would maximise passive support from
existing languages, as well as encouraging active support from subsequent implementations.

3. Possible approaches: land-grab, colon or dash

There are in principle three possible approachs to the spelling question:

  • land-grab  Just use ‘role’ and the names of the properties (e.g.
    ‘checked’, ‘hidden’) as attribute names.
  • colon  Use ‘aria:’ as a distinguishing prefix, giving e.g. ‘aria:role’,
    ‘aria:checked’ as attribute names.
  • dash  Use ‘aria’ plus some other punctuation character, e.g.
    dash, as a distinguishing prefix, giving e.g. ‘aria-role’,
    ‘aria-checked’ as attribute names.

The land-grab approach is pretty clearly unacceptable, because
of clashes with existing vocabularies and the likelihood
of clashes with future ones, and will not be considered further.

The current
ARIA WD
specifies a combination of the colon and
dash approachs, with the colon being specified for use
with XML-based
languages, with the necessary additional requirement that ‘aria’ is bound to
the ARIA namespace in the usual way, i.e.
xmlns:aria="http://www.w3.org/2005/07/aaa", and the
dash approach being specified for use with non-XML languages. We’ll
call this the mixed approach hereafter.

My understanding is that as of the date of this note, the WAI PF working
group have indicated that their intention is that the next draft of
the ARIA specs will move to the dash appropach.

4. The status quo: languages and implementations

Choosing an approach is made complicated by the landscape of language
and infrastructure standards it has to fit in to, and by the fact that these are
moving targets. We therefor have to distinguish between what is currently
in place, what we have reason to expect in the near future, and what we can
foresee in the longer term. Furthermore, for existing languages we have
two categories: XML-based languages, with more or less explict provision for
extensibility in general, typically namespace-based, and non-XML languages,
which for the purposes of this analysis we will take to be HTML 4.01 and nothing else.

As noted above, the best we can expect from deployed user agents is passive
support. The table below sets out the extent of passive support which is
available for the colon and dash approaches for each
of three host languages, which exemplify the major relevant categories: HTML
4.01 (for the non-XML languages), XHTML (an XML language, but not always treated
as such, so we actually get two columns for it below) and SVG (only an XML language).

Passive
support
HTML 4.01 XHTML
(as if HTML)0
XHTML
(as XML)
SVG
Allowed
at all
colon: Yes, by ‘should ignore’ advice
dash: Yes, by ‘should ignore’ advice
colon: Yes, by ‘should ignore’ advice
dash: Yes, by ‘should ignore’ advice
colon: Yes, by ‘must ignore’ rule
dash: Yes, by ‘must ignore’ rule
colon: Yes, by ‘must ignore’ rule
dash: In principle,no
in practice1, yes
Available
via DOM
colon: Yes, via GetAttribute
dash: Yes, via GetAttribute
colon: Yes, via GetAttribute
dash: Yes, via GetAttribute
colon: Yes2, via GetAttributeNS and GetAttribute
dash: Yes2, via GetAttribute
colon: Yes3, via GetAttributeNS and GetAttribute
dash: Yes3, via GetAttribute
Matches
CSS selector
colon: Yes4, using [aria\:attr]
dash: Yes5
colon: Yes4, using [aria\:attr]
dash: Yes5
colon: Yes, using [aria|attr]
dash: Yes5
colon: No
dash: No

Notes:

  • 0  This column applies to the IE family, and to other browsers
    whenever treating XHTML as HTML
  • 1  Firefox 2.0.0.14, IE7 + Adobe 3.03 SVG plugin
  • 2  All browsers which treat XHTML as XML
  • 3  Firefox 2.0.0.14 (unable to test IE+plugin so far)
  • 4  Except IE family
  • 5  If attribute selectors supported at all, i.e. not IE5, IE6

It should be noted that some of the entries above disagree with assertions
made in the past about browser behaviour. At least some of those assertions
were based on flawed test materials—see the discussion
of experiments 1 and 2
in my testing report for details on the information
summarised above.

5. The near future

A number of browser implementors have responded positively to the ARIA
initiative and have included experimental active support for ARIA in pre-release
versions of their products. Most of the test materials and implementation
information I can find suggests that only the dash approach, and only
HTML or XHTML, are currently being implemented.

With regard to improving passive support, it seems very possible that
IE8 will support attribute selectors of the form [aaa\:checked],
which would remove the qualification recorded in the table above by footnote 4.

5.1. HTML5

The situation with respect to HTML5 is complicated. As it
currently stands, the HTML5 draft
specification
supports namespaces internally, and all HTML elements are
parsed into the DOM nodes in the HTML namespace, regardless of whether they are
parsed “as HTML” or “as XML”. But when parsing documents “as HTML”, no
other namespaces are recognised. Unless this changes before HTML5
is completed, the HTML/”XHTML (as if HTML)” columns above will apply to
HTML5-conformant user agents in at least some cases.

6. Cost-benefit analysis

On the basis of the above survey, there follows below an attempt at a
cost-benefit analysis with respect to the colon and
dash approaches, as well as the mixed
approach as currently specced in the ARIA working draft and a fourth approach, as proposed by me in
a
message to www-tag
, which I’ll call the xcolon approach.
The xcolon approach attempts to address some of the problems
revealed in the passive support table by defining a
pair of getter/setter Javascript functions for access to ARIA information in the
DOM, and giving a design pattern for duplicated CSS selectors (one using
[aria\:xxx] and the other [aria|xxx]).

Benefits Costs
colon Consistency for page authors; Uniform DOM access (using
Get/SetAttribute); Orthogonal in XML languages; consistent with
namespace-based
extensibility for XML (and for HTML5?1)
Uniform DOM access ignores namespace2; no uniform CSS selector; no CSS selector at all
for IE legacy3; modest re-implementation cost4
dash Consistency for page authors; uniform DOM access; uniform CSS selector Inconsistent with XML namespace-based extensibility5; new paradigm for
‘namespace’6; scope creep7
mixed Orthogonal in XML languages; consistent with namespace-based
extensibility for XML (and for HTML5?1)
Confusing for authors; no uniform DOM access; no uniform CSS selector; uncertainty wrt XHTML; new paradigm for
‘namespace’6; scope creep7
xcolon Consistency for page authors; orthogonal for XML languages; consistent with
namespace-based
extensibility for XML (and for HTML5?1); uniform DOM access; uniform CSS selector
Requires indirection through accessor functions for DOM access;
requires duplicate CSS selectors; no uniform DOM representation; no CSS selector at all
for IE legacy3; modest re-implementation cost4
  • 1  HTML5’s provision for extensibility, whether compatible with
    XML namespaces or not, is an open area of discussion at the moment.
  • 2  That is, it requires the use of a fixed aria prefix
    and may not (i.e. in some browsers) correctly set the namespaceURI
    property even when targetting an XML DOM.
  • 3  That is, in the IE family, only (putatively) IE8 and successors
    will recognize [aria\:...] selectors
  • 4  See discussion of re-implementation cost below
  • 5  See discussion of XML extensibility below
  • 6  That is, adds the concept of a fixed, dash-delimited, prefix as
    a way of managing distinct symbol spaces to the existing non-fixed, colon-delimited
    prefix for the same purpose.
  • 7  That is, requires all embedding languages to explicitly allow
    and manage an inventory of fixed prefixes and, possibly, their vocabularies.

6.1. Implementation cost

For wholly commendable reasons, development of the ARIA spec. and pilot
implementation work have proceeded in parallel. Most if not all existing
implementations support only the dash approach. What is the likely
cost for those implementations of any decision to adopt any other approach? My
conclusion, having examined one implementation in some detail, is that the cost is
likely to be very modest.

Michael Cooper, WAI PF staff contact, captured the reason for this very
well, albeit unintentionally:

“The ARIA roles and properties are conceptually simple enough, but
they are designed to provide a bridge between HTML and desktop accessibility APIs,
a bridge which is exploited by the operating system, user agent, and assistive
technology all working together. There’s a complex set of interdependencies there
and the feasibility and details of many of the ARIA features could only be worked
out by testing in deployed systems, and therefore doing early implementation.”

The complexity referred to above is fundamentally one of architecture, both
static and dynamic. Not surprisingly, therefore, syntactic concerns account for a
tiny fraction of the code needed to implement ARIA as it stands. Furthermore, and
again not surprisingly, as it’s what sound software engineering practice requires,
the details of the concrete syntax are isolated, and the vast bulk of
the code I looked at refers to it only indirectly. The consequence of all this is
that the changes necessary to manage any change away from the dash
approach will be very straightforward. For more details, see the discussion
of experiment 3
in my testing report.

6.2. XML extensibility and SVG

Many existing XML languages make explicit, generic, provision for
extensibility by including in their formal schemas and/or spec. prose allowance for
any namespace-qualified elements and attributes from namespaces other than those
which make up the language itself. Tools such as NVDL and, to a lesser extent, W3C
XML Schema and RelaxNG, make it possible to combine the schemas for multiple XML
languages to give a complete characterisation of mixed-language documents.

One particularly important example of this approach is SVG. ARIA
integration into SVG is clean and straightforward under the colon or
mixed
approaches, but will require amending the spec. under the dash approach.

6.3. Short- vs. long-range considerations

In trying to weigh the tradeoffs which must of necessity be considered when
confronted by the information given above, the matter of timescale is particular
hard to address. Any assertion about how things will look five, or even two, years
hence can always be countered with a contrary assertion. None-the-less, the
centrality of the HTML languages for the Web, and the fundamental importance of
accessibility for all of us, suggest that we must take the long-term
impact of this decision seriously, and be prepared to discount some short-term
discomfort in return for long-term stability and simplicity.

Proposed Activity for Video on the Web

W3C organized a workshop on Video on the Web in December 2007 in order
to share current experiences and examine the technologies (see report). Online video content and demand is increasing rapidly, becoming
omnipresent on the Web and the trend will continue for at least a few
years. These rapid changes are posing challenges to the underlying
technologies and standards that support the platform-independent
creation, authoring, encoding/decoding, and description of video. To
ensure the success of video as a “first class citizen” of the Web, the
community needs to build a solid architectural foundation that enables
people to create, navigate, search, and distribute video, and to manage
digital rights.

The general scope of the proposed Video on the Web activity is to
provide cohesion in the video related activities of W3C, as well helping
other W3C Groups in their effort to provide video functionalities. In
addition, this activity will focus at implementing the next steps from
the W3C workshop on Video on the Web. The proposal is to create 3 new Working Groups around Video on the Web. Please, have a look at the following documents:

  1. Activity proposal
  2. Media Fragments Working Group Charter
  3. Media Best Practices and Guidelines Working Group Charter
  4. Media Annotations Working Group Charter

We welcome general feedback, general expressions of interest (or lack of!) and
comments on the discussion list public-video-comments@w3.org.

If you should have questions or need further information, please feel free to
contact me as well. I will be presenting the activity proposal during the Web Conference next week, on Thursday afternoon.

Simple things make firm foundations

You can look at the development of web technology in many ways, but one
way is as a major software project. In software projects, the independence of specs, has always been really important, I have felt. A classic example is
the independence of the HTTP and HTML specifications: you can introduce many
forms of new markup language to the web through the MIME Content-Type system,
without changing HTTP at all.

The modularity of HTML itself has been discussed recently, for example by
Ian Hickson, co-Editor of
HTML5:

Note that it really isn’t that easy. For example, the HTML parsing rules
are deeply integrated with the handling of <script> elements, due to
document.write(), and also are deeply integrated with the definition of
innerHTML. Scripting, in turn, is deeply related to the concept of
scripting contexts, which depends directly on the definition of the Window
object and browsing contexts, which, in turn, are deeply linked with
session history and the History object (which depends on the Location
object) and with arbitrary browsing context navigation (which is related to
hyperlinks and image maps) and its related algorithms (namely content
sniffing and encoding detection, which, to complete the circle, is part of
the HTML parsing algorithm). – Brainstorming
test cases, issues and goals, etc.
,
Ian Hickson

and in reply by Laurens Holst:

I don’t know the spec well enough to answer that question, but I’d
say modularization (if I may call it so) would make it both easier to grasp
as individual chunks, for both the reviewing process and the implementing
process. brainstorming:
test cases, issues, goals, etc.
. – Laurens Holst

The <canvas> element introduces a complex
2D drawing API
different in nature from the other interfaces, which concentrate on
setting and retrieving values in the markup itself; the client-side database storage section
of the specification is another such interface. While the
<canvas> element has a place in the specification, the drawing
API should be defined in a separate document. Hixie expressed
a similar sentiment (and see the group’s issues about scope):

The actual 2D graphics context APIs probably should be split out
on the long term, like many other parts of the spec. On the short
term, if anyone actually is willing to edit this as a separate spec,
there are much higher priority items that need splitting out and
editing…

It would also be nice if the <canvas>
element and the SVG elements which embed in HTML did so in just the
same way, in terms of the context (style, etc.) which is passed (or not passed)
across the interface, in terms of the things an implementer has to learn
about, and things which users have to learn about. So that <canvas> and
SVG can be perhaps extended to include say 3D virtual reality later, and
so that all of these can be plugged into other languages just as they are
plugged into HTML.

There are lots of reasons for modularity. The basic one is that one module
can evolve or be replaced without affecting the others. If the interfaces are
clean, and there are no side effects, then a developer can redesign a module
without having to deeply understand the neighboring modules.

It is the independence of the technology which is important. This doesn’t,
of course, have to directly align with the boundaries of documents, but
equally obviously it makes sense to have the different technologies in
different documents so that they can be reviewed, edited, and implemented by
different people.

The web architecture should not be seen as a finished product, not as the
final application. We must design for new applications to be built on top of
it. There will be more modules to come, which we cannot imagine now. The
Internet transport layer folks might regard the Web as an application of the
Net, as it is, but always the Web design should be to make a continuing
series of platforms each based on the last. This works well when each layer
provides a simple interface to the next. The IP is simple, and so TCP can be
powerfully built on top of it. The TCP layer has a simple byte stream
interface, and so powerful; protocols like HTTP can be built on top of it.
The HTTP layer provides, basically, a simple mapping of URIs to
representations: data and the metadata you need to interpret it. That
mapping, which is the core of Web architecture, provides a simple interface
on top of which a variety of systems — hypertext, data, scripting and so on
— can be built.

So we should always be looking to make a clean system with an interface
ready to be used by a system which hasn’t yet been invented. We should expect
there to be many developers to come who will want to use the platform without
looking under the hood. Clean interfaces give you invariants, which
developers use as foundations of the next layer. Messy interfaces introduce
complexity which we may later regret.

Let us try, as we make new technology, or plan a path for old technology,
always to keep things as clean as we can.

Version Identifiers Reconsidered

The Architecture of the World Wide Web includes a section on extensibility and versioning of languages and data formats. Quoting from the architecture document:

Good practice: Version information

A data format specification SHOULD provide for version information.

So, it’s always a good idea when you design a language or data format to provide a way for instance documents to include something like a version attribute, probably near the beginning of the document, to indicate what version of the language is being used.

Or is it always a good idea?

What does a version identifier convey?

In fact, do we even agree on what it means to put something like a language version marker on a document? Let’s imagine a simple XML language designed for setting down recipes. In the first version of the language, the markup looks like this:


<recipe name="Tuna Salad" recipeLanguageVersion="1.0">
<ingredients>
<ingredient name="Tuna Fish"  amount="1 can"/>
<ingredient name="Mayonnaise" amount="3 tablespoons"/>
<ingredient name="Capers" amount="a few"/>
</ingredients>
<steps>
<step>Open can</step>
<step>Drain liquid from can.  Put fish in bowl.</step>
<step>Add mayonnaise.  Stir well.</step>
<step>Add capers.  Stir gently.</step>
</steps>
</recipe>

The allowed markup in version 1.0 of the recipe is just what’s shown above: an outer <recipe> containing <ingredients> and <steps>, etc.
Eventually it’s decided that it would be useful to provide optional pictures for ingredients or steps. So in version 2 of the language we can do things like:


<recipe name="Tuna Salad" recipeLanguageVersion="2.0">
<ingredients>
<ingredient name="Tuna Fish"  amount="1 can"
picture="./CanPicture.jpg"/>
<ingredient name="Mayonnaise" amount="3 tablespoons"/>
<ingredient name="Capers" amount="a few"/>
</ingredients>
<steps>
<step>Open can</step>
<step picture="./DrainCanPicture.jpg">
Drain liquid from can.  Put fish in bowl.</step>
<step>Add mayonnaise.  Stir well.</step>
<step>Add capers.  Stir gently.</step>
</steps>
</recipe>

Question: let’s imagine that version 2 of the language, the one that supports the optional pictures, has been out for awhile, but I still want to write a simple recipe with no pictures:


<recipe name="ice cubes" recipeLanguageVersion="??">
<ingredients>
<ingredient name="water"  amount="1.5 cups"
</ingredients>
<steps>
<step>Put water into ice cube tray.</step>
<step>Freeze.</step>
</steps>
</recipe>

What’s the best value to put in the version attribute? I know that version 2.0 is the latest version of the recipe language. In fact, that’s the only version of the specification I have next to me, so maybe I should use that?
There’s a problem, though. That version="2.0" marker might not work with software that’s written to version 1.0, and in fact, my document would otherwise be a fine 1.0 recipe document.

So, maybe I should label it 1.0? Unfortunately, that’s a bit hard for me. I don’t want to have to go through the specifications for every version of the recipe language that’s ever existed just to find the oldest that works. I really don’t want to do that if the language has been revised a lot! Also, these sample recipes are small, but if I were using software to write very long documents, then that software would either have to keep track of the latest features used, or else search the entire document before writing it to a file, in order to get that version identifier at the front.

Indeed, just these complexities have proven troublesome for the deployment of
languages like XML 1.1. XML 1.1 is similar to XML 1.0, but it enables the use of some new Unicode characters (just as recipe language V2 allows for use of new image tags.)
The XML 1.1 Recommendation suggests that:

XML Programs which generate XML SHOULD generate XML 1.0, unless one of the specific features of XML 1.1 is required.

In fact, it has often proven difficult to write software that generates documents labeled as XML 1.1 only when necessary: it’s much easier for XML 1.1-compatible software to label all output as <xml version="1.1">, resulting in documents that are unusable with widely deployed XML 1.0 software.
Perhaps for reasons like this, adoption of XML 1.1 has been slow.

Returning to the recipe example, maybe the version attribute should take a list of versions, and I should put in both 1.0 and 2.0? That could be helpful to consuming software, but it still means that I (or my software) must be familiar with all the previous versions of the specification.

So, we need to ask, is the version identifier used to convey:

  • The earliest version of the language with which the document is compatible (1.0 in the recipe example)?
  • The version of the specification I used as a guide when writing the document (2.0)?
  • A list of versions with which the document is compatible?
  • Something else?

The best answer is probably different depending on the language, how often it’s revised, whether revisions tend to maintain backwards compatibility, etc.

Is having some sort of version identifier always a good idea?

That Good Practice Note quoted above says “provide a version indicator”, but we’ve just shown that we’re not always quite clear on what that would do anyway. Is it still good advice to suggest that surely each language should provide for something in the instance? If so, should its use be required or optional?

As shown above, it’s common for the same instance document to be legal in many versions of a language.
As long as such documents are likely to have the same or sufficiently compatible meanings per the different versions, then it may be better to omit any indication of version in the instance, and leave it to the receiving software to decide whether the document can be processed. After all, with the second recipe above, the receiver will soon enough discover that it can or can’t process picture attributes, and if not, it either will or won’t know that they can be safely ignored. Version attributes can be helpful in giving early warning of incompatibilities, or as a crosscheck for catching errors, but they’re usually not essential to correct operation.

One important exception is in the case where the language is likely to change in incompatible ways. If the same document means different things in different versions of a language, then it’s very important to indicate which version the author had in mind when creating the document. Putting that version indicator into the document itself is one good way to do it. So maybe the right advice is:

Proposal for Future Architecture Document: Version information


If a language or data format will change in incompatible ways, then indicate the language version used for each instance.


Are namespaces a good way to identify language versions?

If version identifiers aren’t always a good bet, what about namespaces? Many modern languages allow the creation of globally unique names, identifiers, tags, etc. In XML this is done through use of Namespaces. In RDF, it’s done by using URIs as identifiers, etc..

Sometimes it’s appropriate to use new identifiers for each version of a language, and mechanisms like namespaces can make that easier:


<r:step xmlns:r="http://example.org/recipeLanguage1">

vs.


<r2:step picture="./food.jpg"
xmlns:r2="http://example.org/recipeLanguage2">

In this example, the element with expanded name {http://example.org/recipeLanguage2, step} allows a picture attribute, but {http://example.org/recipeLanguage1, step} does not.

A full discussions of the pros and cons of using namespaces this way is beyond the scope of this note. One important advantage of using namespaces is that they can be easily applied not just to the root element for the language as a whole, but to mixtures of compound document markup, in which each sublanguage evolves with its own namespaces. Also, because namespace names are URIs, you can use the Web itself to get information about them.

Namespaces do have drawbacks. Imagine if there were 50 different namespaces for a language just because 50 separate bugs had been fixed in different errata. Would you republish all the markup in 50 namespaces? Would each document have lots of namespaces, with each element named with the last namespace in which it had been revised? Namespaces can be very useful for designating language versions, but there’s no one idiom that’s right for all languages. We note that most widely deployed tag-based languages for the Web (HTML, XML Schema, XSLT) have chosen either to use the same namespace(s) across multiple versions, or in the case of some flavors of HTML, not to use namespaces at all.

Conclusions

So, the TAG is having second thoughts about the suggestion that all data formats SHOULD provide for version identification. Sometimes it’s a good thing to do, but sometimes not.
Perhaps the right advice will be what’s proposed in the revised Good Practice Note above.
In any case, the TAG has been working for several years on a finding that will explore in detail many issues relating to versioning, and version attributes are likely to be among the topics covered. In the meantime, we thought we’d take the opportunity to signal that we’re not so sure that the advice in the Architecture Document is as good as we thought.

By the way, TAG member David Orchard has covered some of the same topics as well as many others relating to versioning in his personal blog. Links to a few of his postings follow my signature below. Dave is also the principle author of the TAG’s draft finding on versioning. Working drafts covering Terminology, Strategies, and Versioning of XML Languages are available for review. New drafts come out every few months, and we’re hoping to have something more or less complete, well, real soon now.

Noah Mendelsohn

Note: unless otherwise indicated, opinions expressed in the TAG’s blog are those of the individual authors, and do not necessarily represent consensus of the TAG as a whole.

Links to Dave Orchard’s Blog Postings on Versioning