Markup Validator Updated

Part of Tools

Author(s) and publish date

By:
Published:
Skip to 7 comments

I tend to keep an eye on things done at CERN. Not just because this is the Web's mothership, but also because there is always a very slim chance that one of their experiments happen to recreate the big bang, kill us all, re-shape the laws of the universe or something else equally exciting and dreadful. After all, it would really be a waste to plan a release of one of our tools after the end of time. So when I started reading about the countdown to the launch of the Large Hadron Collider for August 8th, 2008 I knew it was time to push that maintenance release of the Markup Validator I had been promising “real soon now” for… the past months.

As it turns out, our friends in Switzerland will only start recreating the time just after the big bang in a month. Ah well. Until then, we will have time to enjoy sports on TV, and the Markup Validator, release 0.8.3.

This is mostly a maintenance release, fixing a few bugs, adding support for recently added or updated document types such as XHTML Basic 1.1, but it does have a number of valuable tricks up its sleeves.

For those of us using the validator not just as a web service but as a web platform, a couple of new features will make our life even easier. First, a json output has been added to the validator's results possible outputs. The format is modeled after the JSON output built by our friends at validator.nu. Try this:

GET "http://validator.w3.org/check?uri=http://qa-dev.w3.org/wmvs/HEAD/dev/tests/2342-opensp_type_X.html&output=json"

…you get:

{
    "url": "http://qa-dev.w3.org/wmvs/HEAD/dev/tests/2342-opensp_type_X.html",
    "messages": [
        
          {
              
              "type": "info",
              "subtype": "warning"
              "lastLine": "11",
              "lastColumn": 20,
              "message": "reference to non-existent ID "MMIARCH"",
              "messageid": 183,
              "explanation": "    
        [...]
    <div class="ve mid-183">        
    <p>This error can be triggered by:</p>        
    <ul>        
      <li>A non-existent input, select or textarea element</li>        
      <li>A missing id attribute</li>        
      <li>A typographical error in the id attribute</li>        
    </ul>        
    <p>Try to check the spelling and case of the id you are referring to.</p>        
  </div>        
",
          }
        
        ],
    "source": {
        "encoding": "utf-8"
    }
}

While we are looking at calling the validator and getting quick, easy to process results, did you know that the fastest way to get basic info on validation were the validator's custom HTTP headers? They have been around for a while, now are properly documented and we have added information about the number of warnings, too. Try this:

HEAD http://validator.w3.org/check?uri=http://qa-dev.w3.org/wmvs/HEAD/dev/tests/2342-opensp_type_X.html

    200 OK
    Date: Fri, 08 Aug 2008 15:00:49 GMT
    Content-Language: en
    Content-Type: text/html; charset=utf-8
    Client-Date: Fri, 08 Aug 2008 15:00:52 GMT
    Client-Peer: 128.30.52.49:80
    Client-Response-Num: 1
    X-W3C-Validator-Errors: 0
    X-W3C-Validator-Recursion: 1
    X-W3C-Validator-Status: Valid
    X-W3C-Validator-Warnings: 1

Another good piece of news. If you have a vested interest in XHTML, you will know this dilemma fairly well:

  • XHTML is supposed to be served with the media type application/xhtml+xml media type. That XHTML media type has a few issues, however, in particular the fact that the most distributed browser, up to now, still hasn't added support for it.
  • XHTML 1.0 defined an informative way to be “served as (legacy) HTML”, which kind of worked. But for the rest of the XHTML family…? Some people came up with clever hacks, using HTTP format negotiation to serve XHTML as application/xhtml+xml only to the agents that clearly specify they support this media type, and as text/html, by default, to the others
  • What does that have to do with the Markup Validator? It does not declare an authoritative list of the media types it accepts. Actually, it can't, since there is no way in HTTP to say "Accept HTML, SVG, MathML… and any kind of XML". It does not have to, either, since the HTTP technology makes the Accept header optional, and its absence just means “send me what you've got”
  • When checking one resource set up with the Accept hack for XHTML, the validator would be served content as text/html, and, since that is not supposed to happen, the validator would yield a warning stating, in essence are you certain you really want to serve XHTML 1.1 content as text/html?.

It may have been a mere warning, but it made a lot, lot, lot of people anxious and upset. So, by popular demand – and also because the XHTML working group are preparing a revised note on XHTML and media types ??the warning is gone.

Those interested in HTTP content negotiation beyond the issue with XHTML media type will be interested with some new features in the validator. In version 0.8.2 we had added a way to specify the Accept: and Accept-Language headers sent by the validator to the server holding documents it checks, and in 0.8.3 we also added Accept-Charset and User-Agent. These options are still experimental, but should be useful for content-negotiated resources that do not have a specific URI for each representation.

There is more in this version, and more to come. Read the 0.8.3 release notes, learn how to send feedback or participate in the project, and join me in thanking everyone involved in this release.

Related RSS feed

Comments (7)

Comments for this post are closed.