Online htmldiff service

Many Webmasters have heard about or used the W3C Link checker to find dead links on their pages, but very few would know that this service was initially created to help editors of W3C specifications find broken links in their documents, as required by W3C publication rules as a corrollary of our motto on stable cool URIs.

Every once in a while, we provide new services to make the life of our collaborators easier, and offer them to the public at large as much as possible; our latest toy in this category is an htmldiff service, which out of two online HTML documents will create a new document highlighting the differences between the two documents.

This is of course mostly useful to find the changes between two versions of a given document – and indeed, was created to help show the variations between two versions of a given Technical Report.

The tool itself is a pretty simple Python wrapper around Shane McCarron’s htmldiff perl script – I’m happy to share the code of the Python wrapper if anyone is interested.

7 Responses to Online htmldiff service

  1. as a blind user dependent upon speech-output, it would be of INESTIMABLE benefit if the DIFF-marker used actual semantic markup when inserting a DIFF, rather than the generic SPAN — the obvious candidates are: INS and DEL, and to a lesser extent STRONG and EM; SPAN carries no semantics, so it is difficult to communicate to a non-visual user what the color conventions defined for each SPAN actually means; instead of using SPAN, PLEASE use semantically meaningful elements that can trigger voice characteristic changes (regardless of styling) that alert the non-visual user the status of the text marked as a DIFF — currently, the use of color coding alone (albeit through CSS) to convey meaning/context is a clear and inexcusable violation of the Web Content Accessibility Guidelines, versions 1.0 and 2.0 ( and

  2. Good point, indeed; as I mentioned, we’re re-using an existing script that we didn’t develop, but I’ll try to see if the original author has a good reason for having used <span> rather than <ins> and <del>.
    Also, I wonder if “simply” putting CSS aural indications (and if so, which) could help workaround that issue?

  3. thank you, dom, for investigating the reasoning behind the use of, and alternatives to, the SPAN element…
    “simply” putting aural CSS would help, but would only benefit a small small segment of users, although with developments like charles chen’s FireVox extension for FireFox (which makes the UA a self-voicing application and supports the CSS3-speech module) the number of users who can take advantage of aural CSS is steadily growing… and, of course, there is tv raman’s emacspeak, which (as one might expect from the originator of the concept of aural CSS) has the most robust support for aural CSS of any implementation currently available…
    i can suggest some concrete markup for @media aural (which would be CSS2 compliant, as Section 19 of CSS2 is the normative section on aural CSS) properties in email — is there a list to which i should CC my suggestions/proposals?
    generated content using CSS is very spottily supported — even when the generated content added by the :before and :after pseudo-elements is rendered by a user agent, since the CSS-generated text isn’t included in the DOM, it doesn’t get passed to screen readers or other assisstive technologies — for more data on this topic, consult the thread that unspools from:
    one consideration is a variation on the WCAG 2.0 Technique C7:
    which utilizes the “overflow” property of CSS to hide declarative text in “plain sight”, so that there is an aural indicator available within the actual document source to indicate the beginning and end of an inserted, deleted or proposed text (it’s a “what works now” kludge to a problem that would benefit from actual pseudo-elemental text, were it available to a user’s assisstive technology… note that the WCAG2 technique does not use visibility:hidden or display:none as those are universal properties that affect the aural and tactile, as well as the visual canvas, the latter with a non-perceptible gap in the aural canvas, the former with a period of silence (equivalent to the way visibility affects the visual canvas)
    i hope to work with you and enlist the aid of others with pertinent expertise in these areas, to ensure that DIFF-marked documents which bear the imprint of the W3C conform to WCAG 1.0 as stipulated by the W3C publications criteria. the use of color or stylistic conventions alone to convey meaning is a clear violation of the Priority 1 checkpoint under WCAG 1.0, Guideline 2:

    Guideline 2. Don’t rely on color alone. Ensure that text and graphics are understandable when viewed without color.
    If color alone is used to convey information, people who cannot differentiate between certain colors and users with devices that have non-color or non-visual displays will not receive the information. When foreground and background colors are too close to the same hue, they may not provide sufficient contrast when viewed using monochrome displays or by people with different types of color deficits.


    2.1 Ensure that all information conveyed with color is also available without color, for example from context or markup. [Priority 1]
    Techniques for checkpoint 2.1 (

  4. I have contacted the script’s author: (Member-only link, unfortunately).
    He’s offering to implement the said change, so hopefully I’ll be able to set up the new script to achieve this better effect.
    That said, I would be interested to read your proposal for CSS aural properties for a diff document, so if you could send it (you could cc, or, or, I don’t have a strong feeling about it), I would appreciate it.
    Thanks for your detailed feedback!

  5. OK, I’ve upgraded the script, and it now returns <ins> and <del> as recommended.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Before you comment here, note that your IP address is sent to Akismet, the plugin we use to mitigate spam comments.