W3C

The W3C Document Object Model (DOM)

This version:
http://www.w3.org/2002/07/26-dom-article
Author:
Philippe Le Hégaret, W3C

Table of Contents

Introduction

Web designers like to add effects to their pages.

Have you ever read Your browser cannot access the requested service. Click here to upgrade your browser.? If not, then you are probably the very fortunate owner of the most-used operating system and browser on the Web, and you have downloaded all the possible extensions (plug-ins) for it. If yes, your only chance to access the content is to spend hours convincing the company who provides it to change its means of delivery, or spend hours and/or money to make changes to your software. You might have to buy a new computer and buy software for it.

Why can't they give you access to the content in the first place? The reasons vary. Possibly they require a level of security that your browser does not provide. They may have written content that can only be read with a specific extension. Or, perhaps they include programming code in the page that is not compatible with your browser.

Being able to write programming code which can work on any kind of browsers is not an easy and the W3C has been, and is still, trying to find a common ground between. It facilitates the work on the Web designers and the programmers and, furthermore, it improves the accessiblity by requiring assistive functionalities.

How the DOM Works

Accessing Document Content

An HTML or XML document can be represented as a tree. Each document has one root or root node, which contains children or nodes, and which themselves can have children. Generally speaking, the leaves contain the text of document. For example, here is an HTML table:

Shady Grove Aeolian
Over the River, Charlie Dorian

and a graphical representation of the tree:

graphical representation of the DOM of the example table

Accessing the Style Associated With the Document

Each node in the document is associated with stylistic effects such as color, position, and borders. These stylistic effects are not always part of the document and might be defined in a separate section called a style sheet.

Reacting to User Actions

In order to make Web pages more dynamic, lots of Web designers include stylistic effects when the user manipulates input devices. For example, when using the mouse device, the user might move the cursor over the page, they might click in a certain area, etc., or, when using the keyboard, the user might input information in a Web form. The DOM introduces the ability to capture those events and react to them.

Given the tree structure explained above, a click with a mouse on an image could be propagated in the structure from the root to the target, the image itself, and go back to the root again. Using the DOM, Web developers can attach little pieces of code called observers to react to the event.

DOM and Accessibility

Lots of assistive technologies, such as screenreaders, are developed on top of existing applications. In order to ensure that functionalities are accessible, the applications must expose them through an API: the content of the document displayed on the screen must be accessible through programmatic access, as well as the menu items in the user interface. The applications accomplish the first step of this requirement when exposing the documents through the DOM API. It is possible to simulate a mouse device by creating the mouse events and propagate them in the DOM tree, replace the images with their alternatives contents, remove the blinking text, etc.

A Little Bit of History

The history of the Document Object Model, known as the DOM, is tightly coupled with the beginning of the JavaScript and JScript scripting languages.

JavaScript

The LiveScript language was designed at Netscape Communications to make the Java support in Netscape Navigator more accessible to non-Java programmers. LiveScript, like any scripting language, is a loosely-typed language. It is intended for a large audience of Web designers and developers.

In December 1995, LiveScript was renamed JavaScript and released as part of Netscape Navigator 2.0. Except for marketing purposes, JavaScript has nothing to do with the Java language developed and maintained by Sun Microsystems. The Web community started then to manipulate the content of Web documents, in order to bring interactivity and typography to the formerly static Web.

JScript

In July 1996, Microsoft released Internet Explorer 3.0 with a port of JavaScript called JScript.

ECMAScript

In June 1997, ECMA adopted a hybrid version of the scripting languages called ECMAScript. The International Organization for Standardization (ISO) followed suit in 1998. Unfortunately, ECMAScript arrived too late for the 4.0 releases of Netscape Navigator and Internet Explorer. Each introduced their own document object model, DHTML and dHTML, that came to be called Dynamic HTML.

ECMA-262, released in December 1999, is still not followed by Microsoft and their Internet Explorer. Netscape claims to support ECMA-262 in Netscape Navigator versions 6 and 7.

The World Wide Web Consortium

In 1994, Tim Berners-Lee, inventor of the World Wide Web, created the World Wide Web Consortium (W3C) to lead the Web to its full potential. At the beginning of 1997, the companies involved in this consortium — including Netscape Communications and Microsoft — decided to find a consensus around their object models to access and manipulate documents. While trying to stay as backward compatible as possible with the original browser object models, the W3C's Document Object Model (DOM) provided a better object representation of HTML documents.

HTML and XML

In 1996, a new markup language, the Extensible Markup Language (XML), was developed in the W3C as well. Meant to remove the HTML language's extensibility restrictions, the idea of developing an object model for XML quickly became another goal of the DOM effort.

Scope of the W3C DOM

Application Programming Interface

The functionalities provided by the DOM provide as much interoperability as possible. How does one access a node and its children? How about the color of a node? The DOM groups functionalities in a set of programing interfaces. Each interface contains a precise definition of the methods used to provide the functionalities. Here is an example interface:

interface HTMLImageElement {
            attribute DOMString       alt;
}

In this example, the HTML attribute alt contains the alternative textual representation of an image, and this programing access could be used to develop a speech application for Web pages.

Platform independent

The W3C, as a vendor neutral organization, does not support a specific platform or operating system. Following the directions provided by HTML and XML, the DOM does not rely on functionalities provided by only one platform, even if most of the Web users, developers, and designers are using one of the version of the Microsoft Windows platform. This also means that the users are not forced to upgrade their operating systems when using the W3C technologies. Neither Microsoft or Netscape are providing Windows 3.1 support for the latest versions of their respective browsers — IE 6 and NN 6 — even if this platform remains widely deployed. Opera Software does provide support for latest W3C technologies on Windows 3.1 and intent to release a version with DOM support soon.

Language independent

The W3C, again, a vendor-neutral organization, does not support a specific programming language. In order to describe the interfaces provided by the API, the W3C uses an abstract language introduced by the Object Management Group (OMG) — another neutral consortium — called the Interface Definition Language (IDL).

The advantage of using IDL is that the developer learns how to use the DOM with his or her favorite language and can switch easily to a different language because the functionalities and philosophy of the DOM remain the same. The disadvantage is that, since it is abstract, the IDL cannot be used directly by Web developers. Due to the differences between programming languages, they need to have a mapping — or binding — between the abstract interfaces and their concrete languages. This inconvenience is minimal as the approach and basic model used by the DOM remain the same from one language to another.

At the beginning, the W3C DOM Working Group participants agreed to provide two official bindings: one for ECMAScript, the ECMA standard version of JavaScript and JScript, and one for Java, the programming language developed by Sun Microsystems. Java was being promoted by Sun and used as the programming language for the Web. Other languages didn't find enough interest within the Working Group. Since then, the DOM has been mapped to other programming languages such as C, C++, PLSQL, Python, and Perl.

The other main disadvantage of using an abstract language for defining the interfaces is the requirement not to use any language specific approach when providing functionalities. For example, a part of the Java community is using a different tree representation in order to take full advantage of the Java language.

While some of these issues could have been avoided by breaking backward compatibility with pre-DOM existing software, some would remain because there is no overall consensus even within the Java community.

Conclusion

If we use a specific platform or language, and provide a simplistic and/or proprietary view, we are in danger of limiting the potential of the Web. The World Wide Web is and must remain platform-neutral in order to exist in the future. Web pages developed for specific Web browsers constrain their accessibility. A bank account or the ability to choose a seat for an airplane flight must be accessible independently of the user agent. Having proprietary extensions to the standards is an acceptable approach. But lacking agreement on functionalities will force Web developers to work in a specific language or platform and would provide a restricted and useless common ground.

Bringing companies and users around a table and working together on vendor-neutral solutions must remain the approach of the W3C and public organizations.

References

[Champeon01]
JavaScript: How Did We Get Here?, S. Champeon, June 2001. Available at http://www.oreillynet.com/pub/a/javascript/2001/04/06/js_history.html
[DOMActivity]
Document Object Model (DOM) Activity Statement, P. Le Hégaret, June 2002. Available at http://www.w3.org/DOM/Activity