Copyright ©2002 W3C® (MIT, INRIA, Keio), All Rights Reserved. W3C liability, trademark, document use and software licensing rules apply.
Web designers like to add effects to their pages.
Have you ever read Your browser cannot access the requested
service. Click here to upgrade your browser.
? If not, then you
are probably the very fortunate owner of the most-used operating
system and browser on the Web, and you have downloaded all
the possible extensions (plug-ins
) for it. If yes, your only
chance to access the content is to spend hours convincing the
company who provides it to change its means of delivery, or spend
hours and/or money to make changes to your software. You might have
to buy a new computer and buy software for it.
Why can't they give you access to the content in the first place? The reasons vary. Possibly they require a level of security that your browser does not provide. They may have written content that can only be read with a specific extension. Or, perhaps they include programming code in the page that is not compatible with your browser.
Being able to write programming code which can work on any kind of browsers is not an easy and the W3C has been, and is still, trying to find a common ground between. It facilitates the work on the Web designers and the programmers and, furthermore, it improves the accessiblity by requiring assistive functionalities.
An HTML or XML document can be represented as a tree
.
Each document has one root
or root node
, which
contains children
or nodes
, and which themselves can
have children. Generally speaking, the leaves contain the text of
document. For example, here is an HTML table:
| Shady Grove | Aeolian |
| Over the River, Charlie | Dorian |
and a graphical representation of the tree:

Each node in the document is associated with stylistic effects such as color, position, and borders. These stylistic effects are not always part of the document and might be defined in a separate section called a style sheet.
In order to make Web pages more dynamic, lots of Web designers
include stylistic effects when the user manipulates input devices.
For example, when using the mouse device, the user might move the
cursor over the page, they might click in a certain area, etc., or,
when using the keyboard, the user might input information in a Web
form. The DOM introduces the ability to capture those events
and react to them.
Given the tree structure explained above, a click with a mouse
on an image could be propagated in the structure from the root to
the target, the image itself, and go back to the root again. Using
the DOM, Web developers can attach little pieces of code called
observers
to react to the event.
Lots of assistive technologies, such as screenreaders, are developed on top of existing applications. In order to ensure that functionalities are accessible, the applications must expose them through an API: the content of the document displayed on the screen must be accessible through programmatic access, as well as the menu items in the user interface. The applications accomplish the first step of this requirement when exposing the documents through the DOM API. It is possible to simulate a mouse device by creating the mouse events and propagate them in the DOM tree, replace the images with their alternatives contents, remove the blinking text, etc.
The history of the Document Object Model, known as the DOM, is tightly coupled with the beginning of the JavaScript and JScript scripting languages.
The LiveScript
language was designed at Netscape
Communications to make the Java support in Netscape Navigator more
accessible to non-Java programmers. LiveScript, like any scripting
language, is a loosely-typed language. It is intended for a large
audience of Web designers and developers.
In December 1995, LiveScript was renamed JavaScript
and
released as part of Netscape Navigator 2.0. Except for marketing
purposes, JavaScript has nothing to do with the Java language
developed and maintained by Sun Microsystems. The Web community
started then to manipulate the content of Web documents, in order
to bring interactivity and typography to the formerly static
Web.
In July 1996, Microsoft released Internet Explorer 3.0 with a port of
JavaScript called JScript
.
In June 1997,
ECMA adopted a hybrid version of the scripting languages
called ECMAScript
. The International Organization for
Standardization (ISO) followed suit in 1998. Unfortunately,
ECMAScript arrived too late for the 4.0 releases of Netscape
Navigator and Internet Explorer. Each introduced their own document
object model, DHTML
and dHTML
, that came to be called
Dynamic HTML
.
ECMA-262, released in December 1999, is still not followed by Microsoft and their Internet Explorer. Netscape claims to support ECMA-262 in Netscape Navigator versions 6 and 7.
In 1994, Tim Berners-Lee, inventor of the World Wide Web,
created the World Wide Web Consortium (W3C) to lead the Web to
its full potential
. At the beginning of 1997, the companies
involved in this consortium — including Netscape
Communications and Microsoft — decided to find a consensus
around their object models to access and manipulate documents.
While trying to stay as backward compatible as possible with the
original browser object models, the W3C's Document Object
Model
(DOM) provided a better object representation of HTML
documents.
In 1996, a new markup language, the Extensible Markup Language (XML), was developed in the W3C as well. Meant to remove the HTML language's extensibility restrictions, the idea of developing an object model for XML quickly became another goal of the DOM effort.
The functionalities provided by the DOM provide as much
interoperability as possible. How does one access a node and its
children? How about the color of a node? The DOM groups
functionalities in a set of programing interfaces
. Each
interface contains a precise definition of the methods used to
provide the functionalities. Here is an example interface:
interface HTMLImageElement {
attribute DOMString alt;
}
In this example, the HTML attribute alt
contains the
alternative textual representation of an image, and this programing
access could be used to develop a speech application for Web
pages.
The W3C, as a vendor neutral organization, does not support a specific platform or operating system. Following the directions provided by HTML and XML, the DOM does not rely on functionalities provided by only one platform, even if most of the Web users, developers, and designers are using one of the version of the Microsoft Windows platform. This also means that the users are not forced to upgrade their operating systems when using the W3C technologies. Neither Microsoft or Netscape are providing Windows 3.1 support for the latest versions of their respective browsers — IE 6 and NN 6 — even if this platform remains widely deployed. Opera Software does provide support for latest W3C technologies on Windows 3.1 and intent to release a version with DOM support soon.
The W3C, again, a vendor-neutral organization, does not support
a specific programming language. In order to describe the
interfaces provided by the API, the W3C uses an abstract
language
introduced by the Object Management Group (OMG)
— another neutral consortium — called the Interface
Definition Language (IDL).
The advantage of using IDL is that the developer learns how to
use the DOM with his or her favorite language and can switch easily
to a different language because the functionalities and philosophy
of the DOM remain the same. The disadvantage is that, since it is
abstract, the IDL cannot be used directly by Web developers. Due to
the differences between programming languages, they need to have a
mapping — or binding
— between the abstract
interfaces and their concrete languages. This inconvenience is
minimal as the approach and basic model used by the DOM remain the
same from one language to another.
At the beginning, the W3C DOM Working Group participants agreed to provide two official bindings: one for ECMAScript, the ECMA standard version of JavaScript and JScript, and one for Java, the programming language developed by Sun Microsystems. Java was being promoted by Sun and used as the programming language for the Web. Other languages didn't find enough interest within the Working Group. Since then, the DOM has been mapped to other programming languages such as C, C++, PLSQL, Python, and Perl.
The other main disadvantage of using an abstract language for defining the interfaces is the requirement not to use any language specific approach when providing functionalities. For example, a part of the Java community is using a different tree representation in order to take full advantage of the Java language.
While some of these issues could have been avoided by breaking backward compatibility with pre-DOM existing software, some would remain because there is no overall consensus even within the Java community.
If we use a specific platform or language, and provide a simplistic and/or proprietary view, we are in danger of limiting the potential of the Web. The World Wide Web is and must remain platform-neutral in order to exist in the future. Web pages developed for specific Web browsers constrain their accessibility. A bank account or the ability to choose a seat for an airplane flight must be accessible independently of the user agent. Having proprietary extensions to the standards is an acceptable approach. But lacking agreement on functionalities will force Web developers to work in a specific language or platform and would provide a restricted and useless common ground.
Bringing companies and users around a table and working together on vendor-neutral solutions must remain the approach of the W3C and public organizations.