Document Object Model (DOM) Open Issues List

23 October 2001

Introduction

This document is a "wish list" of enhancements to and concerns about the DOM that were not addressed in Level 1, Level 2, or Level 3. Note that the DOM Working Group has not committed to work on most of these points.

If you know of something that should be included here, please let us know so we can update the document. If you can provide a summary, and citations of past discussion, that would help the DOM Working Group a lot.

Comments on this document are invited and are to be sent to the public mailing list www-dom@w3.org. An archive is available at http://lists.w3.org/Archives/Public/www-dom/.

This document has been produced as part of the W3C DOM Activity. The authors of this document are the DOM WG members.

1. Deferred major functions

1.1 Views, Formatting

As raised by the CSS discussions. There may be multiple views of a document; access to data for each view. (Editor, for example, may have source and tree and rendered views active simultaneously. Might be aural etc.) Batching of events and minimizing reformatting costs plays into this. Also, accessing formatting results of applying style, after rendering (but not including the render itself). Presentation-specific. Partly a browser/engine-standardization issue; see what we've got that we can agree on. Browser users looking for standards beyond DOM Level 0. WAI wants selections/highlighting ideas, view-specific computed values from CSS (which is why abstract View went into CSS support -- but not Actual, i.e. final-rendered, values). Note that the XSL folks also want us to start incorporating XSL support.

1.2 Transactions, batching of operations

E.g. roll-back if an operation fails. Database-style (failure) and/or editor-style (rollback). May have to be able to handle complex compound operations; may need to allow users to aggregate on top of that.

1.3 XSLT stylesheets and/or transformation support

Members of the XSL Working Group has strongly requested that we support XSL at least to the degree that we do CSS. On a simple level that isn't hard. It becomes much more complex if we attempt to allow mapping nodes in the generated Document back to nodes in the source, for selection and the like.

1.4 Repository interface

There is a significant amount of interest in providing tools to manage a group of documents within a repository. There is not yet a clear consensus on what a repository is or whether the DOM is the right place to define an API for accessing it. Jeroen van Rotterdam has proposed the following characteristics; they are included here for discussion, but the DOM WG has neither endorsed or rejected them. For example, one of the open questions in defining a repository is whether it contains "DOM Documents", XML documents, or simply documents which may be XML-based.

A Repository should be able to contain several DOM Documents.
A Repository should be able to contain a collection of Repositories which is a recursive structure much like directories in an operating system.
A repository should be able to retrieve, add, delete or update DOM Documents
A repository should be able to retrieve, add. delete or update its child Repositories
A repository should be able to handle checking out and checking in of DOM Documents
A repository should be able to hold a collection of DTD interfaces
Every Document which is valid according to a DTD should be able to access its DTD from the collection of DTD interfaces of its corresponding Repository or any ancestor Repository.
Repositories should be part of the DOM tree structure.
Query languages should be able to execute their queries relative to a Repository node.

1.5 Query Language interface

More input desired from XML Querying Working Group. Known issues: query representation (string or object), response representation (iterator/treewalker?), and whether querying is a DOM API or a more general API that XML models, including DOMs, might implement. See also the "XPath Support" item later.

1.6 Multi-threading support

Old description: "Subclassing of objects is not supported at level 1. Higher order locking will not be included in level one. Question of how to have some people with read access at the same time as others have write access while maintaining predictable results across implementations. Minutes of 19971003 talk about handling concurrency problems in level two."

1.7 Security

From DOM requirements (October 9, 1997 version) document: "an external security API will be provided after level one." Unclear whether this means digital signatures, protecting an in-memory DOM fragment from being written out to storage, controlling document-to-document scripting access, or other.

1.8 Additional event sets

Sense-of-the-group is that we'd like others to design these, but we may want to "bless" some of the most critical ones.

2 Individual Issues and Enhancements

2.1 Lightweight DOM

Sometimes called "server DOM", this is intended to be a "core of the core" subset, reducing power but gaining performance/efficiency.

Its value depends in part on whether full DOM can be clever/efficient enough. "Unless order of magnitude improvement over naive (at least) DOM, may not be worth considering."
The challenge is that there seem to be many different overlapping subsets. For example, palmtops may also want a reduced model but with a different batch of functions. Some (Perl) implementors have asked whether NodeLists, and even NamedNodeMaps, could be made static snapshots rather than live accessors. And so on. These domains are not yet well defined and seem to be a morass of special interests.
One solution might be to break core into hasFeature sets. It's unclear whether the divisions will always be clean ones, or whether people would really want to deal with that fine a granularity. If too much of the DOM becomes optional, interoperability suffers.
Do we also need to represent "this works, but performance may be far below what you would expect from a "full" DOM? (Contrived example: get-parent implemented by search downward from root nodes.)
Seems to need an Editorial Team to analyze requirements and opportunities. That may make this a "deferred function" rather than a localized enhancement.

2.2 isReadOnly test

It might be useful to have the opportunity to test whether a node is currently readonly or not, without having to attempt to modify it and catch the resulting exception or modify it back. Proposed solution: adding an isReadonly flag to the Node interface.

2.3 Read-only DOM

There have been requests for a way to make part or all of the DOM tree read-only. One use case would be when an immutable data structure wishes to permit users to view its contents via a DOM API. Another would be if a standard DOM was to be used to present a document that the users should not be permitted to alter. The latter state might be either transient or permanent.

For example: XSLT's extension mechanism wants to expose the source document via the DOM API, but does not want to permit extensions to alter that document... even if the input was provided as a read/write DOM.

This can be viewed either as a kind of "Lightweight DOM" (which see), or as one instance/application of "data locking" (see multitasking), depending on whether it's inherent in the implementation or switchable via an API.
DOM Level 1 defined the concept of read-only nodes as children of Entity References. Making other nodes/trees read-only should not involve much new behavioral definition beyond deciding when and how that status can be set , cleared, or queried. Note that we would _not_ want to allow an EntityReference's contents to be set read-write!
If we define a setReadOnly() operation (or equivalent), then permanently read-only DOMs can simply pretend that this method was called before the DOM was made available to the user's code. (Read-only DOM implementations could almost claim to be compliant now, since we state when a node is readonly but not when it isn't.)
ISSUE: We haven't decided how normalize() behaves when faced with an unnormalized readonly subtree. Throw an exception and leave the document unchanged? Exception but fix all read/write nodes? Exception and only fix up to the exception point? Silently fix only read/write nodes? Fix even the readonly nodes?
ISSUE: Is this a read-only Document (ie, on doc-by-doc basis, maybe node-by-node), or a read-only DOM? If the latter, then arguably the node factory methods should throw exceptions; if the former, only attempting to insert nodes should object. This also has implications for the statement that cloneNode() produces a mutable copy.
Some folks really don't want to have to implement mutable nodes and node factories at all. The proper solution for that, architecturally, appears to be to build another layer into the DOM -- a read-only superclass which is subclassed to produce the original read/write DOM. That's a nontrivial change, and there has not been complete agreement over what functions belong in each layer.
Keith suggested that higher-level objects (NodeLists, Ranges, and the like) may want to be able to be made immutable independently of the nodes they refer to -- eg, to support a non-live version of these views as a cached query result (cf to his post for a proposed approach).
The general concept seems to be reasonable. But there is some creeping featurism setting in, and there were enough open questions to prevent it from immediately going into DOM Level 2. The challenge is agreeing on exactly how "a read-only DOM" is defined, and deciding whether a single solution will indeed be widely useful.

2.4 Enhance Traversal to merge adjacent Text Nodes

Ability to traverse a normalized view of an un-normalized document. Same issues as "Text Node Amelioration"

Should it really merge/normalize the adjacent text nodes, or simulate a single node?
How does this interact with filtering?
Should this be able to "normalize" through CDATASections?

2.5 Should NodeIterator be cloneable>

Since NodeIterator doesn't have the ability to setCurrentNode, saving and reusing an iteration state is difficult; one has to create a new iterator, then step it until it reaches the right place. The suggestion has been made to add NodeIterator.clone().

This may be expensive if the NodeIterator is doing something fairly fancy under the covers, such as running a query engine. But it should be no more expensive than the step-until-synchronized approach.
Some have put this in terms of NodeIterators produced by other than our factory methods. As the Traversal team anticipated, folks are trying to generalize the concept to handle richer kinds of query and lookup.

2.6 Pseudo-attributes on Processing Instructions

While the value of a processing instruction is defined by XML as being an unstructured string, one approach to using that to contain multiple parameters is to borrow XML's Attribute syntax. If this turns out to be a common usage, the DOM might want to facilitate it.

What should happen with PIs that don't follow this syntax, or instances which intended to but didn't do so properly? We'd need to make the calls smart enough to report parsing errors.
Do we support only XML-style attrs, or HTML-style unquoted values?
Should this allow setting pseudo-attributes as well as retrieving them? If so, do we attempt to emulate Attr's structural model, or are pseudo-attrs text only?
Related to that last: What do we do with Entity References?
Current sense of the WG is that this should be left in user code.

2.7 Attr.getValue() containing unresolved EntityReference

As of DOM2, what should happen in this case is undefined. The definition of Attr.getValue() refers to the "value" of the EntRef -- but EntRefs have no node value.

We could throw an exception. Problem: SIGNATURE CHANGE?
We could have attr treat undefined EntRef as empty. Problem: Not diagnostically useful.

2.8 getElementsByAttributeValue()

getElementsByID() has been added to DOM Level 2, but evaluating this depends on DTD support (non-DOM, in Level 2) to determine what attributes are or aren't IDs. It was suggested that for non-validated documents or other situations where the document's author didn't declare the attribute as ID, we offer a lookup by attribute value (falling back on the possibly-optimized ID retrieval when possible). There is some sense that this more general search might be independently useful even if it can't be optimized.

Some DOMs are providing implementation-specific versions now.
The generalized version, unlike IDs, may return multiple nodes. There isn't consensus on how to do so. Some strongly prefer Iterators, suggesting that this is just a Filter and that implementations could recognize a special Filter and optimize the Iterator into a lookup. Others argue that iterators aren't Core and don't want the prerequisite, leading them to favor NodeLists. Some say "Our real goal is IDs, and general querying will be addressed eventually, so for now just return the first match and let it go at that.
Note that XML Schemas introduce "keys", another way of uniquely identifying a portion of a document.

2.9 Allow event to suppress capture phase

Level 2 allows us to turn off bubbling of events (by setting "bubbles" to false), but there is no corresponding capability to suppress capture on an event-by-event basis. In the public list, John Ky has suggested that there are times when this would be desirable. While it is certainly possible to simply not register any Capturing listeners, and a sufficiently clever implementation may be able to avoid spending cycles on the Capture phase unless listeners exist which need it, it may be useful to be able to say "This specific event really is intended specifically for the stated Event Target and should not be captured."

2.10 Make TreeWalker attributes writable

Since TreeWalker follows strict current-node semantics, it could arguably be made safe to alter its filter, whatToShow, or even root node properties. Miles suggests this as an alternative to generating/maintaining multiple TreeWalkers. We currently recommend against this change; it seems to have significant risk for minimal benefit:

The assumed efficiency gains are highly dependent upon the exact details of how the DOM is implemented. You'd avoid constructing a new TreeWalker object, but might wind up paying significantly more than that inside the DOM.
One example of that: Implementations which generate specialized implementations of Traversal objects -- optimized for particular predefined filters, perhaps as the result of a complex query -- would find this feature hard to support. The best approach we've found would involve proxying, which would impose significant performance overhead.
In fact, one can already change filtering conditions in mid-traversal, by writing a filter which can change the criteria used for its tests. If the nodeType test is moved into the filter and SHOW_ALL is used, that too can be altered on the fly. We recommend against doing so -- indecisive filters can be hard to debug and may surface portability problems -- but it's theoretically possible.

2.11 Iterator.beforeFirst()/afterLast()

Quick reset to initial/final states. See last paragraph of http://lists.w3.org/Archives/Member/w3c-dom-ig/1999Aug/0088.html and follow-ups.

2.12 Diamond inheritance proposed

Architecturally, many (not all) of the additional interfaces added in DOM Level 2 should inherit from the base classes that will implement them -- for example, DocumentTraversal should inherit from Document -- to document and enforce that association, and to avoid unnecessary typecasting. This introduces "diamond inheritance" (B is-an A, C is-an A, D is both a B and a C), which was a problem for some object-oriented languages.

Is diamond inheritance safe for all modern languages likely to implement the DOM, and for all implementations thereof? Probably, these days. But we aren't 100% confident. If not, it probably doesn't belong in our architecture
There was a concern that CORBA bindings might have trouble with typecasting. Latest evidence is that they shouldn't, but that casting of this sort may be a higher-overhead operation. On the other hand, this is not an operation that will be performed frequently.
Possible alternative: publish separate architectural and CORBA-binding IDL, and only close the diamond in the latter. This has some risk of confusing the readers.

2.13 OMG XML RFP

The OMG wants a way to represent XML content as a datatype... with or without "data binding" to application-specific datatypes. The request, and proposals, should be available at http://www.omg.org/techprocess/meetings/schedule/XML_Value_RFP.html.

At least one proposal based on the DOM API has been submitted. Essentially, this provides an IDL binding for the DOM which emphasizes access to local documents.

2.14 I18N Indexing

The DOM is currently expressed as storing data in 16-bit "units". Unicode permits characters longer than this, via diacritics and so on. The I18N folks requested that the DOM provide mapping between character-offsets and unit-offsets for text we have stored. We seem to agree that there's a legitimate need to do this, but have doubts about whether it's an operation that should be defined by and within the DOM.

Strong sense in DOM WG that this proposal was at the wrong granularity. Converting counts would yield an expensive version of substring, for example; you'd have to recount from the beginning, rather than continuing the count from the last known position.
Fixing it inside the DOM doesn't help applications much -- what about info not currently stored in a DOM node?
But there are some things that could be done with a string type specifically designed to handle this, e.g. cache offset points to avoid some of the repetitive scanning.
Some opinion was expressed that I18N should define a UCSstring API, which could address this properly, and which we could then incorporate into our API as a returned type. (Perhaps "the preferred implementation of DOMString".)
Most recent proposal was to include non-normative language, passing the buck to the binding's DOMString and showing the interfaces I18N thinks it should provide.

2.15 Encoding support

The DOM is currently expressed as storing data in UTF-16. Some annoyance has been expressed with the need to convert data from an environment's native string encodings into this one, especially when there may be multiple encodings within a country (EBCDIC and ASCII, several Japanese encodings, etc.)

Given that DOM Level 3 Serialization is implicitly going to have to deal with encoding conversions, should we expose them as subroutines?
Or is this another "There ought to be an internationalized text string datatype" point, passed to whoever eventually winds up dealing with that problem?
See the I18N Indexing issue for related discussion.

2.16 Non-Well-Formed Entities

Currently createDocument can only be used to create an XML document with a single Document Element. Some applications want to use the DOM to create a text document, or an external parsed entity, not contained in a single root. Essentially, they would like to broaden the concept of "document".

Use case: XSLT is now able to output things other than well-formed XML. Our own spec, with non-XML bindings generated from XML, is an example of that sort of processing. It would make a certain amount of sense for the DOM to be able to capture XSLT's output even in this case.

Currently available solution: Use a standard Document, build the contents as children of an Element or DocumentFragment, and make the serializer smart enough to know that it should discard this container.
Proposal 1: modify createDocument so that it doesn't need a documentElement. For example if that parameter is set to null, then no documentElement is created. In this case, the doctype should be null for documents containing text, comments, PIs, but may not be null for documents containing elements. ISSUE: Document can't currently accept all the intended children (text nodes, for example), so this is a distinctly different mode of operation, and arguably a different kind of node.
Proposal 2: allow a stand-alone document fragment that doesn't have an owner document and push the rest onto the serialization module. ISSUE: Is this really significantly better than the current solution of having creating a Document that you don't intend to directly reference, creating a normal DocumentFragment from that, and using the same specialized serializer?
Proposal 3: add a createExternalParsedEntityContent method (or some other name) which is like createDocument but doesn't have a documentElement and can be used for text files or external parsed entities with markup.

2.17 Visitor design pattern

We get periodic requests to implement the Visitor pattern [Design Patterns, 1994, Gamma et al].

Visitor is a standard object design which allows you to execute a unique method based on two object types. This is known as "double-dispatching". The intent in this case would be to execute different code based on the combination of a specific Node subtype and a specific Operation type. (The Operation would correspond to the Visitor).

In order to synthesize double-dispatching in a language that only supports single-dispatching (e.g. C++, Java, Perl) a method must be defined in the Operation interface which corresponds to each Node type, and each Node must implement an Accept(Operation) method which invokes the appropriate support method.

Pro: Visitor provides a well-known idiom for defining an operation to be performed on elements of an object structure.
Con: Visitor is intended to address structures containing objects with varying interfaces. Since the DOM tree is made up entirely of Nodes, is it unclear that this capabiility is really advantageous in our case.
Con: The Visitor model makes adding new operations easy... at the cost of making it hard to add new types/subtypes of objects. Quoting James Melton: "If client code extended the Node hierarchy it would break any existing [Visitor] implementation. Further, if the DOM was extended to include a new Node subtype the Operator interface would have to be extended and every Operator implementation would require updating before the client could use the new DOM version."
Con: And in fact, XML processing will often want to dispach on namespace and node name as well as node type. If the nodes aren't subclassed to that level, the Visitor's dispatch is only the first, and possibly least, step in determining the node's semantic role and the proper behavior of this Operation.
Conclusion: Writing an explicit dispatch which examines the node and switches appropriately for this Operation may not be as elegantly object-oriented, but it accomplishes essentiallly the same goal, seems to allow better extensibility, and our current best estimate is that it's likely to yield better performance in most DOM implementations. So at this time, we don't see significant value in adding Accept/Visitor to the DOM API.

2.18 Error reporting

A list of error codes could be developed for USLs and other languages that don't support exceptions

2.19 Meta data

"For level 1, just have all meta data be of type string/UTF-8 (for the 5 minimal things), but the type system must be extensible in the long run. Level two will have more on meta data."

2.20 URis and systemIds

(from David Brownell's list.)

For nodes exposing a SystemId, there are cases where it needs to be a fully resolved URI; and there are cases where it needs to be unresolved.