From HTML WG Wiki
Jump to: navigation, search

Document fragment pointers

Bug Report

Problem statement / use cases

  • authors do not always provide ID attribute values for all elements for users and other authors to reference
  • authors sometimes do not ensure uniqueness of ID values, blocking the referencing of one or the other or both document fragments

Proposed solutions

Identify semantics and syntax

The first step is to identify the semantics that we might want authors to be capable of referencing within HTML documents.

This is just a first shot at the semantics that might be needed and a syntax that might work. I've tried to choose special characters that are considered reserved for special use in URIs and also tried to stay within ASCII. If there are any collisions with other syntax, please not it on the list or the bug report.

This is also an attempt to fit within the space of XPointers. In that sense the ChildIndex is a gereralized form of the XPointer ChildSeq and then that becomes one parr of a RelSeq. In this way authors can accomplish many more possible document fragment references without resorting to the complications associated with full XPointer functions.

Pointer ::= RelSeq | Clip | Range | FullXPtr

Expression Possible Syntax Note
RelSeq::= ("#" & Name)?
IDRefDelimiter &
(SiblingIndex | Ancestor )? &
( ChildIndex | ChildName )* &
( AttrName )? &
( CharIndex )?
IDRefDelimiter::= a delimiter might make it possible for the URL to degrade gracefully in UAs that do not support this new pointer approach, though on the other hand it would also be an incompatibility with XPointer syntax
SiblingIndex::= +" | "|-") & 1-9 & (0-9)* Sibling element index. "+1" indicates the next sibling "-3" the third previous sibling (implying 0 for element self)
Ancestor::= ../" & ("../")*
ChildIndex::= '/' & "!"? & [1-9] [0-9]* Child element index. The "!" character causes the index to be reversed counting from the last child element in the element’s contents
ChildName::= '/' & Name Child element name (IDENT). Using the proposed IdAndTypeID: IDENT data type
AttrName::= "$" & Name
CharIndex::= =" & "!"? & 1-9 & (0-9)* The "!" character causes the index to be reversed counting from the last character in the element’s contents. Characters are counted as Unicode characters (where each surrogate pair is one character), ignoring markup characters and after applying XML whitespace normalization (line breaks are always one character) but independent of CSS whitespace handling (more whitespace characters may be present than are rendered for display).
ClipBookmark: Clip::= "@" & NMToken
Range::= RelSeq & "~" & RelSeq The first child sequence if relative to the initial document ID, while the second is relative to the first unless a second "#" or "/" commences the RelSeq
FullXPtr::= Reference XPointer Recommendation

Error handling

The other part HTML5 should define is how processors should handle situations where the pointer fails either partially or completely. How much notification should be passed on to users of an interactive UA. And so on.

Discussion and evaluation


WG members should post feedback and other discussion to the WG’s list serve (the URI for the links below provides date information). Search on this email subject.

See also