Document fragment pointers
Problem statement / use cases
- authors do not always provide ID attribute values for all elements for users and other authors to reference
- authors sometimes do not ensure uniqueness of ID values, blocking the referencing of one or the other or both document fragments
Identify semantics and syntax
The first step is to identify the semantics that we might want authors to be capable of referencing within HTML documents.
This is just a first shot at the semantics that might be needed and a syntax that might work. I've tried to choose special characters that are considered reserved for special use in URIs and also tried to stay within ASCII. If there are any collisions with other syntax, please not it on the list or the bug report.
This is also an attempt to fit within the space of XPointers. In that sense the ChildIndex is a gereralized form of the XPointer ChildSeq and then that becomes one parr of a RelSeq. In this way authors can accomplish many more possible document fragment references without resorting to the complications associated with full XPointer functions.
Pointer ::= RelSeq | Clip | Range | FullXPtr
|RelSeq::=||("#" & Name)?
(SiblingIndex | Ancestor )? &
( ChildIndex | ChildName )* &
( AttrName )? &
( CharIndex )?
|IDRefDelimiter::=||a delimiter might make it possible for the URL to degrade gracefully in UAs that do not support this new pointer approach, though on the other hand it would also be an incompatibility with XPointer syntax|
|SiblingIndex::=||+" | "|-") & 1-9 & (0-9)*||Sibling element index. "+1" indicates the next sibling "-3" the third previous sibling (implying 0 for element self)|
|Ancestor::=||../" & ("../")*|
|ChildIndex::=||'/' & "!"? & [1-9] [0-9]*||Child element index. The "!" character causes the index to be reversed counting from the last child element in the element’s contents|
|ChildName::=||'/' & Name||Child element name (IDENT). Using the proposed IdAndTypeID: IDENT data type|
|AttrName::=||"$" & Name|
|CharIndex::=||=" & "!"? & 1-9 & (0-9)*||The "!" character causes the index to be reversed counting from the last character in the element’s contents. Characters are counted as Unicode characters (where each surrogate pair is one character), ignoring markup characters and after applying XML whitespace normalization (line breaks are always one character) but independent of CSS whitespace handling (more whitespace characters may be present than are rendered for display).|
|ClipBookmark: Clip::=||"@" & NMToken|
|Range::=||RelSeq & "~" & RelSeq||The first child sequence if relative to the initial document ID, while the second is relative to the first unless a second "#" or "/" commences the RelSeq|
|FullXPtr::=||Reference XPointer Recommendation|
The other part HTML5 should define is how processors should handle situations where the pointer fails either partially or completely. How much notification should be passed on to users of an interactive UA. And so on.
Discussion and evaluation