Best Practices for Fragment Identifiers and Media Type Definitions

Abstract

Fragment identifiers within URIs are specified as being interpreted based on the media type of a representation. Media type definitions therefore have to provide details about how fragment identifiers are interpreted for that media type. This document recommends best practices for the authors of media type definitions, for the authors of structured syntax suffix definitions (such as +xml), for the authors of specifications that define syntax for fragment identifiers, and for authors that publish documents that are intended to be used with fragment identifiers or who refer to URIs using fragment identifiers.

Media type registrations should avoid "inheriting" generic fragment identifier rules from both the top-level type and any structured syntax suffix that they use if the fragment identifier syntaxes defined for these overlap and may provide different meanings for the same fragment identifier. Where fragment identifier syntax does overlap, media type registrations should specify which take priority in resolving a fragment identifier. Media type registrations should also reserve the use of plain name fragment identifiers for content named by authors, and specify any restrictions on the interpretation of fragment identifiers by scripts. They should avoid defining new fragment identifier structures within the registration document itself, and should avoid constraining how applications handle fragment identifiers that do not resolve.

Structured syntax suffix registrations are based on a structured syntax which usually will have its own media type registration. The structured syntax suffix registration should define the same fragment identifier rules as are used in that media type registration. Further, they should specify that any fragment identifiers that do not resolve according to these rules should be handled in the way specified by the specific media type that is using the structured syntax.

The designers of fragment identifier structures (such as XPointer) should avoid syntactic overlaps with existing fragment identifier structures and ensure that fragment identifiers can be used across formats with similar semantics.

Publishers should ensure that any addressable structures within documents that are served through content negotiation are consistent across content-negotiated variants. They should also ensure that scripts handle fragment identifiers consistently with the fragment identifier rules for the relevant media type. Authors referring to URIs with fragment identifiers should avoid using fragment identifiers that are specific to a particular syntax (such as XPointer, which is specific to XML) unless they can ascertain that the base URI only serves one representation.

Introduction

Fragment identifiers within URIs are used in three main ways:

to jump to, highlight, or zoom in to a particular piece of content when displaying a larger document
to identify a piece of content for extraction, for example for embedding within another document
to provide an identifier for either a piece of content or something described within a document that can be used as the basis of annotation

When URIs contain fragment identifiers, they are interpreted based on the media type of the representation that is retrieved when the URI is requested. The Generic Syntax for URIs [URI] states:

The semantics of a fragment identifier are defined by the set of representations that might result from a retrieval action on the primary resource. The fragment's format and resolution is therefore dependent on the media type [RFC2046] of a potentially retrieved representation, even though such a retrieval is only performed if the URI is dereferenced. If no such representation exists, then the semantics of the fragment are considered unknown and are effectively unconstrained. Fragment identifier semantics are independent of the URI scheme and thus cannot be redefined by scheme specifications.

Individual media types may define their own restrictions on or structures within the fragment identifier syntax for specifying different types of subsets, views, or external references that are identifiable as secondary resources by that media type. If the primary resource has multiple representations, as is often the case for resources whose representation is selected based on attributes of the retrieval request (a.k.a., content negotiation), then whatever is identified by the fragment should be consistent across all of those representations. Each representation should either define the fragment so that it corresponds to the same secondary resource, regardless of how it is represented, or should leave the fragment undefined (i.e., not found).

Media Type Specifications and Registration Procedures includes a "Fragment identifier considerations" section within the template for registering media types and says:

Media type registrations can specify how applications should interpret fragment identifiers [RFC3986] associated with the media type.

Media types are encouraged to adopt fragment identifier schemes that are used with semantically similar media types. In particular, media types that use a named structured syntax with a registered "+suffix" must follow whatever fragment identifier rules are given in the structured syntax suffix registration.

Problems arise when a media type wishes to adopt several sets of fragment identifier semantics because of its similarity with other media types and its use of a structured syntax. For example, as well as defining its own method of interpreting fragment identifiers, SVG [SVG11] has the media type image/svg+xml and therefore must follow the rules for fragment identifiers that are common to all XML documents (XML Media Types Draft). As an image format, it should also use the common fragment identifier semantics for images (Media Fragments URI 1.0 [MEDIA-FRAGMENTS]). If RDF is embedded within the SVG through RDF/XML or RDFa, fragment identifiers might have RDF semantics and be used to refer to real-world things pictured within the SVG. Finally, if fragment identifiers are interpreted by scripts embedded within the SVG, they may have yet another purpose: to encode local application state.

SVG is only one example of a media type in which conflicts between different uses of fragment identifiers occur. XHTML, which also uses the +xml structured syntax suffix, contains scripts that may interpret fragment identifiers and may be used to carry data interpreted according to RDF semantics.

This document recommends some Best Practices for those registering media types, those registering structured syntax suffixes, the authors of fragment identifier schemes and individual document authors.

1. Conformance

As well as sections marked as non-normative, all authoring guidelines, diagrams, examples, and notes in this specification are non-normative. Everything else in this specification is normative.

The key words must, must not, required, should, should not, recommended, may, and optional in this specification are to be interpreted as described in [RFC2119].

The RFC2119 keywords are only used within Best Practices.

2. Terminology

fragment identifier structure: a defined set of fragment identifier syntax, semantics and processing requirements
plain name fragment identifier: a fragment identifier that does not include any special internal syntax
semantic fragment identifier structure: a fragment identifier structure that provides access to semantic fragments of a document based on application-level understanding of its meaning, and may be used across multiple media types that use different syntaxes; media fragment URIs are an example
syntax-based fragment identifier structure: a fragment identifier structure that provides access to structures within a document based on the syntax used by the document; XPointer is an example

3. Best Practices Summary

Best Practice 1: Ensure Consistent Generic Processing of Overlapping Fragment Identifier Structures
Best Practice 2: Specify Resolution of Fragments that Comply with Multiple Fragment Identifier Structures
Best Practice 3: Reserve Plain Name Fragment Identifiers
Best Practice 4: Define Active Content Processing of Fragment Identifiers
Best Practice 5: Avoid Specifying Fragment Identifier Structures within Media Type Registrations
Best Practice 6: Process Fragment Identifiers in the Same Way as for the Generic Media Type
Best Practice 7: Enable Additional Fragment Identifiers to be Processed According to Media Type
Best Practice 8: Avoid Conflicting with Existing Fragment Identifier Structures
Best Practice 9: Define Fragment Identifier Structures for General Use
Best Practice 10: Name Structures Consistently Across Content Negotiated Representations
Best Practice 11: Scripts Must Adhere to Constraints Specified within Media Type Registrations
Best Practice 12: Do Not Use Syntax-Based Fragment Identifiers with Multiple Content-Negotiated Variants

4. Best Practices for Media Type Registrations

Individual media type registrations define how fragment identifiers should be interpreted when found in documents of that media type. These registrations must balance the following goals:

enable consistent processing of fragment identifiers by applications that are aware of the specific media type and by generic processors for types that share the same structured syntax
facilitate content negotiation between documents of the media type and other media types that might be used for representations of the same resource
if the media type supports scripting, enable publishers to use fragment identifiers to encode application state where appropriate

Generic applications may process documents of a particular media type without knowing about the specific rules that apply to that media type as specified in its registration. For example, a browser might always attempt to display any text/* document as text, or any application/*+xml document using XML syntax highlighting. The interpretation of fragment identifiers by media-type-aware applications should match the behaviour of these generic applications, so that the same fragment is identified whether or not the application has built-in knowledge of the media type.

As specified in Media Type Specifications and Registration Procedures [RFCXXXX], media type registrations that adopt a named structured syntax with a registered suffix, such as XML or JSON, must follow whatever fragment identifier rules are specified in the structured syntax suffix registration. This ensures that there is consistency in processing between generic applications that understand the structured syntax and those that are aware of the specific media type. Similar considerations also apply, however, to other fragment identifier structures that may be used by generic processors, for example those that perform generic processing based on the top-level media type (text, image and so on).

Another source of constraints on fragment identifier structures supported by a media type is support for content negotiation. When multiple representations with different media types are served for the same URL, fragment identifiers should be used consistently across those documents, either identifying content with the same semantic content in each representation, or giving an error.

For example, it might be anticipated that documents in ABC Music Notation with a media type of text/vnd.abc would frequently be served up through content negotiation alongside documents in MusicXML with a media type of application/vnd.recordare.musicxml+xml. People referring to music may want to reference particular bars within the musical score using fragment identifiers. To enable this to happen, both media types would need to support the same fragment structure, so that the same bar could be identified regardless of which content negotiated representation was served up. In this example, the two formats do not share the same structured syntax and are not within the same top-level media type: the need for consistency arises because the two formats have the same semantic content.

With multiple potential fragment identifier structures to comply with, there's the potential for the syntaxes of those structures to overlap with each other, which means that any given fragment identifier might:

identify the same part of the document under both interpretations
only identify a part of the document under one interpretation (the fragment identifier being an error under the alternative interpretation)
identify different parts of the document under the different interpretations

The last of these possibilities is problematic, and it can be hard for someone writing a URI reference to know which of these three categories a given fragment identifier falls into. In addition, if the base document is changed after a given URI reference is created, a fragment identifier might switch category, and suddenly become problematic without the creator of the URI reference being aware of the error.

For these reasons, it is best if media types avoid syntactic conflicts between fragment identifier structures. When syntactic overlaps occur due to a requirement to support different types of generic processing (ie generic processing based on the top-level media type and generic processing based on a structured syntax suffix), the media type registrant should ensure that all fragments that are of the overlapping syntax identifies the same fragment in each; if that is not possible, the media type should not use the structured syntax suffix.

Best Practice 1: Ensure Consistent Generic Processing of Overlapping Fragment Identifier Structures

If two or more fragment identifier structures used by different generic processors applicable to the media type overlap in syntax, they should have consistent semantics for any fragments that use that common syntax.

There may also be syntactic overlaps between fragment identifier structures that address application-specific fragments (as in the music notation example above), and between application-specific fragment identifier structures and generic fragment identifier structures. Applications that are aware of the application-specific fragment structures will know about and can therefore follow guidance within the media type registration about how to interpret fragment identifiers, so fragment identifiers that follow the syntax of more than one application-specific fragment identifier structure can be resolved in a predictable and consistent manner across applications. The media type registration simply has to specify how this happens.

Best Practice 2: Specify Resolution of Fragments that Comply with Multiple Fragment Identifier Structures

If there's the possibility for a fragment identifier to comply with the syntax of multiple fragment identifier structures used by the media type, and the fragment identifier would identify different fragments in those cases, the registration should specify how such fragment identifiers are resolved.

Plain names are a common type of fragment identifier structure. A plain name fragment identifier is a fragment identifier that is used to identify a named structure that is local to a document, such as one identified by an @id attribute in HTML or a @xml:id attribute in XML or the name of a function within a Python program. These fragment identifiers are opaque to processors and as such they do not normally include punctuation characters, though this depends on the language: in XML, for example, they usually match the NCName production from XML Namespaces [XML-NAMES11] which means they can contain hyphens (-) and periods (.).

Plain name fragment identifiers are usually created by human authors but may also be generated by applications. They provide a good method of identifying content that is equivalent across content negotiated variants of a document, for example paragraphs of text in French and Chinese that contain the same semantic content. Plain name fragment identifiers that do not identify a portion of a document are frequently used in Semantic Web applications as a way of providing an identifier for a thing described by the document.

Best Practice 3: Reserve Plain Name Fragment Identifiers

If the media type includes structures that can be given local names or identifiers, plain name fragment identifiers should be reserved for addressing those structures.

Some media types support active content, whereby scripts provided by the publisher are used to manipulate the document while it is being viewed. Depending on the scripting support in the media type, such scripts may use the fragment identifier to encode application state (see Identifying Application State for details). The presence of a script does not change what a given fragment identifier identifies, but individual scripts may extend the space of meaningful fragment identifiers for a particular document, by virtue of interpreting those fragment identifiers.

As described in section 7. Best Practices for Document Authors, the developers of scripts need to ensure that any fragment identifiers whose behaviour is defined by the media type of the document the script is used with are handled consistently with that definition. Media type registrations therefore need to make it easy for such developers to understand how fragment identifiers will be interpreted by other applications, and what syntax can be used by script developers to encode application state. For example, in HTML hash-bang URIs, in which the fragment identifier starts with #!, are commonly reserved for interpretation by scripts.

Best Practice 4: Define Active Content Processing of Fragment Identifiers

If the media type supports active content, the registration should specify any constraints on how scripts may process fragment identifiers adhering to known fragment identifier structures. The registration may define a reserved syntax for fragment identifiers that are interpreted by scripts.

Aside from specifying support for plain name fragment identifiers and any fragment identifier syntax reserved for use by scripts, individual media type registrations should not contain the specifications for media-type-specific fragment identifier structures. Instead, registrants should consider specifying a separate fragment identifier structure, following the guidelines in section 6. Best Practices for Fragment Identifier Structures, and referencing that specification from the media type registration. This ensures that other media types with similar content can easily reference and reuse the same fragment identifier structure.

Best Practice 5: Avoid Specifying Fragment Identifier Structures within Media Type Registrations

Media type registrations should reference external fragment identifier structures specifications where they exist, and the registrant should create them if required, rather than embedding such definitions within the media type registration.

It is possible for a given fragment identifier used with a document to be an error in two ways:

the fragment identifier might not match the syntax of any of the fragment identifier structures used by the media type
the fragment identifier might match the syntax but not resolve to a fragment of the document (for example because there is no named structure with a given plain name identifier)

There are several legitimate reasons for fragment identifiers to error in these ways. Fragment identifiers, particularly plain name fragments, are sometimes used within URIs to identify things described by the document, rather than a fragment within the document. Active content may interpret otherwise unrecognised fragment identifiers. A given document may have a content negotiated variant for which the fragment identifier is meaningful. Thus the purpose of a media type registration is to define how recognised fragment identifiers are to be resolved, not to constrain the syntax of fragment identifiers used for a given document. The behaviour of an application faced with a fragment identifier that does not resolve to a fragment for whatever reason should be implementation defined.

5. Best Practices for Structured Syntax Suffix Registrations

Structured syntax suffix registrations are designed to enable generic processing of media types that share a structured syntax such as XML (+xml) and JSON (+json). With respect to fragment identifiers, these registrations should describe the generic processing of fragment identifiers within documents that use the structured syntax.

The processing of fragment identifiers for media types that adopt a structured syntax suffix and the media type for the structured syntax itself should be identical. For example, fragment identifiers for +xml media types should be processed in the same way as fragment identifiers for the application/xml media type and fragment identifiers for +json should be processed in the same way as fragment identifiers for application/json. This ensures that generic processors designed for the generic media type can be used with the media types that adopt the structured syntax suffix.

Best Practice 6: Process Fragment Identifiers in the Same Way as for the Generic Media Type

Structured syntax suffix registrations should define processor behaviour for fragment identifiers that is consistent with the relevant associated generic media type.

As described in section 4. Best Practices for Media Type Registrations, individual media types are required by Media Type Specifications and Registration Procedures [RFCXXXX] to follow whatever fragment identifier rules given in any structured syntax suffixes that they adopt. They may need to adopt several fragment identifier structures to support other generic processing or for consistency with other types with which they share semantics. For example, image/svg+xml needs to follow the generic fragment identifier processing specified by the +xml structured syntax suffix registration as well as the generic processing of fragment identifiers used to identify portions of images.

Fragment identifier rules in a structured syntax suffix registration should therefore be focused on the generic processing of fragment identifiers for the structured syntax. They should not specify the behaviour of fragment identifiers that fall outside those generic fragment identifier structures, because if they did it would be hard for media types to adopt fragment identifier structures aside from those specified by the structured syntax suffix registration.

Best Practice 7: Enable Additional Fragment Identifiers to be Processed According to Media Type

Structured syntax suffix registrations should not classify fragment identifiers that do not match the defined fragment identifier syntax for the structured syntax or that do not resolve to a fragment as errors or constrain what they identify; instead the registration should say that such fragment identifiers are resolved according to the individual media type registration and may identify anything.

As described in section 4. Best Practices for Media Type Registrations, plain name fragment identifiers are commonly used within individual media types to address named structures within a document. The ways in which these structures are named may be specified at the structured syntax level, or at the media type level, or both. For example, XML itself defines a mechanism for naming elements within a document (using xml:id attributes [XML-ID] and the ID attribute type), but an XML-based markup language such as RDF/XML may specify an alternative semantics for plain name fragment identifiers, such as their RDF semantics.

Following the two best practices above ensures that plain name fragment identifiers which identify fragments through the generic processing of the structured syntax have a consistent semantics based on that processing, while the semantics of those that do not identify a fragment according to that generic processing can be determined by the individual media type.

6. Best Practices for Fragment Identifier Structures

A fragment identifier structure is a defined set of fragment identifier syntax, semantics and processing requirements. A fragment identifier structure may be specific to a particular media type but is often shared across a range of media types. Examples of cross-media type fragment identifier structures are XPointer [XPTR-FRAMEWORK] and Media URI Fragments [MEDIA-FRAGMENTS].

As described in section 4. Best Practices for Media Type Registrations, a media type may adopt multiple fragment identifier structures, and it can be a problem specifying processing if these have overlapping syntaxes with different semantics. Generally, fragment identifier structures fall into two categories:

syntax-based fragment identifier structures provide access to structures within a document based on the syntax used by the document; XPointer is an example
semantic fragment identifier structures provide access to semantic fragments of a document based on application-level understanding of its meaning, and may be used across multiple media types that use different syntaxes; media fragment URIs are an example

If the syntax of a syntax-based fragment identifier structure overlaps with that of a semantic fragment identifier structure, they cannot be used together, which constrains how they might be used. Similarly, if the syntax of two semantic fragment identifier structures that could both apply to a given media type overlap and identify different things, the media type will have to specify which takes precedence during fragment identifier processing. The designers of new fragment identifier structures should therefore be careful to avoid clashing with existing fragment identifier structures that could feasibly be used in combination. In particular, the developers of semantic fragment identifier structures should avoid conflicts with known syntax-based fragment identifier structures, such as XPointer.

Best Practice 8: Avoid Conflicting with Existing Fragment Identifier Structures

New fragment identifier structures should avoid overlapping with the syntax used by existing fragment identifier structures that could be used in combination with them.

Fragment identifier structures are most useful when they can be adopted by a range of media types. This enables them to be used by generic processors and helps authors to reference structures common across multiple representations of content negotiated resources. Fragment identifier structures should therefore be targetted towards common use across media types rather than being media type specific. For example, rather than designing a fragment identifier structure to be used to identify musical structures purely with MusicML, designers should take into account other formats used to represent music with which the fragment identifier structure could be used.

Best Practice 9: Define Fragment Identifier Structures for General Use

New fragment identifier structures should be defined such that they can be used across media types that share the same syntax or semantics rather than being tuned specifically to a single media type.

7. Best Practices for Document Authors

There are two categories of usage of fragment identifiers by document authors: publishing documents that contain addressable content or use scripts that process fragment identifiers, and using fragment identifiers within URIs to address content.

7.1 Best Practices for Publishers

Fragment identifiers are not passed to servers for processing, but publishers can influence the interpretation of fragment identifiers, typically by naming structures within their documents so that they can be addressed through plain name fragment identifiers. As described in [WEBARCH], when publishers do this with content-negotiated resources, they should make sure that each plain name fragment identifier identifies a set of fragments with consistent semantics across representations.

Best Practice 10: Name Structures Consistently Across Content Negotiated Representations

Publishers should ensure that structures identifiable with the same fragment identifier in two content-negotiated representations have the same semantics. Equally, where two structures in content-negotiated representations have the same semantics, they should be addressable through the same fragment identifier.

Publishers are also responsible for any scripts called from documents that they publish. Scripts can also enhance the display of fragments within documents, for example by smoothly scrolling to the relevant area of the document, highlighting it, or zooming in or out to the selected area. Scripts may also process and alter fragment identifiers as a way of managing application state, as described in Identifying Application State.

Scripts do not change what a fragment identifier identifies, but they can change what users see. For users viewing a document, it is helpful if scripts handle fragment identifiers in a way that is consistent with how they are resolved based on the media type. For example, given an HTML document, it would be confusing if a fragment identifier #example were interpreted by a script as meaning that all instances of the word example should be highlighted, rather than scrolling to the anchor named example. As described in , media type registrations may specify constraints on how fragment identifiers are handled by scripts, and may specify a reserved syntax for scripts that wish to use fragment identifiers to store application state.

Best Practice 11: Scripts Must Adhere to Constraints Specified within Media Type Registrations

Scripts must adhere to any constraints that are placed on their behaviour in the appropriate media type registration.

7.2 Best Practices for Referrers

Different fragment identifiers are understood by different processors and have different longevity and utility across content-negotiated representations of a resource:

plain name fragment identifiers are typically author-generated, and therefore are relevant for so long as the structure that is named can be found within the document; they can be useful when dealing with content-negotiated resources as publishers should ensure that they are consistently used across representations, but they generally require processors that are aware of the particular media type of the representation
semantic fragment identifier structures such as Media Fragment URIs are particularly useful when addressing content-negotiated resources or resources that do not support plain name fragment identifiers; they are also likely to be more robust than syntax-based fragment identifiers in the face of changes, but usually require processors to be aware of the specific media type of the representation
syntax-based fragment identifier structures such as XPointer are useful when targetting a representation that uses a particular known syntax, particularly as they can be used by generic processors without knowledge of the underlying media type; they can be fragile in the face of changes to the underlying document, however, and do not generally work across content-negotiated variants (and when they do, they are likely to identify different fragments)

In general, plain name and semantic fragment identifiers are much more useful than syntax-based fragment identifiers, the exception being when specifically targetting fragments of a document in a particular format for processing.

Best Practice 12: Do Not Use Syntax-Based Fragment Identifiers with Multiple Content-Negotiated Variants

Authors should not use URIs with syntax-based fragment identifiers unless they can ascertain (through server documentation, an HTTP OPTIONS request or other methods) that the base URI addresses a resource with a single representation format.

A. Acknowledgements

Many thanks to Chris Lilley, Ashok Malhotra, Larry Masinter, Noah Mendelsohn and Henry Thompson for their comments, and to Robin Berjon for ReSpec.js.

B. Analysis

This appendix looks at various fragment identifier structures that apply to SVG and uses as an example a simple bar chart at http://example.org/potter, which has an SVG representation:

<svg xmlns="http://www.w3.org/2000/svg" width="150px" height="120px" viewBox="0 0 300 225">
  <g stroke="grey" stroke-width="40">
    <line id="harry" x1="50" x2="50" y1="300" y2="50" />
    <line id="hermione" x1="100" x2="100" y1="300" y2="0" />
    <line id="ron" x1="150" x2="150" y1="300" y2="100" />
    <line id="hagrid" x1="200" x2="200" y1="300" y2="50" />
    <line id="dumbledore" x1="250" x2="250" y1="300" y2="150" />
  </g>
</svg>

which appears as:

B.1 Media Fragment URIs

The Media Fragment URIs specification [MEDIA-FRAGMENTS] defines a fragment identifier structure for images, videos and audio. They cover identification of spatial areas, time segments, tracks or named segments.

Under that specification, the area covering the first two lines within the example bar chart can be addressed using a URI like:

http://example.org/potter#xywh=25,0,100,225

which would identify the area highlighted here:

The syntax for fragment identifiers defined as part of that specification is:

namevalues = namevalue *( "&" namevalue )
namevalue  = name [ "=" value ]
name       = fragment - "&" - "="
value      = fragment - "&"

; defined in RFC 3986
fragment      = *( pchar / "/" / "?" )
pchar         = unreserved / pct-encoded / sub-delims / ":" / "@"
unreserved    = ALPHA / DIGIT / "-" / "." / "_" / "~"
pct-encoded   = "%" HEXDIG HEXDIG
sub-delims    = "!" / "$" / "&" / "'" / "(" / ")" / "*" / "+" / "," / ";" / "="

; defined in RFC 5234
ALPHA         =  %x41-5A / %x61-7A   ; A-Z / a-z
DIGIT         =  %x30-39 ; 0-9
HEXDIG        =  DIGIT / "A" / "B" / "C" / "D" / "E" / "F"

This syntax essentially allows anything within a fragment identifier, although applications that follow the specification will attempt to interpret any fragment identifier on an image, audio or video representation as a set of name[=value] pairs separated by ampersands.

Named segments under this specification are addressable with the syntax id=id. Thus, the URI:

http://example.org/potter#id=hermione

could (assuming an application that recognises id attributes in SVG as naming segments addressable through fragment identifier structures defined in the Media Fragment URI specifications) identify the second bar within the bar chart, which has been labelled as hermione.

B.2 XML Media Types

The XML Media Types Draft defines (among other things) syntax and processing for fragment identifiers for */*+xml media types. It states (emphasis added):

A family of specifications define fragment identifiers for XML media types. A modular syntax and semantics of fragment identifiers for the XML media types is as defined by the [XPointerFramework] W3C Recommendation. It allows simple names, and more complex constructions based on named schemes. The syntax of a fragment identifier part of any URI or IRI with a retrieved media type governed by the specification must conform to the syntax specified in [XPointerFramework]. Conformant applications must interpret such fragment identifiers as designating that part of the retrieved representation specified by [XPointerFramework] and whatever other specifications define any XPointer schemes used. Conformant applications must support the 'element' scheme as defined in [XPointerElement].

A registry of XPointer schemes [XPtrReg] is maintained at the W3C. Unregistered schemes should not be used.

When an XML-based MIME media type follows the naming convention '+xml', the fragment identifier syntax for this media type may restrict the syntax to a specified subset of schemes, but must support barenames and 'element' scheme pointers. It may further allow other registered schemes such as the xmlns scheme and other schemes.

If [XPointerFramework] and [XPointerElement] are inappropriate for some XML-based media type, it should not follow the naming convention '+xml'.

The XML Media Types draft thus defers the interpretation for fragment identifiers for */*+xml media types to XPointer. XPointer specifies the syntax:

[1]   	Pointer        ::=   	Shorthand | SchemeBased
[2]   	Shorthand      ::=   	NCName
[3]   	SchemeBased    ::=   	PointerPart (S? PointerPart)*
[4]   	PointerPart    ::=   	SchemeName '(' SchemeData ')'
[5]   	SchemeName     ::=   	QName
[6]   	SchemeData     ::=   	EscapedData*
[7]   	EscapedData    ::=   	NormalChar | '^(' | '^)' | '^^' | '(' SchemeData ')'
[8]   	NormalChar     ::=   	UnicodeChar - [()^]
[9]   	UnicodeChar    ::=   	[#x0-#x10FFFF]

For example, because SVG is XML, the second of the line elements in the SVG bar chart can be addressed using:

http://example.org/potter#element(/1/1/2)

This is highlighted in the following XML:

<svg xmlns="http://www.w3.org/2000/svg" width="150px" height="120px" viewBox="0 0 300 225">
  <g stroke="grey" stroke-width="40">
    <line id="harry" x1="50" x2="50" y1="300" y2="50" />
    <line id="hermione" x1="100" x2="100" y1="300" y2="0" /> 
    <line id="ron" x1="150" x2="150" y1="300" y2="100" />
    <line id="hagrid" x1="200" x2="200" y1="300" y2="50" />
    <line id="dumbledore" x1="250" x2="250" y1="300" y2="150" />
  </g>
</svg>

The scheme used within a scheme-based XPointer determines what it identifies; the element() XPointer scheme used above is used to identify element nodes for example. The XPointer Framework specification also states:

A shorthand pointer, formerly known as a barename, consists of an NCName alone. It identifies at most one element in the resource's information set; specifically, the first one (if any) in document order that has a matching NCName as an identifier.

This defines the semantics of a simple fragment identifier, such that a URI such as:

http://example.org/potter#hermione

means an element within the XML information set, in this case the second line element node.

B.3 SVG Fragment Identifiers

SVG itself describes how fragment identifiers can be used to identify views on SVG content. It says:

An SVG fragment identifier can come in two forms:

Shorthand bare name form of addressing (e.g., MyDrawing.svg#MyView). This form of addressing, which allows addressing an SVG element by its ID, is compatible with the fragment addressing mechanism for older versions of HTML.

SVG view specification (e.g., MyDrawing.svg#svgView(viewBox(0,200,1000,1000))). This form of addressing specifies the desired view of the document (e.g., the region of the document to view, the initial zoom level) completely within the SVG fragment specification. The contents of the SVG view specification are the five parameter specifications, viewBox(...), preserveAspectRatio(...), transform(...), zoomAndPan(...) and viewTarget(...), whose parameters have the same meaning as the corresponding attributes on a ‘view’ element, or, in the case of transform(...), the same meaning as the corresponding attribute has on a ‘g’ element).

SVG's fragment identifiers are conformant with XPointer: they follow the same syntax and are defined in the terms given in XPointer. For example, the URI:

http://example.org/potter#hermione

in this case will address the element with the ID hermione, highlighted here:

<svg xmlns="http://www.w3.org/2000/svg" width="150px" height="120px" viewBox="0 0 300 225">
  <g stroke="grey" stroke-width="40">
    <line id="harry" x1="50" x2="50" y1="300" y2="50" />
    <line id="hermione" x1="100" x2="100" y1="300" y2="0" /> 
    <line id="ron" x1="150" x2="150" y1="300" y2="100" />
    <line id="hagrid" x1="200" x2="200" y1="300" y2="50" />
    <line id="dumbledore" x1="250" x2="250" y1="300" y2="150" />
  </g>
</svg>

When fragment identifiers of this kind are used, SVG uses CSS's :target pseudo-class which enables the identified element to be highlighted. For example, the SVG:

<svg xmlns="http://www.w3.org/2000/svg" width="150px" height="120px" viewBox="0 0 300 225">
  <style type="text/css">
    line:target { stroke: red; }
  </style>
  <g stroke="grey" stroke-width="40">
    <line id="harry" x1="50" x2="50" y1="300" y2="50" />
    <line id="hermione" x1="100" x2="100" y1="300" y2="0" />
    <line id="ron" x1="150" x2="150" y1="300" y2="100" />
    <line id="hagrid" x1="200" x2="200" y1="300" y2="50" />
    <line id="dumbledore" x1="250" x2="250" y1="300" y2="150" />
  </g>
</svg>

means that the URI

http://example.org/potter#hermione

is displayed with the second line (identified as hermione) stroked in red:

SVG introduces a svgView() XPointer scheme that is used to describe views onto SVG images; one possible argument is viewBox(), which selects a particular area of an image in the same way as the xywh parameter defined for Media Fragment URIs described above. Thus the URI:

http://example.org/potter#svgView(viewBox(25,0,100,225))

identifies the area of the chart that covers the first two bars.

B.4 Active Content

SVG, like HTML, enables scripts to be embedded within documents and to respond to events such as clicks on particular parts of the content. Active content can read the document location and base the behaviour of the script on the fragment identifier.

For example, the following SVG parses the fragment identifier that's used to access the bar chart and uses it to highlight one of the bars:

<svg xmlns="http://www.w3.org/2000/svg" width="150px" height="120px" viewBox="0 0 300 225" onload="highlight()">
  <script type="application/ecmascript">
    function highlight () {
      var id = document.location.hash.substring(1);
      if (id) {
        var element = document.getElementsByTagName('line')[id];
        if (element) {
          element.setAttribute('stroke', 'red');
        }
      }
    }
  </script>
  <g stroke="grey" stroke-width="40">
    <line id="harry" x1="50" x2="50" y1="300" y2="50" />
    <line id="hermione" x1="100" x2="100" y1="300" y2="0" />
    <line id="ron" x1="150" x2="150" y1="300" y2="100" />
    <line id="hagrid" x1="200" x2="200" y1="300" y2="50" />
    <line id="dumbledore" x1="250" x2="250" y1="300" y2="150" />
  </g>
</svg>

The URI:

http://example.org/potter#2

thus highlights the second bar within the bar chart.

In this case, the recognised syntax of the fragment identifier is determined by the script, which recognises any numeric fragment identifier between one and five. The fragment identifier has no declarative semantics -- there is no specification that says what it means -- but in effect this script supports the identification of a bar of the bar chart through a fragment identifier.

B.5 Semantic Content

SVG allows extensions; any element in a different namespace will be ignored by SVG processors. This facility can be used to embed semantic content through RDF/XML.

The following example contains some RDF/XML which makes some basic assertions about the resource

http://example.org/potter#hermione

This resource has been identified with a fragment identifier within an SVG document like this:

<svg xmlns="http://www.w3.org/2000/svg" width="150px" height="120px" viewBox="0 0 300 225">
  <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:foaf="http://xmlns.com/foaf/0.1/">
    <foaf:Person rdf:about="#hermione">
      <foaf:name>Hermione Granger</foaf:name>
    </foaf:Person>
  </rdf:RDF>
  <g stroke="grey" stroke-width="40">
    <line x1="50" x2="50" y1="300" y2="50" />
    <line x1="100" x2="100" y1="300" y2="0" />
    <line x1="150" x2="150" y1="300" y2="100" />
    <line x1="200" x2="200" y1="300" y2="50" />
    <line x1="250" x2="250" y1="300" y2="150" />
  </g>
</svg>

In semantic content, fragment identifiers can mean anything. In this particular example, we can tell from the RDF that the above URI means the person named Hermione Granger. It is common practice when using RDF for URIs that include fragment identifiers to be used to refer to things that are described by the document retrieved at the base URI, as this makes it easy to serve RDF content.