Re: FragIds in semantic web (ACTION-543)

Hi,

To fulfil my ACTION-543, I'm going to take a second stab at pulling together some text for Larry's document on MIME and the Web [1].

Note that the aim here is not to recommend solutions but to describe the issues. It is intended to replace the current text of Section 4.6 (Fragment identifiers) [2].

---
The Web added the notion of being able to address part of an entity 
and not the whole content by adding a 'fragment identifier' to the 
URL that addressed the data. Of course, this originally made sense 
for the original Web with just HTML, but how would it apply to other 
content types?

The URL spec stated that "the definition of the fragment identifier 
meaning depends on the Internet Media Type", but unfortunately, few 
of the Internet Media Type definitions include this information, 
and practices diverge greatly.

Interpretation of fragment identifiers based on media type led 
to a set of problems when they are combined with content negotiation, 
in which multiple representations with different media types might be 
provided at a single URI. For example:

  * A particular fragment may be supported by one mime type but not
    another, such as the element() XPointer syntax for XHTML being
    unsupported for HTML.

  * Two mime types may define completely different fragment syntax
    for the same fragment, such as an area of an image in SVG [3] and 
    other image types [4].

  * Two mime types may have a different semantic interpretation of
    the same fragment, such as a portion of a document in HTML 
    compared to a non-information resource (NIR) in RDF/XML.

These issues are addressed at least in part within the Architecture of
the World Wide Web Volume 1 [5].

Later versions of the URL spec also widened the scope of fragment
identifiers to any 'secondary resource', which includes not just
parts of entities but any other resource 'defined or described by
the representation'. This led to a second set of problems with the 
use of Hash URIs within the Semantic Web [6]. In these cases, fragment 
identifiers are used as a mechanism to easily get from a URI for a NIR 
to a document about that NIR. Standard practice in these cases is to 
use a bare name fragment identifier (usually used to address a portion 
of a document) without there being a document that contains a portion 
named in that way.

The third set of problems comes particularly with HTML, where web 
applications use fragment identifiers to encode application state
rather than addressing part of the document, a technique that isn't
covered within the media type definition for text/html.

In summary, interpreting fragment identifiers based on the media type
of the representation returned by a URI does not make sense in all
cases, and the definitions for their interpretation within media type
definitions are regularly ignored in practice.
---

Does this summarise the issues sufficiently correctly and comprehensively?

Thanks,

Jeni

[1]: http://tools.ietf.org/id/draft-masinter-mime-web-info-02.html
[2]: http://tools.ietf.org/id/draft-masinter-mime-web-info-02.html#anchor16
[3]: http://www.imc.org/ietf-xml-mime/mail-archive/msg01153.html
[4]: http://www.w3.org/TR/media-frags/
[5]: http://www.w3.org/TR/webarch/#frag-coneg
[6]: http://www.w3.org/TR/cooluris/#hashuri
-- 
Jeni Tennison
http://www.jenitennison.com

Received on Tuesday, 3 May 2011 20:28:38 UTC