Fragment URIs versus Specific Resources

Two options, listed below, for annotating parts of resources were considered for the Open Annotation Data Model. We decided on the more expressive but more verbose second option for a variety of reasons, that this page explains in more detail than in the specification.

Option (1) Fragment URI as target


_:anno a oa:Annotation ;
  oa:hasTarget <http://www.example.com/example.ogv#t=10,20>  ;
  oa:hasBody <http://www.example.org/comment1> .

Option (2) Identity as target, description as a Selector


_:anno a oa:Annotation ;
  oa:hasTarget <SpecificTarget1> ;
  oa:hasBody <http://www.example.org/comment1> .

<SpecificTarget1> a oa:SpecificResource ;
  oa:hasSelector <FragmentSelector1> ;
  oa:hasSource <http://www.example.com/example.ogv> .

<FragmentSelector1> a oa:FragmentSelector ;
  rdf:value "t=10,20" .

To decompress the reasoning in the specification:

  • You can’t search for http://www.example.com/example.ogv directly in
    the first model. Remember that URIs are opaque, non-decomposable strings. Regardless of whether that
    string may include human readable semantics or not, you can’t discover
    annotations of form (1) with regular SPARQL constructions and you can discover them in form (2) by
    querying the object of oa:hasSource. It is possible to discover them using regular expressions in SPARQL.

  • While Style specifiers as a resource are going away, form (1) is not compatible
    with States. If you need to refer to example.ogv at a particular
    point in time then you need a State which cannot be attached to the
    Fragment URI, as that would break the global scope of statements in
    RDF. In other words, you would be saying that for all uses of that
    time range within the video, it was always based on the video resource
    as it was at a particular point in time (say at 2011-05-20), which
    would prevent other annotations from having a different point in time
    for that same time segment within the video.

  • URIs provide identity. A URI with a fragment provides both identity
    for the segment, but also a description of how to resolve that segment
    given a (particular) representation of the resource. This has several
    issues:

    • There may be many ways to describe the same segment, used by
      different communities.

    • IETF Fragment specifications are tied to a specific mimetype.
      Plain text fragments work only for text/plain
      resources, and no other, for example. Thus the identity of the segment is tied to a specific representation in a specific format

    • As Jeni Tennison points out, the same fragment can identify
      different things within the same resource. She gives the example of
      an SVG document with embedded RDF in Appendix B.
      So we consider it safer to create a new node in the graph to provide
      identity, and a separate node to provide the description. This
      safety, expressiveness and consistency comes at the expense of some
      extra bytes, but that’s RDF for you.
  • Fragment URIs are not expressive enough to cover the use cases that
    drive the Open Annotation specification. Non rectangular sections of
    images are very important to be able to identify and describe,
    including simple circles as well as arbitrary paths. The worst case
    is annotating a diagonal road in a map from top left to bottom right
    of the image, where a rectangular box would encompass the whole
    image’s content. Thus we need some other way to implement this, which
    results in form (2)

  • Fragment URIs do not cover all media types, nor could they possibly
    hope to. If you wanted to annotate a selection of text within a MS
    Word document, you would need Microsoft to register a fragment
    description for .doc and .docx. Given that this is a useful sort of
    thing to do, and it can’t be done with fragment URIs, we need a
    selector concept as in form (2)

  • Media Fragments are not extensible by third parties in the current specification.
    As soon as anyone needed something slightly richer or more
    expressive … like a circular area rather than rectangular … then
    we would be back in the situation where we needed a selector again.

Thus, there is a need for a Selector that describes the segment of a
resource separately from its identity. Given that this is required,
we felt it most consistent to always use a Selector, but to import the
fragment description semantics into it. This solves all of the issues
above at the expense of being somewhat more verbose.

  • You can always query oa:hasSource to find the URI of the target
    resource, without any segment information

  • You, or a third party, can always attach a State to give the time
    for the representation. The failure to consider the dynamic nature of
    web resources has been the downfall of many annotation systems in the
    past, and we fully intend to learn from their mistakes.

  • Multiple descriptions are possible, and can work across mime types.
    There is no confusion about what the Specific Resource identifies as
    it does not also try to describe it.

  • We can be as expressive as we like using a Selector, and remain
    consistent with a single model

  • We can have selectors for new and old media types, without the
    blessing of the IANA/IETF registries

  • Selectors are infinitely extensible
  • There is a single model, not two possibilities that everyone would
    need to implement both of or risk splitting their user base

  • We import the semantics of the fragment definitions, so are not
    re-inventing those. We simply split the fragment away from the URI of
    the full resource to gain the benefits above.

  • And finally we tried both ways and the consensus of the group was
    that the single selector model was the better approach

After all of that, if you’re still not convinced, then it’s only a recommendation to use this approach. If you feel that
some additional bytes in an already extremely verbose format is too
high a cost for interoperability, expressiveness, consistency and the
understandability of your annotations, then it is not forbidden to
annotate a fragment URI directly. We hope, of course, that you’re
convinced otherwise by the arguments above :)

(This page is taken from: http://lists.w3.org/Archives/Public/public-openannotation/2012Oct/0008.html)

4 Responses to Fragment URIs versus Specific Resources

  1. This articles says things which are clearly false ! One of the mistake is that Media Fragments URI are not extensible which is plain wrong ! The list of selector is purposely let open to enable a future Media Fragments URI 2.0 specification enabling more complex selection that goes beyond a bounding box for example.

  2. Robert Sanderson says:

    Corrected the wording to state the intended meaning: That the Media Fragments are not extensible by third parties which is what Open Annotation would need. Obviously anything is “extensible” by creating a new version that adds new capabilities.

    • Sorry, third parties can also extend it. A media fragment URI parser is conformant if it can also parse key=value even if it does not recognize some keys (in which case it must ignore it). This means that whoever can extend it by specifying which additional key he wants to use and adding the corresponding processing instructions for it. I plan to describe this more thoroughly on the list later.

      • Robert Sanderson says:

        The spec says that all dimensions are initially undefined and that any name that is not one of t, xywh, track or id does not represent a media fragment thus must be ignored and should generate a warning.

        http://www.w3.org/TR/media-frags/#processing-name-value-lists

        Therefore #timestate=2008-12-13T10:30:00&xywh=1,1,5,5 would be processed without the timestate.

        And actually the specification is inconsistent. 5.1.1 says “must ignore” and 6.2.1 says “SHOULD ignore”. Neither section is listed as non-normative.

        What am I missing? :)

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Before you comment here, note that this forum is moderated and your IP address is sent to Akismet, the plugin we use to mitigate spam comments.