Fragment URIs versus Specific Resources
Posted on:Two options, listed below, for annotating parts of resources were considered for the Open Annotation Data Model. We decided on the more expressive but more verbose second option for a variety of reasons, that this page explains in more detail than in the specification.
Option (1) Fragment URI as target
_:anno a oa:Annotation ; oa:hasTarget <http://www.example.com/example.ogv#t=10,20> ; oa:hasBody <http://www.example.org/comment1> .
Option (2) Identity as target, description as a Selector
_:anno a oa:Annotation ; oa:hasTarget <SpecificTarget1> ; oa:hasBody <http://www.example.org/comment1> . <SpecificTarget1> a oa:SpecificResource ; oa:hasSelector <FragmentSelector1> ; oa:hasSource <http://www.example.com/example.ogv> . <FragmentSelector1> a oa:FragmentSelector ; rdf:value "t=10,20" .
To decompress the reasoning in the specification:
-
You can’t search for http://www.example.com/example.ogv directly in
the first model. Remember that URIs are opaque, non-decomposable strings. Regardless of whether that
string may include human readable semantics or not, you can’t discover
annotations of form (1) with regular SPARQL constructions and you can discover them in form (2) by
querying the object of oa:hasSource. It is possible to discover them using regular expressions in SPARQL. - While Style specifiers as a resource are going away, form (1) is not compatible
with States. If you need to refer to example.ogv at a particular
point in time then you need a State which cannot be attached to the
Fragment URI, as that would break the global scope of statements in
RDF. In other words, you would be saying that for all uses of that
time range within the video, it was always based on the video resource
as it was at a particular point in time (say at 2011-05-20), which
would prevent other annotations from having a different point in time
for that same time segment within the video. - URIs provide identity. A URI with a fragment provides both identity
for the segment, but also a description of how to resolve that segment
given a (particular) representation of the resource. This has several
issues:- There may be many ways to describe the same segment, used by
different communities. - IETF Fragment specifications are tied to a specific mimetype.
Plain text fragments work only for text/plain
resources, and no other, for example. Thus the identity of the segment is tied to a specific representation in a specific format - As Jeni Tennison points out, the same fragment can identify
different things within the same resource. She gives the example of
an SVG document with embedded RDF in Appendix B.
So we consider it safer to create a new node in the graph to provide
identity, and a separate node to provide the description. This
safety, expressiveness and consistency comes at the expense of some
extra bytes, but that’s RDF for you.
- There may be many ways to describe the same segment, used by
- Fragment URIs are not expressive enough to cover the use cases that
drive the Open Annotation specification. Non rectangular sections of
images are very important to be able to identify and describe,
including simple circles as well as arbitrary paths. The worst case
is annotating a diagonal road in a map from top left to bottom right
of the image, where a rectangular box would encompass the whole
image’s content. Thus we need some other way to implement this, which
results in form (2) - Fragment URIs do not cover all media types, nor could they possibly
hope to. If you wanted to annotate a selection of text within a MS
Word document, you would need Microsoft to register a fragment
description for .doc and .docx. Given that this is a useful sort of
thing to do, and it can’t be done with fragment URIs, we need a
selector concept as in form (2) - Media Fragments are not extensible by third parties in the current specification.
As soon as anyone needed something slightly richer or more
expressive … like a circular area rather than rectangular … then
we would be back in the situation where we needed a selector again.
Thus, there is a need for a Selector that describes the segment of a
resource separately from its identity. Given that this is required,
we felt it most consistent to always use a Selector, but to import the
fragment description semantics into it. This solves all of the issues
above at the expense of being somewhat more verbose.
- You can always query oa:hasSource to find the URI of the target
resource, without any segment information - You, or a third party, can always attach a State to give the time
for the representation. The failure to consider the dynamic nature of
web resources has been the downfall of many annotation systems in the
past, and we fully intend to learn from their mistakes. - Multiple descriptions are possible, and can work across mime types.
There is no confusion about what the Specific Resource identifies as
it does not also try to describe it. - We can be as expressive as we like using a Selector, and remain
consistent with a single model - We can have selectors for new and old media types, without the
blessing of the IANA/IETF registries - Selectors are infinitely extensible
- There is a single model, not two possibilities that everyone would
need to implement both of or risk splitting their user base - We import the semantics of the fragment definitions, so are not
re-inventing those. We simply split the fragment away from the URI of
the full resource to gain the benefits above. - And finally we tried both ways and the consensus of the group was
that the single selector model was the better approach
After all of that, if you’re still not convinced, then it’s only a recommendation to use this approach. If you feel that
some additional bytes in an already extremely verbose format is too
high a cost for interoperability, expressiveness, consistency and the
understandability of your annotations, then it is not forbidden to
annotate a fragment URI directly. We hope, of course, that you’re
convinced otherwise by the arguments above 🙂
(This page is taken from: http://lists.w3.org/Archives/Public/public-openannotation/2012Oct/0008.html)
This articles says things which are clearly false ! One of the mistake is that Media Fragments URI are not extensible which is plain wrong ! The list of selector is purposely let open to enable a future Media Fragments URI 2.0 specification enabling more complex selection that goes beyond a bounding box for example.
Corrected the wording to state the intended meaning: That the Media Fragments are not extensible by third parties which is what Open Annotation would need. Obviously anything is “extensible” by creating a new version that adds new capabilities.
Sorry, third parties can also extend it. A media fragment URI parser is conformant if it can also parse key=value even if it does not recognize some keys (in which case it must ignore it). This means that whoever can extend it by specifying which additional key he wants to use and adding the corresponding processing instructions for it. I plan to describe this more thoroughly on the list later.
The spec says that all dimensions are initially undefined and that any name that is not one of t, xywh, track or id does not represent a media fragment thus must be ignored and should generate a warning.
http://www.w3.org/TR/media-frags/#processing-name-value-lists
Therefore #timestate=2008-12-13T10:30:00&xywh=1,1,5,5 would be processed without the timestate.
And actually the specification is inconsistent. 5.1.1 says “must ignore” and 6.2.1 says “SHOULD ignore”. Neither section is listed as non-normative.
What am I missing? 🙂