Warning:
This wiki has been archived and is now read-only.

Multi-Body Annotations in Named Graphs

From Web Annotation Wiki
Jump to: navigation, search

Intro for the non-RDF people: loosely speaking, named graphs mean that a set of triples (ie, subject-predicate-object tuples) are collected in a separate entity called a graph that gets a name (in the form of a URI). Since RDF 1.1 named graphs are formally defined in RDF, although the formal term used is Datasets. In practice, RDF datasets are used very often to add a "context" to a triple. There are standard syntaxes to express datasets: JSON-LD has this out of the box; TriG is an extension to Turtle; N-Quads is an extension to N-triples. SPARQL has always had this concept from the start (actually, the term 'Dataset' was introduced in SPARQL).

From as strictly theoretical point of view, named graphs have a problem: the RDF community could never fully agree on what Dataset mean in terms of semantics (I spare you the details). More exactly, there are several possible ways to interpret Datasets, and there is no consensus as for what the "best" is. There is a W3C Note on possible semantics; if we decided to use the dataset formalism, we could say that what we require from annotation environments is to work with the semantics whereby each named graph defines its own context.

Specific example

What I did below is to explore one of the examples on Tim's Wiki page, namely the Classification Example. I took a radical approach here, namely that all annotations would become separate named graphs. We could do something less granular, and allow for implementations to segment the annotations into datasets as they wish, which means that they would then be 'responsible' to avoid semantic 'overload' that started this discussion.

The example below is the simplest on Tim's page, insofar as the targets are simply URI references and nothing else. That is where the issue of Semantic & Open World Assumption is the clearest (in all other examples the bodies are in a blank node anyway even without roles).

Here is the original example, without any role assignment:

JSON Turtle
{ "id": "http://example.org/Anno3",
  "type": "Annotation",
  "motivation": "classifying",
  "body" : { "id" : "http://id.loc.gov/authorities/subjects/sh2009118189" },
  "target" : [ {"id" : "http://tinyurl.com/Debate2AirThurs", 
                "type" : "Text" },
               {"id" : "http://tinyurl.com/NoWay2PickDebate", 
                "type" : "Text" }
             ]  
}
@prefix dc: <http://purl.org/dc/elements/1.1/> .
@prefix dctypes: <http://purl.org/dc/dcmitype/> .
@prefix dcterms: <http://purl.org/dc/terms/> .
@prefix oa: <http://www.w3.org/ns/oa#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix skos: <http://www.w3.org/2004/02/skos/core#> .
<http://example.org/Anno3> 
    a "oa:Annotation" ;
    oa:motivatedBy oa:classifying ;
    oa:hasBody <http://id.loc.gov/authorities/subjects/sh2009118189>;
    oa:hasTarget 
       <http://tinyurl.com/Debate2AirThurs> ,
       <http://tinyurl.com/NoWay2PickDebate> .
       
<http://tinyurl.com/Debate2AirThurs> a dctypes:Text .
<http://tinyurl.com/NoWay2PickDebate> a dctypes:Text .

Here is the same example, using Datasets. Note that the named graph itself has a URI and, in this example, the annotation does not have one, it is a blank node. It would be possible to assign a separate URI to the annotation; this makes sense if the approach chosen is to bundle several annotations into one named graph.

JSON Turtle
{ "id": "http://example.org/Anno3",
  "@graph" : 
     {
       "type": "Annotation",
       "motivation": "classifying",
       "body" : { "id" : "http://id.loc.gov/authorities/subjects/sh2009118189" },
       "target" : [ {"id" : "http://tinyurl.com/Debate2AirThurs", 
                     "role" : "describing",
                     "type" : "Text" },
                    {"id" : "http://tinyurl.com/NoWay2PickDebate",
                     "role" : "questioning", 
                     "type" : "Text" }
                  ] 
    } 
}
@prefix dc: <http://purl.org/dc/elements/1.1/> .
@prefix dctypes: <http://purl.org/dc/dcmitype/> .
@prefix dcterms: <http://purl.org/dc/terms/> .
@prefix oa: <http://www.w3.org/ns/oa#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix skos: <http://www.w3.org/2004/02/skos/core#> .
<http://example.org/Anno3> {
    [
        a "oa:Annotation" ;
        oa:motivatedBy oa:classifying ;
        oa:hasBody <http://id.loc.gov/authorities/subjects/sh2009118189>;
        oa:hasTarget 
           <http://tinyurl.com/Debate2AirThurs> ,
           <http://tinyurl.com/NoWay2PickDebate> .
    ];
    <http://tinyurl.com/Debate2AirThurs> 
      a dctypes:Text ; 
      oa:hasRole oa:describing .
      
    <http://tinyurl.com/NoWay2PickDebate> 
      a dctypes:Text ;
      oa:hasRole oa:questioning .
}

There are major pros and cons to this solution.

  • Pro: the difference between the "plain" version and the version with roles is minimal, and the addition is fairly obvious for non-RDF users, too. There is no real issue with the Semantic & Open World Assumption, because all statements are confined within a specific context. This may be, at first glance, the simplest approach in handling that use case and mainly the most easy to understand by non-RDF people: by setting the context we side-step the semantic issue.
  • Pro: almost all RDF environments work internally, in fact, with quads, or at least have the options for those. Ie, representing a triple with a fourth 'context' entry is, in practice, easy to use in those environments (see the JSON-LD playground, which has a separate tab for N-Quads).
  • Con: existing OA applications would need quite an overhaul...
  • Con: this is a fairly radical departure from the current model and would require careful design.
  • Con: due to the uncertainty of the semantics of data sets, the usage of Datasets in the Linked Data community is unclear.