Difference between revisions of "TF-Graphs-UC"

From RDF Working Group Wiki
Jump to: navigation, search
(Storage Use Cases)
(Trace inferences and their results)
Line 83: Line 83:
 
===== Trace inferences and their results =====
 
===== Trace inferences and their results =====
 
Using identifying graphs that where consumed and produced by an inference one can can trace the inferences that enriched a triple store to undo some reasoning for instance when the store is updated.
 
Using identifying graphs that where consumed and produced by an inference one can can trace the inferences that enriched a triple store to undo some reasoning for instance when the store is updated.
 +
<pre>
 +
:G1 { :Tom ex:manage :ACompany }
 +
 +
:G2 { :Tom rdf:type ex:Manager  }
 +
 +
:G2 ex:deducedFrom :G1
 +
</pre>
  
 
=== Query Use Cases ===
 
=== Query Use Cases ===

Revision as of 16:01, 1 March 2011

Graph Use Cases

Storage Use Cases

Organizing Information

When storing RDF information in a graph store, we would like to organize related information into separate graphs. Each graph must be identified with a URI to facilitate retrieval.

Slicing datasets according to multiple dimensions

Within the BBC, we want to slice large RDF datasets according to multiple dimensions: statements about individual programmes, access control, 'ownership' of the data (what product owns/maintains what set of triples), versioning, etc. All those graphs are potentially overlapping or contained within each other. Those issues are very common in large organisations using a single, centralised, triple store.

Permissions

Another purpose in storing RDF content in different graphs is to enforce a permissions model so that sensitive information is not accessed by unauthorized users.

Graph Changes Over Time

When storing graph information retrieved from a URL external to an application, it becomes important to store snapshots of the location over time. When these graph snapshots are taken, it is useful to annotate each snapshot with information such as retrieval time, HTTP Headers used, HTTP Response returned, and other such items that may have affected the contents of the graph snapshot.

Here is a quick JSON-LD (assuming g-snap support) example showing two graph snapshots. The home page changes between the two snapshots:

G-SNAP #1:

{
   "@":
   {
      "a": "<foaf:Person>",
      "foaf:name": "Manu Sporny",
      "foaf:homepage": "<http://linkedin.com/in/manusporny>"
   }
   "dc:date": "2010-04-18T01:24Z"
}

G-SNAP #2:

{
   "@":
   {
      "a": "<foaf:Person>",
      "foaf:name": "Manu Sporny",
      "foaf:homepage": "<http://manu.sporny.org/>"
   }
   "dc:date": "2011-02-01T18:32Z"
}

A more complex example involves supporting decentralized product listings via PaySwarm. That is, in PaySwarm products for sale (access to particular post in a blog, or a particular Web App) are expressed in a decentralized manner on a website. The expression of what is for sale is encapsulated in a graph of information about the asset for sale, pricing information and licensing information that is associated with the sale. The combination of this information is effectively an offer of sale:

{
   "@": "http://wordpress.payswarm.dev/?p=65#listing",
   "a": ["gr:Offering", "ps:Listing"],
   "com:payee": 
   [{
      "@": "http://wordpress.payswarm.dev/?p=65#listing-payee",
      "a": "com:Payee",
      "com:currency": "USD",
      "com:destination": "https://payswarm.com/i/johnsmith/accounts/1",
      "com:rate": "0.05",
      "com:rateType": "<com:FlatAmount>",
      "rdfs:comment": "Payment for Intro Blog Article by John Smith."
   }],
   "com:payeeRule": 
   [{
      "a": "com:PayeeRule",
      "com:destinationOwnerType": "<ps:Authority>",
      "com:maximumRate": "10",
      "com:rateType": "<com:InclusivePercentage>"
   }],
   "ps:assetHash": "905ab5980931053792fc63e40fb4afd0a2f55e02",
   "ps:forAsset": "http://wordpress.payswarm.dev/?p=65#asset",
   "ps:license": "http://payswarm.com/licenses/blogging",
   "ps:licenseHash": "0d8866836917f8ef58af44accb6efab9a10610ad",
   "ps:validFrom": "2011-02-26T00:00:00+0000^^<http://www.w3.org/2001/XMLSchema#dateTime>",
   "ps:validUntil": "2011-02-27T00:00:00+0000^^<http://www.w3.org/2001/XMLSchema#dateTime>",
   "ps:signature":
   {
      "a": "ps:JsonldSignature",
      "dc:created": "2011-02-26T00:00:00Z^^<http://www.w3.org/2001/XMLSchema#dateTime>",
      "dc:creator": "https://payswarm.com/i/johnsmith/keys/4",
      "ps:signatureValue": "hluj7gTcjGOhxAfTmr04DXZNYwErXKcNBWqwYnjZCxAPlkl7EUl6L7aS0xENmGe3n3VZebWq9mnPH/mv05tzxUYOi6/ssZG+WFNUXFWRA9u+2AdJL5b07U9s51j3tKG6CRB5wGN6w3MPvgM0TspM+VUGHwsR9ePAfpCuFql9zH4="
   }
}

Note the "ps:validFrom" and "ps:vaildUntil" dates - that information changes once a day. Since that information in the graph changes, the signatures on the graph change as well. Because of the daily changes, it is important that one is able to track snapshots of this graph as it changes from day to day. Storing this data in a graph store is particularly challenging w/o the fundamental concept of a graph snapshot (Graph Literal).

Trace inferences and their results

Using identifying graphs that where consumed and produced by an inference one can can trace the inferences that enriched a triple store to undo some reasoning for instance when the store is updated.

:G1 { :Tom ex:manage :ACompany }

:G2 { :Tom rdf:type ex:Manager  }

:G2 ex:deducedFrom :G1

Query Use Cases

While query services are not explicitly addressed in the RDF spec, SPARQL does make use of graph IRIs and we should ensure that the semantics of graph identifiers are compatible with the way in which RDF datasets are defined by SPARQL.

Find Information In a Graph

When a query service processes a query containing a graph identifier, it must resolve the graph identifier to some collection of materialized RDF content that will be returned in the result set.

Computed Graphs

Often, graphs exposed by a query service are not present in any sort of physical storage, but rather their contents are computed at query time. Examples include:

  • A federated query service may define a graph URI to be the union of graphs accessible through other query services.
  • A service that does RDB to RDF mapping via R2RML may dynamically compute RDF results based on SQL results at query time.
Graph URIs as Locations

In the situation where a query service is presented with a graph identifier that is not present in local storage, the query service may wish to resolve the graph URI as a URL and make a request to that URL (possibly with conneg) for a document that serializes the content of that graph.

Provenance Use Cases

Digital Signatures on Graphs

There are a number of ways to create digital signatures on RDF graphs. Often, you do not want to co-mingle the signature information and the graph. Co-mingling signature information in a graph requires the software to use an algorithm to clean the graph in order to generate the signature hash for verification purposes. It also means that it becomes very difficult to sign a graph containing a digital signature at the top-most level. In order to express a digital signature on a graph of information, the idea of a Graph Literal becomes useful. Take the following as an example of a JSON-LD graph that we would like to digitally sign:

{
   "a": "<foaf:Person>",
   "foaf:name": "Manu Sporny",
   "foaf:homepage": "<http://manu.sporny.org/>"
}

One could sign the graph above by adding a few triples to the graph:

{
    "a": "<foaf:Person>",
    "foaf:name": "Manu Sporny",
    "foaf:homepage": "<http://manu.sporny.org/>",
    "sig:signature: 
    {
        "a": "<sig:JsonldSignature>",
        "sig:signer": "<http://manu.sporny.org/webid#key-5>",
        "sig:signatureValue": "OGQzNGVkMzVmMmQ3ODIyOWM32MzQzNmExMgoYzI4ZDY3NjI4NTIyZTk="
    }
}

However, nobody else could sign that graph without introducing ambiguity as to who signed the graph first. That is, the second signer couldn't sign the initial signer's signature. Therefore, having the concept of a graph snapshot which can be annotated in the same way that triples are annotated becomes very useful. The first signature could be performed like so:

{
   "@": 
   {
      "a": "<foaf:Person>",
      "foaf:name": "Manu Sporny",
      "foaf:homepage": "<http://manu.sporny.org/>"
   },
   "sig:signature: 
   {
      "a": "<sig:JsonldSignature>",
      "sig:signer": "<http://manu.sporny.org/webid#key-5>",
      "sig:signatureValue": "OGQzNGVkMzVmMmQ3ODIyOWM32MzQzNmExMgoYzI4ZDY3NjI4NTIyZTk="
   }
}

The example above separates the signature from the data that is being signed, which is good design. The second signature could be performed like so:

{
   "@": 
   {
      "@": 
      {
         "a": "<foaf:Person>",
         "foaf:name": "Manu Sporny",
         "foaf:homepage": "<http://manu.sporny.org/>"
      },
      "sig:signature: 
      {
         "a": "<sig:JsonldSignature>",
         "sig:signer": "<http://manu.sporny.org/webid#key-5>",
         "sig:signatureValue": "OGQzNGVkMzVmMmQ3ODIyOWM32MzQzNmExMgoYzI4ZDY3NjI4NTIyZTk="
      }
      "dc:date": "2011-02-26T22:18Z"
   },
   "sig:signature: 
   {
      "a": "<sig:JsonldSignature>",
      "sig:signer": "<http://authority.payswarm.com/webid#key-873>",
      "sig:signatureValue": "kMzVmMVDIyOWM32MzI4ZDY3NjI4mQ3OOGQzNGNTIyZTkQzNmExMgoYz="
   }
}

Note that a "dc:date" has been associated with the initial signed graph. Using this technique, one could verify that:

  1. The initial graph was signed by a primary author.
  2. The initial graph w/ signature was annotated and signed by a secondary author.

This is useful when dealing with web-of-trust issues such as trusting graphs which have been cached by third parties. This happens when product listings are cached by companies like Google and then proxied by 3rd parties. You want to ensure that the initial product listing is valid per the asset owner, and that the state of the cache has been verified by Google. This prevents a nefarious proxy of meddling with the information that will be used to perform a financial transaction.


Separate Ontology Use Case

This use case is derived from a proposal to have OWL annotations that can be collected together into a separate ontology (and that might even be able to affect the main ontology). The proposal itself can be seen at http://www.w3.org/2007/OWL/wiki/Annotation_System however this "use case" is somewhat of a modification of the suggestions in the proposal.

The basic need is to be able to generate multiple ontologies from a single OWL document. One ontology is the ontology that corresponds to the main information in the document. The other ontology (or ontologies) would sit alongside the main ontology. These secondary ontologies might be used to store and reason about things like provenance or certainty.

Aside from the ability to have multiple ontologies be generated from a single document, there is the need to be able to have syntactic entities in the main document show up as semantic entities in the secondary ontologies. Note that this does *not* directly require reflection, as the syntactic entities don't have their semantic import in the secondary ontologies. Any semantic relationship between the main ontology and secondary ontologies is mediated by relationships outside the formalism semantics, again so that there is no need for reflection or reification or ....

So far this is about (OWL) ontologies, not graphs, but it can be turned into a use case for referenceable graphs either by replacing OWL ontologies by RDF graphs or by considering RDF graph naming as the syntactic mechanism for separate ontologies in the RDF encoding of OWL.