Note/20100505

From XG Provenance Wiki
Jump to: navigation, search

The Presentation

Brief Introduction to Named Graphs

Named Graphs are part of the official SPARQL specification, supported by many RDF stores, and

used by many Linked Data applications, to keep track of provenance, e.g. where they got the data

likely to become part of upcoming RDF 2.0

use an URI to refer to a collection of RDF triples

A Named Graph can be described within the graph itself or in another graph

Alternative to RDF Reification

  • practically speaking, you will have triple times for RDF storage
  • SARPQL queries for RDF reification are pretty ugly

Syntax of NG: TriX, TriG, and RDF/XML - TriX allows you with XML tools like XSLT or XQuery, but difficult for human - TriG is based on a subset of N3,

Usage of NG

  • Allows part of RDF triples to be hidden, for access control
  • Provide provenance for a collection of RDF triples, such as the author of these triples
  • Used in Semantic Publishing Vocabulary, to describe whether a statement is asserted or quoted
  • Used to express a trust policy, using NG together with SP Vocab, give me data sources that I trust, or published on a certain date

NG tools

  • NG4J, NG API for Jena, supporting provenance-enabled Jena statements, such as finding out which graph an object comes from

& all current SPARQL stores

NG is grounded on the model theory. The SPARQL WG didn't follow this semantic but only includes NG as structural assertions.

NG is also used for version as well as for provenance tracking, and used for differentiating RDF assertions resulting from different reasoning engines.

Q&A

Luc

  • Q: We want to track provenance at very fine-grained in some circumstances, such as the use cases from the eGov. Is NG scalable to the point that I can create a NG for each individual RDF triple?
  • A: Yes, you can do that. Normally ppl create a NG for a collection of triples, but it doesn't matter if you create a NG for each RDF triple; it won't affect on the performance of the query.
  • Q: Is there a way to push down to describe the provenance of the S, P. or O.
  • A: You can use reification for such case, but such a requirement has never come out.
  • Q (Luc): A use case from eGov who want to express where an object in an RDF triple comes from?
  • P: Or you can create a construct for that object and create an NG to describe that construct.
  • Q: set of triples belong to different NGs
  • A: You can fill NG with the kind of semantics you want. Following the original NG paper, you can do that but it would mean different things, such as two assertions by different authorities.
  • P: If you look at NG as sets then it doesn't make sense to put one RDF triple in more than one NGs, such as putting part of a file into 2 files. The NG semantics in the original paper really followed the RDF semantics and this is not very useful for SPARQL.

Deborah

  • Deborah: is there a reason the sparql working group did not adopt the semantics provided by pat and jeremy? did the group think there were multiple options for the semantics and thus not able to agree with it?
  • A: there are no problems with the semantics but there are just different types of semantics and ppl would like to make their choice.
  • Chris: ppl are using NG for versioning that we didn't expect at all. We believe SPARQL WG should also be open

Irini

  • Irini: The provenance of a SPARQL query using NG?
  • Chris: Put ground fact into a NG and describe them. There are several papers on this.
  • Irini: where do the triples used to produce my query results come from?
  • Chris: It might below us your query result graph.

Jim Myers

  • Q: The XML signature v.s. RDF signature?
  • Chris: Represent signature as NG was part of the original paper. Serialize the graph and create a signature for it.

Paul Groth

  • Q: Having multiple NGs within RDFa?
  • Chris: You will have multiple URIs for these multiple NGs, it might be tricky when dereferencing these multiple URIs

Deborah

  • Q: what SPARQL WG going to with the NG semantics?
  • Chris: SPARQL WG will not touch work on this. The RDF2.0 WG might work on this. But it's still hard to make everybody agree on a single semantics.
  • Pat: There a bigger problem in RDF. I don't think this can be done within SPARQL. SPARQL is trying to avoid getting model theory involved, which is the right decision to make things move forward fast.

Paolo

  • Q: Ask a triple belongs to a particular graph?
  • A: Yes, you can do that

Irini

  • Q: Anyone did any performance evaluation on NG?
  • Chris: We did it as part of the Berlin benchmark. Olaf did some evaluation with SPARQL queries with NG and noticed no bigger performance issues.
  • Pat: Someone has an industrial scale Quand store and has no performance problem.