RDF Next Steps Workshop, Breakout on Graph Metadata

27 Jun 2010

See also: IRC log


Ivan Herman, Axel Rauschmayer, Fabien Gandon, Elisa Kendall, Axel Polleres, Jun Zhao, Thomas Lörtsch, M. Scott Marshall, Jie Bao, Atanas Kiryakov, Mike Dean
Elisa Kendall


<ivan> scibenick: AxelPolleres

<scribe> scribe: AxelPolleres

attendees: MikeDean, ivan, elisa, fabien, thomas, scott, jun, atanas, axelR, AxelPolleres, jie

elisa: 2 main topics: 1) named graphs, 2) annotations
... issues: bnode scope, bnode as graph name, needed vocabulary

axelR: requirements? named graph as used now, e.g. in Sesame for authentification, can this be solved?

scott: what do you mean by authentication? signing graphs?

axelR: sorry, meant authorization

mikeDean: also an issue about whether or not more than one NG per doc

elisa: also, can one graph span multiple docs
... in earlier discussions we had consensus both is fine/legal

axel: is syntax N-quads vs TriG style vs. Graphs as literals an issue?

elisa: that is connected to voabulary

jun: relation to definitions of named graphs in SPARQL is important

elisa: let's have one session on named graphs, one session for annotations.

fabien: n-tuples are one model...

atanas: datasets for data integration, named graphs

axel: datesets vs mnamed graphs...

fabien: what about the default graph/background graph?

thomas: general solution or specific use case.

jun: sets of named graphs for any triple are possible in practice...

ivan: ... and I can query about that whole set, what's the difference between RDF datasets and sets of named graphs?

axel: are we trying to solve the problem here, or draft a charter?

fabien: collect/understand problems

ivan: requirement is max compatibility betwen what we define for NG and how it is defined/used in SPARQL.

fabien: (SPARQL) Update has an effect on named graphs... if I modify a graph, what does that mean to it's meta-data

ivan: RDF semantics should be silent about that

atanas: more important is that Update also need mechanisms to update the meta-data

ivan: I'd be surprised if a statement on the NG meant that this statement is valid "on all triples in the graph"

mikeDean: there is a reception that NG is a replacement for reification... your statement implies that it is NOT.
... for cd:creator etc. NG is a reasonable replacement, not sure for timestamps that apply to triples.

ivan: then we are back at quads, which can be seen as special syntax for NG

atanas: we want to be able to add metadata to (sets of statements) in a graph.
... ist's a matter of adding metadata to subsets/subgraphs

fabien: from theoretical POV you have a notion of hypergraph that is equivalent to named graph, nested graph etc.
... one table per named graph could be non-efficient.

atanas: we need a model that allows to distinguish between the different implementations
... best model at the moment seems multi-graphs (naso... can you give a reference to multi-graphs)

thomas: quad plus a fifth element for the triple identifier
... that's what franz did.
... for or five is a practical implementation, theoretically 4 is enough.

<FabGandon> ta one point we need to start collecting use cases, examples and counter-examples

axel: one layer seems not to be enough to cover all UCs
... e.g. if we want to talk about parts of some statemens/individual statements within a graph.

thomas: syntactically, practically identifying single triples would help a lot.

scott: we talk about solutions (quads), we should talk about requirements, UCs.

<webr3> can I ask, is the purpose of a named graph to name (/reference) a distinct set of triples (that never changes)?

scott: I have another UC, which I didn't present yesterday.

elisa: we all agree that capturing UCs is essential.

ivan: we need to know, what is it what the community is using TODAY, what is implemented and how
... if we find out that e.g. 99% are just quad strores then maye that is an indicator what we should do

fabien: might not be good for all UCs

scott: not sure how to handle multiple graph membership could be handled with quads

fabien: theoretically all can be mapped back to normal graphs, we should just pick one reasonable model on top.

axelR: we need abstraction on an intuitive way, maybe look at TopicMaps work?

jie: let's keep in mind semantic consequences on combinations of NGs.

atanas: currently NGs have not a lot of semantics.

elisa: let's break, after the break we look at NAso's slides and talk about UCs

scott: I can present a UC in 3 slides ...


<FabGandon> Graph-based Knowledge Representation http://www.springer.com/computer/database+management+&+information+retrieval/book/978-1-84800-285-2

scibe: jun

<ivan> scribenick: jun

presentation by atanas kiryakov (ontotext) "triplesets: tagging and grouping in RDF datasets"

atanas: slide1 ng is an rdf graph with a uri assigned as a named
... slide1 sparql spec also has the def of dataset
... slide2 in the data integration context, each dataset can be treated as a NG
... it's unclear the formal consequences of adding or removing a statement from a NG shoudl be
... I am not talking about entailment, even though I used the word consequences
... our solutions to the missing semantics. let a dataset be represented as RDF multi-graphs, a set of quadruples of type <S, P, O, G>
... if one statement is in multiple graphs, updating this statement in one graph doesn't mean it will be updated in all other graphs

mike: do you implement any relationship between a default graph and a NG?

atanas: no, we don't
... when you have the spec. of add/remov behavior at both graph and dataset levels, you can do both implementations
... management of part of dataset is also needed
... e.g. dealing only part of a NG when you try to deal with a sub-dataset
... the model should allow easy statements with such groups (part of datasets), independent from the NG

ivan: you are kind of mixing up the abstract data model and the implementations

atanas: some statements are true in a specific context, but not in other context. I need metadata about the quads
... we have scenarios, we need to group quands, and say things about them. and they are different from grouping triples.
... shows the diagram. one triple in multi. NGs. some triples are associated in none triplesets, some are in one tripleset, and some are in multiple triplesets.

AxelR: can you create one graph for each triple?

<AxelPolleres> so.... what if ngs kan not only overlap/nested, but also, don't have to be disjoint... not sure whether I am clear.

anatas: that's too fine-grained for our case

<AxelPolleres> please put the link of the slides on the wikipage.

atanas: NG "owns" statements

Elisa's slides about NGs for her ontology modelling

ekendall: the odm diagram of NG and graphs

<AxelPolleres> I firmly believe that we need a simple model.

ekendall: a NG has 0..n triples, and a triple belongs to 0..n graphs
... a ng is part of another ng
... a triple in what context belongs to a graph? in uml, you create specialization to state such contextual information

<AxelPolleres> quads identifying each triple maybe enough even for naso's UC.... why wouldn't they?

<AxelPolleres> s p o id.

<AxelPolleres> id :intripleset t1, ... tn.

<AxelPolleres> id :inNG ng1.

ivan: i can have a NG having one triple. for me a quad is just a name for another atomic singleton NG

<AxelPolleres> that's it

mdean: sparql doesn't the notion of subgraphs

ivan: I just want to query one graph, I don't care anything else

<AxelPolleres> something like g1 rdfs:GraphIncludes g2 might be a useful extension of RDFS semantics.

ivan: for me, the abstract view presented by elisa wors

<mscottm> mike: but in some cases, we must query a million singleton named graphs

AxelR: who creates a statement and when is different from who has the authority to access the statement

<mscottm> scott: that points to the notion of making statements about quads in some types of implementations, i.e. one case is actually quintuples

<mscottm> ..quintuples is where we are trying to manage metadata about named graphs such as many singleton named graphs

tomlurge: we need syntax sugar to ease the implementation issue

<AxelPolleres> quads and named graphs are entirely interchangeable for most UCs.

elisa's slides is at p45 of http://bit.ly/9PLxx8

FabGandon: we need to align the different notions of NGs.

ivan: it scares me to see the notion that we would need quadruples on top of rdf triples

mike: should NG be really a sub-class of RDF graph?

elisa: I might need to go back and revisit it after the discussions on this context

mike: it would also make sense to align the concept from sparql spec.

ivan: subclass in the UML sense. Just to the precise!!

elisa: we just took the NG paper and modelled in UML. that's all we did.

AxelR: we need to keep the model intuitive, otherwise it won't be useful

ivan: what we do with singletons; the semantics of the quads; or whether we stay with "set" kind of semantics
... we need to have all the discussions and identified issues well recorded on the wiki

<AxelPolleres> Axel's charter wishlist: 1) standardise SPARQL datasets plus a notion of graph inclusion and semantics for it, 2) extendto RDFS

ivan: there seems to be different interpretations about Jeremy's paper

AxelPolleres: what the graph notion means semantically? in terms of RDF/RDFS semantics? and the relationship with the sparql

ivan: we should get the documentation on the wiki done fore lunch and discuss annotations after lunch

<AxelPolleres> 3) upwards compatibility with non-named graphs.

FabGandon: at one moment, we need to specifiy what we need to the "Syntax" group

<AxelPolleres> in the sense that a non-named graph is a dataset only consisting of a default graph?

<AxelPolleres> definition of dataset merge as a generalisation of graph merge?

<AxelPolleres> note that this latter would include talking about bnode scope in named graphs, likely...

<AxelPolleres> (if wanted, I can put that on the wiki)

atanas shows his use cases about tripleset

<AxelPolleres> 4) upwards compatibility with SPARQL datasets

atanas: each singleton triple belongs to a graph, and we have metadata for each of this singleton graph

FabGandon: I see named graphs in the right example expressed in triplesets too

atanas: we want to be able to express in the example much prettier if using NGs

mike: it might also help to show the use of reification in your example too

FabGandon: we also should try to express in N3

Use case by Scott

scott: w3c hcls have a KB years ago, as a kb warehouse
... we have multiple sparql endpoints. at each sparql, we have a bunch of named graphs
... we would like to know what NGs behind each endpoint and what are there

<mscottm> http://www.freebase.com/view/base/politeuri/sparql_endpoint#

scott: who created the rdf, which version of database, etc

<mscottm> http://semanticweb.org/wiki/VoiD

<mscottm> http://sourceforge.net/projects/omv2/

scott: I don't where I go and look for such information when going to a sparql endpoint

<AxelPolleres> Good additional issue: How do named graphs semantics relate to FYN?

FabGandon: you have follow-your-nose, take a uri of a NG, you get the property of that graph. is this what you want?

scott: how to create query federation by following the NGs. FYN could be a good way to go if it works

ivan: I don't see how your requirement is directly related to quads
... it's requirement to sparql 1.1., to provide descriptions to named graphs

<AxelPolleres> scott, please check http://www.w3.org/TR/sparql11-service-description/

ivan: you should take a look at the sparql service description document

FabGandon: we should collect all the different names related to Named Graphs

ivan: does sparql use NG?

<AxelPolleres> fabien: nested graphs, named graphs, dataset.

ivan: does sparql use the term named graphs?

<AxelPolleres> http://www.w3.org/TR/sparql11-query/#rdfDataset

editing the wiki to track identified issues

<AxelPolleres> http://www.w3.org/2001/sw/wiki/RDF/NextStepWorkshop#Axel.27s_wishlist_on_Graph_metadatahttp://www.w3.org/2001/sw/wiki/RDF/NextStepWorkshop#Axel.27s_wishlist_on_Graph_metadata

<AxelPolleres> http://www.w3.org/2001/sw/wiki/RDF/NextStepWorkshop#Axel.27s_wishlist_on_Graph_metadata

<AxelPolleres> I put my wishlist on the wiki page

<FabGandon> we need to indentify all the notions that are used here and compare them, a number of disagreement are on the terms more than the definitions: named graphs, typed graphs, typed nested graphs, typed nested named graphs, quadruples, triple sets, n-tuple, included graphs, etc.

<AxelPolleres> http://sw.deri.org/2008/07/n-quads/

<mdean> scribenick: mdean

back from lunch

review updates to http://www.w3.org/2001/sw/wiki/RDF_Core_Charter_2010#Graph_Metadata

Axel's wishlist http://www.w3.org/2001/sw/wiki/RDF/NextStepWorkshop#Axel.27s_wishlist_on_Graph_metadata

Naso: efficient addition/removal of statements in datasets?
... hypergraph or multi-graph

<tomlurge> 12http://www.w3.org/2001/sw/wiki/RDF_Core_Charter_2010#Graph_Metadata

moving on to annotations

<jun> scribenick: mdean

Axel: specific annotation domains and their semantics

<timbl> Darn, missed named graphs.

Axel: perhaps should be handled in other group

<AxelPolleres> http://www.w3.org/2001/sw/wiki/RDF_Core_Charter_2010#Graph_Metadata gives a summary of issues we discussed so far... now discussiong whether concrete annotations are in scope or noe

Fabien: practical/scope reasons not to work on specific vocabularies

<timbl> I wish people would not confuse graphs and documents. graphs can be named indirectly though documents' URIs, just as strings can and xml infosets can, but the URI does not identify the graph in the sense of the I in URI.

Fabien: lots of good progress in vocamps

<AxelPolleres> @Anchakor, the idea here was to define some mechanism that allows to define an inclusion relationship between them, e.g. rdf:subGraphOf

Jun: foundation for providing annotations should be in RDF Core

<AxelPolleres> +1 to that point of Jun

Fabien: named graphs important for annotations

<AxelPolleres> (foundations should be layed)

<timbl> AxelPolares, log:includes seems to match that need

Jun: may require additional features depending upon how named graphs are implemented

DavidWood: also impacts bnodes, etc.

<AxelPolleres> timbl, fair enough, except that we might want - in a rubber-stamped standard - give that a URI in the rfd: ns?

<timbl> Sure.

<AxelPolleres> we're on the same page...

<timbl> (If we need names graphs, how did we ever get by without named Strings, and Named Integers)?

Jun: need to agree on meanings of annotations

dwood: good to include metadata about graph in the graph itself
... helps, and commonly done now

<timbl> Well, commonly done now is metadata about a document in the document like: <> a IRCLog.

<AxelPolleres> (timbl, this is about making graphs/triples derefernceable, resources are dereferenceable already ... except if you'r talking about the literals-as-subject-issue, for which I'd refer rather to the syntax or semantics breakout groups)

dwood: naming graphs in documents vs databases - people now ready to move on

<AxelPolleres> I should have said "referable" rather than "dereferenceable", probably, didn't mean to imply HTTP

<timbl> No, I'm not talking about literals-as--subject, I assume that will be fixed.

Elisa: best practices or something more than that

Ivan: WG needs to specify named graphs - not sure about annotations (probably no)

<AxelPolleres> don't understand what "how did we ever get by without named Strings, and Named Integers" means then

<timbl> Strings and Integers are unnamed.

<timbl> That works fine.

<timbl> So are graphs in N3.

Ivan: already have more on list than envisaged - keep WG quick and small

<timbl> You can name things around them using relastions like log:semantics

<timbl> But you don't actually name the graph or the string or the literal.

Ivan: should be handled somewhere (else) by community

<dwood> Tim, do you have concerns that literals-as-subjects could result in many RDF graphs without any URIs in subjects or objects? That is a usage scenario that has been discussed around here.

Scott: include examples of use in documents

Axel: SPARQL WG time-allowed features

<timbl> I am worried that the beauty of N3 which can solve so many problems is going to be messed up by a great asymmetry in a new language with named graphs.

Ivan: messy
... necessary for SPARQL 1.1, given no workshop like this one

<timbl> dwood, I am not at all concerned about graohs not having URIs in theory any more than I a numbers not having URIs.

Ivan: re-chartered 3 times for IPR reasons

<timbl> I don't know why it is so difficult to explain the need for graphs to be literals in the language just like strings. The problem is it is too onbvious to me.

Sandro: charter leaves some room for WG to prioritize

Elisa: requirement from provenance WG

<timbl> Yes, Anchakor, you can use sameAs I think to name any literal, graph or not.

Ivan: separate path for provenance

<FabGandon> Feedback, proposal, opinions, references gathered about named/nested/annotated graphs in the W3C Workshop on RDF next steps: http://www.w3.org/2001/sw/wiki/RDF_Core_Charter_2010#Named_Graphs

Ivan: WG has to specify what named graphs are - syntax, semantics, etc.

<AxelPolleres> timbl, graphs as literals is one way to tackle this issue, named graphs is another... no?

<Anchakor> timbl: great, if owl:sameAs was merged in rdf namespace it would wash away a lot of pain... and I agree with similarity of graphs and literals

Jun: use case gathering process for named graphs?

Ivan: probably

<AxelPolleres> my personal idea would be that all that can be done if we make RDF datasets as used in SPARQL a first-class citizen of RDF.

<timbl> AxelPolares, no, it is not an equally good method.

Ivan: UC can be formal document or more implicit - to be decided by WG

<timbl> If you have graph literal syou can do anything, including any form of named graohs.

Ivan: RDFa didn't have a separate UC document

<timbl> If you have named graohs then you cannot have graoph literals and you can't just write a little N3 rule without sma nems.

Elisa: use cases for provenance should be considered for named graphs
... add e.g. provenance, annotations, etc.

<timbl> (BTW my login is not accepted for editing the wiki http://www.w3.org/2001/sw/wiki/RDF_Core_Charter_2010#Named_Graphs )

Elisa is actively editing the charter

<AxelPolleres> sure, and vice versa, "named graphs that aren't named" can simply be represented by bnodes ... that's just a dual solution in my opinion, isn't it?

<timbl> anchakor, in N3, owl:sameAs is just "=" so the namespace is not seen.

<AxelPolleres> one could provide syntactic sugar looking similar to N3 graph literals for that case in a Turtle/N-triples/N3 syntax standardisation, no?

Elisa: leave in document for now
... any proposals?

<timbl> AxelPolleres, write { ?x a Man } => { ?x a Human}. in named graphs, then.

Elisa: TBD based on named graphs

<AxelPolleres> timbl, RDF rules a la N3 rules is not in scope of an RDF WG, IMO, but a matter of defining a RIF dialect with an N3 surface syntax, rather.

<timbl> AxelPolleres, write { foo.html licence l:lgpl } chron:before "2007-06-7Z". in named graphs, then.

<timbl> That is a provenance use case I assume

Elisa: want to be able to reason over annotations

Axel: provide examples

<AxelPolleres> RDF Dataset: { foo.html licence l:lgpl }_:c . _:c chron:before "2007-06-7Z".

<AxelPolleres> this dataset consists of one named graph and the default graph contains the annotation for that graph... makes sense?

<timbl> Well yes, but it is messy.

Axel: should provide extensibility mechanism for annotations

<FabGandon> messy ?

<AxelPolleres> Axel: We should provide an extensibility mechanism for annotations, exemplify it, in terms of how annotations can be given a semantics.

<timbl> wel, having to intrducce the bnode _:c i smessy.

<FabGandon> it is likely that we will discourage bnodes for graph names

<timbl> re RDF Dataset: { foo.html licence l:lgpl }_:c . _:c chron:before "2007-06-7Z". --- does the default dataset contain the triple { foo.html licence l:lgpl } ?

<FabGandon> to me yes

Ivan: other 2 groups are finished - not sure about semantics

<timbl> In other works, will a simple SPARQL query for that trip return it?

<FabGandon> and I would say it also contains the triple _:c chron:before "2007-06-7Z"

<AxelPolleres> timbl, same as for lists ... :s :p (1 2 3)

<timbl> If so, we have a problem, in that it isn't true, as it was only true before 2007.

Sandro: avoid using term Named Graph - N3 has graph literals

<timbl> +1

Elisa: suggest RDF Graph Identification

<FabGandon> I agree : lets talk about "rdf graph indentification"

<timbl> Do you want just to name it or to express its contents?

<AxelPolleres> I am fine with that, I just think we shouldn't preclude naming

<timbl> For a literal, it sidentity is only its contents.

<timbl> The identity of "chat" the string is just its contants.

<timbl> You can't have two strings "chat" and "chat" and maintain they are different.

<timbl> If you give graphs names, will you be able to have identical graphs which are though not equal because they don't have the same name? Ugh... remember you will have to do logic wit these things.

<Anchakor> what about graphs composed by applying a function on some other graphs? (ex: a graph representing an union of 2 different graphs)

<timbl> You will nee to make things depend on subgraohs.

<FabGandon> my opinion: we need (1) a mechanism to identify a graph (2) a mechanism to say a triple is included in one or several identified graphs.

<AxelPolleres> timbl, we have a generic proposal for giving semantics to annotations (such as time in your example), cf. http://www.polleres.net/presentations/20100626W3C_RDF_NS_RDFneedsAnnotations.pdf ... that should work independent of the syntactic representation, be it graph literals or named graphs.

<timbl> You will need to derive a graph from a string by parsing it.

<Naso> http://www.w3.org/2001/sw/wiki/RDF/NextStepWorkshop#The_Semantics_of_RDF_Datasets_and_Dealing_with_Quad-sets

<jun> +1 to fabien

<AxelPolleres> What's wrong with subgraphs? if we give a semantics to log:includes/rdf:subgraphOf?

<AxelPolleres> Fabien, and log:includes/rdf:subgraphOf allows exactly that, or no?

all groups have finished

<Naso> Alex, see the final slide in the slides - you have the example why doing the same with sub-graphs is a bit cumberstone

<Naso> I meant Axel, sorry

<Anchakor> I wish this vocabulary was more propagated: http://www.w3.org/2004/03/trix/rdfg-1/

<FabGandon> if graphs are triple sets, and graph inclusion is set inclusion I think we can have a simple and clear graph inclusion mechanism

<timbl> AxelP, You didn't include N3 as alternatives in your slidest even though we have been doing just what you are talking about for amost 10 years with it?

<AxelPolleres> naso, not if we provide syntactic sugar a la N3 for it

<AxelPolleres> ... as discussed with timbl further up in IRC

<FabGandon> but then again the question here is not to solve these questions but to decide on their inclusion in the charter

<FabGandon> my opinion is yes let's have this question of inclusion in the charter

<timbl> :axel f:knows :ivanherman true ‘‘in http://polleres.net/foaf.rdf’’ ?

<AxelPolleres> timbl, true, that is an ommission in the syntactic representation part, but those slides were not meant to propose syntax, just to raise the issue that we need to fix/agree on a syntax, and - equally important - to discuss a generic semantics framework for annotations.

<timbl> We would say <http://polleres.net/foaf.rdf> log:semantics [ log:includes {axel f:knows :ivanherman }] without inventing any more sugar

<timbl> and then reason over that

<AxelPolleres> ... as mentioned above, I think that the N3 way of writing it fits with a named graph/quad way of writing it.

<timbl> In fact I think when you combine the annotations with the data in the graphs in your logic you will need the same generality as cwm has to write rules about what is inside a graph.

<timbl> Like { ?doc author Axel. ?doc says { ?axel knows ?y }} => { Axel knows ?y }.

<AxelPolleres> the syntax I have in the slides, e.g. " :axel f:knows ivanherman {polleres.net/foaf.rdf}" was meant as an abstract notation for an annotation, and there are several alternatives to write it.

<timbl> Anyway, late here, gtg

<AxelPolleres> thanks for the discussion, appreciated, timbl, anyways, I still think that the rules part of N3 is not a core RDF working groups concern.

back to #rdfn

Summary of Action Items

[End of minutes]

Minutes formatted by David Booth's scribe.perl version 1.135 (CVS log)
$Date: 2010/07/03 10:05:36 $