This is one of the possible Use Cases.
1. Abstract
Rules referring to both, a Computer Science publication server and Computer Science taxonomy or ontology, are used for determining in which area the co-authors of a certain author have published.
2. Status
One of the use cases proposed by REWERSE.
3. Links to Related Use Cases
The use case described here is similar to the use case Enterprise Information Integration, as both stress the integration of heterogeneous data.
4. Relationship to OWL/RDF Compatibility
5. Examples of Rule Platforms Supporting this Use Case
This use case is implemented in Xcerpt.
6. Benefits of Interchange
7. Requirements on the RIF
- Rule-based combined access to both XML and RDF data is needed.
8. Breakdown
8.1. Main Sequence
Typically, the publication server will have its data in XML. Indeed, publications are naturally grouped after the proceedings, journal issue, or book they appeared in. In contrast, a taxonomy of research fields is naturally expressed in RDF.
The following rules are views accessing both, the XML data of the Computer Science publication server DBLP and an (hypothetical) Computer Science taxonomy in the spirit of SKOS, the Simple Knowledge Organisation System.
8.1.1. Constructs
Like RDF data, XML documents are accessed by Datalog-like queries with the following constructs:
- a[ b[], c[] ] is a query retrieving XML documents of the form
<a> <b/> <c/> </a>
- a[[ b[] ]] is a query retrieving XML documents of the form
<a> ... <b/> ... </a>
where ... stands for any content, i.e. the query is an incomplete specification of XML documents to retrieve.
- a{ b, c } is a query retrieving XML documents of one of the forms
<a> <b/> <c/> </a> <a> <c/> <b/> </a>
- a{{ b, c }} is a query retrieving XML documents of one of the forms
<a> ... <b/> ... <c/> ... </a> <a> ... <c/> ... <b/> ... </a>
- a[ desc b ], a[[ desc b ]], a{ desc b }, and a {{ desc b }} are queries retrieving a rooted XML documents having a b element at any possible depth.
In other words, an XML document on the Web is seen as a fact.
Each of the following rule is given an intutive reading in English. Each rule body is a conjunctive query to an XML document, or to both, an XML document and an RDF specification. Each rule's head is a RDF triple (the syntax of which can be rephrased as one wishes).
8.1.2. In which areas have my co-authors published?
This query can be separated in three tasks:
- Find the co-authors of a person.
- Find the papers published by these co-authors.
- Find the areas that these papers are associated with.
area-of-coauthors[var Person, var AreaLabel] :- in "http://dblp.uni-trier.de/xml/dblp.xml" dblp {{ desc article {{ desc author [ var Person ], desc author [ var Co-author ] }}, desc article (id = var PaperID) {{ desc author [ var Co-author ] without desc author [ var Person ] }} }} AND in "http://example.com/cs-ontology.rdf" (var Paper ID, skos:related, var Area) AND (var Area, skos:prefLable, var AreaLabel)
Salient features:
- The rule mixes access to XML and RDF.
- Co-author is defined as sibling author elements somewhere in the document. This definition easily covers all publications covered in DBLP and requires a search at indefinite depth (the pattern is incomplete in depth).
- The pattern matching the dblp root element is incomplete in breadth allowing for many more than just the two specified sub-elements.
- Other papers also authored by the co-author are also descendants of the dblp root element.
- The 'without' expression is a simple form of "scoped negation as failure" avoiding an author being considered a co-author of him/herself.
8.1.3. For which areas, exist papers that cite one of my papers?
Again the query can be divided in three parts:
- Find the papers authored by a person.
- Find the papers citing these papers.
- Find the areas of computer science these papers are related to.
citing-area[var Person, var Area] :- author-of-paper[var Person, var Paper] AND citing-papers[var CitingPaper, var Paper] AND area-of-paper[var CitingPaper, var Area] citing-paper[var Paper, var CitedPaper] :- in "http://dblp.uni-trier.de/xml/dblp.xml" dblp {{ desc article (id = var Paper) {{ desc cite(idref=var CitedPaper){{}} }}, }} }} OR dblp {{ desc article (id = var Paper) {{ desc a(href=concat('#',var CitedPaper)){{}} }}, }} }}
One might also consider indirect citation, i.e., a transitive closure over the citing-paper predicate.
8.1.4. In which areas have papers been published on my last conference?
I want to write the call for papers for the third iteration of a conference. I would like the list of topics on which papers are solicited for to reflect the list of topics from papers published on the conference in previous years.
area-of-conference[var ConferenceName, var Area] :- paper-at-conference[var Paper, var ConferenceName] AND area-of-paper[var Paper, var Area] paper-at-conference[var Paper, var ConferenceName] :- in "http://dblp.uni-trier.de/xml/dblp.xml" dblp {{ desc article (id = var Paper) {{ in (idref=var ConferenceID){{ }} }}, desc conference(id=var ConferenceID){{ title{{ var ConferenceName }} }} }}
9. Commentary
Combining a taxonomy of research fields as metadata with the XML data of DBLP is a foundation for applications such as community based classification and analysis of bibliographic information using interrelations between researchers and research fields.