Overview

Part of Semantic Web Tutorial

These slides: http://www.w3.org/2003/Talks/0520-www-tf1-b3-rules/

This section:

  1. Using cwm to work with (RDF) data
  2. Writing and understanding "rules"

Using cwm usually means using rules, so the two are intertwined.

Detailed Overview

We'll talk about using cwm to:

  1. convert between data formats
  2. merge data
  3. "think" (use rules for inference)
  4. "filter" (query and transform)
  5. report generation
  6. debugging and usage tips
  7. some more about builtins

Running Cwm (reminder)

Arguments are (mostly) processed left-to-right, changing state as we go

Filenames/URI arguments mean "read this and add it to the main information store"

By default, at the end, the contents of the store are output

Converting Data Formats

cwm --rdf foo.xml --n3 > foo.n3
  1. --rdf : change to using RDF/XML as the data format
  2. foo.xml : read the file foo.xml using the current data format and add its statements to store
  3. --n3 : change to using n3 as the data format
  4. (end) : output contents of the store using the current data format

Converting and Merging

Merge different files, in different formats, output result in N-Triples:

cwm foo.n3 --rdf bar.xml bar2.rdf \
     --n3 baz.n3 --ntriples > bux.nt

Merging Nodes

cwm blue.n3 red.n3 green.n3 > white.n3

#blue arcs
:y m:homePage :z.
:x m:attending :y;
   p:GivenName :name;
   p:hasEmail :email.

circles and arrows diagram

Red and Green arcs

#green arcs
:z p3p:policy :pp.

#red
:y m:chair :x;
   m:Location [ g:zip []; g:lat []; g:long [] ].

with more circles and arrows

Generalizations

Sometimes our data has patterns and regularities:

:Mary :son :Frank, a :Male;
      :son :Bob, a :Male;
      :son :Sam, a :Male.

Can we just tell the computer that every son is male?

We could use something standard, or application code.

No standards yet; OWL is one approach, rules are another

A Simple Rule

{ ?x :son ?y } => { ?y a :Male }.

# simpler data
:Mary :son :Frank, :Bob, :Sam.

Given --think, cwm will treat these the same. It will infer that :Frank, :Bob, and :Sam are :Male.

Running --think

cwm mary-short.n3 son-rules.n3 --think

:Mary :son :Frank, a :Male;
      :son :Bob, a :Male;
      :son :Sam, a :Male.

Rules Are Just Statements

#   subject        verb        object
#=============  ==========    ==============
{ ?x :son ?y }      =>        { ?y a :Male }.
{ ?x :son ?y }  log:implies   { ?y a :Male }.

The terms in braces { } are formulas.

The rule statement relates two formulas.

Publishing Rules

Rules are statements of fact, and can be published just like other RDF data, but...

This area is not yet standardized.

  1. RDF/XML has nothing like { ... }
  2. there is no wide consensus on log:implies or something like it

Cwm uses N3 or a non-standard extended parsetype for RDF/XML.

More Complex Antecedent

{ ?x :son ?y.
  ?y.:age math:lessThan 15. }
 =>
{ ?y a :Boy. }

More Complex Consequent

{ ?x :son ?y } 
  => 
{ ?y a :Male.  
  ?y :parent ?x. 
  ?x a :Parent. }.

Still More Complex Consequent

{ ?x a :Mammal. }
  => 
{ ?x :parent [ a :Mammal ]. }

And more...

{ ?x a :Mule. }
  => 
{ ?x :parent [ a :Horse, :Female ],
             [ a :Donkey, :Male ] }

Still just a basic statement of fact.

It's also Turing Complete.

Action Rules

This is a normal rule declaration:

{ ?temp math:greaterThan 75 }
=>
{ :coolingSystem :state :on }.

...but it can be used by a system (agent) which only looks for the cooling system state, and uses that to control the device.

That's not part of cwm. Do it in something which calls cwm.

Transformation Rules

This could be a normal rule declaration:

{ ?x :son ?y }
  => 
{ ?x :child ?y.
  ?y a :Male. }.

But let's use it as a transformation rule, with cwm's --filter option.

Using --filter

:Mary :son :Frank, :Bob, :Sam.

cwm mary.n3 --filter=trans.n3 gives us:

    :Bob     a :Male .
    :Frank   a :Male .
    :Mary    :child :Bob,
                :Frank,
                :Sam .
    :Sam     a :Male .

Variables

Universal ("for all ...") and Existential ("there exists ...")

We've been using a restricted shorthand for each.

log:forAll

{ ?x :son ?y } => { ?y a :Male }.

could be written

this log:forAll :x, :y.
{ :x :son :y } => { :y a :Male }.

? is just shorthand for "this log:forAll" in the parent formula

Sometimes you want explicit scopes.

log:forSome

{ ?x a :Mammal. }
  => 
{ ?x :parent [ a :Mammal ]. }

could be written

{ ?x a :Mammal. }
  => 
{ this log:forSome :foo.
  ?x :parent :foo.
  :foo a :Mammal. }

_:foo

[ a :Mammal. ]

could be written

this log:forSome :foo.
:foo a :Mammal.

But this is not the rule we had:

{ ?x a :Mammal. }
  => 
{ ?x :parent _:foo.
  _:foo a :Mammal. }

Here, there is only one parent for all things.

Report Generation

What if you want to look at some output other than n3 or rdf/xml?

  1. RDF/XML is XML; use XML tools (XSLT). Use --bySubject to make the RDF/XML more regular (less pretty-printed).
  2. Use RDF/XML tools (?)
  3. Write a program (in python using cwm libraries, or any RDF libraries)
  4. Use --strings

Using log:outputString

How do you "print" from a big pool of triples?

cwm ... --strings ... looks for all triples like

"aaa" log:outputString "Hello, World!"

then sorts them by subject ("aaa")and output the object strings ("Hello, World!") in order.

Sometimes this works well.

Using log:outputString (example)

@prefix names: <foo:> .
@prefix log: <http://www.w3.org/2000/10/swap/log#> .
@prefix string: <http://www.w3.org/2000/10/swap/string#>.

{  ?x names:familyName ?k.
   (?x.names:givenName " " ?x.names:familyName " has been invited\n" )
         string:concatenation  ?s.
} => {
   ?k  log:outputString ?s.
}.

[ names:familyName "Hawke"; names:givenName "Sandro" ].
[ names:familyName "Connolly"; names:givenName "Dan" ].
[ names:familyName "Berners-Lee"; names:givenName "Tim" ].

cwm example.n3 --think --strings

Tim Berners-Lee has been invited
Dan Connolly has been invited
Sandro Hawke has been invited

Debugging Hints

Check syntax with (perhaps with cwm --no) often

If a rule isn't firing, try commenting out triples in the antecedent until it does

Divide complex rules into smaller ones; output intermediate results

Misspellings tend to cause silent errors; try an OWL validator or the cwm DAML/RDFS validator

Cwm --chatty=50 (or any value).

Let cwm pretty-print your files and see what it thinks you said.

Naming Conventions

Using .n3 is nice, but...

Parallel files based on role allows Makefile rule-patterns to be used to invoke cwm

Eg: .sen for sensor data, .ana for analyzed sensor data.

--with

All arguments after --with are passed in for rule use

@prefix log: <http://www.w3.org/2000/10/swap/log#> .
@prefix string: <http://www.w3.org/2000/10/swap/string#>.
@prefix os: <http://www.w3.org/2000/10/swap/os#>.

{ "2" os:argv ?x.
} => {
   ""  log:outputString ?x.
}.

cwm os-rule.n3 --think --strings --with foo bar baz outputs bar (with no final newline).

Reconsidering Built-Ins

So what are os:argv, string:concatenation, ...?

They are predicates (properties) which cwm recognizes and handles specially. They are "built in" to cwm.

In rules terminology they are procedural attachments for sensing. They provide input to rules.

Mostly just Tim's designs, with some user feedback.

They only work in the antecedent of a rule. Cwm can figure out if they are true, and sometimes bind a variable in the process.

With functions, cwm can figure out the object (like os.argv)

With inverse functions, cwm can figure out the subject (@@ example?)

For others (math:lessThan) cwm simply figures out if it's true.

Some (math:product) just provide better (reasonable) performance. Others (os.argv) provide added power to the language.

Math Example

{   (?x.tempInF "32") math:difference ?a.
    (?a "0.5555") math:product ?c.
} => {
     ?x  tempInC ?c.
}.

Function of >1 parm use lists.

Lots more builtins

Built-ins

We'll talk about web ones next, then give full examples.