# Semantics for the Rest of Us (Keynote)

Sandro Hawke (sandro@w3.org), W3C / MIT
ISWC, 26 October 2009, SemRUS Workshop
http://www.w3.org/2009/Talks/1026-semrus/

# Outline

Four Parts:

1. What do you mean, "Semantics"?
• Practical, intuitive definition
2. Why do you want (Standard) Semantics?
• Three general use cases
3. Non-Standard Semantics
• Subsets and supersets
• Staying safe
• Key design principle
4. Ideas for the Future
• Web + Inference

Time for Discussion

# Semantic Web

We did "Semantic Web"

Let's revisit "Semantic"

# Semantic Web

• We communicate via RDF Graphs, published on the Web
• We use a common vocabulary of IRI terms; some terms will be recognized by a given program, others wont be.
• What clever things can we make computers do?
• Let's observe a few things about graphs.... (very simple)

# (In)Consistency

Some graphs could never be true...

• "I was born in both 1965 and in 1975"
• ```:y foaf:birthday "07-10".
:y foaf:birthday "07-11".
foaf:birthday rdf:type owl:FunctionalProperty.```
• `:x rdf:type "Hello World".`

• ```:z foaf:name "Sandro Hawke".
:z foaf:firstName "John".```

# Equivalence

• Some graphs say the same thing:
Graph A Graph B
:Aubrey :age "7"^^xs:int :Aubrey :age "07"^^xs:int
• It does not matter how old Aubrey is; they still say the same thing.

# Entailment

Sometimes "logic" tells us whenever A is true, B must also be:

Graph A (Given) Graph B (Entailed)
Everything which has an "age" is a Person.
The "age" of Aubrey is 7.
Aubrey is a Person.
:age rdfs:domain :Person.
:Aubrey :age "7"^^xs:int
:Aubrey rdf:type :Person

Given Graph A, we can answer:

• Is Aubrey a Person?
• What is Aubrey?
• What things are of type Person?

So what are these terms like rdfs:domain?

# RDF Logic Languages

Same Syntax (RDF), Different Semantics

Thus these are equivalent:

1. Which RDF logic language are you using?
2. Which semantics are you using?
3. Which entailment regime are you using?
4. Which inference rules are you using?
5. What must a conformant reasoner do?

# Reasoner = Data Source

A reasoner takes what you know, and tells you more

Entirely predictable, unchanging

All correct reasoners for the same semantics give the same answers

Intuition: using a semantics = using another data source, with just another URI

... except it's rules.

... or it's just triples but:

• It's Telepathic (it knows what you know)
• It never has an original thought (it knows ONLY what you know)
• It often tells you the obvious.

# Remind you of anyone?

Goofy Example. See what wisdom the counselor offers?

local

# Part 2: Why Use a (Standard) Reasoner?

What Reasoners Do:

1. Check Consistency
2. Check Entailments, Query Answering with Entailment

In other words:

1. Find inconsistencies (mistakes and disagreements)
2. Finding entailments (more "true" facts)

They give us more useful knowledge! (Yay!)

# Usage Scenarios

• Private Use
• Single Source
• Multi Source

• When do we need a standard language?
• What else might be standardized?

# Private Use

One person / organization

Apple, managing their iPod product line...

```apple:iPodNanoGen1 rdfs:subClassOf pdx:MP3Player
apply:iPodNanoGen1 rdfs:subClassOf [
rdf:type owl:Restriction;
owl:onProperty apple:memorySizeMeg;
owl:hasValue "1024";
]```

... but privately.

They decide, internally, which reasoner(s) to use

Standard language promotes healthy market

# Single Source

Alice published a graph:

```:Alice :selling :item22
:item22 rdf:type apple:iPodNanoGen1.```

Apple (vocabulary provider) says:

`apple:iPodNanoGen1 rdfs:subClassOf pdx:MP3Player`

This entails:

`:item22 rdf:type pdx:MP3Player.`
• Charlie is looking for an MP3 Player!

# Who does the Reasoning?

• If Alice does it, no standard needed
• Bigger data set -
• More work -
• If Charlie does it, requires standard
• More Products ++
• More work -
• Unpredictable work -
• ONE of them has to do the reasoning
• Maybe negotiate?

# Multi Source

• Actually, Charlie is looking for an MP3 Player than can run RockBox.
• Bob runs a RockBox website....
```apple:iPodNanoGen1 rdfs:subClassOf rockbox:SupportedDevice
```
• Alice+Bob Entails:
```?x selling [ a rockbox:SupportedDevice ]
```

# Who does the Reasoning?

Alice doesn't know (or care) about Bob

Bob doesn't know (or care) about Alice

So, it has to be Charlie

... and they need a standard (RDFS here)

# Part 3: Non-Standard Reasoning

• Did they really need RDFS?
• What else might they need?
• What happens if Charlie uses a non-standard reasoner?

# Why Deviate

• Easier implementation
• Faster reasoning
• No elegant solution using standard
• Local conventions (firstname is givenName)

# Subsets

Why?

• Easier, Smaller, Faster, Simpler
• Charlie can use some fast bit of code that does only subclassOf

What?

• Fewer entailments and inconsistencies found
• Charlie will miss some entailments, wont notice some inconsistencies

Result:

• Charlie might miss Alice's item
• But his faster reasoner might find others!

It's like deciding whether to consider other search results

# When to Subset

Okay if data is already incomplete

Good if resources better used elsewhere

Not okay to present as complete

Not okay if negated

• If no one is selling an MP3Player, then...

# Supersets

What:

• More entailments and more graphs defined to be inconsistent

Only if it's right.

Beware unwarranted assumptions

# Superset Exampls

Some possible supersets:

• Items described in Apple namespace are inferred to be of type apple:Product
• Because apple:memorySize has range apple:iPod, Alice' data is considered inconsistent for not providing a memorySize
• Because Alice lists between 100 and 10,000 items for sale, Alice is classified as MidSizeVendor
• Because Alice is selling an iPod, and she probably wont remember to erase her music from it, and copying music is illegal, Alice is classified as a Criminal.

So when are superset semantics okay?

# Safe Supersets

Did the source say that?

Did the source mean to imply that?

Or was it just something you inferred, on your own?

Safety is in how you present results:

• Alice is a Criminal.
• According to the Evil-Music-Consortium definitions, Alice is a Criminal.

# Sorry, I didn't mean to imply you were a criminal...

The hard part: what was or was not implied?

Does this... Imply this?
:x rdf:type :A.
:A rdfs:subclassOf :B
:x rdf:type :B.
Does this... Imply this?
:z foaf:name "Sandro Hawke". :z foaf:firstName "Sandro".

# If the shoe fits

PROPOSED: If you use the URI (in the right graph pattern), you're endorsing use of the published semantics.

Use "extensible semantics"; never make that ambiguous or contradictory.

• Any additional semantics must be triggered by use, by the author, of additional syntax

Example: OWL 2

• If DL graph pattern, then use either semantics (they're the same)
• If not DL graph, then use RDF-Based Semantics

Okay for W3C Recommendations, but what about for all semantics with a web page?

# Part 4: Some Ideas for Standards

class and property dereferencing

importing RIF

reasoner negotiation

# PCID

Property and Class Identifier Dereferencing (PCID)

• Recursively dereference every RDF class and property IRI, and merge the results.
• More limited than dereferencing all term IRIs; typically small
• Looks like another entailment regime to me
• TimBL's been suggesting this for years
• I think it'll work well....

# RIF

RIF meant to be used with RDF, but import from RDF

use owl:import? use PCID?

Additional RIF dialects, FOPL, LP, ....

# Reasoner Negotiation

Charlie tells Alice the entailment regimes he'll be using, so she can skip doing those.

```Skip-Inference: OWL-DIRECT
```

Alice tells Charlie which entailment regimes she has (completely) used, so he can skip doing those

```Did-Inference: OWL-RL
```

... or put in in the graph, or in a metadata graph.

• Dave wants a lessThan predicate:
• { _:x owl:sameAs 7; _:x dave:lessThan 3. } is inconsistent, in Dave's semantics
• At the dave:lessThan IRI, Dave puts a RIF Core ruleset which "implements" his semantics
• That ruleset, combined with a graph, is inconsistent when the lessThan relation is violated, and entails all the graphs for which it holds.
• Reasoners doing PCID+RIF_Core will now automatically implement dave:lessThan

• "That's just using RIF Core as a programming language", you say.
• Okay, then: lets also do it for Java bytecode and Javascript
• At dave:lessThan would be links to the appropriate code
• A small number of RDF reasoner plugin APIs would be needed

Data Source Providers SHOULD include all useful entailments

Data Source Providers MUST publish only consistent data

Data Consumers SHOULD check for consistency across all included data sources.

Data Consumers SHOULD compute all entailments they'll find useful.

Reasoners MUST report the semantics they are using (including whether algorithm is incomplete/unsound)

Entailment Regimes MUST use Extensible Semantics.

# Summary

1. What do you mean, "Semantics"?
• A semantics (or a logic, or an ER) is a spec for a set of equivalent reasoners
• Intuitively: a (published) collection of inference rules
2. Why do you want (Standard) Semantics?
• Private Use: Better tools for ontologists
• Single Source: Save on bandwidth
• Multi Source: Emergent, synthesized knowledge
3. Non-Standard Semantics
• Subset are okay if you're incomplete anyway
• Use supersets at your own risk
• Alice States, Alice Implies, Charlie Infers.
• Never define semantics which conflict someone else's
4. Ideas for the Future
• Promote deployment of PCID