Outline
Four Parts:
- What do you mean, "Semantics"?
- Practical, intuitive definition
- Why do you want (Standard) Semantics?
- Non-Standard Semantics
- Subsets and supersets
- Staying safe
- Key design principle
- Ideas for the Future
Time for Discussion
Semantic
Web
We did "Semantic Web"
Linked Data
Let's revisit "Semantic"
Semantic
Web
- We communicate via RDF Graphs, published on the Web
- We use a common vocabulary of IRI terms; some terms will
be recognized by a given program, others wont be.
- What clever things can we make computers do?
- Let's observe a few things about graphs.... (very simple)
(In)Consistency
Some graphs could never be true...
What about this:
Equivalence
- Some graphs say the same thing:
Graph A |
Graph B |
:Aubrey :age "7"^^xs:int |
:Aubrey :age "07"^^xs:int |
- It does not matter how old Aubrey is; they still say the
same thing.
Entailment
Sometimes "logic" tells us whenever A is true, B must also
be:
Graph A (Given) |
Graph B (Entailed) |
Everything which has an "age" is a Person.
The "age" of Aubrey is 7. |
Aubrey is a Person. |
:age rdfs:domain :Person.
:Aubrey :age "7"^^xs:int
|
:Aubrey rdf:type :Person |
Given Graph A, we can answer:
- Is Aubrey a Person?
- What is Aubrey?
- What things are of type Person?
So what are these terms like rdfs:domain?
RDF Logic Languages
Same Syntax (RDF), Different Semantics
Thus these are equivalent:
- Which RDF logic language are you
using?
- Which semantics are you using?
- Which entailment regime are you
using?
- Which inference rules are you using?
- What must a conformant reasoner do?
Reasoner = Data Source
A reasoner takes what you know, and tells you more
Entirely predictable, unchanging
All correct reasoners for the same semantics give the same answers
Intuition: using a semantics = using another data source, with just another URI
... except it's rules.
... or it's just triples but:
- It's Telepathic (it knows what you know)
- It never has an original thought (it knows ONLY what you know)
- It often tells you the obvious.
Part 2: Why Use a (Standard) Reasoner?
What Reasoners Do:
- Check Consistency
- Check Entailments, Query Answering with Entailment
In other words:
- Find inconsistencies (mistakes and disagreements)
- Finding entailments (more "true" facts)
They give us more useful knowledge!
(Yay!)
Usage Scenarios
- Private Use
- Single Source
- Multi Source
Think about:
- When do we need a standard language?
- What else might be standardized?
Private Use
One person / organization
Apple, managing their iPod product line...
apple:iPodNanoGen1 rdfs:subClassOf pdx:MP3Player
apply:iPodNanoGen1 rdfs:subClassOf [
rdf:type owl:Restriction;
owl:onProperty apple:memorySizeMeg;
owl:hasValue "1024";
]
... but privately.
They decide, internally, which reasoner(s) to use
Standard language promotes healthy market
Single Source
Alice published a graph:
:Alice :selling :item22
:item22 rdf:type apple:iPodNanoGen1.
Apple (vocabulary provider) says:
apple:iPodNanoGen1 rdfs:subClassOf pdx:MP3Player
This entails:
:item22 rdf:type pdx:MP3Player.
- Charlie is looking for an MP3 Player!
Who does the Reasoning?
- If Alice does it, no standard needed
- More buyers ++
- Bigger data set -
- More work -
- If Charlie does it, requires standard
- More Products ++
- More work -
- Unpredictable work -
- ONE of them has to do the
reasoning
- Maybe negotiate?
Who does the Reasoning?
Alice doesn't know (or care) about Bob
Bob doesn't know (or care) about Alice
So, it has to be Charlie
... and they need a standard (RDFS here)
Part 3: Non-Standard Reasoning
- Did they really need RDFS?
- What else might they need?
- What happens if Charlie uses a non-standard
reasoner?
Why Deviate
- Easier implementation
- Faster reasoning
- No sufficiently expressive standard (more
information!)
- No elegant solution using standard
- Local conventions (firstname is givenName)
Subsets
Why?
- Easier, Smaller, Faster, Simpler
- Charlie can use some fast bit of code that does only subclassOf
What?
- Fewer entailments and inconsistencies found
- Charlie will miss some entailments, wont notice some inconsistencies
Result:
- Charlie might miss Alice's item
- But his faster reasoner might find others!
It's like deciding whether to consider other search results
When to Subset
Okay if data is already incomplete
Good if resources better used elsewhere
Not okay to present as complete
Not okay if negated
- If no one is selling an MP3Player, then...
Supersets
What:
- More entailments and more graphs defined to be inconsistent
- Charlie will get more information, catch more "errors"
More information is good, right?
Only if it's right.
Beware unwarranted assumptions
Superset Exampls
Some possible supersets:
- Items described in Apple namespace are inferred to be
of type apple:Product
- Because apple:memorySize has range apple:iPod, Alice'
data is considered inconsistent for not providing a
memorySize
- Because Alice lists between 100 and 10,000 items for
sale, Alice is classified as MidSizeVendor
- Because Alice is selling an iPod, and she probably wont remember to erase her music from it, and copying music is illegal, Alice is classified as a Criminal.
So when are superset semantics okay?
Safe Supersets
Did the source say that?
Did the source mean to imply that?
Or was it just something you inferred, on your own?
Safety is in how you present results:
- Alice is a Criminal.
- According to the Evil-Music-Consortium definitions, Alice is a Criminal.
Sorry, I didn't mean to imply you were a criminal...
The hard part: what was or was not implied?
Does this... |
Imply this? |
:x rdf:type :A. :A rdfs:subclassOf :B |
:x rdf:type :B. |
Does this... |
Imply this? |
:z foaf:name "Sandro Hawke". |
:z foaf:firstName "Sandro". |
If the shoe fits
PROPOSED: If you use the URI (in the right graph pattern), you're
endorsing use of the published semantics.
Use "extensible semantics"; never make that ambiguous or contradictory.
- Any additional semantics must be triggered by use, by the
author, of additional syntax
Example: OWL 2
- If DL graph pattern, then use either semantics (they're the same)
- If not DL graph, then use RDF-Based Semantics
Okay for W3C Recommendations, but what about for all semantics
with a web page?
Part 4: Some Ideas for Standards
class and property dereferencing
importing RIF
reasoner negotiation
downloading
PCID
Property and Class Identifier Dereferencing (PCID)
- Recursively dereference every RDF class and property IRI,
and merge the results.
- More limited than dereferencing
all term IRIs; typically small
- Looks like another entailment regime to me
- TimBL's been suggesting this for years
- I think it'll work well....
RIF
RIF meant to be used with RDF, but import from RDF
use owl:import? use PCID?
Additional RIF dialects, FOPL, LP, ....
Reasoner Negotiation
Charlie tells Alice the entailment regimes he'll be using,
so she can skip doing those.
Skip-Inference: OWL-DIRECT
Alice tells Charlie which entailment regimes she has
(completely) used, so he can skip doing those
Did-Inference: OWL-RL
... or put in in the graph, or in a metadata graph.
Downloadable Semantics
- Dave wants a lessThan predicate:
- { _:x owl:sameAs 7; _:x dave:lessThan 3. } is
inconsistent, in Dave's semantics
- At the dave:lessThan IRI, Dave puts a RIF Core ruleset
which "implements" his semantics
- That ruleset, combined with a graph, is inconsistent when
the lessThan relation is violated, and entails all the graphs
for which it holds.
- Reasoners doing PCID+RIF_Core will now automatically
implement dave:lessThan
- Such reasoners get OWL-RL for free?
Downloadable Plugins
- "That's just using RIF Core as a programming language",
you say.
- Okay, then: lets also do it for Java bytecode and
Javascript
- At dave:lessThan would be links to the appropriate
code
- A small number of RDF reasoner plugin APIs would be needed
- Maybe also try to download from your reasoner
supplier.
Advice to Users
Data Source Providers SHOULD include all useful
entailments
Data Source Providers MUST publish only consistent data
Data Consumers SHOULD check for consistency across all
included data sources.
Data Consumers SHOULD compute all entailments they'll find
useful.
Reasoners MUST report the semantics they are using
(including whether algorithm is incomplete/unsound)
Entailment Regimes MUST use Extensible Semantics.
Summary
- What do you mean, "Semantics"?
- A semantics (or a logic, or an ER) is a spec for a set of equivalent reasoners
- Intuitively: a (published) collection of inference rules
- Why do you want (Standard) Semantics?
- Private Use: Better tools for ontologists
- Single Source: Save on bandwidth
- Multi Source: Emergent, synthesized knowledge
- Non-Standard Semantics
- Subset are okay if you're incomplete anyway
- Use supersets at your own risk
- Alice States, Alice Implies, Charlie Infers.
- Never define semantics which conflict someone else's
- Ideas for the Future
- Promote deployment of PCID
- Downloading Semantics: Standardize the Glue
- Document Best Practices, Transition Strategies
- More Workshops!