Putting the Web back in Semantic Web
Random reflections on ISWC,
and busting some myths
and a few challenges
ISWC Wrapup - RuleML kickoff
http://www.w3.org/2005/Talks/1110-iswc-tbl/
Tim Berners-Lee
CSAIL, MIT
Levels of semantics in publication
- Scraped unstructured text and multimedia content
- Scraped semi-structured XML
- Scraped XML
- XML (may be XHTML) with GRDDL (OK)
- Published RDF (Yes!)
With data, generate HTML from the RDF, not the other way around.
Sem Web architecture 101
- Give important concepts URIs.
- Each URI identifies one concept.
- Share these symbols between many languages
- Support URI lookup
Define symbols:
- Using natural language (bootstrap?)
- By reference to existing systems (eg GPS)
- By mathematical relation to others (raft)
Chis Welty/IBM: "
In the Semantic Web, it is
not the Semantic which is new, it is the Web which is new".
Web architecture 101
- Things are denoted by URIs.
- Use them to denote things.
- Serve useful information at them.
- Dereference them.
Example
http://www.w3.org/People/Berners-Lee/card#i
http://www.w3.org/People/Berners-Lee/card (in N3, summarized):
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix : <#>.
:i a foaf:Person;
foaf:family_name "Berners-Lee";
foaf:givenname "Timothy";
foaf:homepage <http://www.w3.org/People/Berners-Lee>;
foaf:mbox <mailto:timbl@w3.org>.
Counterexample
FOAF: Follow the rdfs:seeAlso link, and look for someone whose foaf:mbox has
the given checksum.
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix : <#>.
:i a foaf:Person;
foaf:knows [
a foaf:Person;
rdfs:seeAlso <http://rdfweb.org/people/danbri/rdfweb/webwho.xrdf>;
foaf:mbox <mailto:danbri@w3.org>;
foaf:name "Dan Brickley" ];
Why not just use a URI?
URI + HTTP architecture 1
The hash is an operator which joins a local identifier to a document URI to
give a global identifier.
http://example.com/foo#bar
- Strip off #bar
- Look up http://example.com/foo using HTTP:
- Look up example.com giving 128.0.0.1
- Request foo from 123.0.0.1
- 200 OK is returned
- Parse the result according to the Internet content type
- This gives you information about bar
URI + HTTP architecture 2
Post TAG resolution of HTTPRange-14, an
optional possible operation is:
given http://example.com/foo/bar
- Strip off #bar
- Look up http://example.com/foo using HTTP as amended by TAG:
- Look up example.com giving 128.0.0.1
- Request foo from 123.0.0.1
- You get a redirection 303 See Other response, indicating that the
URI did not denote an information resource, but mentioning a new
resource http://example.com/foo-schema.rdf
- Request http://example.com/foo-schema.rdf
- Get a 200 OK response
- Parse the result according to the Internet content type
- This gives you information about <http://example.com/foo/bar>
Not recommended by me.
Other issues: Content negotiation between HTMl and RDF. LSIDs
HTTP Arch 3: other links
Various properties used to point to resources which may be interesting.
- rdfs:seeAlso
- owl:imports
- log:rules
- anything used as predicate or type -> "ontological closure"
- anytghing used as an object, subject?
[MIT work with CWM in this area
funded by various NSF and DARPA grants]
Breadcrumbs ethos
- Leave information for others to follow
- a Breadcrumbs protocol = what to leave where + what links to follow
when
- Delegated query is a special case of breadcrumbs
- Challenge: For delegated query, how to describe the sort of query a
service or document can answer
Mythbusting
Myth: "The Semantic Web technology is
Description Logic"
No, OWL is one semantic web language.
It is important that applications which need different expressiveness can use
it.
But other languages must interoperate to the greatest extent possible.
They should use URIs
They should not reinvent functionality already provided by standards.
SW Arch: Same symbols, multiple languages
Mythbusting: Not just public data
- The SW is not just about public data.
- It also about personal, group, agency and enterprise data.
- Historically, intranet servers preceded extranet servers.
For example in biopax
When will the patterns all connect?
[Diagram: Joanne Luciano,Predictive
Medicine; Drug discovery demo using
RDF, Sideran Seamark and Oracle
10g]
Other myths
- "The semantic Web is metadata for
classifying documents"
- "The semantic web is about
hand-annotated web pages"
Such pages are interesting, but not the mainstay of semantic web: too
much trouble!
- "The semantic web is mainly about
content extracted from text"
No, it is primarily an interlingua for relational data and logic. Bridges
will always be important
- "The Semantic Web is about making
one big ontology"
The semantic web is about a fractal mess of interconnected
ontologies....
- "The semantic web ontologies must all
be consistent"
Only the parts I am using together
- Communities will be of many sizes.
- There will be very many small ones (6.10^9 of size 10^0) and a few
global ones (e.g. W3C Rec'n)
- Klensin shows that
fractal (1/f) distribution is optimal under some assumptions
- Swoogle results
for example (right)
- We have less experience when fractal is not constrained to a 2D
surface.
Total Cost of Ontologies (TCO)
Assume :-) ontologies evenly spread across orders of magnitude; committee
size as log(community), time as commitee^2, cost shared across
community.
Scale |
Eg |
Committee size |
Cost per ontology (weeks) |
My share of cost |
0 |
Me |
1 |
1 |
1 |
10 |
My team |
4 |
16 |
1.6 |
100 |
Group |
7 |
49 |
0.49 |
1000 |
|
10 |
100 |
0.10 |
10k |
Enterprise |
13 |
169 |
0.017 |
100k |
Business area |
16 |
256 |
0.0026 |
1M |
|
19 |
361 |
0.00036 |
10M |
|
22 |
484 |
0.000048 |
100M |
National, State |
25 |
625 |
0.000006 |
1G |
EU, US |
28 |
784 |
0.000001 |
10G |
Planet |
31 |
961 |
0.000000 |
Total cost of 10 ontologies: 3.2 weeks. Serious project: 30 ontologies, TCO =
10 weeks.
Lesson:
Do your bit. Others will do
theirs.
Thank those who do working groups!
User Interface challenges
User interfaces are blossoming at ISWC (conphoto, &c), and more to do
- More customization for specific application areas but...
- Generality: can browse any data anywhere
- Dynamically pick up from ontology: Lenses, style, forms
- Independent control of: style, provenance, domains (vocabulary
groups)
- Blow spreadsheet tools away
- Announced Galway, Monday November 7, 2005
- See: Press release
- Building on: Open source and academic work, RuleML
- Business Rules industry includes...
-
- Stemming from W3C Rules
Workshop 27-28 April, 2005
- Charter writing discussions
Goals for Rules
- Interoperability between Rule-based systems
- Extend the expressive power of shared knowledge
The next step in the Semantic Web roadmap
- New markets for rule
systems
- Mapping between ontologies
- Interlocking with other Semantic Web languages RDF, OWL, SPARQL
- Serendipitous re-use of knowledge in Rule
form
Web attitude
- Anyone can say anything about anything
- No one knows everything about anything
-> scoped negation as
failure
- My system is most valuable because of its interconnection to its
peers
Thank you
The conference is over. Welcome to the conference!
slides:
http://www.w3.org/2005/Talks/1110-iswc-tbl/
Tim Berners-Lee
CSAIL, MIT
END
You have gone too far.
Clients of the RDF bus
New data applications can be built on top of RDF bus, for example:
Components: Adapting random files
Keep your existing systems running - adapt them
Components: Triple store
Virtual severs actually figure stuff out as well as look up data
Adapting SQL Databases
Keep your existing systems running - adapt them
Adapting XML
Remember- RDF on an HTTP server can always be virtual
Adapting XML: GRDDL
Remember- RDF on an HTTP server can always be virtual
Components: Smart servers
Virtual severs actually figure stuff out as well as look up data