Weaving Meaning: The Semantic Web

Eric Miller, W3C Semantic Web Activity Lead

American Association of Law Libraries
2002-07-23
Orlando, FL USA

Slides available at:
http://www.w3.org/Talks/0723-aall-em/

metadata

Overview

The Semantic Web: What is it?

elephant

The Semantic Web

a bed time story...

The Current Web

Resources:
identified by URI's
untyped
Links:
href, src, ...
limited, non-descriptive

User:
Exciting world - semantics of the resource, however, gleened from content
Machine:
Very little information available - significance of the links only evident from the context around the anchor.

the current web


The Semantic Web - A Simple Extension to the Current Web

Resources:
Globally Identified by URI's
or Locally scoped (Blank)
Extensible
Relational
Links:
Identified by URI's
Extensible
Relational

User:
Even more exciting world, richer user experience
Machine:
More processable information is available
Computers and people:
Work, learn and exchange knowledge effectively

the semantic web

What is the Semantic Web?

The Semantic Web is an extension of the current web, in which information is given well defined meaning, better enabling computers and people to work in cooperation.

Information that has well defined meaning is in a form that machines can understand, rather than simply display.

Machine understandable documents does not imply some magical artificial intelligence allowing machines to comprehend human speech, rather it relies solely on the machine's ability to solve well defined problems by performing well defined operations on well defined data.

or, another way to think about it...

The Semantic Web is a mesh of information linked up in such a way as to be easily processable by machines, on a global scale.

You can think of it as being an efficient way of representing data on the World Wide Web, or as a globally linked database.

Overview

1. Web resources

2. Non-Web resources

2.1 Physical objects

2.2 Abstract concepts

Unambiguously Identifying Web Resources

Solution (trivial): URLs

Unambiguously Identifying Physical Objects

Many human systems:

Problems:

Solution: Convert to URIs

Unambiguously Identifying Abstract Concepts

Solution: Use URIs

Problem: Which URIs?

Solution: Ontology

Ontology

In other words:

Dublin Core

One Global Ontology?

No.  Not realistic.

But:

Does an Ontology Really Define Meaning?

No

Example: RFC 2119

Ontologies and Legal Community

Example of Unambiguous Identification

To say: "Web page foo.html  was created by  John Smith"

Need to unambiguously identify 3 things:

  1. Web page:
    http://www.example.org/foo.html
  2. "was created by":
    http://purl.org/dc/elements/1.1/creator
  3. "John Smith":
    http://www.example.org/staffid/85740

Summary

Adding meaning to web, making the web more "semantic"

Problem 2: Complexity of Information Formats

Example: "Time flies like an arrow"

Need a common, machine-processible information format

Important Characteristics for a Machine-Processible Format

A solution: RDF

What Is RDF?

Why a Relational Data Model?

Adapting the Relational Model for the Web

URIs and Database Keys

Relational database mantra:

"The key, the whole key, and nothing but the key"

Web mantra:

"The URI, the whole URI, and nothing but the URI"

RDF Triples

<subject> <verb> <object>
<subject> <property> <value>

Example Triple

(Not RDF/XML syntax):

http://www.example.org/foo.html     (Subject)
  http://purl.org/dc/elements/1.1/creator     (Verb/Property)
    http://www.example.org/staffid/85740     (Object/Value)

Meaning: "Web page foo.html  was created by  John Smith"

Representing Relational Data as Triples

Table with individual row labeled "Subject", column labeled "Property" and cell value labeled "Value"

Representing Tables as Triples

Any relational data can be represented as triples

arrow tail, body and head are are subject, property and value

Table as Collection of Triples

Arrows can make a table, an arrow from each row to each value

Joining Triples to Create a Graph

Nodes connected by blue labeled arcs.

Joining Data from Multiple Sources

Trivial: Same URI => same node.

Combination of blue, red and gree networks (or subgraphs). The subgraphs are connected by the nodes that they have in common.

Application Integration: XML Versus RDF

N things on the left connected to N things on the right, with N*N connections: a connection for every left/right pair of things N things on the left and N things on the right, with an RDF node in the middle. N connections from the left go to the RDF node, and N connections from the right also go to the RDF node.
  • N*N complexity
  • N*1 complexity

Summary

Concepts of RDF

SECTION 5: Conclusions, Example Applications and Demo 

 

Solutions to Key Problems

Goal: "Machine processible information"

Purpose: Find, share and combine information more easily

What information could be machine processible?

Ideally: All Web data.  (Not realistic)

Consumer Producer Solution
Machine Machine Easy.  Use RDF/mappable*
Machine Human Easy.  Use RDF/mappable*
Human Machine Easy.  Include RDF/mappable* with human format
Human Human Harder:  Must manually add RDF/mappable*, requiring:
  • Expertise
  • Tools (e.g., in HTML editors), or
  • AI

*"RDF/mappable" = RDF or RDF-mappable

Where to put machine-processible information?

A: Several possibilities:

Example RDF / Semantic Web Applications

Demo of TAP Semantic Search

W3C's Current Search Search

W3C's Semantic Search Service

What I Hoped to Achieve

Creative Commons

RDF Legal Dictionaries

http://rdf.lexml.de/ Open Source Development of an RDF Dictionary Open source development of a multi-lingual and multi-jurisdictional RDF Dictionary for the legal world.

Legal Citations

Example...

Example

Caveat: Everything i learned from legal citations I learned from 

http://philip.greenspun.com/politics/litigation/reading-cites.html

 
    -- the promise of the "semantic web" is automating
    (parts of) social protocols, and those social protocols
    are often grounded in law

    -- there are established conventions for legal citations

    -- more and more legal proceedings are published via the web
    all the time

    -- those legal proceedings are often copied in many places,
    and there's no recognized canonical URI for them, so
        -- caches don't help
        -- my browser doesn't tell me I've been there before
        -- etc.

So... some ideas...
    -- an RDF schema for legal citations
        (probably one schema per jurisdiction, with lots of
        sharing and sublcassing)

    -- a corresponding HTML form for each jurisdiction that, in effect,
    allows you to compute the address of a document

To take the example from philg's tutorial:

    Ford Motor Co. v. Lonon, 2117 Tenn 400, 398 S.W.2d 240 (1966)

Perhaps in RDF, I'd spell that:


         <rdf:Description rdf:about = "uri">
i      <plaintiff>Ford Motor Co.</plaintiff>
       <defendant>Lonon</defendant>
       <volume>2117</volume>
       <jurisdiction>Tenn</jurisdiction> 
       <page>400</page>
         </rdf:Description>