Web Principles, Hopes and Fears
Richard E. Snyder President's Lecture
March 28, 2006
Tufts University
http://www.w3.org/2006/Talks/0328-tufts-tbl/
Tim Berners-Lee
MIT Computer Science & Artificial Intelligence Laboratory (CSAIL)
Southampton University, School of Electronics and Computer Science (ECS)
Director, World Wide Web Consortium
This talk
- Philosophical engineering
- The Web and Semantic Web
- Universality
- Decentralization
- Scale-free structure
but not necessarily in that order.
Philosophical Engineering
- cf. Experimental Philosophy
- Microscopic rules
- Macroscopic behavior
- Connection between them
- Synthesis vs Analysis
- Rules are social as well as technical
"Web Science"
Example: eMail
- Technical Rules: store & forward, no trust infrastructure
- Social Rules: don't bother people
- Scales in academic environment
- Doesn't scale in commercial environment
Web rules
Technical:
- Use URIs for documents and anchors
- Ladder of authority to interpret
- Use standards (HTTP. xHTML, SVG, CSS, DOM, XSLT, ...)
Social:
- Serve useful stuff
- Make useful links
- Intellectual property, libel & fraud laws
Wiki
- micro: A simple editor
- micro: Citizen's responsibility
- Macro: Wikipedia
Blogs
- Micro: Trackback
- Macro: The blogosphere
Semantic Web
Technical:
- Use URIs for documents and concepts
- Same ladder of authority
- Standards: (RDF, OWL, SPARQL*, RIF**)
Social:
- Serve useful stuff
- Make useful links
- Share ontologies
- Agree on ontologies
Universality of the Web
Information independent of
- Hardware platform
- Software platform - OS
- Application Software
- Network access
- Public, Group, or Personal scope
- Quality - Scribbled idea to polished publication
- Language and culture
- Disability
- Data for machines or Documents for people
- Scale
- Time
Independence of Hardware
- Originally: Mainframes vs mini vs PC
- Later: VGA vs 800x600
- Now: Laptop vs handheld vs LCDTV
- Technique: Separation of content and form
- Example: Cascading Style Sheets
Independence of Operating system
- Was: VM-CMS vs VAX/VMS vs Unix
- Now: OSX vs Windows VS Linux vs series 60 etc
- Technique: Standards, open standards
Independence of application software
- Was: WorldWideWeb, Erwise, ViolaWWW, Mosaic, Lynx etc
- Now: Firefox vs Internet Explorer vs Opera etc
- Technique: Open Standards
Independence of Network
- Was: Decnet vs Internet protocol vs CERNnet etc
- Now: IP on RCN vs Verizon vs ...
- Technique: Common Internet Protocol and API
- Technique: Internet "cloud" mimics single connection point
- Concern: IP carrier deals link specific producer-consumers
Public/Private scope
- Must be able to links public and private info
- Systems spread faster behind firewalls
- Intra and internets must be compatible
- Internal and external data must be linkable
Independence of Quality
Quality is subjective
- A universal system can't use one person's criterion
- Example: Nudity and violence ratings internationally
- Example: Religious material
- Technique: Metadata, and parent choice
- "Anyone can say Anything about Anything"
- Technique: Annotation, tagging, etc
- Google algorithm finds clusters
- Hope: Censorship gives way to openness
- Hope: Peer review systems and distributed democracy
Independence of Language and Culture
- Techniques: Unicode, W3C character model
- Techniques: xml:lang attributes
- A lot more than characters
Independence of Disability
- 20% have some disability
- 50% make some adaptation
- Web Accessibility Initiative
- Guidelines for making web sites
- Guidelines for making software to make web pages
- International harmonization
- Not just visual impairment!
Data vs Documents
Data is currently pre-web
- Model your data --> ontology
- Map your data to ontology
- Publish it as [virtual] RDF
- Query using SPARQL Protocol And RDF Query Language
- Aim for serendipitous reuse
Independence of Scale
- What shape should the WWW be?
Decentralization
- No central bottleneck (net)
- No conceptual bottleneck (censor)
- No cultural bottleneck (ratings)
- No single root of a tree (ontology)
- Concern: DNS
- Concern: 'net' is tree-like
Graph structure
- Allows many trees to be expressed
- Allows tables to be expressed
- Doesn't go back in the tree
- Needs some structure
Semantic web includes tables,...
...trees
... everything
RDF data...
...merges just like that.
Subject and object node using same URIs
Communities and Vocabularies
Universal WWW must include communities on many scales
- Communities communicate with languages
- Languages form barriers
- Barriers are essential to the community
- Communicting with other communities is expensive
- Developing wider languages is expensive
- For data web, communities map to ontologies
Applications connected by concepts
Total Cost of Ontologies (TCO)
Assume :-) ontologies evenly spread across orders of magnitude;
committee size as log(community), time as committee^2,
cost shared across community.
Scale |
Eg |
Committee size |
Cost per ontology (weeks) |
My share of cost |
0 |
Me |
1 |
1 |
1 |
10 |
My team |
4 |
16 |
1.6 |
100 |
Group |
7 |
49 |
0.49 |
1000 |
|
10 |
100 |
0.10 |
10k |
Enterprise |
13 |
169 |
0.017 |
100k |
Business area |
16 |
256 |
0.0026 |
1M |
|
19 |
361 |
0.00036 |
10M |
|
22 |
484 |
0.000048 |
100M |
National, State |
25 |
625 |
0.000006 |
1G |
EU, US |
28 |
784 |
0.000001 |
10G |
Planet |
31 |
961 |
0.000000 |
Total cost of 10 ontologies: 3.2 weeks. Serious project: 30 ontologies,
TCO = 10 weeks.
Lesson:
Do your bit.
Others will do theirs.
Thank those who do working groups!
- Communities will be of many sizes.
- There will be very many small ones (6.10^9 of size 10^0)
and a few global ones (e.g. W3C Rec'n)
- Kleinberg shows
that fractal (1/f) distribution is optimal under some assumptions
- Swoogle results for example (right)
- We have less experience when fractal is not constrained to
a 2D surface.
Goal: Wide depolyment of data
- Personal informatuion management
- Common Scientific data
- Lab notebook details
- Ecomonic and Financial
- Regulatory, Tax etc
Impact of Semantic Web
- Much more data integration power
- much more policy awareness required
- systems with transparency ("Why? How do you know that?")
- Computer analysis of data much more powerful
- work needed on emergent properties, stability, and
- in general relationship of µrules to Mphenomena<.li>
Independence of Time
- Pick URLs without parts which will change
- Imagine a time-invariant set of names
- Support them with a web server
- Make arrangements in case of the organization's demise
- Use standard formats
- Evolution of languages: language frameworks
- Demand this from web site designers.
Conclusion
- Universal space -> minimum constraint
- Decentralization -> fractal tangle
- User responsibility.
Thank you
SW: Everything has a URI
Don't say "colour" say
<http://example.com/2002/std6#col>
The relational database
The element of the Semantic Web
- Can be encoded in XML
- Simplicity and mathematical consistency
- This is called Resource Description Framework (RDF)
RDF: Semantic links - "Joining the Web"
Verb/predicate/Property using same URIs
Fractal Web of concepts
- Across boundaries of scale -- personal, group, global
- Varying access levels
- Tension between local and global standards
- Society is a fractal tangle, so must SW be.
- Personal interactions on multiple scales
The semantic web is about allowing data systems to change by evolution
not revolution
Clients of the RDF bus
New data applications can be built on top of RDF bus, for
example:
Components: Adapting random files
Keep your existing systems running - adapt them
Components: Triple store
Virtual severs actually figure stuff out as well as look up
data
Adapting SQL Databases
Keep your existing systems running - adapt them
Adapting XML
Remember- RDF on an HTTP server can always be virtual
Adapting XML: GRDDL
Remember- RDF on an HTTP server can always be virtual
Components: Smart servers
Virtual severs actually figure stuff out as well as look up
data
Future: Stack of expressive power
The Semantic Web Wave
Why not instant?
- Paradigm shift all over again
- Data is trickier, esp. to design logic languages
- Need for smaller incubator like HEP
- Data is less exciting with no browser
- Fear of having to make ontologies
Good news
- Logic discussions are getting done (OW, SPARQL, RIF,...)
- Life sciences is an incubator community
- TCO is finite
- Startups
- Major vendors are moving it into products
- We have some ideas about actually making a user interface!
When you get home
- Make a FOAF page + give yourself a URI
- Take your own data
- Export as RDF (or SPARQL)
- Include links to related other data
- Tabulate it