XML and WWW

C. M. Sperberg-McQueen

Österreichische Akademie der Wissenschaften

Wien

1. April 2004

TOC | First


What makes the Web special?

previous table of contents next
1 of 39
Why the World Wide Web, and not (e.g.) Guide or InterMedia or System-G?

What makes the Web special?

previous table of contents next
2 of 39
Why the World Wide Web, and not (e.g.) Guide or InterMedia or System-G?

What makes the Web special?

previous table of contents next
3 of 39
Why the World Wide Web, and not (e.g.) Guide or InterMedia or System-G?

What makes the Web special?

previous table of contents next
4 of 39
Why the World Wide Web, and not (e.g.) Guide or InterMedia or System-G?

What makes the Web special?

previous table of contents next
5 of 39
Why the World Wide Web, and not (e.g.) Guide or InterMedia or System-G?

What makes the Web special?

previous table of contents next
6 of 39
Why the World Wide Web, and not (e.g.) Guide or InterMedia or System-G?

What makes the Web special?

previous table of contents next
7 of 39
Why the World Wide Web, and not (e.g.) Guide or InterMedia or System-G?

What makes the Web special?

previous table of contents next
8 of 39
Why the World Wide Web, and not (e.g.) Guide or InterMedia or System-G?

What makes the Web special?

previous table of contents next
9 of 39
Why the World Wide Web, and not (e.g.) Guide or InterMedia or System-G?
  • first off the block
  • improved reliability
  • innovation
  • simplicity
The Web succeeded because it was simple.
Why was it made simple? In order to scale.

The World Wide Web Consortium

previous table of contents next
10 of 39
Our mission: to lead the Web to its full potential.
Our goals:
  • universal access
  • semantic Web
  • trust
  • interoperability
  • evolvability (through simplicity, modularity, compatibility, extensibility)
  • decentralization
  • cooler multimedia

Some consequences

previous table of contents next
11 of 39
Because of our goals, we care about

W3C organization

previous table of contents next
12 of 39
A world-wide organization.
  • host organizations
    • ERCIM (European Research Consortium for Informatics and Mathematics), Sophia-Antipolis
    • Keio University, Tokyo
    • Massachusetts Institute of Technology, Cambridge, Mass.
  • offices in many regions: Australia, Benelux, Finland, Germany/Austria, Greece, Hong Kong, Hungary, Israel, Italy, Korea, Morocco, Spain, Sweden, United Kingdom and Ireland.
  • members from Europe, North America, and Asia

We need your help

previous table of contents next
13 of 39
As customers, demand that your vendors support open standards!
As users, read and comment on our draft specifications!
As institutions, join the W3C to give users a stronger voice within the organization!

What makes XML special?

previous table of contents next
14 of 39
First, the heritage of SGML (Standard Generalized Markup Language, ISO 8879):
  • generic / generalized / descriptive markup (vs. application-specific)
    • for data reuse
    • for data longevity
  • (therefore) declarative markup (vs. imperative)
  • information has validatable structure
Also:
  • simplicity

XML

previous table of contents next
15 of 39
Some dates:
1967: early work on generic markup: logical markup, not appearance- or process-oriented
1973: Generalized Markup Language
1986: Standard Generalized Markup Language (SGML) an ISO standard
1996: Work on ‘SGML on the Web’ starts in W3C
  • Make SGML Web-ready. Keep XML SGML-compatible.
  • Keep the power: nested structure, world view, reusability, generic markup, data ownership.
  • Lose the cruft: simplify, simplify!
1998: XML 1.0 becomes a W3C Recommendation
2004: over 20 XML-related specs at W3C — why so many?
Has XML lost its simplicity?

XML is a language

previous table of contents next
16 of 39
The syntax of XML is simple:
<doc>
  <rule number="1">Everything is delimited.
    <elucidation>
      <subrule>Elements are delimited by <term>start-</term> 
        and <term>end-tags</term>.</subrule>
      <subrule>Tags are delimited by angle brackets.</subrule>
      <subrule>Attribute values are delimited by 
        quotation marks.</subrule>
    </elucidation>
  </rule>
  <rule number="2">Everything nests.
    <elucidation>
      <subrule>Elements occur within elements.</subrule>
      <subrule>Attributes are specified within start-tags.</subrule>
    </elucidation>
  </rule>
</doc>

XML is a metalanguage

previous table of contents next
17 of 39
Three layers of rules governing data.

DTDs do two different things

previous table of contents next
18 of 39
Distinguish logical and physical structure.

Four layers, four or more specs

previous table of contents next
19 of 39
To every thing there is a season. And a spec.

What we foresaw (and a bit more)

previous table of contents next
20 of 39
The original plan:
  • XML (a lightweight SGML) 1998
    • generic markup
    • validation
  • XLink / XPointer (lightweight HyTime) 2002
    • stand-off links
    • n-way linking
    • structure-based addressing
  • XSL (lightweight DSSSL) 1999, 2001
    • structure-based formatting and rendering
    • tree transformations
    • flow objects
  • XML applications / XML-based languages (SMIL, XHTML, MathML, ...)
An early addition: XML Namespaces 1.0.

Conceptual layers

previous table of contents next
21 of 39
A conceptual stack for markup (or data representation generally).

Conceptual layers

previous table of contents next
22 of 39
A conceptual stack for markup (or data representation generally).

Conceptual layers

previous table of contents next
23 of 39
A conceptual stack for markup (or data representation generally).

Conceptual layers

previous table of contents next
24 of 39
A conceptual stack for markup (or data representation generally).

Conceptual layers

previous table of contents next
25 of 39
A conceptual stack for markup (or data representation generally).

Conceptual layers

previous table of contents next
26 of 39
A conceptual stack for markup (or data representation generally).

Conceptual layers

previous table of contents next
27 of 39
A conceptual stack for markup (or data representation generally).

Conceptual layers

previous table of contents next
28 of 39
A conceptual stack for markup (or data representation generally).

Conceptual layers

previous table of contents next
29 of 39
A conceptual stack for markup (or data representation generally).

Conceptual layers

previous table of contents next
30 of 39
A conceptual stack for markup (or data representation generally).

Conceptual layers

previous table of contents next
31 of 39
A conceptual stack for markup (or data representation generally).

Implications of the stack

previous table of contents next
32 of 39
XML Information Set: what is information, what is insignificant variation
XML 1.1: alignment with Unicode 3
XML Namespaces 1.1: improve processing on abstract model, re-serialization

World domination through markup

previous table of contents next
33 of 39
A typical data flow diagram.

What will data processing be?

previous table of contents next
34 of 39
When you have a good hammer ...

Further implications

previous table of contents next
35 of 39
If XML is a good way to represent complex data, then we'll also need:
  • APIs for XML
    • Document Object Model (DOM)
    • Simple API for XML (SAX)
  • XML querying
    • XPath
    • XQuery
    • XSLT
  • XML-to-XML transformations
    • XQuery
    • XSLT
  • XML messages for distributed applications
    • SOAP
    • Web Services description, choreography
    • Digital signatures, encryption
as well as definitions for specific XML applications (e.g. for corpora, for historical philological editions, for commerce of various kinds, ...)

What makes XML special?

previous table of contents next
36 of 39
  • ownership of information
    • freedom from vendor lock-in
    • freedom from pre-defined semantics
    • application independence

What makes XML special?

previous table of contents next
37 of 39
  • ownership of information
    • freedom from vendor lock-in
    • freedom from pre-defined semantics
    • application independence
  • responsibility for information
    • You must think about the processing you need.
    • You must decide what information you wish to capture, and what information you wish to discard.
    Sometimes painful, but worthwhile.

Answer to the question “What is enlightenment?”

previous table of contents next
38 of 39

Aufklärung ist der Ausgang des Menschen aus seiner selbst verschuldeten Unmündigkeit. Unmündigkeit ist das Unvermögen, sich seines Verstandes ohne Leitung eines anderen zu bedienen.

-Kant

Thank you

previous table of contents next
39 of 39
C. M. Sperberg-McQueen
Member, Technical staff
World Wide Web Consortium
cmsmcq@w3.org
For more information: http://www.w3.org/XML