Architecture of the World Wide Web

Ian Jacobs, TAG Editor
World Wide Web Consortium

Meeting of xml.gov XML Working Group
21 January 2004, Washington, D.C.

These slides online at:
http://www.w3.org/2004/Talks/0121-ij-xmlgov/
(zip file).

Contents

  1. W3C's Technical Architecture Group (TAG)
  2. Overview of Architecture of the World Wide Web
  3. Architecture Review of US Patent Office Web Site

W3C's Technical Architecture Group

W3C's Technical Architecture Group was chartered in 2001:

"to document and build consensus around principles of Web architecture and to interpret and clarify these principles when necessary."

Roles: write, coordinate, mediate

TAG Participants

TAG participants elected and appointed:

TAG, posing in front of same motorbike as in Sep 2002, in Vancouver

Why an Architecture Document?

Community Brings Issues to TAG

Teleconference and mailing list discussions ensue.

TAG Explores Problem Space

What makes HTTP GET important?

8 Apr 2002: Dan Connolly receives assignment to write strawman proposal. This evolves into a draft finding.

TAG Coordinates to Build Consensus

Groups Document Consensus

Ongoing: Marking safe operations in WSDL

What type of information is in the Architecture Document, Findings?

Example related to previous issue:

Principles, Constraints, Good Practice Notes
Rationale
Stories and examples

Architecture Tripod

  1. Identification
  2. Interaction
  3. Representation

Identification I: Why URIs?

Value of common syntax for global identifiers:

"Great multiplicative power of reuse derives from the fact that all languages use URIs as identifiers: This allows things written in one language to refer to things defined in another language. The use of URIs allows a language to leverage the many forms of persistence, identity, and various forms of equivalence." -- URIs, Addressability, and the use of HTTP GET and POST

Identification II: URI Usage

Due to global scope, URIs also used outside of Web protocols (e.g., as database keys).

Interaction I: Dereferencing a URI

Interaction II: Dereferencing a URI (illustration)

A resource (Oaxaca Weather Info) is
identified by a particular URI and is represented by pseudo-HTML
content

Interaction III: Managing Representations

Interaction IV: Issues Raised by Interaction

Representation: Data formats

Architecture Review of US Patent Office Web Site

Review of the United States Patent and Trademark Office revealed:

  1. HTTP GET used for database lookup (good)
  2. HTTP GET used for unsafe interactions (not good)
  3. URI for patent is actually URI for search (not optimal)
  4. POST used to protect sensitive login data (design choice)

HTTP GET used for database lookup

HTML "GET" form used for database lookup:

   <form action="/netacgi/nph-Parser" method="GET">

Use GET for queries, searches, database lookups.

HTTP GET used for unsafe interactions (not good)

Modifying state of shopping cart is unsafe since produces side-effect:

"Add to Cart" an HTML link:

   <a href=".../AddToShoppingCart?docNumber=6,678,889...">...

I cannot link to shopping cart from this slide; a search engine or pre-fetching agent might increment counter (cf. SVG 1.2, section 11.8.

In HTML, use "POST" form for unsafe operations.

URI for patent is actually URI for search (not optimal)

What might a URI for a patent look like?

   http://www.uspto.gov/patents/p6678889

Note that this is globally unambiguous; better than "6678889"

Search produced this URI for search on "hypertext":

   http://patft.uspto.gov/...s1=hypertext&OS=hypertext...

Search produced this URI for search by patent number 6,678,899:

   http://patft.uspto.gov/...s1=6,678,889.WKU....

Why are these URIs different if this is the same patent?

Cost of Arbitrarily Different URIs

At first, I thought these URIs were arbitrarily different URIs for the same resource. If so, machines cannot compare reliably, so:

Identify Results of Search, not Search

Resource only indirectly identified as query result.

Related in Architecture Document:

POST used to protect sensitive login data (design choice)

Think about these architecture issues, tradeoffs during design! See URIs, Addressability, and the use of HTTP GET and POST

Future work

Questions

Review period for Architecture of the World Wide Web open until 5 March 2004; see Call for Review.


Last modified: $Revision: 1.61 $