MWI Team Blog
Categories: Current state (32) | Developing Countries (15) | Events (16) | Looking forward (11) | News (37) | Technical (30) |
All webs in one, One Web for all (part 2) — 23 March 2009
This post is the second post of the One Web series, and tours the two principles and five good practices of the architecture of the Web that yield the core of One Web. It does stay on the theory side, but note it comes with small and handy figures :) Part 3 is now available.
A Web of URIs
The Architecture of the World Wide Web specification defines the Web as an information space in which the items of interest, referred to as resources, are identified by global identifiers called Uniform Resource Identifiers (URI). Hey, where are the links? Links are but a direct consequence of the use of global identifiers: any resource of the Web can link to any other resource, because each resource is precisely identified by its global identifier.
Principle: The Web is based on global identifiers
The global identifiers could have taken different forms. They happen to be URIs in practice. All linking mechanisms on the Web use URIs. URIs are identifiers, and should be regarded as such. In particular, they do not carry any information on the resource itself.
Principle: URIs are opaque
Example
Looking at your logs, you notice that some users try to access your Web site through mobile devices that only support WML. You decide to use content adaptation and create a WML version of your Web site for them. The problem is that the URI they use is http://example.com/index.html. Well, it is not a problem! URIs are opaque, you can simply go ahead and serve WML content for that URI when it is requested by WML-only mobile devices!
Links equal value
The value of the Web lies in its ability to link similar resources together, either through explicit links between the resources, or through implicit links (e.g. two resources targeting the same resource, two resources that match the same keyword search in a search engine). The value of the Web could be defined as the sum of the values of its resources.
Example
Let's imagine that the Web is composed of three resources: A, B and C. Resources B and C link to resource A. The situation is represented in the diagram below.
Resources B and C get their value from the fact that they both link to resource A. For the sake of simplicity, let's say that their value is 1. Resource A gets its value from the fact that it is the target of two different links, and from the fact that resources B and C, at the origin of the links, have a value. Summing up, let's say that the value of resource A is 4. The total value of the Web in this case is 6.
The value of the Web is reduced each time a resource escapes or divides the Web. This happens when:
- Resources are identified with global identifiers that are not URIs as it de facto creates another Web and divides the information space into different spaces that cannot inter-communicate.
Good practice: Identify resources with URIs
- A single URI identifies more than one resource. Resources identified by the URI are hidden from the rest of the world that can only link to the visible URI.
Good practice: Use distinct URIs to identify distinct resources
Example
If we add two resources hidden behind resourceA, we cannot tell whether resourcesBandClink toAor to one of the other resources. The value ofAis reduced, say, to 2. The total value of the Web is reduced to 4. - The URI of a resource changes. When a URI changes, links that used the previous URI are broken, reducing the value of the resources that contain the broken links. Cool URIs don't change!
Good practice: URIs should be persistent
Example
When the URI of resourceAchanges, resourcesBandCnow link to nothing. The value of resourceAis reduced to 0. The total value of the Web is reduced to 2. - The type of information carried out by a resource changes. Links to the resource are still valid, but their value is reduced because they now link to a different resource.
This good practice does not mean that the content in itself cannot change. The actual content of a resource that returns "breaking news information about strikes in France" is likely to change pretty often, but it should always return "breaking news information about strikes in France".
Good practice: Resources should return predictable information
Example
If the URI that used to identify resourceAnow identifies resourceD, resourcesBandCnow link to something by mistake. The value of resourceDis simply 0. The total value of the Web is reduced to 2. - Different URIs are used to identify the same resource. In this case, the problem arises because different resources that link to the resource will use different URIs, creating artificial copies of the resource.
Good practice: Do not use different URIs to identify the same resource
Example
If resourceAis identified by two different URIs, and resourceBlinks to resourceAusing the first URI, while resourceCuses the second URI, resourcesBandCnow apparently link to two different copies of resourceAand are not connected anymore. Their respective value is 0.
ResourceAis duplicated. One copy is the target of one link originating from resourceB, the other from resourceC. Each copy has thus a value of 1. The total value of the Web was reduced from 6 to 2!
One Web is NOT One Version
There is one important message that the One Web vision does not carry: an hypothetical need to return the same version for each and every device. A Web site may use content adaptation, i.e. return different versions of a resource depending on the capabilities of the device and/or the context of the user. In fact, content adaptation is explicitly encouraged to improve the user experience on devices. Exploit devices capabilities is one of the mobile web best practices, and it does fit One Web!
How many versions should there be? 1, 10, 100, 1000? Content authors will certainly want to reduce the number of versions they have to maintain to a bare minimum, to one whenever possible. With a little bit of (hard) work and the possible help of content adaptation tools, it is usually possible to generate all versions from one single source. Could this single source be served to everyone so that we could simply drop the need to adapt the content? In some cases, yes. The Web site of the W3C Mobile Web Initiative for instance does not use content adaptation (but for one or two exceptions that are out of scope of this post) and was designed to work on a majority of devices. One may argue that the content of this Web site is "simple". As of today, it is true that specific versions are required to take advantage of the specificities of various devices. I'll get back to that in the last part of this series. The one thing that I would like to stress out here is that One Web is neutral when it comes to specifying how many versions there should be, provided that the good practices mentioned above are applied, of course!
Fine. Now you know One Web. How does that connect to the real-world?
[On to part 3...]
Comments, Pingbacks:
Principle: The Web is based on global identifiers
See also http://www.w3.org/TR/2004/REC-webarch-20041215/#pr-global-id / Voir aussi http://opikanoba.org/tr/w3c/webarch/#pr-global-id
Principle: URIs are opaque
See also http://www.w3.org/TR/2004/REC-webarch-20041215/#pr-uri-opacity / Voir aussi http://opikanoba.org/tr/w3c/webarch/#pr-uri-opacity
Good practice: Identify resources with URIs
See also http://www.w3.org/TR/2004/REC-webarch-20041215/#pr-use-uris / Voir aussi http://opikanoba.org/tr/w3c/webarch/#pr-use-uris
Good practice: Use distinct URIs to identify distinct resources
See also http://www.w3.org/TR/2004/REC-webarch-20041215/#pr-uri-collision / Voir aussi http://opikanoba.org/tr/w3c/webarch/#pr-uri-collision (erreur de traduction de « Single Resource »)
Good practice: URIs should be persistent
See also http://www.w3.org/TR/2004/REC-webarch-20041215/#URI-persistence / Voir aussi http://opikanoba.org/tr/w3c/webarch/#URI-persistence
Good practice: Do not use different URIs to identify the same resource
See also http://www.w3.org/TR/2004/REC-webarch-20041215/#avoid-uri-aliases / Voir aussi http://opikanoba.org/tr/w3c/webarch/#avoid-uri-aliases
A Web site may use content adaptation, i.e. return different versions of a resource depending on the capabilities of the device and/or the context of the user.
See also http://www.w3.org/TR/2004/REC-webarch-20041215/#def-coneg / Voir aussi http://opikanoba.org/tr/w3c/webarch/#def-coneg
This post has 3 feedbacks awaiting moderation...