This document contains examples in another language or script.
Accesskey n skips to in page navigation. Skip to the content start
Slide by slide You can view larger versions of the slides by clicking on these icons or the
slide images.
Slide text If you want to copy the text on the slides, click on these icons.
Overview A list of headings to help you navigate around the presentation quickly.
on this page: Front matter - Universal access - Characters - IRIs - Localization - Cultural differences
This talk was delivered by Richard Ishida at the meeting to celebrate the 10th anniversary of the World Wide Web Consortium in Europe, in June 2005.
Additional commentary will be added shortly to convey the messages on the slides.
This material is organized around a set of presentation slides which can be viewed in several ways. Each view is identified by an icon as described below.
All in one A single page containing all explanatory text followed by small accompanying slides.
Slide by slide One page per slide view. This is particularly useful if you need to see the detail on a slide.
Slide text This page by page version of the slides is provided mainly for those who want to cut and paste the text on the slides. (You will need appropriate fonts and rendering software to see the text correctly.)
Overview The overview provides a list of headings to help you navigate around the presentation quickly.
Please send any comments to ishida@w3.org.
The W3C has always placed great emphasis on the importance of 'The Web for Everyone'. The Internationalization Activity, within the W3C, has the mission of making the World Wide Web world wide. The slide shows that phrase in 15 different writing systems*.
One of the things to note on this slide is that English is just another language.
Another is that only a few years ago it would have been extremely unusual to see all these scripts correctly rendered on the same page. Initially this was a problem related to character encoding.
Not so long ago, the most widely used character encoding was ASCII. This was a character encoding based on a one-byte-per-character approach, where one byte contained 7-bits. The 7-bits allowed for a grid of 128 possible numbers, each of which could be used to represent a different character. Some numbers were associated with control codes, and the graphical characters in the set comprised the English alphabet and a number of symbols.
For Western Europe additional accented characters were needed. When 8-bit bytes became more prevalent, one byte could express 256 different characters. This was better, but still not good enough.
To support scripts such as Greek we resorted to multiple code pages. These had to be swapped in and out to deal with text in multiple scripts, but could not be combined on a single Web page.
All of this was academic to people in the Far East. This slide shows the size of some of the character sets that had to be encoded for this region. Each country requires the availability of thousands of characters at a time. This led to the creation of 'double-byte' encodings. Even so, you could not usually represent more than one of the Far Eastern languages per encoding.
Author: Richard Ishida.