Web for Everyone?

This presentation was given as a lightning talk at the Technical Plenary in Boston, March 2005, by Richard Ishida.

Title slide

The tag line on the home page of the W3C Internationalization (i18n) Activity is "Making the World Wide Web truly World Wide".

Tifinagh revisited

The ancient Berber script, Tifinagh, was developed around 500 BC, from Pheonician. The top line of the slide means 'Hello' in Berber. The lower line shows some additional characters from the Berber alphabet.

Although after around a thousand years the Berber script began to be less widely used, it has still continued to exist here and there.

In November of 2004, however, there was a workshop in Rabat, Morocco, organised by the Royal Institute for Berber Culture (IRCAM). IRCAM has been working, with the support of the Moroccan King, to reintroduce the widespread use of Tifinagh in Morocco to represent the Berber language. It is a fascinating idea, with few parallels in the modern world outside the revival of Hebrew and Welsh languages.

The Berber script is closely associated, however, with the Berber identity. It's symbols reappear in Berber art and patterns. Reintroducing its use is seen as an affirmation of this ancient culture. Children are already being taught the Berber script in Moroccan schools.

Of course, in today's world it is essential to enable use of the script with computerised technology. This is in progress. IRCAM has been working on ISO standards for keyboards and sorting. There are free fonts available (eg. Hapax Berbère). But most importantly, Unicode 4.1, which is soon to be released, will contain codepoints for all the necessary Tifinagh characters.

Tifinagh web page

Since the proposed Unicode code points are already known, you can already create Web pages using Tifinagh. This picker allows you to type in Tifinagh characters and then cut and paste them into documents. Characters are roughly arranged as per the Tifinagh keyboard proposal.

Problems with XML

You could also create XML content in Tifinagh. However, there is an issue with XML.

Let's suppose, for example, that you had a mission critical application that required you to supply the names of days of the week, with no spelling errors. You might create an enumerated list of the days. This would ensure that incorrect spellings are spotted during validation, and may also assist an author by allowing them to select the appropriate day from a pull-down list, rather than possibly making typos that need to be corrected.

If you tried to do the same thing with Tifinagh, however, it wouldn't work if you were using XML 1.0. This is because XML 1.0 is based on and only supports characters in version 2 of the Unicode Standard. Tifinagh characters do not appear before version 4.1 of the Unicode Standard.

Some people may argue that Tifinagh is not very important, so this is not a major defect of XML 1.0. Note, however, that you would encounter the same problems with an application using Ethiopic script. This too was added after version 2 of the Unicode standard.

Excluded scripts

In fact, there are a number of scripts that have been added since Unicode 2.0. This slide lists additional scripts that are all in everyday use in the modern world, but that cannot be used in XML 1.0 for element or attribute names or for purposes such as the enumerated list shown earlier:

Ethiopic (Ethiopia, Eritrea, Somalia)
Sinhala (Sri Lanka)
Myanmar (Myanmar)
Khmer (Cambodia)
Canadian Syllabics (Canada)
Thaana (Maldives)
Mongolian (Mongolia)

For the roughly 150 million people using these scripts around the world, the 'X' in XML is likely to look more like the 'X' in 'exclusion' than the 'X' in 'extensible'.

Indeed, many loose characters have also been added to scripts that were already included in Unicode 2.0. This can present an even more problematic situation than a missing script. In some cases the additions are commonly used by people writing text in Unicode, and it will not be clear to them what versions of Unicode support what characters. If a person using, say, an Indic script includes one of these characters in their XML, it will be extremely difficult for them to work out why the XML doesn't validate.

User opinion

As we look to extending the Web to all corners of the globe, this leads to some problems. Users don't typically want to hear about technical difficulties. They just want things to work for them in their cultural environment.

In a recent article by some people introducing XML to the Ethiopic community we read the following:

The notion of XML not being able to support complete native markup for certain scripts ... seems to carry a negative stigma of unfairness and lack of equal representation of writing systems around the world.

There is a solution

There is a solution to this problem. XML 1.1 decouples the link between XML and Unicode versions. If you use XML 1.1, all these characters we have discussed are available. And yet the take-up of XML 1.1 by technologies within the W3C has been slow.

The slow up-take is down to differing views and arguments in the technical community. During this Technical Plenary we have already discussed XML 1.1 adoption, and the naysayers have said things like "XML 1.1 is disruptive", "XML 1.1 is not a success", and "Don't mess with the foundation". The problem is that we may have built a beautiful building with fine foundations, but there are still many people out there sitting on the street.

These people don't care about the technical issues; they just want the Web to work for them. Whether we package this as "Web for everyone", "Overcoming the digital divide", or "Customer First!", I believe we should use the outstanding technical expertise available in our Working Groups to address the technical barriers to making the World Wide Web worldwide, and finish the job.

Content created 2 March, 2005. Last update 2005-03-12 17:36 GMT

Copyright © 2004 W3C^® (MIT, ERCIM, Keio), All RightsReserved. W3C liability, trademark, document useand software licensing rules apply. Your interactions with this site are inaccordance with our public and Member privacy statements.

Related links

Alternative views