Use accesskey "n" to jump to the internal navigation links at any point. Right now you can
Tutorial that provides you with an understanding of key requirements for implementing writing systems in information technology. It does this by examining real examples of a wide range of modern scripts to discover features that a computerized implementation must support. It also makes special reference, where appropriate, to how the Unicode Standard points the way forward for meeting these requirements.
Pages illustrating the features of numerous non-Latin writing systems. These pages also allow you to experiment with various CSS3 styling techniques using dynamic HTML.
Describes some of the basic principles underlying how the Unicode Bidirectional Algorithm works, and some scenarios where inline markup or codes are needed to correctly render web content written in a right-to-left script. Still a draft !
Paper delivered at the Unicode Conference in Sept 2002 and Mar 2003. Most people will need to read the PDF version to see the Indic text.
In-progress draft of notes that list the symbols used to represent Bengali, describe their use, and relate them to appropriate characters for representation in Unicode. There is an index of shapes you can use to look up Bengali glyphs and track them down to their constituent Unicode codepoints.
In-progress draft of notes that list the symbols used to represent Urdu with the Arabic script, describe their use, and relate them to appropriate characters for representation in Unicode. I still need to address which Unicode characters are most appropriate when there is a choice, but there is already a lot of information about the use of letters and symbols when writing Urdu. Most people will need to read the PDF version to see the nasta'liq font, but there is also an HTML version.
Is it correct that simplified and traditional Chinese are not completely separate sets of code entries in Unicode? If so, are they simply like two different fonts for the same Unicode point? Would I have to have a simplified and a traditional font installed? One traditional character may correspond to several simplified ones, right?
People who want to point to pages in other languages or for other countries keep asking me where to find information about how to write a country or language name in the native language and script, so I thought I'd try to put a list together myself. This is a draft listing language names - corrections and additions are welcome!
Following the same rationale as the previous item, this draft lists names of countries in their own script. Note that I am currently working on the official list of countries as used by the UN that is likely to supercede this list.
Designers must be very careful about how they split up and reuse text on-screen, since the linguistic differences between languages can lead to real headaches for localizers and may in some cases make a reasonable translation impossible to achieve. (Article in Multilingual Computing magazine).
The HTML specification suggests that the link element can be used by search engines to find alternate translations of the current page. Some browsers expose the link information on the user interface. Andrew Cunningham and I wrote a test for this. Here is a summary of the results of some brief testing of mainstream browsers on Windows XP.
Some browsers apply the fonts listed in the user font preferences to the display of HTML Unicode text in Traditional Chinese, Simplified Chinese, Japanese and Korean, depending on the setting of the lang/xml:lang attribute. Here is a summary of the results of some brief testing of mainstream browsers on Windows XP. I may update this as additional information becomes available.
Describes a hack that allows you to do language negotiation across files that are not necessarily in the same directory. It is a method described by Dominique Hazaël-Massieux with a couple of refinements I added relating to handling default files and language extensions appearing before the .html extension. Note: I’m not convinced that it’s a good idea.
Should I declare the language of my XHTML document using a language attribute, the Content-Language HTTP header, or a meta element?
How do I use .htaccess directives on an Apache server to serve files with a specific encoding?
How do I use the MultiViews approach on an Apache Web server to automatically serve resources in the language requested by an HTTP header?
What is 'ruby'?
What are the trade-offs between international sites that are monolingual vs. multilingual?
How do I check or change the language settings of my browser?
What is the 'Document Character Set' for XML and HTML, and how does it relate to the encodings I use for my documents?
Can I write HTML and XML element and attribute tag names in languages and scripts other than English?
Why does my browser collapse spaces between Latin and Arabic/Hebrew text?
To correctly format bidi text in XHTML/HTML pages, should I use Unicode control codes or markup?
Should I use CSS or markup to correctly format Unicode-based bidi text in HTML and XML-based markup languages?
Effective localization of XML documents begins with the development of an internationalized document structure. (Article in Multilingual Computing magazine).
Requirements to guide DTD or Schema developers, or for an internationalised tag set or namespace that can be included in DTDs.
When I was learning to use FO I needed a table that showed at a glance what each formatting object's children were and what properties supported it.
Implementor's cheat sheet - gives fast access to WAI's recommendations for those implementing HTML (still not quite finished, and still very badly styled, but works for most common things).
I use XMetal for all my XHTML editing because it ensures validity and I find it very easy to add, move and change tags and attributes. This article describes how I set up my environment to handle XHTML, in the hope that others might find bits of it useful to get started quickly.
Tips I've picked up and want to remember to help me use XMetal. (currently about using as XHTML editor).
Latest delivery: IUC24, Atlanta, Sep 03. A short tutorial explaining how to go about creating XHTML and HTML pages containing text written in the Arabic or Hebrew scripts. It examines how best to achieve the correct effect for these bidirectional scripts using appropriate markup, CSS properties and Unicode code points or entities. It covers the basics, and goes beyond to provide recommended techniques for some of the tricky situations that even native speakers can struggle with. It assumes a basic familiarity with the bidirectional characteristics of Arabic and Hebrew, as well as HTML and CSS.
Last delivered at LISA Forum, London, July 03. Describes the W3C I18N Activity and mentions key highlights of recent and planned work.