$Id: html-authoring.html,v 1.1 2003/02/18 15:29:05 rishida Exp $

Authoring Techniques for XHTML & HTML Internationalization 1.0

W3C Working Draft dd mmmm 2003

This version:
Latest version:
Previous version:
Richard Ishida, W3C <ishida@w3.org>


This document provides HTML authors with techniques for developing internationalized HTML using XHTML 1.0 or HTML 4.01, supported by CSS1, CSS2 and some aspects of CSS3. The term author is used in the sense described by the HTML 4.01 spec, ie as a person or program that writes or generates HTML documents.

Status of this Document

This document is an editors' copy that has no official standing.

This section describes the status of this document at the time of its publication. Other documents may supersede this document. The latest status of this series of documents is maintained at the W3C.

This is a very early working draft. It is undergoing constant and frequent modification.

This document is published as part of the W3C Internationalization Activity by the Internationalization Working Group, with the help of the Internationalization Interest Group. The Internationalization Working Group will not allow early implementation to constrain its ability to make changes to this specification prior to final release. Publication as a Working Draft does not imply endorsement by the W3C Membership. It is inappropriate to use W3C Working Drafts as reference material or to cite them as other than "work in progress". A list of current W3C Recommendations and other technical documents can be found at http://www.w3.org/TR/.

Table of Contents

1 Document structure & metadata
    1.1 Creating an internationalised page header
    1.2 Using link elements
    1.3 International layout considerations
2 Navigation
    2.1 Navigating to the right localised web site
    2.2 Implementing international contact pages
3 Character sets, character encodings and entities
    3.1 Choosing an encoding
    3.2 Specifying the character encoding
    3.3 Referring to specific characters
    3.4 Dealing with undisplayable fonts
4 Specifying the language of content
    4.1 Identifying the primary language
    4.2 Identifying language change
    4.3 Specifying the language of a link destination
    4.4 Specifying language codes
5 Text direction
    5.1 Setting directionality for an entire document in a bidirectional script
    5.2 Changing the directional properties of a part of the text
    5.3 Overriding the Unicode bidirectional algorithm
6 Text markup
    6.1 Emphasis
    6.2 Acronyms & abbreviations
    6.3 Quotations
    6.4 Ruby
7 Lists
    7.1 Implementing language-specific list markers
8 Tables
    8.1 Mirroring tables in bidirectional text
9 Links
    9.1 Including encoding and language information in links
    9.2 Keyboard access to links
10 Objects
    10.1 Determining the runtime locale for an object
    10.2 Dealing with embedded objects with different encodings
11 Images
    11.1 Creating culturally appropriate graphics
    11.2 Using text in graphics
    11.3 Using color
    11.4 Dealing with directional bias in graphics
    11.5 Supplying graphics to the localisation group
12 Multimedia
    12.1 Animation
    12.2 Voice
    12.3 Music
    12.4 Creating culturally appropriate multimedia objects
13 Forms
    13.1 Keyboard access to forms
    13.2 Creating culturally appropriate forms
    13.3 Graphical buttons
    13.4 Dealing with character sets & encodings
14 Keyboard shortcuts
15 Writing source text
    15.1 Text fragmentation and re-use
    15.2 Ordering text
    15.3 Writing clear, understandable text
    15.4 Using metaphors, examples and humour
    15.5 Using abbreviations & acronyms
    15.6 Applying visual style conventions
    15.7 Use of pre text
16 Handling elements that vary by locale
    16.1 Date & time
    16.2 Numbers, currency, measurements, addresses, telephone numbers, personal names, paper sizes...
17 Supplying data for localisation
18 Client-side scripting


A References
    A.1 References

1 Document structure & metadata

2 Navigation

3 Character sets, character encodings and entities

4 Specifying the language of content

5 Text direction

5.1 Setting directionality for an entire document in a bidirectional script

Browser: IE; Version: 5+

In Internet Explorer adding the dir attribute to the html tag also moves the scroll bar to the left of the browser window.

5.2 Changing the directional properties of a part of the text

If the dir attribute is added to a block element, all subordinate elements inherit the directionality (unless of course their directionality is changed explicitly using a different value for dir). Elements and their contents will flow from the right of the displayed page towards the left.

At a simple level the Unicode bidirectional algorithm takes care of the reordering of inline text, but where there is nesting of directionality the dir attribute needs to be used.

5.3 Overriding the Unicode bidirectional algorithm

6 Text markup

7 Lists

8 Tables

9 Links

10 Objects

11 Images

12 Multimedia

13 Forms

14 Keyboard shortcuts

15 Writing source text

16 Handling elements that vary by locale

17 Supplying data for localisation

18 Client-side scripting

A References

A.1 References

Martin J. Dürst, Requirements for String Identity Matching and String Indexing, W3C Working Draft. (See http://www.w3.org/TR/WD-charreq.)
D. Connolly, Character Set Considered Harmful, W3C Note. (See http://www.w3.org/MarkUp/html-spec/charset-harmful.)
Bert Bos, Håkon Wium Lie, Chris Lilley, Ian Jacobs, Eds., Cascading Style Sheets, level 2 (CSS2 Specification), W3C Recommendation. (See http://www.w3.org/TR/REC-CSS2.)
DOM Level 1
Vidur Apparao et al., Document Object Model (DOM) Level 1 Specification, W3C Recommendation. (See http://www.w3.org/TR/REC-DOM-Level-1.)
Ben Chang, Jeroen van Rotterdam, Johnny Stenback, Andy Heninger, Joe Kesselman, Rezaur Rahman Eds., Document Object Model (DOM) Level 3 Abstract Schemas and Load and Save Specification, W3C Working Draft. (See http://www.w3.org/TR/DOM-Level-3-ASLS.)
HTML 4.0
Dave Raggett, Arnaud Le Hors, Ian Jacobs, Eds., HTML 4.0 Specification, W3C Recommendation, 18-Dec-1997 (See http://www.w3.org/TR/REC-html40-971218.)
HTML 4.01
Dave Raggett, Arnaud Le Hors, Ian Jacobs, Eds., HTML 4.01 Specification, W3C Recommendation. (See http://www.w3.org/TR/html401.)
Martin Dürst, Michel Suignard, Internationalized Resource Identifiers (IRIs), Internet-Draft, April 2002. (See http://www.w3.org/International/2002/draft-duerst-iri-00.txt.)
Info URI-I18N
Internationalization: URIs and other identifiers. (See http://www.w3.org/International/O-URL-and-ident.)
ISO/IEC 14651
ISO/IEC 14651:2000, Information technology -- International string ordering and comparison -- Method for comparing character strings and description of the common template tailorable ordering as, from time to time, amended, replaced by a new edition or expanded by the addition of new parts. (See http://www.iso.ch for the latest version.)
ISO/IEC 9541-1
ISO/IEC 9541-1:1991, Information technology -- Font information interchange -- Part 1: Architecture. (See http://www.iso.ch/iso/en/CatalogueDetailPage.CatalogueDetail?CSNUMBER=17277 for the latest version.)
David Carlisle, Patrick Ion, Robert Miner, Nico Poppelier, Eds., Mathematical Markup Language (MathML) Version 2.0, W3C Recommendation. (See http://www.w3.org/TR/MathML2.)
Gavin Nicol, The Multilingual World Wide Web, Chapter 2: The WWW As A Multilingual Application. (See http://www.mind-to-mind.com/library/papers/multilingual/multilingual-www.html.)
RFC 2070
F. Yergeau, G. Nicol, G. Adams, M. Dürst, Internationalization of the Hypertext Markup Language, IETF RFC 2070, January 1997. (See http://www.ietf.org/rfc/rfc2070.txt.)
RFC 2277
H. Alvestrand, IETF Policy on Character Sets and Languages, IETF RFC 2277, BCP 18, January 1998. (See http://www.ietf.org/rfc/rfc2277.txt.)
RFC 2279
F. Yergeau, UTF-8, a transformation format of ISO 10646, IETF RFC 2279, January 1998. (See http://www.ietf.org/rfc/rfc2279.txt.)
RFC 2718
L. Masinter, H. Alvestrand, D. Zigmond, R. Petke, Guidelines for new URL Schemes, IETF RFC 2718, November 1999. (See http://www.ietf.org/rfc/rfc2718.txt.)
RFC 2781
P. Hoffman, F. Yergeau, UTF-16, an encoding of ISO 10646, IETF RFC 2781, February 2000. (See http://www.ietf.org/rfc/rfc2781.txt.)
SPREAD - Standardization Project for East Asian Documents Universal Public Entity Set. (See http://www.ascc.net/xml/resource/entities/index.html)
Jon Ferraiolo, Ed., Scalable Vector Graphics (SVG) 1.0 Specification, W3C Recommendation. (See http://www.w3.org/TR/SVG.)
UTR #10
Mark Davis, Ken Whistler, Unicode Collation Algorithm, Unicode Technical Report #10. (See http://www.unicode.org/unicode/reports/tr10.)
UTR #17
Ken Whistler, Mark Davis, Character Encoding Model, Unicode Technical Report #17. (See http://www.unicode.org/unicode/reports/tr17.)
Martin Dürst and Asmus Freytag, Unicode in XML and other Markup Languages, Unicode Technical Report #20 and W3C Note. (See http://www.w3.org/TR/unicode-xml.)
Steve DeRose, Eve Maler, David Orchard, Eds, XML Linking Language (XLink) Version 1.0, W3C Recommendation. (See http://www.w3.org/TR/xlink.)
XML 1.0
Tim Bray, Jean Paoli, C. M. Sperberg-McQueen, Eve Maler, Eds., Extensible Markup Language (XML) 1.0, W3C Recommendation. (See http://www.w3.org/TR/REC-xml.)
XML Schema-2
Paul V. Biron , Ashok Malhotra , Eds., XML Schema Part 2: Datatypes, W3C Recommendation. (See http://www.w3.org/TR/xmlschema-2.)
XML Japanese Profile
MURATA Makoto Ed., XML Japanese Profile, W3C Note. (See http://www.w3.org/TR/japanese-xml.)
James Clark, Steve DeRose, Eds, XML Path Language (XPath) Version 1.0, W3C Recommendation. (See http://www.w3.org/TR/xpath.)
XQuery Operators
Ashok Malhotra, Jim Melton, Jonathan Robie, Norman Walsh, Eds, XQuery 1.0 and XPath 2.0 Functions and Operators, W3C Working Draft. (See http://www.w3.org/TR/xquery-operators.)
James Clark Ed., XSL Transformations (XSLT), W3C Recommendation. (See http://www.w3.org/TR/xslt.)