Authoring Techniques for XHTML & HTML Internationalization 1.0 -- (Editors' copy)

1 Introduction

1.1 Who should use this document

All HTML content authors working with XHTML 1.0, HTML 4.01, CSS1, CSS2 and CSS3.

The term author is used in the sense described by the HTML 4.01 spec, ie. as a person or program that writes or generates HTML documents.

This document provides guidance for the development of HTML so that it will support international usage. This is the responsibility of all content authors, not just the localization group, and is relevant from the very start of development. Ignoring the advice in this document, or relegating it to a later phase in the development, will only add unnecessary costs and resource issues at a later date.

1.2 How to use this document

To improve usability, the table of contents of this document represents tasks that a developer of XHMTL/HTML content may want to perform.

It is expected that this document will normally be used for reference purposes - the reader dipping in to a particular section to find out how to perform a specific task with internationalization in mind. If you are new to this topic you may, however, wish to read this document from end to end.

To further assist usability as a reference, an outline version of the document is available. There is also a version that contains only resource links. The reader can switch between outline, resource, and detailed versions by clicking on icons alongside section headings.

Note that, to support its use as a quick reference, the same material will occasionally be repeated in more than one section.

Cross references and further resources are summarized at the end of each section.

Editorial notes have been left in this version of the document. These are marked [Ed. note: like this].

It is assumed that readers of this document are proficient in developing HTML and XHTML pages - this document limits itself to providing internationalization advice.

1.3 Standards addressed

This document provides techniques for developing pages using HTML 4.01 or XHTML 1.0 with CSS1, CSS2 and CSS3.

Note that XHTML source can be served as XML (using MIME types application/xhtml+xml, application/xml or text/xml) or HTML (using the MIME type text/html).

It is very common for XHTML to be served as HTML, following the compatibility guidelines in Appendix C of the XHTML 1.0 specification. This allows authors with the right editing tools to produce valid XML code, which therefore lends itself to processing with such things as scripting or XSLT, but is also well supported for display by most mainstream browsers. (XHTML served as application/xhtml+xml is not well supported for browser display at the moment.) In this document we wish to reflect practical reality for content authors, so we cover XHTML served as text/html in the techniques.

Indeed we encourage the use of XHTML, and all the examples (unless trying to make a specific point about HTML 4.01) are written in XHTML.

For XHTML served as XML, this document limits its advice to documents served as application/xhtml+xml. Note that user agent support for XHTML served as XML is still patchy.

1.4 User agents addressed

In order to improve the value of this information to the user we try to ground techniques with information about their applicability to particular user agents.

User agents, in this current version, means a number of mainstream browsers. (The scope may grow as resources and test results become available for other user agents.)

In an attempt to make the task of tracking browser applicability manageable, we have chosen a 'base version' for each of the user agents we are tracking for applicability. This base version represents a fairly recent, standards-compliant version of the browser. Where a browser operates in both standards- and quirks-mode, standards-mode is assumed (ie. you should use a DOCTYPE statement).

The base versions considered for this version of the document include:

Internet Explorer 6 (Windows)
Netscape Navigator 7
Opera 7

If the technique is applicable to a base version of a user agent the name of that user agent will appear immediately below the summary of the technique. If the technique is not applicable, the name will appear crossed out. If the name does not appear at all, this signifies that further investigation is needed. If the technique is applicable to a later version than the chosen base version, this will be indicated by adding the version number to the name.

Plans exist to provide information relating to the following additional user agents as work on the document progresses:

Internet Explorer 5 (Mac)
Safari
Mozilla

Detailed information may also be provided from time to time about behavior of a user agent in an earlier version than the base version, or about some particular aspect of the behavior of a base version or later user agent. This is provided in a special boxed section within the body of the text.

2 Document structure & metadata

2.1 Internationalizing the page header

Creating an internationalized page header principally consists of declaring the encoding and language of the document.

For HTML documents and XHTML documents served as text/html, always use the meta element to explicitly declare the document's character encoding.

1	2	3
مكتب W3C הישראלי	مكتب W3C הישראלי	مكتب W3C הישראלי