Notes on use of the i18n techniques dtd

This document provides a description of the extensions to xmlspec developed specifically to support the creation of i18n techniques. NOTE: this is all subject to constant change.

Since this incorporates nearly all the changes made for the Character Model for the World Wide Web 1.0, for the sake of completeness I have also added a description of elements used in the Character Model but not for the techniques documents.

General points

The current approach to document development in the i18n GEO task force uses an XML file referred to as a 'repository' to house a collection of 'techniques'. Each technique is a tersely-stated directive similar in level to the WAI techniques, plus accompanying descriptions, explanations, references, etc. A repository is likely to be focused on a specific w3c technology, eg. there are different repositories for HTML and CSS. Documents to be read by the user are created as 'views'. These views provide headings and structure to meet the needs of the user, and at an appropriate level pull in content from a repository. A view may repeat and reorder information in a repository, and may pull in information from several repositories, eg. a view aimed at HTML authors may pull content from HTML, CSS and Core repositories.

Both repository and view files currently use the same dtd, but some choices regarding structure are different between the two.

A repository file should normally just have 2 levels of heading in the body of the text. Sections should only contain geo-technique elements. Do not use the resource-list element in a repository file. Resource information should be attached to each technique - not provided at the section level.

A view may contain whatever is necessary to assist the user. This may include, for instance, introductory paragraphs at the beginning of sections that are authored in the xml file for the view itself, rather than a respository. Do not use the resources element in a view file. Resource information is gathered from the techniques in a given section by the XLST, and grouped (without redundancy) at the section level under the resource-list element.

As of 10 june 2003, there is one (HTML) repository, and one view (Authoring International HTML). Using XSLT on the view file, it is also currently possible to generate two alternate views containing only outline and resource information. No authoring is involved in this process.

As of 9 July 2003 there is a single DTD in use to support techniques documents for both GEO and WCAG groups. Since the constituents of a technique contain different required elements for these two groups the former technique element was split into a geo-technique and a wai-technique element. Most of the remaining structural and phrasal elements are shared, although there are some elements that are used only by GEO, and vice versa. These should be identified in the DTD code. This document describes only the elements needed for GEO techniques.

Additional elements and attributes

The following tables describe elements and attributes that have been added to xmlspec and that are expected to be of use for techniques development. Bolded attributes are required. Assume that attribute values are CDATA unless otherwise indicated.

Structural elements

element descriptionused by
geo-technique

A grouping element for information related to a technique

(short-name, checklist-item+, description, ua-issues*, resources*, test*, admin*, scripts?, ua-applicability)

  • @id: a unique id to identify the technique (I currently generate these through scripting as <authors-initials><date yyyymmdd><time to milliseconds>)

TBD: remove the * from ua-issues and resources (and maybe also from test and admin?)

%div.mix;
short-name

A brief summary of the content of a technique, visible in the html version of the repository for quick scanning, not visible in the view but used for content of include elements to clarify what they point to.

(%head.pcd.mix;)*

  • %common.att;
geo-technique
checklist-item

The do or don't text that summarises a technique. This should normally be one paragraph (or even one sentence if possible), but it is also possible to use a list to combine two inseparable statements in a single checklist-item.

(%hdr.mix;)*

  • %common.att;
geo-technique
description

groups paragraphs, examples, and other stuff describing the checklist-item associated with a particular technique

(%obj.mix;)*

  • %common.att;
geo-technique
ua-issues

groups a number of ua-issue elements

(ua-issue)+

  • %common.att;
geo-technique
ua-issue

describes a feature wrt UA support for a particular technique and a particular browser

(%obj.mix;)*

  • @name: the name of the browser
  • @version: the version(s) of the browser that exhibit(s) this feature
ua-issues
resources

This element should be used only in repository files.

Groups a number of pointers to additional resources associated with a given technique.

(resource)*

  • %common.att;

TBD: Note that the above is not totally truthful. The real content model is (resource*, (see-also | tech-source)*), but I have phased out the see-also and tech-source elements in favour of resources elements with appropriate resource-type attributes.

geo-technique
resource

A single pointer to other useful information supporting the description of the technique. The type of resource is indicated by the resource-type attribute. Different types of resource can be specified in any order (they will be grouped and ordered later by the XSLT).

The bibref element must point to one of the entries in the refs.xml file. If this is a simple web page that doesn't merit an unique entry in the refs.xml file, point to the URL entry in refs.xml. If you have not yet added the entry you can temporarily point to the TBD entry in the refs.xml file.

The content of the loc element (not described here because it is a standard xmlspec element) should specify the location of the resource as precisely as possible to help the reader. When pointing to large resources, such information may include the chapter heading and number, the page number (in a book), an indication of which paragraph to look for, etc. For small web pages, this can be the title of the page.

(bibref, loc, resource-descn)

  • @resource-type (source | background | implementation | other): identifies the type of resource. This information will be used for grouping and ordering resources in the XSLT-generated view data. source refers to sources for the information included in a technique; background refers to useful background reading related to this technique; implementation refers to information such as instructions that helps you take the next step and implement the recommendation in the technique; other is a catch-all, and may be split into other categories in the future.
  • @id: a unique id for this resource pointer

TBD: Consider whether other should be split into additional categories. For example, should the current xref element become <resource resource-type="xref">?

resources, resource-list
resource-list

For use in the view only. Used at the end of a section.

Its presence indicates that you wish to create a list of resources at the end of the section. It may also contain cross-references to other sections in the view. Do not include any resources or resource elements in this element.

(xref-list?, resource*, (%obj.mix;)*)

  • no attributes

TBD: Remove everything but the xref-list.

TBD: Ensure that an empty one of these triggers the XSLT to include the resources in the repository.

%local.div.mix;
xref-list

Used only in views.

Groups a number of cross-references.

(xref+)

  • no attributes
xref

A cross-reference to another section in a view. These are currently expressed typically as "What if such-and-such is the case?" or "How do I do such-and-such?".

(resource-descn, specref+)

  • no attributes
xref-list
resource-descn

Describes the target of a link.

In an xref element, these are currently expressed typically as "What if such-and-such is the case?" or "How do I do such-and-such?".

(%p.pcd.mix;)*

  • no attributes

TBD: Should probably have common attributes.

TBD: I can't remember whether different resource-descn content across techniques will create multiple pointers after XSLT processing. Need to check it out.

xref, resource
scripts

Indicates scripts for which this technique is relevant.

Choose either [a] 'all' or [b] 'na' or [c] one or more script types.

( all | na | (latin | alpha | arabic | hebrew | fareast | seasia | indic )+)

  • no attributes
geo-technique
all / na / latin / alpha / arabic / hebrew / fareast / seasia / indic

Values for the scripts element. These are elements rather than attribute values so that it is easy to ensure that only valid alternatives are being used. The element alpha represents Latin plus other alphabetic scripts, such as Greek, Cyrillic, Armenian, Cherokee, etc. fareast represents Chinese, Japanese and Korean scripts. seasia refers to Thai and similar scripts.

EMPTY

  • no attributes
scripts
ua-applicability

Provides information about which user agents, and which versions of those, support the recommendation in the technique.

This element should contain only one instance of any particular subelement. If there is no information about a particular user agent, simply leave out the subelement for that ua.

(ie|nn|op)*

  • no attributes
geo-technique
ie / nn / op

Each element represents a user agent, and if present indicates that the technique is applicable to that user agent, or a specific version of that user agent.

ie stands for Internet Explorer (Windows). nn stands for Netscape Navigator. op stands for Opera. Other elements may be added as information becomes available (eg. ie-mac, safari, etc.).

EMPTY

  • @version: 'Y' if the earliest version of that browser we are concerned with supports the recommendation of the technique, or if the technique is not browser specific. '-' if the browser does not support this recommendation. A number if only a later version of the browser (above the earliest versions we are concerned with) supports the recommendation (eg. '6' as the version for an ie element indicates that this is supported only for IE version 6 and above).
ua-applicability
test

Copied from WCAG but currently not used by GEO.

resources
admin

Copied from WCAG but currently not used by GEO.

resources

Grouping and block elements

element descriptionused by
figure

Associates the table or image with a caption such that it can be manipulated as a unit by the stylesheet. Even without a caption, this is useful for recognition as a block with figure-specific positioning rules.

((table | image) , caption?)

  • %common.att;
%illus.class
image

This definition of image uses an element for the alt information. That allows the alt text to be tagged for bidi, language, etc. and makes translation easier. Note that this is intended to be used in-line.

(img, alt)

No attributes

%illus.class
img

Displayed graphic. Replacement for the graphic element - removes the alt attribute (see image).

EMPTY

  • %common.att;
  • %simple-xlink.att;
  • %auto-embed.att;
  • @source: uri for the image file
  • @height: the height of the image
  • @width: the width of the image
image
alt

Text describing an image. Having the text in a separate element rather than an attribute a la HTML means that it can be given meta information such as lang, bidi, etc.

(%obj.mix;)*

  • %common.att;
image
include

Used to point to text in the repository that will be included at this point in the view.

(%head.pcd.mix;)*

  • %common.att;
  • @doc: the document containing the text to be included
  • @idref: the id of the element containing the text to be included
  • @mode: (technique|rule|link) There may be additional alternatives here in time. The mode setting carries a message to the xslt regarding what type of information is required - the specific items of information extracted and the structure given to them depends on the xslt template that corresponds to a match such as include [mode='technique']. Currently, the HTML Authoring view will not extract short-name or resource information for inclusion in the view at the point where the include element appears with mode set to technique. Rule would extract just the checklist-item - it is not yet used, but may soon be used to create a checklist type view. The link mode is currently used for resources in a resource list, although many such resources are still just cut and paste at the moment.
%local.div.mix; %local.obj.mix;

Phrase elements

element descriptionused by
bdo

Used to override the Unicode bidi algorithm for bidirectional scripts. Very useful for conversion of Hebrew and Arabic text to x/html .

(%p.pcd.mix;)*

  • @xml:lang
  • @dir: (ltr|rtl) The intended directionality.
%local.emph.class
rfc2119

Not used yet, but we may use it to surround words like MUST, SHOULD, etc. used for conformance requirements. Permits various alternatives for styling. (Used widely in CharMod)

(%tech.pcd.mix;)*

  • %common.att;
%tech.class
qterm

Words or short phrases being referred to / quoted. eg. "The word 'character' is used...". Do not use quote marks! These are added, if required by the stylesheet.

(%tech.pcd.mix;)*

  • %common.att;
%tech.class
qchar

One or a short sequence of characters/letters being referred to / quoted. eg. "The letter 'c' is the third..." Also serves for keyboard keys. Do not use quote marks! These are added, if required by the stylesheet.

(%tech.pcd.mix;)*

  • %common.att;
image
abbr

Surrounds abbreviations and acronyms and stores the expansion of the abbreviation (which could be shown as say a tooltip, or pronounced by a voice synthesiser).

(%tech.pcd.mix;)*

  • %common.att;
  • @expansion: the expanded form of the abbreviation.
%termdef.class
uname

Surrounds Unicode character names to allow styling, eg. COMBINING LONG SOLIDUS OVERLAY.

(%tech.pcd.mix;)*

  • %common.att;
%tech.class
ins

ins and del elements are for marking changes to the document. They were added here rather than using the @diff scheme provided by xmlspec because it was easier (quicker) to use for marking up phrase level text given the process in use for editing the Character Model (if using XMetal - just highlight the appropriate text and double-click the element in the element list). Using xslt, the elements were mapped very simply to elements of the same name in xhtml.

(%p.pcd.mix;)*

  • %common.att;
%local.emph.class
del

See the description for ins above.

(%p.pcd.mix;)*

  • %common.att;
%local.emph.class

Additional changes made for CharMod

element descriptionused by
req

The req and req-list elements are used to highlight the actual requirements embedded in the (flowing) text. Each requirement is associated with one or more of S, I, or C (specifications, implementations, or content) to indicate relevance. This is rendered currently as yellow background, with relevance indicated by appropriate letters (S,I, or C) enclosed in square brackets at the beginning.

(req-type+ , req-text )

  • No attributes.
%tech.class
reqlist

See the description of req above

(req-type+ , req-text , (%list.class;)?)

  • No attributes.
%div.mix
req-text

The text of a requirement.

(%p.pcd.mix;)*

  • %common.att;
req, req-list
req-type

One of the letters 'S', 'I' or 'C' - standing for specifications, implementations and content, respectively.

(%tech.pcd.mix;)*

  • %common.att;
req, req-list

Other changes to xmlspec

The following tables describe elements and attributes that have been added to xmlspec and that are expected to be of use for techniques development. Bolded attributes are required.

Things still to be done

The following tables describe elements and attributes that have been added to xmlspec and that are expected to be of use for techniques development. Bolded attributes are required.


Richard Ishida, 7 feb 2003