Notes on use of the xmlspec-i18n dtd

This document provides a description of the extensions to xmlspec developed specifically to support the creation of i18n documents. The DTD developed for techniques document development (xmlspec-tech.dtd) has been revised to incorporate the version 2.9 of xmlspec and additional markup from the Internationalization Tag Set (ITS) 1.0 specification. It is now being made available for use with new documents by all working groups in the i18n Activity.

NOTE: this is all subject to constant change.

The xmlspec-i18n DTD contains some extensions that are specific to techniques documents as developed by the GEO WG, and others that are specific to Character Model documents developed by the Core WG. An attempt has been made below to clarify which is which. Note that the choice of elements used for a particular document rests with the author. There are other elements and additions that are valid for use in all types of document.

General points

This DTD uses markup declarations defined within the Internationalization Tag Set (ITS) 1.0 specification. The usage of this markup is described in the Best Practices for XML Internationalization document.

For further information about the usage of this DTD, see also the document Styleguide for i18n specifications.

Setting language

The language of W3C specifications is en-US. This should be set on the spec element using xml:lang, as well as in the langusage/language element. The latter is not used by the XSLT.

Editors-copy setting

You should set spec/@role to editors-copy while working on a document. This will introduce, in various places, text and javascript appropriate to a document that is being edited. Remove this value before publication and everything will disappear.

Components in a set of techniques documents

Techniques documents of the i18n GEO Working Group are divided into subject areas, and can be read from end to end if desired (this is recommended for newcomers). The techniques document contains a number of techniques that have the same structural components. Each technique contains a tersely-stated directive similar in level to the WAI techniques, plus accompanying descriptions, explanations, references, etc.

In addition, there is an outline document covering techniques for a given set of technologies that lists the terse directives and summary information about user agent support, grouped into sections that are related to tasks that the user is likely to want to undertake. These outline documents may repeat the same directives in more than one section. The user can select a directive to link to more detailed information in the relevant techniques document.

Finally, there is a document with the same structure as each outline document that contains just resource information related to particular techniques. It is possible to link between the outline and the resources documents.

The outline and resources documents are generated using XSLT from a template and the xml form of the techniques document.

As of 9 July 2003 there was a single DTD in use to support techniques documents for both GEO and WCAG groups. Since the constituents of a technique contain different required elements for these two groups the former technique element was split into a geo-technique and a wai-technique element. Most of the remaining structural and phrasal elements are shared, although there are some elements that are used only by GEO, and vice versa. These should be identified in the DTD code. This document describes only the elements needed for GEO techniques.

Element list

The following are elements in xmlspec that it is ok to use inside a div1. Although a validating editor will make other choices available to you, those choices may need special setup in the XSLT or CSS style sheet to work well, or may be deprecated for use with i18n activity documents for other reasons. If you wish to use something that is not on this list, please place a request with Richard Ishida.

Elements that are native to xmlspec are shown on the left. Note that some special combinations of element plus attribute are also included (such as emph[@role="strong"]). Elements introduced to the DTD by the i18n activity are shown to the right. These are described further down this page.

Structural elements

Typexmlspeci18n dtd
Sections

div1

div2

div3

div1[@role="unfiltered"] *

head

Techniques

geo-technique

short-name

checklist-item

description

ua-issues

ua-issue

resources

resource

xref

resource-descn

ua-applicability

ua

Requirements

req

reqlist

req-text

req-type

Notes:

div1[@role="unfiltered"]
Used when the bibliography contains more entries than are used in the actual document. This is particularly useful when bibliographies are centralised in external entity references. The XSLT searches the document for bibrefs that point to bibliographic entries and discards the rest.

Grouping and block elements

Typexmlspeci18n dtd
General blocks

p

note

example

eg *

caption

figure

Lists

olist

ulist

item

Tables

table *

tbody

th

td

Images

image *

img

alt

Bibliography

bibl

Notes:

eg
Should only be used within an example element when a 'pre'-like behaviour is absolutely necessary. Otherwise the code element should be used with breaks or within paragraphs.
table
Consider using tables within figure elements. This makes it possible to autonumber the table and refer or link to it in a standard and easy way from the text.
image
Beware: don't use the graphic element!

Inline elements

Typexmlspeci18n dtd
All-purpose

phrase

Terms

termdef

term

Emphasis

emph

emph[@role="strong"] *

Quoting & demarcating

quote

kw

code

qterm

qchar

abbr

uname

Linking

loc

bibref

specref

termref

titleref

xspecref

xtermref

Special

sub

sup

Editorial

ednote

edtext

ins

del

Notes:

emph[@role="strong"] *
This allows you to specify stronger emphasis. It will result in use of the strong tag in the generated XHTML.

I18n-specific elements and attributes

The following tables describe elements and attributes that have been added to xmlspec.

Strongly emphasised attributes are required. Assume that attribute values are CDATA unless otherwise indicated. After the description the content model of the element is shown, followed by descriptions of any attributes.

General-use grouping and block elements

element descriptionused by
figure

Associates the table or image with a caption such that it can be manipulated as a unit by the stylesheet. Even without a caption, this is useful for recognition as a block with figure-specific positioning rules.

((table | image) , caption?)

  • %common.att;
%local.illus.class
image

This definition of image uses an element for the alt information. That allows the alt text to be tagged for bidi, language, etc. and makes translation easier. Note that this is intended to be used in-line.

(img, alt)

No attributes

%local.illus.class
img

Displayed graphic. Replacement for the graphic element - removes the alt attribute (see image).

EMPTY

  • %common.att;
  • %simple-xlink.att;
  • %auto-embed.att;
  • @source: uri for the image file
  • @height: the height of the image
  • @width: the width of the image
image
alt

Text describing an image. Having the text in a separate element rather than an attribute a la HTML means that it can be given meta information such as lang, bidi, etc.

(%obj.mix;)*

  • %common.att;
image

General-use inline elements

element descriptionused by
qterm

Words or short phrases being referred to / quoted. eg. "The word 'character' is used...". Do not use quote marks! These are added, if required by the stylesheet.

(%tech.pcd.mix;)*

  • %common.att;
%local.tech.class
qchar

One or a short sequence of characters/letters being referred to / quoted. eg. "The letter 'c' is the third..." Also serves for keyboard keys. Do not use quote marks! These are added, if required by the stylesheet.

(%tech.pcd.mix;)*

  • %common.att;
%local.tech.class
abbr

Surrounds abbreviations and acronyms and stores the expansion of the abbreviation (which could be shown as say a tooltip, or pronounced by a voice synthesiser).

(%tech.pcd.mix;)*

  • %common.att;
  • @expansion: the expanded form of the abbreviation.
%local.termdef.class
uname

Surrounds Unicode character names to allow styling, eg. COMBINING LONG SOLIDUS OVERLAY.

(%tech.pcd.mix;)*

  • %common.att;
%local.tech.class
ins

ins and del elements are for marking changes to the document. They were added here rather than using the @diff scheme provided by xmlspec because it was easier (quicker) to use for marking up phrase level text given the process in use for editing the Character Model (if using XMetal - just highlight the appropriate text and double-click the element in the element list). Using xslt, the elements were mapped very simply to elements of the same name in xhtml.

(%p.pcd.mix;)*

  • %common.att;
%local.emph.class
del

See the description for ins above.

(%p.pcd.mix;)*

  • %common.att;
%local.emph.class

Techniques-specific structural elements

This section describes the geo-technique element and its children. The geo-technique element is only used in techniques documents, and is what distinguishes them from other document types.

There is also a wai-technique element. This is used to support the slightly different needs of the WAI domain, whilst allowing them to use the same basic DTD.

element descriptionused by
geo-technique

A grouping element for information related to a technique. Only used for GEO techniques related documents.

(short-name, checklist-item+, description, ua-issues*, resources*, test*, admin*, scripts?, ua-applicability)

  • @id: a unique id to identify the technique (I currently generate these through scripting as <authors-initials><date yyyymmdd><time to milliseconds>)

TBD: remove the * from ua-issues and resources (and maybe also from test and admin?)

%local.div.mix;
short-name

A brief summary of the content of a technique, visible in the html version of the techniques documents, not visible in the outline and resources documents but used for content of include elements to clarify what they point to.

(%head.pcd.mix;)*

  • %common.att;
geo-technique
checklist-item

The do or don't text that summarises a technique. This should normally be one paragraph (or even one sentence if possible), but it is also possible to use a list to combine two inseparable statements in a single checklist-item.

(%hdr.mix;)*

  • %common.att;
geo-technique
description

Groups paragraphs, examples, and other stuff describing the checklist-item associated with a particular technique

(%obj.mix;)*

  • %common.att;
geo-technique
ua-issues

Need to figure out the best usage for this element.

Groups a number of ua-issue elements

(ua-issue)+

  • %common.att;
geo-technique
ua-issue

Need to figure out the best usage for this element.

Describes a feature wrt UA support for a particular technique and a particular browser

(%obj.mix;)*

  • @name: the name of the browser
  • @version: the version(s) of the browser that exhibit(s) this feature
ua-issues
resources

Groups a number of pointers to additional resources associated with a given technique.

(resource)*

  • %common.att;
geo-technique
resource

A single pointer to other useful information supporting the description of the technique. The type of resource is indicated by the resource-type attribute. Different types of resource can be specified in any order (they will be grouped and ordered later by the XSLT).

The bibref element must point to one of the entries in the references. The bibliography may be entered directly into the document or included into the document via the use of an external entity.

The content of the loc element (not described here because it is a standard xmlspec element) should specify the location of the resource as precisely as possible to help the reader. When pointing to large resources, such information may include the chapter heading and number, the page number (in a book), an indication of which paragraph to look for, etc. For small web pages, this can be the title of the page.

(bibref, loc, resource-descn)

  • @resource-type (source | background | implementation | test | other): identifies the type of resource. This information will be used for grouping and ordering resources in the XSLT-generated view data. source refers to sources for the information included in a technique; background refers to useful background reading related to this technique; implementation refers to information such as instructions that helps you take the next step and implement the recommendation in the technique;test refers to test documents; other is a catch-all, and may be split into other categories in the future.
  • @id: a unique id for this resource pointer

TBD: Consider whether other should be split into additional categories.

resources, resource-list
xref

Need to look into the usage here.

A cross-reference to another section in a view. These are currently expressed typically as "What if such-and-such is the case?" or "How do I do such-and-such?".

(resource-descn, specref+)

  • no attributes
xref-list
resource-descn

Describes the target of a link.

Try to 'front-load' descriptions as much as possible, ie. express the key distinguishing information about this resource as near as possible to the beginning of the description, and keep them succinct.

In an xref element, these are currently expressed typically as "What if such-and-such is the case?" or "How do I do such-and-such?".

(%p.pcd.mix;)*

  • no attributes

TBD: Should probably have common attributes.

TBD: I can't remember whether different resource-descn content across techniques will create multiple pointers after XSLT processing. Need to check it out.

xref, resource
scripts

Not currently in use.

Indicates scripts for which this technique is relevant.

Groups a number of script descriptions. Several script descriptions can be made by having several script elements as children of "scripts". There must be one "all" or one "na" element.

(all | na | script+)

  • no attributes
geo-technique
script

Content of the script element.

EMPTY

  • @scriptname: The script is indicated by the attribute "scriptname". Possible values of "@scriptname" are:

    (latin | alpha | arabic | hebrew | fareast | seasia | indic )

    The value "alpha" represents Latin plus other alphabetic scripts, such as Greek, Cyrillic, Armenian, Cherokee, etc. fareast represents Chinese, Japanese and Korean scripts. seasia refers to Thai and similar scripts.

scripts
ua-applicability

Provides information about which user agents, and which versions of those, present issues when it comes to implementing the recommendation in the technique.

This element should not contain more than one instance of "ua" with the same value for the "ua-name" attribute. If there is no information about a particular user agent, simply leave out the subelement for that ua.

(ua)*

  • no attributes
geo-technique
ua

Each "ua" element represents a user agent, and if present indicates whether or not there are issues relating to the implementation of the current technique for that user agent either in the chosen base version and/or in the latest available version at the time of publication of the document.

EMPTY

  • @ua-name: Denotes the name of the user agent. The following values are currently available as an enumerated list:
    • ie, Internet Explorer (Windows)
    • ff, Firefox
    • moz, Mozilla
    • opera, Opera
    • safari, Safari
    • iem, Internet Explorer (Mac)
  • @version: Possible values are:
    • nn, there are no issues with support for this technique
    • yn, there were issues surrounding implementation on the base version of the user agent, but not the latest version.
    • yy, there continue to be issues
    • ny, issues have been introduced into the later versions of this browser that the base version (not very common)
ua-applicability

Character Model specific structural elements

element descriptionused by
req

The req and req-list elements are used to highlight the actual requirements embedded in the (flowing) text. Each requirement is associated with one or more of S, I, or C (specifications, implementations, or content) to indicate relevance. This is rendered currently as yellow background, with relevance indicated by appropriate letters (S,I, or C) enclosed in square brackets at the beginning.

(req-type+ , req-text )

  • No attributes.
Top level element
reqlist

See the description of req above

(req-type+ , req-text , (%list.class;)?)

  • No attributes.
%local.div.mix
req-text

The text of a requirement.

(%p.pcd.mix;)*

  • %common.att;
req, req-list
req-type

One of the letters 'S', 'I' or 'C' - standing for specifications, implementations and content, respectively.

(%tech.pcd.mix;)*

  • %common.att;
req, req-list

Elements and attributes from the ITS 1.0 specification

The following elements and attributes from the ITS 1.0 specification have been added to the xmlspec DTD.

As mentioned above, the usage of this markup is described in the Best Practices for XML Internationalization document.

Entities added to xmlspec

The following entities are expected to be of use for any document.

Technical information about the Modifications to xmlspec

Here is an overview of changes made to the xmlspec DTD, for whose working on the DTD:

  1. created %i18n.att (xml:lang, dir) and added to local.common.att and local.common-idreq.att . These have values identical to those in XHTML 2.0, see the respective definition of Bi-directional Text and xml:lang.
  2. created %l10n.att (locn-note, locn-alert, translate) and added this to %local.common.att and %local.common.idreq.att .
  3. added general entities: lrm, rlm, zwj, zwnj
  4. added %ubiquitous.phrase.class containing "|phrase|ins|del|emph". Added the entity to the definitions of code and loc elements.
  5. added an attribute "version", attached to the "xmlspec" element. It has the fixed value "1.1". Future versions of this schema must use this attribute for versioning and should not rely on the name of the schema file.
  6. created an external modul "i18n-extensions.mod", which is called by the parameter entity "i18n-extensions". So far, it contains only a parameter entity " useragents " with a list of values for the "ua-name" attribute at the "ua" element.
  7. put all i18n documentation and modifications of parameter entities from xmlspec-i18n.dtd to i18n-extensions.mod, and all i18n specific element declarations to xmlspec-element.mod. You will xmlspec-i18n.dtd and these two files to create or validate documents. The change leads to a new version 002, hence, the value of the version attribute is now "002".
  8. Made modifications to the DTD to allow for using ITS 1.0 markup, and changed the value of the version attribute to "003". See Elements and attributes from the ITS 1.0 specification for details.

Things still to be done


Richard Ishida, 2005-07-17 16:21

Version: $Id: xmlspec-i18n-dtd.html,v 1.6 2008/02/13 12:33:16 fsasaki Exp $