W3C logo

Extreme Markup Languages 2005 - http://www.w3.org/People/fsasaki/its-eml05


Schema Languages & Internationalization Issues: A survey

Felix Sasaki (W3C)

Christian Lieske (SAP)

Andreas Witt (Bielefeld University, Department of Texttechnology)

Slides are available at http://www.w3.org/People/fsasaki/its-eml05

The Late Breaking Paper can be found at http://www.w3.org/People/fsasaki/EML2005sasa0411.html

Topics / Goals of this Presentation

Overview

Scope: Different types of XML Documents

  1. center on text e.g. OpenOffice
  2. focus on code e.g. XUL Mozilla
  3. mix prose and code e.g. DocBook
  4. include presentational aspects e.g. XHTML

Requirement: Bidirecional Text Support

The title is "مفتاح معايير الويب!" in Arabic.

The title is "مفتاح معايير الويب!" in Arabic.

Markup to specify directionality

<p>The title is
"<span xml:lang="ar"
dir="rtl" lang="ar">مفتاح معايير الويب!</span>
"in Arabic.</p>

Requirement: Support for Ruby Markup

Ruby Display
<p>これは
<ruby><rb>紙芝居</rb><rt>かみしばい</rt></p>です。

Requirement: Span-like Element

<code>System.out.println("
<span xml:lang="ja" translate="no">
W3C国際活動</span>");</code>

Requirement: Retrieving External Information

<para>If you create a typing error like "strs(s)",
you will get the message
 <xref id="resfile.resx">
 <subst>
  <search>{0}</search>
   <replace>&lt;Filename&gt;</replace>
 </subst>
 </xref>.<para>

Overview of Further Requirements

Indicator of Constraints Indicator for metrics
CDATA sections Limited impact
Handling entities Attributes and translatable text
Locale / Language identification Localization notes
Term identification Handling of white-spaces
Purpose specification / mapping Multilingual documents
Cultural aspects of the content Annotation markup

Overview

The Need to go Beyond Schemas (I)

<window>
 <box align="center">
  <button label="hello xFly"
   onclick="alert('Hello World');"/>
 </box>
</window>

The Need to go Beyond Schemas (II)

<code>System.out.println("
<span xml:lang="ja" translate="no">
W3C国際活動</span>");</code>
code/text()
System.out.println("");
System.out.println("W3C国際活動");

The Need to go Beyond Schemas (III)

Deployment options for ITS

→A single ITS schema in one schema language is insufficient

Status Quo:

ITS Approaches

A Word on Technologies

Overview

Schema languages: ITS relevant Characteristics

Namespaces

Pattern-Based Descriptions

Usefulness of Modularization and Typing Mechanisms for ITS

We will show the usage of existing mechanisms to realize the ITS locinfo data category

XML DTDs: Marked Sections for ITS

<!ENTITY % para "INCLUDE">
<![%para;[
   <!ELEMENT p (#PCDATA)>
   <!ATTLIST p id ID #IMPLIED>
   <!ENTITY % its.para "IGNORE">]]>
<![%its.para;[
   <!ELEMENT p (#PCDATA)>
   <!ATTLIST p id ID #IMPLIED
               locinfo CDATA #IMPLIED>]]>

XML Schema: Typing for ITS

<xs:complexType name="paraContent" mixed="true">
 <xs:attribute name="id" type="xs:ID"/>
</xs:complexType>
<xs:complexType name="itsParaContent">
 <xs:complexContent>
 <xs:extension base="paraContent">
  <xs:attribute name="locinfo" type="xs:string"/>
 </xs:extension>
 </xs:complexContent>
</xs:complexType>
<xs:element name="p" type="itsParaContent"/>
doc("mydoc.xml")//element(*, itsParaContent?)

RELAX NG: Ambiguity for ITS

p-html =
  element p { xhtml-p.content }
p-its =
  element p { xhtml-p.content,
  attribute locinfo { text } }
p = p-html | p-its
element div = p+

Assessing the Alternatives

Overview

Namespace Sectioning (I)

<h:html xmlns:h="http://www.w3.org/1999/xhtml"
 xmlns="http://www.example.org/its"
translate="yes"> 
 <h:head>...<h:meta translate="no">...
  <span>...<span>
</h:html>
default namespace = "http://www.example.org/its"
attribute translate { "yes" | "no" }?
element span { text }

Namespace Sectioning (II)

Overview

ITS as Schema Annotation

<!DOCTYPE 
 SYSTEM "xhtml-plus-its.dtd">
<html> ...
<its-span
 locinfo="...">...</its-span>
...
</html>

ITS as Schema Annotation:
Architectural Forms (Annex of HyTime)

<?IS10744:arch name="its-arch"
  bridge-form="archbridge"
  renamer-att="its-arch.atts"
  dtd-system-id="xhtml1-transitional.dtd"?>
<!ELEMENT html (head, body)>
<!ATTLIST html its-arch NAME #FIXED "archbridge">
...
<!ELEMENT its-span (#PCDATA)>
<!ATTLIST its-span
its-arch NAME #FIXED "span"
its-arch.atts CDATA #FIXED "locinfo title">

ITS as Schema Annotation: Further approaches (I)

http://example.com/its#span realizedAs
http://www.mySchema2.xsd#(/~element::span).
http://example.com/its#span realizedAs
"<!ELEMENT span (...)>".
http://example.com/its#span realizedAs
"element span { ... }".

ITS as Schema Annotation: Further approaches (II)

<purposeSpec>
 <servesPurpose origVoc="span" its="its-span"/>
</purposeSpec>

A Word on Schema Annotation for ITS in General

Overview

Processing Model for ITS

ITS might be implemented with presumably several processes

Sample Processing related to ITS

A processing model for ITS

Technologies for Processing Models

Overview

Instead of a Summary - Questions:

Please read
http://www.w3.org/TR/itsreq/
and send comments to www-international@w3.org