W3C

Internationalization of XML Current Status

This page summarizes the relationships among specifications, whether they are finished standards or drafts. Below, each title links to the most recent version of a document.

Completed Work

W3C Recommendations have been reviewed by W3C Members, by software developers, and by other W3C groups and interested parties, and are endorsed by the Director as Web Standards. Learn more about the W3C Recommendation Track.

Group Notes are not standards and do not have the same level of W3C endorsement.

Standards

2013-10-29

Internationalization Tag Set (ITS) Version 2.0

This document defines data categories and their implementation as a set of elements and attributes called the Internationalization Tag Set (ITS) 2.0. ITS 2.0 is the successor of ITS 1.0; it is designed to foster the creation of multilingual Web content, focusing on HTML5, XML based formats in general, and to leverage localization workflows based on the XML Localization Interchange File Format (XLIFF). In addition to HTML5 and XML, algorithms to convert ITS attributes to RDFa and NIF are provided.

2007-04-03

Internationalization Tag Set (ITS) Version 1.0

This document defines data categories and their implementation as a set of elements and attributes called the Internationalization Tag Set (ITS). ITS is designed to be used with schemas to support the internationalization and localization of schemas and documents. An implementation is provided for three schema languages: XML DTD, XML Schema and RELAX NG.

2005-02-15

Character Model for the World Wide Web 1.0: Fundamentals

This Architectural Specification provides authors of specifications, software developers, and content developers with a common reference for interoperable text manipulation on the World Wide Web, building on the Universal Character Set, defined jointly by the Unicode Standard and ISO/IEC 10646. Topics addressed include use of the terms 'character', 'encoding' and 'string', a reference processing model, choice and identification of character encodings, character escaping, and string indexing.

For normalization and string identity matching, see the companion document Character Model for the World Wide Web 1.0: Normalization [CharNorm]. For resource identifiers, see the companion document Character Model for the World Wide Web 1.0: Resource Identifiers [CharIRI].

Group Notes

2013-01-24

Unicode in XML and other Markup Languages

Provides guidelines on the use of the Unicode Standard in conjunction with markup languages such as XML.

2011-07-05

Working with Time Zones

Discusses some of the problems encountered when working with the date, time, and dateTime values from XML Schema when those value include (or omit) time zone offsets. Many W3C technologies rely on date and time types.

2009-09-15

Requirements for String Identity Matching and String Indexing

This document was written as the first step towards a character model for W3C specifications, to make sure that the requirements of other W3C Working Groups (and of other interested parties) are understood and can be addressed.

2008-11-03

Legacy extended IRIs for XML resource identification

For historic reasons, some formats have allowed variants of IRIs that are somewhat less restricted in syntax, for example XML system identifiers and W3C XML Schema anyURIs. This document provides a definition and a name (Legacy Extended IRI or LEIRI) for these variants for easy reference.

2008-02-13

Best Practices for XML Internationalization

Provides a set of guidelines for developing XML documents and schemas that are properly internationalized, aimed at both developers of XML applications and authors of XML content.

Drafts

Below are draft documents: Candidate Recommendations, other Working Drafts . Some of these may become Web Standards through the W3C Recommendation Track process. Others may be published as Group Notes or become obsolete specifications.

Candidate Recommendations

2004-11-22

Character Model for the World Wide Web 1.0: Resource Identifiers

Architectural Specification providing authors of specifications, software developers, and content developers with a common reference for the use of resource identifiers building on Unicode.

Other Working Drafts

2014-07-15

Character Model for the World Wide Web: String Matching and Searching

Architectural Specification providing authors of specifications, software developers, and content developers with a common reference for normalization and string identity matching to improve interoperable text handling on the World Wide Web.

2013-03-07

Metadata for the Multilingual Web - Usage Scenarios and Implementations

An overview of usage scenarios and implementations demonstrating applications of the Internationalization Tag Set (ITS) 2.0. The usage scenarios are ranging from simple machine translation or human translation quality check to training for machine translation systems or automatic text analyis.

2006-06-12

Language Tags and Locale Identifiers for the World Wide Web

Describes mechanisms based on BCP 47 for identifying or selecting the language of content or locale preferences used to process information using Web technologies.

Obsolete Specifications

These specifications have either been superseded by others, or have been abandoned. They remain available for archival purposes, but are not intended to be used.

Retired

2006-05-18

Internationalization and Localization Markup Requirements

When creating schemas (XML Schema, DTD, etc.), it is important to include constructs that meet the needs of content authors dealing with international audiences, and address the needs of the localization community. This document provides a list of key requirements to achieve such a goal.