Category: New resource
These tests check whether a user agent displays IDNs (Internationalized Domain Names) as Unicode, punycode or otherwise in the status bar. User agents that try to detect possible homograph attacks do so in different ways. These tests explore some of those approaches.
Getting Started material: This is a second in a proposed series of pages that will introduce you to key internationalization topics and tasks, and direct you towards articles or resources on the W3C Internationalization subsite that will take you to the next level of understanding.
This document introduces topics related to declaring the human language of your content, and related topics, such as language-based styling, content negotiation, and user navigation.
By Richard Ishida, W3C.
Information about the language in use on a page is important for accessibility, styling, searching, and other reasons. In addition, language information that is typically transmitted between the user agent and server can be used to help improve navigation for users and the localizability of your site. This tutorial will help you take advantage of the opportunities that are available now and in the near future by declaring language information appropriately.
By following this tutorial you should be able to:
- recognize the available alternatives for declaring language, and how they differ,
- understand the difference between metadata about the expected language of the audience and the text-processing language,
- choose the best way of declaring language for your content
- locate information about how to specify language attribute values.
This series of tests checks whether a user agent automatically recognizes that a file declared as US-ASCII is really UTF-8 encoded, and displays the text as UTF-8, even if the encoding declarations say otherwise.
Initial test results are also provided.
The CSS3 modules currently in development will introduce a large number of properties designed to support non-Latin text, from vertical script support to kashida justification, from ruby positioning to list numbering. This article will give you a glimpse of some of the properties that lie in store, and discuss how you can help to make these improvements a reality.
Getting bidirectional text to display correctly can sometimes appear baffling and frustrating, but it need not be so. If you have struggled with this or have yet to start, this tutorial should help you adopt the best approach to marking up your content, and explain enough of how the bidirectional algorithm works that you will understand much better the root causes of most of your problems. It also addresses some common misconceptions about ways to deal with markup for bidirectional content.
At the end of this tutorial you should be able to:
- create effective XHTML and HTML pages containing text written in the Arabic or Hebrew (or other right-to-left) scripts,
- understand the basics of how the Unicode bidirectional algorithm works, so that you can understand why bidirectional text behaves the way it does, and how to work around problems,and
- take decisions about the appropriateness of alternatives to markup.
The new version of BCP 47 is expected to shortly replace RFC 3066. The tags defined by ‘RFC 3066bis’ address a number of long standing problems with language identification, leading, hopefully, to richer language-aware features in our software and better support for language in our documents.
This article provides an overview of the changes in store for language tags. It describes the structure of future language tags, the current status of the work, and remaining work to be done. Author: Addison Phillips, Yahoo!
This is a set of pages that examine how right-to-left and bidirectional text affects user agents outside of the main content area.
The article looks at design and development practices that can cause major problems for translation. Designers must be very careful about how they split up and reuse text on-screen because the linguistic differences between languages can lead to real headaches for localizers and may in some cases make a reasonable translation impossible to achieve.
The article looks at a particular design and development practise that can cause major problems for translation of content. Many programmers and designers decide that if a particular string is used in many places, they will use copies of the same string rather than implement many identical strings. String reuse is not necessarily a bad thing. The trick is to know what constitutes a good candidate for reuse and what does not. If you get it wrong, you can be creating an insuperable obstacle to good localization.