Using HTML's translate attribute

Intended audience: users, XHTML/HTML coders (using editors or scripting), script developers (PHP, JSP, etc.), schema developers (DTDs, XML Schema, RelaxNG, etc.), and anyone needs guidance on how to use the HTML translate attribute.

Question

What is the translate attribute for, and how should I use it?

Quick answer

The translate attribute in HTML5 indicates that the content of the element should or should not be translated. There is no effect on the rendered page (although you could, of course, style it if you found a good reason for doing so).

The attribute can appear on any element, and it takes just two values: yes or no. If the value is no, translation tools should protect the text of the element from translation. The translation tool in question could be an automated translation engine, like those used in the online services offered by Google, Microsoft and Yandex. Or it could be a human translator's 'workbench' tool, which would prevent the translator inadvertently changing the text.

Setting this translate flag on an element applies the value to all contained element content. HTML5 has a list of attributes that are to be translated by default, but these attributes should not be translated if they are on an element where translate is set to no. Otherwise attributes should not be translated.

If a page has no translate attribute, a translation system or translator should assume that all the text is to be translated. The yes value is therefore likely to see little use, though it could be very useful if you need to override a translate flag on a parent element and indicate some bits of text that should be translated. You may want to translate the natural language text in examples of source code, for example, but leave the code untranslated.

Longer answer

Why it is needed?

Adding the translate attribute to your page can help readers better understand your content when they run it through automatic translation systems, and can save a significant amount of cost and hassle for translation vendors with large throughput in many languages.

You come across a need for this quite frequently. There is an example in the HTML5 spec about the Bee Game. Here is a similar, but real example where the documentation being translated referred to a machine with text on the hardware panel that wasn't translated.

<p>Click the Resume button on the Status Display or the
<span class="panelmsg" translate="no">CONTINUE</span> button
on the printer panel.</p>

Here are a couple more real-life examples of content that could benefit from the translate attribute. The first is from a book, quoting a title of a work.

<p>The question in the title <cite translate="no">How Far Can You Go?</cite> applies to both the undermining of traditional religious belief by radical theology and the undermining of literary convention by the device of "breaking frame"...</p>

The next example is from a page about French bread – the French for bread is 'pain'.

<p>Welcome to <strong translate="no">french pain</strong> on Facebook. Join now to write reviews and connect with <strong translate="no">french pain</strong>. Help your friends discover great places to visit by recommending <strong translate="no">french pain</strong>.</p>

You may also want to use it to protect keywords, code samples or examples from being translated.

<p>Here is an example of the <span class="kw" translate="no">label<span> element using the <span class="kw" translate="no">for</span> attribute:</p>

<code translate="no">&lt;label for="postcode"&gt;Enter your postcode to find the nearest store:&lt;/label&gt; &lt;input id="postcode" type="text"&gt;</code>

When to use translate="yes"

The yes value of the translate attribute is mostly used to override the effect of setting translate to no. For example, we may want to allow the natural language text of the above source code to be translated, while protecting the code itself (ie. the keywords such as label, for, postcode, input, etc.). We could do that by surrounding the natural language text with elements that have the translate attribute.

<p>Here is an example of the <span class="kw" translate="no">label<span> element using the <span class="kw" translate="no">for</span> attribute:</p>

<code translate="no">&lt;label for="postcode"&gt;<span translate="yes">Enter your postcode to find the nearest store</span>:&lt;/label&gt; &lt;input id="postcode" type="text"&gt;</code>

Working with attributes

It can be problematic to deal with attribute values in translation. Generally speaking, attribute values are part of the syntax of the page, and should therefore not be translated. If they are, the page will break. In some cases, however, the values contain human readable text (eg. title, alt, and placeholder attribute values in HTML), although this is not recommended.1

1   For example, it is impossible to use markup for attribute values to manage bidirectional text in languages such as Arabic and Hebrew, or to mark up such things as language changes. And of course, it can be difficult to determine which attribute values should be translated and which should not. It is also difficult to identify a part of an attribute value that should be left untranslated, or an attribute value that should be left untranslated although the element content is translated.

The HTML specification lists attributes that should be treated as translatable. Attribute values not in this list are not to be translated.

If a 'translatable' attribute value appears on an element which has translate set to no, then the expectation is that the attribute value will also remain untranslated.

This can, of course, cause problems in cases where you do want the attribute values to be translated but not the element content, or vice versa. In some cases those situations can be mitigated by nesting the markup concerned. For example, you could have an outer span element with translate set to yes that carries the title attribute you want to avoid translating. Inside that span you could put another span with translate set to no and containing the element content. This is how articles in this series handle links to translated versions of a page – the title attribute of the outer element carries the name of the language pointed to, and the inner element carries the name of that language in the language itself (which should not be changed). This also helps when labelling the language using the lang attribute.

The following example shows how you could protect the word 'English', when it is a link to the English version of the document, when translating a page in German to another language. The informative title attribute would be translated. Without the translate flag, online services currently tend to translate the word 'English' to the equivalent in the target language or to 'Deutsch'.

<span title="Englisch"><a href="article.en.html" translate="no" lang="en">English</a></span>

Because these are attribute values, however, it is still impossible to indicate whether parts of the text in the attribute value should be protected from translation.

(Bear in mind that the HTML5 specification, at the time of writing, is not yet stable, and implementations may not yet follow the specification.)

This approach is different from the general approach recommended by the ITS specification for XML-based languages. ITS (see below) recommends that attribute values be left untranslated by default, but it also provides a way of indicating specific attributes that should be translated, independent of their context.

Adding translate flags to a page

The translate attribute can, of course, be added to a page by a content author who is mindful of how they want the page to appear after translation. This is particularly useful for protecting content when a reader runs a page through an automatic translation service, such as those offered by Google, Microsoft and Yandex.

In industrial translation scenarios, localizers may add attributes during the translation preparation stage, as a way of avoiding the multiplicative effects of dealing with mistranslations in a large number of languages. This may be done via automated processes, such as entity recognition tools, that automatically recognize proper nouns.

It is also possible to use external files to (among other things) point to markup that should not be translated. For example, you may want to indicate that all span elements with a given class name should not be translated. A way of doing this is described by the Internationalization Tag Set (ITS) specification. A set of such rules can be valid for one page or many pages at the same time. Content developers and localizers may work closely together in setting up these rules to achieve a faster and better localization process.

Implementation support for the translate flag

The code translate="no" is supported by Google, Microsoft and Yandex online translation services at the time of writing. The use of translate="yes" to allow translation within a part of the document where translation is disallowed is currently less widely supported.

See the latest test results.

The MultilingualWeb-LT Working Group, which has been working on the Internationalization Tag Set specification, has compiled a document, Metadata for the Multilingual Web – Usage Scenarios and Implementations, that describes other applications and usage scenarios where the translate flag is recognized.

Legacy approaches for online translation services

Before the translate attribute was defined, both Google and Microsoft online translation services supported a number of other, non-standard ways to express similar ideas.

Both Google and Microsoft support class="notranslate", but replacing a class attribute value with an attribute that is a formal part of the language makes this feature much more reliable, especially in wider contexts. For example, a translation prep tool would be able to rely on the meaning of the HTML5 translate attribute always being what is expected. Also it becomes easier to port the concept to other scenarios, such as other translation APIs or localization standards such as XLIFF.

Microsoft apparently supports style="notranslate". This is not one of the options Google lists for their online service, but on the other hand they have things that are not available via Microsoft's service.

For example, if you have an entire page that should not be translated, you can add <meta name="google" value="notranslate"> inside the head element of your page and Google won't translate any of the content on that page. (However they also support <meta name="google" content="notranslate">.) This shouldn't be Google specific, and a single way of doing this, ie. translate="no" on the html tag, is far cleaner.

Microsoft and Google's translation engines also don't translate content within code elements. Note, however, that you don't seem to have any choice about this – there don't seem to be instructions about how to override this if you do want your code element content translated.

As already mentioned, the new HTML5 translate attribute provides a simple and standard feature of HTML that can replace and simplify all these different approaches, and will help authors develop content that will work with other systems too.

Thanks are due to those who contributed helpful suggestions during the review of this document, especially Felix Sasaki, Gunnar Bittersmann, and members of the W3C Internationalization Working Group.