Accesskey n skips to in-page navigation. Skip to the content start.

s_gotoW3cHome Internationalization
 

Internationalization Checker reports

Intended audience: this article is mainly intended for people developing the internationalization checker, but can also be used by those interested in tracking source references quickly for a particular report.

This page lists all the report messages used by the W3C Internationalization Checker. As well as the text of the report and the severity of the report (with variants), it lists the conditions which trigger that report. It also lists references to articles or specifications that provide authoritative sources for the report.

The conditions that trigger a report are often dependent on the format or mime-type of the page being considered. The checker currently supports only a subset of format/mime-type combinations. The following keywords are used to indicate the various possibilities that are currently tracked:

If there are no keywords, the conditions apply to all formats and mime-types.

The checker has not yet been tailored to deal with XHTML5 or Polyglot documents.

This page will be updated from time to time, as new features are added to the checker or existing features are refined.

Character encoding: HTTP header

Encoding declared only in HTTP header

Conditions and severity

[rep_charset_no_in_doc]
  • Warning: The only encoding information is in the HTTP header.

Explanation

A character encoding is specified in the HTTP header (%1), but there was no matching encoding declaration in the page. This may lead to problems later if there is a chance that the document will be read from or saved to disk, CD, etc.

In addition, the W3C Internationalization (i18n) Group recommends to always include a visible encoding declaration in a document, because it helps developers, testers, or translation production managers to check the encoding of a document visually.

What to do

Add information to indicate the character encoding of the page inside the page itself.

Further reading

Character encodings explained

Declaring the character encoding in an X/HTML document

Sources

Character encoding: BOM

UTF-8 BOM found at start of file

Conditions and severity

[rep_charset_bom_found]
  • Warning: The page has a UTF-8 BOM at the top.

Explanation

The UTF-8 Byte Order Mark (BOM) was found at the beginning of the page. It can sometimes introduce blank spaces or short sequences of strange-looking characters (such as )

What to do

Using an editor or an appropriate tool, remove the byte order mark from the beginning of the file. This can often be achieved by saving the document with the appropriate settings in the editor. On the other hand, some editors (such as Notepad on Windows) do not give you a choice, and always add the byte order mark. In this case you may need to use a different editor.

Further reading

Handling the byte-order mark

Sources

BOM found in content

Conditions and severity

[rep_charset_bom_in_content]
  • Warning: The page has a UTF-8 BOM below the top of the page.

Explanation

The UTF-8 Byte Order Mark (BOM) was found below the top of the page. This is often caused when the BOM is at the top of a file or chunk of content that is included into a page. It can sometimes introduce blank spaces or short sequences of strange-looking characters (such as ).

What to do

Using an editor or an appropriate tool, remove the byte order mark from the beginning of the file or chunk of content where it appears.

If the problem does arise from a BOM at the top of an included file, this can often be achieved by saving the content with appropriate settings in the editor. On the other hand, some editors (such as Notepad on Windows) do not give you a choice, and always add the byte order mark. In this case you may need to use a different editor.

Further reading

Handling the byte-order mark

Sources

No visible in-document encoding declared

Conditions and severity

[rep_no_visible_charset]
  • Warning: The page has no xml declaration and no meta declaration, and a utf-8 BOM has been detected.
  • NOTE: For non-HTML5 pages, utf-16 bom could also be part of the condition, but this was dropped to align advice for legacy formats with HTML5.

Explanation

The character encoding of this page is indicated using a byte-order mark.

Although this is usually sufficient to indicate to a browser what is the encoding of the page, the W3C Internationalization (i18n) Group recommends to always include a visible encoding declaration in a document as well, because it helps developers, testers, or translation production managers to check the encoding of a document visually.

What to do

Add a meta tag or XML declaration, as appropriate, to your page to indicate the character encoding used.

Further reading

Character encodings explained

Declaring the character encoding for HTML

Sources

Character encoding: XML declaration

XML declaration used

Conditions and severity

[rep_charset_xml_decl_used]
  • Error: html,html5 The page has an XML declaration.
  • Warning: xhtml The page has an XML declaration.

Explanation

html

This page currently uses the following XML declaration:

%1

XML declarations are used by XML processors, and are not appropriate for pages that are parsed as HTML.

html5

This page currently uses the following XML declaration:

%1

HTML5 only allows comments before the Doctype, so this prevents the use of the XML declaration.

xhtml

This page currently uses the following XML declaration:

%1

XML declarations are sometimes used with XHTML pages that are served as text/html, so that when those files are read by an XML parser, rather than an HTML parser, the encoding information is recognized. However, an XML declaration in an HTML document can cause Internet Explorer to render in quirks mode rather than standards mode, so it is generally recommended that you avoid its use for such hybrid documents. If you use UTF-8 you don't need an XML declaration for an conforming XML parser.

What to do

html,html5

Remove the XML declaration from your page. Use a meta element instead to declare the character encoding of the page.

xhtml

Since you are using XHTML 1.x but serving it as text/html, use UTF-8 for your page and remove the XML declaration.

Further reading

Character encodings explained

Declaring the character encoding for HTML

Sources

No effective character encoding information

Conditions and severity

[rep_no_effective_charset]
  • Warning: html,html5,xhtml The page has an XML declaration and no other character encoding declaration.

Explanation

This page only declares a character encoding in the following XML declaration.

%1

An HTML parser does not recognize encoding declarations in the XML declaration, so effectively no encoding has been specified for this page.

What to do

Add a meta element to indicate the character encoding of the page. You could also declare the encoding in the HTTP header, but it is recommended that you always use a meta element too.

Further reading

Character encodings explained

Declaring the character encoding for HTML

Sources

Character encoding: meta declaration

meta character encoding declaration uses http-equiv

Conditions and severity

[rep_charset_pragma]
  • Comment: html5 The page contains a meta element with an http-equiv attribute in the Encoding declaration state.

Explanation

This page uses the following character encoding declaration with an http-equiv attribute:

%1

This is acceptable for HTML5, however you may want to consider using the meta element with a charset attribute instead. For example:

<meta charset="%2">

What to do

Replace the http-equiv and content attributes in your meta tag with a charset attribute.

Further reading

Character encodings explained

Declaring the character encoding for HTML

Sources

meta encoding declarations don't work with XML

Conditions and severity

[rep_meta_ineffective]
  • Comment: xhtml10x,xhtml11x The page contains a meta element with a charset attribute or a meta element with an http-equiv attribute in the Encoding declaration state.

Explanation

This page is being served as XML and there is a character encoding declaration in the following meta tag:

%1

Encoding declarations in meta tags are not recognised by XML processors, so this declaration has no actual effect.

In the absence of another declaration, the XML processor will recognize the encoding as UTF-8 or UTF-16 by sniffing the start of the file. If you sometimes serve this page as HTML, it will be useful then, but otherwise it would be better to use an XML declaration than a meta tag to identify the encoding of the page. It is useful, by the way, to have a visible in-document declaration because it helps developers, testers, or translation production managers to check the encoding of a document visually.

What to do

Unless you sometimes serve this page as text/html, remove the meta tag and ensure you have an XML declaration with encoding information.

Further reading

Character encodings explained

Declaring the character encoding for HTML

Sources

A meta tag with a charset attribute will cause validation to fail

Conditions and severity

[rep_meta_charset_invalid]
  • Warning: html,xhtml,xhtml10x,xhtml11x The page contains a meta element with a charset attribute.

Explanation

This page is not HTML5 but uses the following meta element to specify the character encoding:

%1

Although all major browsers now recognize this character encoding declaration, and act appropriately, the charset attribute on a meta tag is only referred to by the HTML5 specification. This means that, although things will generally work as you expect in modern browsers, if you try to validate this page you will receive an error message.

What to do

If you want this page to be valid HTML, replace the charset attribute with http-equiv and content attributes, eg. <meta http-equiv='Content-Type' content='text/html; charset=utf-8'>.

Further reading

Character encodings explained

Declaring the character encoding for HTML

Sources

Incorrect use of meta encoding declarations

Conditions and severity

[rep_incorrect_use_meta]
  • Warning: xhtml The page contains only a meta element with a charset attribute or a meta element with an http-equiv attribute in the Encoding declaration state, and the encoding specified is not utf8/utf16.
  • Warning: xhtml10x,xhtml11x The page contains only a meta element with a charset attribute or a meta element with an http-equiv attribute in the Encoding declaration state, and the encoding specified is not utf8/utf16.

Explanation

The only character encoding declaration for this page is in the following meta element, which specifies an encoding that is neither UTF-8 nor UTF-16:

%1

If a document is treated as XML and is encoded as neither UTF-8 nor UTF-16, you must declare the encoding in the XML declaration. The meta tag declaration is not recognized by an XML processor. When parsed as XML, this document will be treated as UTF-8.

What to do

Add an XML declaration with encoding information, or change the character encoding for this page to UTF-8. If this page is never parsed as HTML, you can remove the meta tag.

Further reading

Character encodings explained

Declaring the character encoding for HTML

Sources

Multiple encoding declarations using the meta tag

Conditions and severity

[rep_charset_multiple_meta]
  • Error: The page contains more than one meta element used to declare character encoding.

Explanation

One document has to have a single character encoding, and you only need one meta element to declare the character encoding. This page has the following list of meta elements containing character encoding declarations:

%1

What to do

Edit the markup to remove all but one meta element.

Further reading

Character encodings explained

Declaring the character encoding in an X/HTML document

Sources

Meta character encoding declaration used in UTF-16 page

Conditions and severity

[rep_charset_utf16_meta]
  • Error: html5 The page is encoded as UTF16 and a meta encoding declaration is used.

Explanation

This HTML5 page has a character encoding declaration in a meta element:

%1

The HTML5 specification disallows the use of meta character encoding declarations with UTF-16 encoded documents. A UTF-16 byte-order mark (BOM) is the only in-document encoding allowed.

What to do

Remove the meta encoding declaration.

Further reading

Character encodings explained

Declaring the character encoding for HTML

Sources

UTF-16 encoding declaration in a non-UTF-16 document

Conditions and severity

[rep_charset_bogus_utf16]
  • Error: The page is not encoded as UTF16 and a meta encoding declaration is used.

Explanation

The meta character encoding declaration for this page says that the page is encoded as UTF-16:

%1

The character encoding declaration is incorrect: this is not a UTF-16 encoded file. In HTML4.01 the page will be parsed as the default encoding for the browser. In XHTML 1.x and HTML5 it will be treated as UTF-8. If your document used a different character encoding than these, you will likely see corruption of the non-ASCII text on your page.

What to do

Change the encoding declaration to reflect the actual encoding of the page.

Further reading

Character encodings explained

Declaring the character encoding for HTML

Sources

UTF-16LE or UTF-16BE found in a character encoding declaration

Conditions and severity

[rep_charset_utf16lebe]
  • Error: A utf-16be or utf-16le encoding declaration is used in a meta tag, an xml declaration, or a http header.

Explanation

The following encoding declaration(s) specify whether the page is either big-endian (UTF-16BE) or little-endian (UTF-16LE):

%1

You should not use the UTF-16BE and UTF16-LE charset names in character encoding declarations for markup - you should, instead, use just "UTF-16". All UTF-16 pages should start with a byte-order mark, and this will indicate whether the character encoding used is big- or little-endian.

What to do

Ensure that the page starts with a byte-order mark (BOM) and change the encoding declaration(s) to "UTF-16".

Further reading

Character encodings explained

Declaring the character encoding for HTML

Sources

Character encoding declaration in a meta tag not within 1024 bytes of the file start

Conditions and severity

[rep_charset_1024_limit]
  • Error: html5 The first meta element with encoding declaration doesn't fit entirely within the first 1024 bytes of the file.

Explanation

The following character encoding declaration did not fit completely within the first 1,024 bytes of the start of the page:

%1

In HTML5 this will mean that the encoding declaration is not recognized.

What to do

Move the character encoding declaration nearer to the top of the page. Usually it is best to make it the first thing in the head element.

Further reading

Character encodings explained

Declaring the character encoding for HTML

Sources

Character encoding: charset

charset attribute used on a or link elements

Conditions and severity

[rep_charset_charset_attr]
  • Error: html5 A charset attribute is used on a link or an a element.
  • Warning: html,xhtml A charset attribute is used on a link or an a element.

Explanation

The following a and/or link elements contained a charset attribute:

%1

The charset attribute has been deprecated on these elements in HTML5, so it is recommended that you avoid using it in future for any format.

What to do

Remove the charset attribute. If pointing to a page that is under your control, ensure that any appropriate character encoding information is provided for that page.

Further reading

Character encodings explained

Declaring the character encoding for HTML

Sources

Character encoding: other

No character encoding information

Conditions and severity

[rep_charset_none]
  • Warning: html,xhtml No encoding information found at all.
  • Error: html5 No encoding information found at all.

Explanation

There is no declaration or byte-order mark to indicate the character encoding of the page. You should always specify the encoding used for an HTML page. If you don't, you risk that characters in your content will be incorrectly interpreted. This is not just an issue of human readability, increasingly machines need to understand your data too.

HTML5 requires a meta encoding declaration if the character encoding is not declared in the HTTP header or a byte-order mark.

The W3C Internationalization (i18n) Group recommends to always include a visible encoding declaration in a document, because it helps developers, testers, or translation production managers to check the encoding of a document visually.

What to do

Add information to indicate the character encoding of the page.

Further reading

Character encodings explained

Declaring the character encoding in an X/HTML document

Sources

No in-document encoding declaration found

Conditions and severity

[rep_no_encoding_xml]
  • Warning: xhtml10x,xhtml11x No encoding information found at all.

Explanation

No character encoding is declared for this page. Since it is being served as XML, the browser and any XML processor will assume that the encoding is UTF-8. If you are not saving your document as UTF-8, you will find that characters are being corrupted.

Even if you are intending the document to be read as UTF-8, the W3C Internationalization (i18n) Group recommends to always include a visible encoding declaration in a document, because it helps developers, testers, or translation production managers to check the encoding of a document visually.

What to do

Add information to indicate the character encoding of the page inside the page itself .

Further reading

Character encodings explained

Declaring the character encoding for HTML

Sources

Non-UTF-8 character encoding declared

Conditions and severity

[rep_charset_no_utf8]
  • Warning: Any type of character encoding declaration found that doesn't declare the encoding to be UTF-8.

Explanation

The page currently uses the following non-UTF-8 character encoding declaration(s):

%1

UTF-8 is based on Unicode. A Unicode character encoding makes it easier to use a wide range of characters, from the registered trademark symbol to characters in multiple languages. It also simplifies the use of scripts and databases for multilingual sites, and allows you to more easily expand your site to cover new languages, when needed. Using non-UTF-8 encodings can also have unexpected results on form submission and URL encodings, which use the document's character encoding by default. It is not a requirement to use UTF-8, but the HTML5 specification recommends its use, and you should consider it.

UTF-16 is also a character encoding based on Unicode, but is little used on the Web, and generally best avoided.

What to do

Set your authoring tool to save your content as UTF-8, and change the encoding declarations.

Further reading

Character encodings explained

Choosing a character encoding

Changing the encoding of a document

Sources

Language: attributes

A tag uses a lang attribute without an associated xml:lang attribute

Conditions and severity

[rep_lang_missing_xml_attr]
  • Warning: xhtml,xhtml10x,xhtml11x In any tag there is a lang attribute but no xml:lang attribute.

Explanation

xhtml

In the following tag or tags the lang attribute is not accompanied by an xml:lang attribute.

%1

This may cause problems if you try to process this XHTML page as XML, since XML processors recognise xml:lang but don't recognise lang. For XHTML you should normally use both.

xhtml10x,xhtml11x

In the following tag or tags the lang attribute is not accompanied by an xml:lang attribute.

%1

XML processors recognise xml:lang but don't recognise lang. When serving a page as XML, you should have an xml:lang attribute wherever there is a lang attribute. (You only need to have the lang attribute if you plan to serve the page as text/html also.)

What to do

Add an xml:lang attribute to each of the above tags, with the same value as the lang attribute.

Further reading

Language declarations explained

Using attributes to declare language

Sources

A tag uses an xml:lang attribute without an associated lang attribute

Conditions and severity

[rep_lang_missing_html_attr]
  • Error: html5 In any tag there is an xml:lang attribute but no lang attribute.
  • Warning: xhtml In any tag there is an xml:lang attribute but no lang attribute.

Explanation

In the following tag or tags the xml:lang attribute is not accompanied by a lang attribute.

%1

This causes a problem if you try to display an XHTML page as HTML, since HTML parsers don't recognise xml:lang, they only recognise the lang attribute.

HTML5 and XHTML5 require you to use a lang attribute if you use an xml:lang attribute (and the values must be the same).

What to do

Add a lang attribute to each of the above tags, with the same value as the xml:lang attribute.

Further reading

Language declarations explained

Using attributes to declare language

Sources

A lang attribute value did not match an xml:lang value when they appeared together on the same tag.

Conditions and severity

[rep_lang_conflict]
  • Error: In any tag the lang and xml:lang attributes don't match.

Explanation

In each of the following tag or tags the language values of the lang and xml:lang attributes don't match:

%1

What to do

Change one of the values in each tag by editing the markup

Further reading

Language declarations explained

Using attributes to declare language

Sources

The html tag has no language attribute

Conditions and severity

[rep_lang_no_lang_attr]
  • Warning: The html tag has no xml:lang attribute and no lang attribute.

Explanation

There is no language attribute in the html tag.

%1

A language attribute on the html tag sets the default natural language for the page. This information can be used for processing the content in various ways, including such things as spell-checking, accessibility, data formatting, and choice of styles for rendering the page. Every page should have the correct default language specified.

For HTML files, this should be a lang attribute. For XHTML served as HTML you should use both the lang and xml:lang attributes. For files served as XML only, you should have xml:lang, but you don't need to have the lang attribute.

What to do

html,html5

Add a lang attribute that indicates the default language of your page.

Example: lang='de'

xhtml

Since this is an XHTML page served as HTML, add both a lang attribute and an xml:lang attribute to the html tag to indicate the default language of your page. The lang attribute is understood by HTML processors, but not by XML processors, and vice versa.

Example: lang="de" xml:lang="de"

xhtml10x,xhtml11x

Add an xml:lang attribute that indicates the default language of your page.

Example: xml:lang='de'

Further reading

Language declarations explained

Using attributes to declare language

Choosing language values

Sources

The language declaration in the html tag will have no effect

Conditions and severity

[rep_html_no_effective_lang]
  • Warning: html,html5,xhtml The html tag has no lang attribute, but has an xml:lang attribute.
  • Warning: xhtml10x,xhtml11x The html tag has no xml:lang attribute, but has a lang attribute.

Explanation

This is the html tag in this document.

%1

A language attribute on the html tag sets the default natural language for the page. This information can be used for processing the content in various ways, including such things as spell-checking, accessibility, data formatting, and choice of styles for rendering the page. Every page should have the correct default language specified.

HTML parsers only recognize the lang attribute. XML parsers only recognize the xml:lang attribute. On this page the wrong attribute is being used, and so the default language of the page is not being recognized.

What to do

html,html5

Since this page is served as HTML, use the lang attribute.

xhtml

Since this page is served as HTML, use the lang attribute. If there is a chance that the same page will also be processed by an XML parser, use both the lang attribute and the xml:lang attribute.

xhtml10x,xhtml11x

Since this page is served as XML, use the xml:lang attribute instead of a lang attribute. If there is a chance that this page will also be served as text/html in some circumstances, use both.

Further reading

Language declarations explained

Using attributes to declare language

Choosing language values

Sources

This HTML file contains xml:lang attributes

Conditions and severity

[rep_lang_xml_attr_in_html]
  • Error: html In any tag there is an xml:lang attribute.

Explanation

The page contains xml:lang attributes in the following places:

%1

The xml:lang attribute is not a valid unless you are using XHTML.

What to do

Remove the xml:lang attributes from the markup, replacing them, where appropriate, with lang attributes.

Further reading

Language declarations explained

Choosing language values

Sources

A language attribute value was incorrectly formed

Conditions and severity

[rep_lang_malformed_attr]
  • Error: Any tag has an xml:lang or lang attribute with a value that is not just a-zA-Z0-9 plus hyphen.

Explanation

In the following tag or tags the language values of the lang and xml:lang attributes are not well-formed according to BCP47. Attributes values must contain a maximum of one language tag, and a language tag is composed of one or more subtags taken from the IANA Language Subtag Registry, separated by hyphens (eg. zh-Hans-SG).

%1

What to do

Change the attribute values to conform to BCP47 syntax rules.

Further reading

Language declarations explained

Choosing language values

Sources

Language: Content-Language meta

Content-Language meta element used to set the default document language

Conditions and severity

[rep_content_lang_meta]
  • Error: html5 The page contains a meta element with the http-equiv attribute set to Content-Language.
  • Warning: html,xhtml,xhtml10x,xhtml11x The page contains a meta element with the http-equiv attribute set to Content-Language.

Explanation

This page uses a meta element with the http-equiv attribute value set to Content-Language.

%1

The HTML5 specification has made this type of meta element obsolete in HTML, so you should not use it for pages written in HTML5. This is due to the widespread confusion surrounding the use of this construct. In addition, browsers are inconsistent in the way they handle this information.

Given this, it is strongly recommended that you not use this Content-Language meta element in any HTML format.

What to do

Remove the Content-Language meta element, and ensure that you have used an attribute on the html tag to specify the default language of the page.

Further reading

Language declarations explained

Using attributes to declare language

Choosing language values

Sources

Non-Latin attribute values

Class or id names found that are not in Unicode Normalization Form C

Conditions and severity

[rep_misc_non_nfc]
  • Warning: Non-NFC text found in a class or id attribute.

Explanation

Unicode allows you to represent certain letters using different combinations of bytes. For example é can be represented as LATIN SMALL LETTER E WITH ACUTE or as LATIN SMALL LETTER E followed by COMBINING ACUTE ACCENT. To avoid problems when trying to match class or id names against CSS selectors, or for JavaScript lookup, all your markup tags and CSS and JavaScript code should use the same byte combinations for the same text, ie. be normalised.

Total number of non-NFC names: %1.

%2

What to do

It is recommended to save all content as Unicode Normalization Form C (NFC).

Further reading

Unicode normalization forms

Sources

Markup: general

%1 tags found with no class attribute

Conditions and severity

[rep_misc_tags_no_class]
  • Comment: A b or an i tag is found without a class attribute.

Explanation

One or more %1 tags that don't use a class attribute were found in the source code for this page. These tags may cause problems for localization if the content for which they are used has more than one semantic value.

Total number of %1 tags: %2.

Number of %1 tags without a class attribute: %3.

What to do

You should not use %1 tags if there is a more descriptive and relevant tag available. If you do use them, it is usually better to add class attributes that describe the intended meaning of the markup, so that you can distinguish one use from another.

Further reading

Using <b> and <i> tags

Sources

Markup: direction

Incorrect values used for dir attribute

Conditions and severity

[rep_dir_incorrect]
  • Error: html,xhtml,xhtml10x,xhtml11x A dir attribute contains values that are not rtl or ltr.
  • Error: html5 A dir attribute contains values that are not rtl, ltr or auto.

Explanation

html,xhtml,xhtml10x,xhtml11x

In the following tag or tags the value should be one of rtl or ltr:

%1

html5

In the following tag or tags the value should be one of rtl, ltr, or auto:

%1

What to do

Correct the attribute values.

Further reading

Markup for text direction explained

Setting up a right-to-left page

Changing the direction of a block element

Mixing text direction inline

Sources

bdo tags found with no dir attribute

Conditions and severity

[rep_markup_bdo_no_dir]
  • Error: A bdo tag exists with no dir attribute.

Explanation

One or more bdo tags that don't use a class attribute were found in the source code for this page. Without a dir attribute, the bidirectional override will not be applied.

Total number of bdo tags: %2.

Number of bdo tags without a class attribute: %3.

What to do

Add a dir attribute to each bdo tag.

Further reading

Directional markup explained

Overriding the Unicode bidirectional algorithm

Sources

Tell us what you think (English).

Send us a comment

Follow our news feed.

 ‎@webi18n

 Home page news

Other introductory materials

We have recently published a Getting Started page to help you find information on the site. The Getting Started page points to a series of articles that are underway, and that provide newcomers with a gentle introduction to key internationalization topics and point to basic information on the site to get you going.

By: Richard Ishida, W3C.

Content first published 2011-07-08 18:08. Last substantive update 2011-07-08 18:08 GMT. This version 2011-07-08 18:08 GMT

For the history of document changes, search for article-checker in the i18n blog.