Accesskey n skips to in-page navigation. Skip to the content start.

s_gotoW3cHome Internationalization
 

Checking HTTP Headers

Intended audience: users, XHTML/HTML coders (using editors or scripting), script developers (PHP, JSP, etc.), Web project managers, and anyone who needs to know how to check the character encoding used by the HTTP header.

Updated 2010-08-28 7:29

Question

How can I check the character encoding information sent in the HTTP header of a web document?

Background

It is important to clearly indicate the character encoding (charset) of a document served on the Web. Otherwise, a receiver may not correctly interpret the document. A Web browser, for example, may show random characters instead of readable text. One way of indicating the character encoding of a Web document is to put this information into the charset parameter of the Content-Type header.

In particular, it is important to note that the encoding declared in the HTTP header overrides all in-document encoding declarations in HTML and CSS files.

Answer

There are several ways to check the actual Web document served, including the headers:

The i18n Checker

The Internationalization Checker tool, developed by the W3C, checks web pages for various internationalisation issues. It also has an information section that summarises key internationalization-related information about a page, such as character encoding and language declarations, etc. That section tells you whether an encoding declaration is used in the HTTP header, and if so, what is the encoding.

The i18n checker tool is particularly useful, since it also shows you other encoding declarations used in the document, and raises a flag if there are differences.

Use a Web-based service

There are several services that show you all the HTTP headers and the (HTML/XHTML) source of the document returned from the server after you enter the address of the document you are interested in:

Note: W3C has no relationship to any of these services.

In the HTTP headers, look for the Content-Type header, and in particular for the charset parameter, e.g.

Content-Type: text/html; charset=utf-8

Note: The charset parameter may not be present. This is okay if your document itself indicates its character encoding.

Use the W3C Markup Validation Service

To check the markup, the Markup Validation Service has to make sure it correctly decodes the document it checks. It will show an error message if it cannot find information about the encoding, or if it finds conflicting information, or if it cannot decode the document according to the information it found.

To know the encoding that the validator found, you can use the extended interface. In this interface, you can also select the show source option, and then visually check that the source is correctly interpreted. This is useful to check that you actually use the right encoding. It is not always possible to mechanically check whether for example, a document claiming to be iso-8859-1 is actually encoded using iso-8859-2 or some other encoding.

Use telnet or another command-line tool

This requires a bit more expertise, but may be easier to automate. Another command line tool may be wget (with a -S or -s option).

By the way

Some servers transcode the Web documents they serve to different character encodings for different clients. This happens for example with some servers in Russia. This requires special care, because your browser, running e.g. on a Mac or on a Windows system, may indicate using a different character encoding than the encoding given to you by a Web-based service or the W3C Markup Validation Service (which are mostly based on UNIX systems).

Tell us what you think (English).

Subscribe to an RSS feed.

New resources

Home page news

Twitter (Home page news)

‎@webi18n

Further reading

By: Martin Dürst, W3C. Changed by: Richard Ishida, W3C.

Valid XHTML 1.0!
Valid CSS!
Encoded in UTF-8!

Content first published 2003-06-16. Last substantive update 2010-08-28 7:29 GMT. This version 2010-08-28 7:29 GMT

For the history of document changes, search for qa-headers-charset in the i18n blog.