Intended audience: newcomers to internationalization who want to change the encoding of their (X)HTML pages.
How do I change the encoding of my (X)HTML pages to UTF-8?
So you've heard that it's useful to encode your pages in UTF-8 rather than a legacy encoding such as Windows 1252 or ISO 8859-1, and you've heard that others are doing it, but you're not sure how to do it. This page will help.
This article draws summarises the information you need. Follow the embedded links to other articles on the site if you need to get detailed information about any step.
It is not sufficient to just change the declarations inside your pages to say that the page is encoded in UTF-8. You must ensure that your data is actually encoded, ie. saved, in UTF-8. If you are working with hand-edited files then you should use your editor to save the file in UTF-8 rather than the encoding you were using. If you are building files from scripts and databases, you should ensure that the data is converted as necessary and that the correct parameters are set in your scripting environment.
Note that you may have to ensure that the data does not include a UTF-8 signature, also known as a byte-order mark (BOM).
You should change the character encoding declaration in your page (or add one if you don't already declare it).
Although your data is in UTF-8 and you have declared it in the page, your server may still be serving the page with an accompanying HTTP header that says it is something else. The declaration in the HTTP header will override information inside the page.
Server admin privileges are needed to change the encoding sent in the HTTP header, though you may be able to do so yourself even if you are serving files via an ISP. Consult your server admin person. See the explanation of one way to do this for an Apache server.
Tell us what you think (English).
Content first published 2005-08-26. Last substantive update 2005-08-26 12:51 GMT. This version 2010-08-27 20:50 GMT
For the history of document changes, search for qa-changing-encoding in the i18n blog.
Copyright © 2005-2010 W3C® (MIT, ERCIM, Keio, Beihang), All Rights Reserved. W3C liability, trademark, document use and software licensing rules apply. Your interactions with this site are in accordance with our public and Member privacy statements.