On this page
Access to a Web for All has been a fundamental concern and goal of the World Wide Web Consortium since the beginning, and is a natural requirement for Web-based applications, given that they can be accessed by people around the world. Unfortunately, it is easy to overlook the needs of people in cultures different to your own, or who use different languages or writing systems. If you do, you will build applications and content that, in fact, present barriers for the use of your technology or content by many people around the world.
What is Internationalization?
Translation and localization are NOT what we mean by 'internationalization'. Surprised? Let me explain.
If you internationalize, you design or develop your content, application, specification, and so on, in a way that ensures it will work well for, or can be easily adapted for, users from any culture, region, or language. This is where you address the first set of barriers: not the fact that your user can't read or relate to your product, but the barriers that make it difficult to adapt your product so that they can.
It's essentially a Quality approach: one that sees you taking action early in the development cycle so that you avoid costly and sometimes prohibitive obstacles when it comes to rolling out your product to new marketplaces.
A universal code base
Fundamental to internationalization is ensuring that your product supports text in any writing system of the world. You should ensure that your product is built on the universal character set, Unicode. This means not only the HTML page that you serve to your user, but all the backend databases, content management systems, scripts, and so forth. There are plenty of examples of beautiful user interfaces that handle deftly any language that you need, but that return gobbledegook after the data has been processed behind the scenes.
You'll also want to ensure that it's possible to easily swap in translations of any natural language text that will be read by humans (including error messages, JSON strings, etc.), but also carry metadata about the language and direction of that text. The language metadata is important to get the fonts right, and to allow for support of the different typographic styles used around the world (for things such as line-breaking, text justification, emphasis or other text decorations, text selection and units, etc.)
It's advisable to clearly separate semantics (markup) from styling (CSS), and to avoid hard-coding content that assumes a particular order of text, or a particular set of punctuation marks, etc.
Did you know that the most widely used writing system in the world after the Latin script is Arabic? The script is used for many languages, often with variations in the way that vowels are represented, or with slightly different repertoires. But what all these languages have in common is that they are read predominantly from right to left. This also has implications for layout: things such as table columns, spreadsheets, graphs, cascading menus, and even web page layout, are normally mirror-images of content produced in English. So instead of using values like 'left' and 'right' in your style sheet, you should use logical values such as 'start' and 'end': that way, when the direction of a page changes, the mirroring happens automatically and without the need for the translator to mess with your code.
Actually, it's even more complicated than that. Arabic mixes right-to-left and left-to-right text on the same line, and it is important to be able to control the direction of the surrounding context for that to work properly. It's also important to handle data strings in a way that preserves information about their base direction, so that when they are used on the user interface they don't look mangled.
And it's not just Arabic. Right-to-left writing is used for Hebrew, for south Asian languages such as Dhivehi (Maldivian) and Rohingya, and for fast-growing African scripts such as Adlam and N'Ko.
Names, addresses, and such
If you are dealing with HTML forms or creating databases for information such as people's names and addresses, you will need to consider how to handle the many different approaches to formatting data that exist around the world.
In some countries people only use a single name, or write their name using the family then given order. They may have single letter names, or very long names. Street addresses in Japanese go from the general (country or prefecture) to the specific (house location) from top to bottom, and there are plenty of variations on that theme. (In fact, Japanese homes typically don't have house numbers at all.)
You'll need to consider how you'll cope with acquiring and storing this kind of data (and many others, with region-specific approaches). The more you can make your system flexible up-front, the easier a time you'll have when you want to support people in a new locale.
Oh, and by the way, these people don't speak or write in English, and they tend to sort their data in very different ways, so you'll also need to figure out whether that's going to cause a problem for your backstore or back office, and put plans in place to address it as you localize.
Time zones, currencies, dates, etc.
You will usually want to store data internally in one standard form, but display it in ways that look natural to local users. As well as the names and addresses already mentioned, does the person working with your app or content expect to see periods or commas for decimal points? How about the order of day, month, and year, or even which day begins the week in a calendar?
You may also need to support alternative calendars, time zones and daylight savings, in both native plus transliterated forms, etc. Did you know that there are numerous countries around the world that have local calendars, and use them on a regular basis. Birth dates are typically recorded in the Imperial calendar in Japan, and newspapers in Thailand usually carry the date in the Buddhist calendar (the Western year 2021 is 2564 in Thailand). Any app you create needs to be able to adjust information for the appropriate time zone.
Equally, you'll need to consider how to handle users who work with a range of currencies. In addition to deciding how to format and represent monetary data when displayed to the user, you also should consider how to put in place mechanisms to manage diverse currency systems. How will you develop pricing models for different countries, which may have large variations in standard of living? How will you convert subscriptions and payments from one currency to another?
Cultural norms & expecations
You'll also want to do some homework in advance about the cultural preferences and habits of the marketplaces where you want your application to be used, and choose flexible content design technologies and processes so that you can later support others.
For example, symbolism can be culture-specific. The check mark means correct or OK in many countries, but in some countries, such as Japan, it can be used to mean that something is incorrect. Japanese localizers may need to convert check marks to circles (their symbol for 'correct') as part of the localization process.
If you want your product to appeal to users, you'll need to be using content management systems that give you the ability to flex colours, layout, and information structures, as well as introducing local colour. But you'll also need to ensure that you are not hard-coding graphics or images that offend or alienate users in another region.
And then there are quite fundamental questions for monetization applications. Is the community one that is familiar with credit card transactions? Does the population you want to reach have access to sufficient bandwidth (or even to the internet at all) when they need to use your application? Do the banking or other systems that your application interacts with support the language of the user? And remember that a large majority of users these days interface with the Web via mobile devices.
And have you taken into account local regulatory and legal considerations in the various territories your application will reach to?
And then localize
The things we have discussed so far all need some attention and preparation while you are planning and building your application. Otherwise, you could be, instead, building barriers for yourself when it comes to the exciting phase where you translate and adapt your product for various local languages and markets.
The localization phase is where you actually adapt for different users. You change the language via translation; you change the graphics and colours, where appropriate; you flick that text direction switch; you make available alternative data collection forms and processes; you write locally-relevant content, and so on.
Internationalization means forseeing and planning for that phase from the earliest possible moment, so that not only are you ready when the time comes, but you can avoid digging yourself into pit holes that may be costly to get out of later on down the line.