World Wide Localization - slide "Character Encoding"

Originally huge roadblock to networking
Part of traditional locale model
Really important for world wide localization
Largely solved due to Unicode (and XML)
Indication of encoding in protocol headers or formats:
- Content-Type: text/html;charset=iso-8859-6
- <?xml version='1.0' encoding='iso-8859-6'?>
Unicode as a reference for conversion and processing ("Think Unicode"):
- Visible e.g. in numeric character references (覫)
- Some remaining inaccuracies due to vendor-specific differences in conversion tables (see e.g. XML Japanese Profile)
Even better: Unicode-based encodings (e.g. UTF-8) for transfer:
- UTF-8 (and UTF-16, with a BOM) are defaults for XML
- XML processors are required to accept UTF-8 (and UTF-16)