Jean DOUGNAC
TELIS Télécom
8, rue Sainte Barbe BP2089
F-13203 MARSEILLE CEDEX 01
FRANCE

Tel : +33.91.13.95.11
Fax : +33.91.91.63.43



PROPOSITION PAPER FOR WWW INTERNATIONALIZATION WORKSHOP


PARIS MAY 6-10,1996


INTRODUCTION

WHAT A MULTILINGUAL APPLICATION IS

CONSULTING MULTILINGUAL INFORMATION

EXTENTED NOTION OF CHARACTER SET

EXISTING SOFTWARE CHARACTERISTICS


INTRODUCTION

In the scope of the WWW internationalization workshop, this paper is based on the experience consisting to design a multilingual software, to support originately telematics applications on a Worldwide basis.

Similar technical questions apply when developping multilingual applications on the World-Wide Web. This paper introduces the main aspects to consider for that purpose.


WHAT A MULTILINGUAL APPLICATION IS

Typically an application suggesting the end user a list of languages to select his/her preference to work. In the context introduced above, we may imagine a multilingual service presenting first a banner to choose a working language among several various scripts : English, French , Japanese , Arabic, Russian, Hindi or Hebrew in the following screen example.



CONSULTING MULTILINGUAL INFORMATION

Multilingual information is consulted by the mean of a terminal station, whose functional diagram could be the following one.

On the left part of the block diagram, the network interface links the terminal to the server providing the information to consult.

In a multilingual approch, this server must specify to the terminal :

Selecting and switching character sets are trigerred by ISO 2022 escape sequences.

Unless identifying character sets as private sets (a non recommended method), their identification is standardized by ISO 2375. Keep in mind a big part of confusions in character sets identification is due to the fact :


EXTENTED NOTION OF CHARACTER SET

Character set standards specify codes assigned to characters, respecting as much as possible the following principles :

In addition to code assignment to characters, management of a character set also includes to process :

Points leading to a representation of a character set by the mean of four planes dealing with:

Considering the management of the three last planes are proprietary, the software introduced at the beginning of this paper was facing the problem to manage a wide range of script rules maintaining the code volume below 640 Ko memory limit of MS/DOS.

The possible implementation was to distinguish :

In addition to this architecture characteristics, font files also discribe the keyboard layout.

Reconfiguring this layout is a station local function, transparent for the management of remote applications.

Since these mechanisms are proprietary instead of standardized, using such fonts, station installed or downloaded implies to deliver in the same time to the station the portion of code capable to interpret these proprietary data.

It is strongly suggested to consider the implementation of such a portion of code by the mean of an applet.


EXISTING SOFTWARE CHARACTERISTICS

Registered internationally under ORIENT GATEWAY(R) trademark.


Publication

" Terminal multilingue : une approche universelle " (Multilingual terminal : a universal approach)

J.DOUGNAC, Télécom Review, Paris (France), Dec 1995