Input Method Editor API

Abstract

This specification defines an “IME API” that provides Web applications with scripted access to an IME (input-method editor) associated with a hosting user agent. This IME API includes:

An InputMethodContext interface, which provides methods to retrieve detailed data from an in-progress IME composition, and to update that data.
A Composition interface, which represents read-only attributes about the current composition, such as the actual text being input, its length, and its target clause.
need to define what a clause is

This API is designed to be used in conjunction with DOM events and elements on the Web platform, notably: composition events and the Canvas 2D Context API [CANVAS-2D].

1. Introduction

This section is non-normative.

Even though existing Web-platform APIs allow developers to implement very complicated Web applications, such as visual chat applications, using technologies such as SVG or the <canvas> element and API, developers have difficulties when implementing Web applications that control input-method editors. To provide the ability for hosting user agents to expose Web applications to the composition text being composed in an associated IME, the DOM Level 3 Events specification [DOM-LEVEL-3-EVENTS] introduces composition events. Using composition events, Web applications can retrieve composition text from an IME.

However, Web applications can still run into difficulties when they manipulate IMEs on non-editable elements such as the <canvas> element; those difficulties include the fact that a Web application cannot do the following:

indicate to the user whether the Web application renders composition text by itself, or needs to ask user agents to render it
determine the place where user agents render composition text
detect whether the user agent renders candidate windows by themselves
determine the place where user agents render candidate windows

The Web platform has a number of existing APIs useful for implementing a custom IME in JavaScript. For example, the Web Storage API can store an IME dictionary, and the WebSocket and XMLHttpRequest APIs allow sending a server request that performs a lookup in an IME dictionary, and so on. In fact, Web-application developers are already developing and deploying JavaScript-based IMEs that use these APIs. However, it is currently difficult to make those JavaScript-based IMEs work on all user agents, because they often rely on APIs specific to the hosting user agent, such as browser extension APIs.

To solve these IME-related problems, this specification introduces an IME API that allows Web applications to read and write composition data made available by user agents. Moreover, this specification introduces interfaces for compositions, so Web applications can read detailed composition data and update it. A Composition object provides a reference to an ongoing IME composition, so Web applications can retrieve the composition text and attributes.

The use of those APIs allows Web applications the ability to set the position of a composition window and to retrieve the text and attributes of the ongoing composition.

Consider the following examples. The first example shows the source for a Web application that renders composition text by itself and uses the candidate window provided by an IME.

<!DOCTYPE html>
<html>
<head>
<script language="javascript" type="text/javascript">
function init() {
    window.inputmethodmanager.setOpenState(true);
    var node = document.getElementById('canvas0');
    node.getInputContext().setEnable(true);
    node.addEventListener('compositionstart', onCompositionStart, false);
    node.addEventListener('compositionupdate', onCompositionUpdate, false);
    node.addEventListener('compositionend', onCompositionEnd, false);
}

function onCompositionStart(event) {
}

function onCompositionUpdate(event) {
    var x = 0;
    var y = 0;
    var canvas = document.getElementById('canvas0');
    var context = canvas.getContext('2d');
    var inputContext = canvas.getInputContext();
    var composition = inputContext.composition;

    // Render a caret.
    // NOTE: this just renders a caret rectangle in black for
    // simplicity.
    if (composition.caret.start >= 0) {
        var start = context.measureText(
                composition.text.substring(0, composition.caret.start));
        var end = context.measureText(
                composition.text.substring(0, composition.caret.end));
        context.fillStyle = ’black’;
        context.fillRect(start.width, y, end.width, y + 10);
    }

    // Render the clauses in the composition.
    for (var i = 0; i < composition.clauses.length; ++i) {
        var text = composition.clauses[i].text;
        var metrics = context.measureText(text);
        // Draw the text of this clause.
        context.fillStyle = composition.clauses[i].textColor;
        context.fillText(text, x, y);
        // Draw an underline under the text. For simplicity, this code
        // draws a bold underline for selected clauses or a thin
        // underline for non-selected ones.
        if (composition.clauses[i].selected) {
            context.fillRect(x, y, x + metrics.width, y + 2);
        } else {
            context.fillRect(x, y, x + metrics.width, y + 1);
        }
        x += metrics.width;
    }

    // Move the candidate window outside of the composition text.
    window.inputmethodmanager.moveCandidateWindow(0, y, x, y + 10);
}

function onCompositionEnd(event) {
}
</script>
</head>
<body>
<canvas id=”canvas0” width=”640” height=”480”></canvas>
</body>
</html>

The next example shows the source for a simple IME that composes Japanese Hiragana characters from key strokes.

This is just a sample and not suitable for real use.

<!DOCTYPE html>
<html>
<head>
<title></title>
<script language="javascript" type="text/javascript">
var imeActivated = false;
var imeRomajiInput = '';
var imeRomajiTable = {
    'A': '\u3042', 'I': '\u3044', 'U': '\u3046', 'E': '\u3048', 'O': '\u304A',
    /* suppressed */
};

function init() {
    // Disable the system IME associated with this window.
    window.inputmethodmanager.setOpenState(false);

    // Listens the keyboard events.
    var node = document.getElementById('input0');
    node.addEventListener('keydown', onKeyDown, false);
    node.addEventListener('keyup', onKeyUp, false);
}

function onKeyDown(event) {
    // Toggle the input mode when pressing a shift key.
    if (event.key == 'Shift') {
        imeActivated = !imeActivated;
        imeRomajiInput = '';
    }

    // Exit if this IME is not activated.
    if (!imeActivated)
        return true;

    var imeComposition = new Composition;
    var imeConfirm = false;

    if (event.keyCode < 0x20) {
        event.preventDefault();
        return true;
    }

    // Convert the input key strokes to a Japanese character.
    imeRomajiInput += String.fromCharCode(event.keyCode);
    if (imeRomajiTable[imeRomajiInput]) {
        imeComposition.text = imeRomajiTable[imeRomajiInput];
        imeConfirm = true;
        imeRomajiInput = '';
    } else {
        imeComposition.text = imeRomajiInput;
    }

    // Fill the Composition object.
    imeComposition.caret.start = imeComposition.text.length;
    imeComposition.caret.length = 1;
    imeComposition.clauses[0] = new CompositionClause;
    imeComposition.clauses[0].text = imeComposition.text;
    imeComposition.clauses[0].start = 0;
    imeComposition.clauses[0].selected = true;
    imeComposition.clauses[0].textColor = 'currentColor';
    imeComposition.clauses[0].backgroundColor = 'transparent';
    imeComposition.clauses[0].lineStyle = 'solid';
    imeComposition.clauses[0].lineColor = 'black';

    // Send the Composition object to the user agent.
    var context = event.target.getInputContext();
    context.setComposition(imeComposition);
    if (imeConfirm)
        context.confirmComposition();

    // Disable the default action to prevent this key from being
    // inserted.
    event.preventDefault();
    return false;
}

function onKeyUp(event) {
}
</script>
</head>
<body onload="init();">
<textarea id="input0" cols="80" rows="10"></textarea>
</body>
</html>

2. Background: What’s an Input Method Editor?

This section is non-normative.

An IME (input-method editor) is an application that allows a standard keyboard (such as a US-101 keyboard) to be used to type characters and symbols that are not directly represented on the keyboard itself. In China, Japan, and Korea, IMEs are used ubiquitously to enable standard keyboards to be employed to type the very large number of characters required for writing in Chinese, Japanese, and Korean.

An IME consists of two modules; a composer and a converter.

A composer is a context-free parser that composes non-ASCII characters (including phonetic characters) from keystrokes.

A converter is a context-sensitive parser that looks up a dictionary to convert phonetic characters to a set of ideographic characters.

A IME composition is a instance of text produced in an IME.

When an IME receives keystrokes, it sends the keystrokes to a composer and receives phonetic characters matching to the keystrokes. When an IME receives phonetic characters from a composer, it sends the phonetic characters to a converter and receives the list of ideographic characters matching to the phonetic characters. The following figure shows the basic structure of an IME.

2.1 Composer

A composer consists of two types of composers: a phonetic composer and a radical composer.

A phonetic composer composes a phonetic character from its ASCII representation.

A radical composer composes a phonetic character from phonetic radicals.

An IME usually shows the text being composed by a composer with its own style to distinguish it from the existing text. Even though most of composers output phonetic characters, some composers (such as Bopomofo composers) output a placeholder character instead of phonetic characters while composing text.

need to define composition window

probably should define radical

probably should define clause here too

2.1.1 Phonetic composer

Phonetic composers are not only used for typing Simplified Chinese and Japanese, but also used for typing non-ASCII characters (such as mathematical symbols, Yi, Amharic, etc.) with a US-101 keyboard. Each of these languages has a mapping table from its character to a sequence of ASCII characters representing its pronunciation: e.g., ‘か’ to ‘ka’ in Japanese, and; ‘卡’ to ‘ka’ in Simplified Chinese. This mapping table is called as Romaji for Japanese and Pinyin for Simplified Chinese, respectively. A phonetic composer uses these mapping tables to compose a phonetic character from a sequence of ASCII characters produced by a US keyboard.

A phonetic composer for Simplified Chinese outputs the input ASCII characters as its composition text.

On the other hand, a phonetic composer for Japanese outputs phonetic characters when the input ASCII characters have matching phonetic characters.

A phonetic composer for mathematical symbols outputs a composed mathematical symbol and shows the source keystrokes to its own window, which is an example of a composition window.

2.1.2 Radical composer

Radical composers are mainly used for typing Traditional Chinese and Korean with phonetic keyboards. Each phonetic keyboard of these languages can produce phonetic radicals: e.g., typing ‘r’ produces ‘ㄱ’ on a Korean keyboard; typing ‘o’ produces ‘人’ on a Traditional-Chinese (or Bopomofo) keyboard, etc. A radical composer composes a phonetic character from phonetic radicals given by these keyboards: e.g., typing 'ㄱ' (r) and 'ㅏ' (k) produces '가' on a Korean keyboard; typing ‘人’ (o), ‘弓’ (n), and ‘火’ (f) produces ‘你’ on a Traditional-Chinese keyboard, etc.

A radical composer for Korean outputs the phonetic radicals as its composition text.

A radical composer for Traditional Chinese outputs a placeholder character (U+3000) and shows the phonetic radicals being composed to its own window. This window is an example of a composition window.

Some platforms (such as Mac and Linux) use radical composers for typing accented characters used in European countries. For example, typing ‘ ̈ ’ (option+u) and ‘a’ (a) produces ‘ä’ on US keyboards of Mac.

2.2 Converter

A converter is a context-sensitive parser used for replacing the outputs of a composer to ideographic characters on Chinese, Japanese, and Korean.

Korean does not use ideographic characters so often.

Because Chinese, Japanese, and Korean have many homonyms, each sequence of phonetic characters usually matches many ideographic characters: e.g., a Japanese phonetic character 'か' matches Japanese ideographic characters ‘化’, ‘科’, ‘課’, etc.; Pinyin characters ‘ka’ matches Simplified-Chinese ideographic characters ‘卡', ‘喀’, ‘咯’, etc.; Bopomofo characters ‘人弓’ matches Traditional-Chinese ideographic characters ‘乞’, ‘亿’, ‘亇’, etc.

A converter looks up a dictionary and shows a list of candidates of possible ideographic characters so a user can choose one. This list is known as a candidate list. A candidate list is known as a candidate window when it has its own window.

Some Japanese IMEs show annotations in its candidate window for a character that is not so easy to distinguish from other characters (such as full-width alphabets, full-width Katakanas, and half-width Katakanas, etc.), as shown in the following figure.

The next figure shows a candidate window of a Simplified-Chinese IME.

And the next figure shows a candidate window of a Traditional-Chinese IME.

A converter often integrates an MRU (Most-Recently Used) list. Even though there are many ideographic characters for each phonetic character (or phonetic radical), a user does not usually use all these ideographic characters. A converter uses an MRU list to filter out ideographic characters not used so often from a candidate list. A converter sometimes integrates a grammar parser. A converter that integrates a grammar parser splits the given phonetic characters into grammatical clauses and converts only one clause at once. When a sequence of phonetic characters consists of n clauses and the i-th clause has m_i candidates, the total number of the candidates for the input characters become (m_1 * m_2 * … * m_n). To reduce the number of candidates owned by a converter, a converter usually processes one clause at once. This clause is called as a selected clause.

An IME usually renders a selected clause with a special style to distinguish it from other clauses, as shown in the following figure.

When a converter converts two or more clauses, it chooses candidates for the selected clause so it becomes grammatically consistent with the surrounding clauses: e.g., Japanese converters usually output ‘危機一髪’ (not ‘危機一発’) for Japanese phonetic characters ‘ききいっぱつ’ because ‘危機一発’ is grammatically incorrect.

3. Conformance

As well as sections marked as non-normative, all authoring guidelines, diagrams, examples, and notes in this specification are non-normative. Everything else in this specification is normative.

The key words must, must not, required, should, should not, recommended, may, and optional in this specification are to be interpreted as described in [RFC2119].

More to be written.

4. Terminology and algorithms

To be written.

5. The getInputContext() method

For each element, a user agent can choose an IME for the element. To control the IME attached to an element, it is a good idea to add a method to the HTMLElement interface.

If the getInputContext() method cannot be added to the HTMLElement interface, it should be moved to the InputMethodContext interface.

interface HTMLElement
    …
    Object getInputContext ();
};

5.1 Methods

getInputContext: Returns an InputMethodContext interface associated with this element. By default, a user agent returns an InputMethodContext interface representing the system IME. To change the behavior of the IME associated with an element, authors must first obtain an InputMethodContext interface by calling the getInputContext() method of the HTMLElement interface.
No parameters.
No exceptions.
Return type: Object

6. The Composition Interface

This interface represents an ongoing IME composition. It provides an attribute representing the text being composed by an IME. It also provides a method to retrieve attributes of the specified character in the composition text.

interface Composition {
    readonly attribute DOMString             text;
    readonly attribute CompositionCaret      caret;
    readonly attribute CompositionClauseList clauses;
};

6.1 Attributes

caret of type CompositionCaret, readonly: Represents the caret in this composition text.
No exceptions.
clauses of type CompositionClauseList, readonly: Returns the clauses in the composition text.
No exceptions.
text of type DOMString, readonly: Represents the text being composed by an IME. This string is equal to the text attribute of a compositionupdate event.
No exceptions.

7. The CompositionCaret Interface

This interface represents the caret in the composition text. When the value of its ”length” attribute is 0, an IME uses a vertical bar as an its cursor. Otherwise, an IME uses a block cursor as its cursor. When an IME does not show a caret, both values must be -1.

interface CompositionCaret {
    readonly attribute int start;
    readonly attribute int length;
};

7.1 Attributes

length of type int, readonly: Represents the length of the caret, in characters. The default value is -1.
No exceptions.
start of type int, readonly: Represents the beginning of the caret, in characters. This value is less than the length of the composition text, or -1 if an IME does not show a caret. The default value is -1.
No exceptions.

8. The CompositionClause Interface

This interface represents a clause of the composition text. This interface also represents attributes of the clause retrieved from a user agent so Web applications can render the clauses as a user agent does.

Retrieving attributes is not supported by all operating systems or all IMEs.

When a user agent cannot retrieve attributes from an operating system, it sets default values for them.

interface CompositionClause {
    readonly attribute DOMString text;
    readonly attribute int       start;
    readonly attribute boolean   selected;
    readonly attribute DOMString textColor;
    readonly attribute DOMString backgroundColor;
    readonly attribute DOMString lineStyle;
    readonly attribute DOMString lineColor;
};

8.1 Attributes

backgroundColor of type DOMString, readonly: Represents the background color used by a user agent to render this clause. This string must be parsed as a CSS color value. The default value is "transparent".
No exceptions.
lineColor of type DOMString, readonly: Represents the color of an underline used by a user agent. If lineStyle is ”none”, this value is undefined. The default value is currentColor.
No exceptions.
lineStyle of type DOMString, readonly: Represents the style of an underline used by a user agent to render under this clause. This value must be one defined in the ‘text-underline-style’ of CSS 3; i.e., "none", "solid", "double", "dotted", "dashed", and "wave". The default value is "solid".
No exceptions.
selected of type boolean, readonly: Represents whether this clause is a selected clause.
No exceptions.
start of type int, readonly: Represents the offset of this clause from the beginning of the composition text, in characters.
No exceptions.
text of type DOMString, readonly: Represents the text of this clause.
No exceptions.
textColor of type DOMString, readonly: Represents the text color used by a user agent to render this clause. This string must be parsed as a CSS color value. The default value is currentColor.
No exceptions.

9. The CompositionClauseList sequence

typedef sequence<CompositionClause> CompositionClauseList;

Throughout this specification, the identifier CompositionClauseList is used to refer to the sequence<CompositionClause> type.

10. The InputMethodContext Interface

interface InputMethodContext {
    readonly attribute DOMString   source;
    readonly attribute Composition composition;
    boolean setEnabled (in boolean enabled);
    boolean isEnabled ();
    bool    setInputMode (in DOMString script, in DOMString modifier);
    boolean hasComposition ();
    void    setComposition (in Composition composition);
    void    confirmComposition ();
    void    setCaretRectangle (in int x, in int y, in int w, in int h);
    boolean setOpenState (in boolean open);
};

10.1 Attributes

composition of type Composition, readonly: Represents the detailed information of the ongoing IME composition. When an IME is not composing text, this value must be null.
No exceptions.
source of type DOMString, readonly: Represents the name of the IME associated with this context.
No exceptions.

10.2 Methods

confirmComposition

Finishes the ongoing composition of the hosting user agent.

When a Web application calls this function, a user agent sends a compositionend event and a textInput event as a user types an ‘Accept’ key as written in “Input Method Editors” section the DOM Level 3 Events specification [DOM-LEVEL-3-EVENTS].

This function is just copied from WebKit, to solicit opinions from developers of JavaScript-based IMEs.

No parameters.

No exceptions.

Return type: void

hasComposition

Returns true when the hosting user agent is composing text.

This function is just copied from WebKit, to solicit opinions from developers of JavaScript-based IMEs.

No parameters.

No exceptions.

Return type: boolean

isEnabled

Returns the state of the IME associated with this context.

No parameters.

No exceptions.

Return type: boolean

setCaretRectangle

Notifies the rectangle of composition text to a user agent. When a user agent renders a candidate window or a composition window, it uses this rectangle to prevent these windows from being rendered on this rectangle.

On Windows, this rectangle is used as a parameter for ImmSetCandidateWindow(). On Mac, this rectangle is sent when it calls [firstRectForCharacterRange:]. On Linux (GTK), this rectangle is used as a parameter for gtk_im_context_set_cursor_location().

The x, y, w, and h parameters represent the local coordinates of a composition-text rectangle. A user agent may need to convert these coordinates to the screen coordinates when it shows a candidate window.

No exceptions.

Return type: void

setComposition

Updates the composition information of the hosting user agent.

When a JavaScript-based IME starts a composition, it must call this function with the appropriate composition information. When a JavaScript-based IME cancels an ongoing composition, it must call this function with a composition object whose text is empty. A user agent sends a compositionstart event when this function is called while hasComposition() returns false. On the other hand, a user agent sends a compositionupdate event when a Web application calls this function while hasComposition() returns true.

This function is just copied from WebKit, to solicit opinions from developers of JavaScript-based IMEs.

The composition parameter represents the information of the new composition.

No exceptions.

Return type: void

setEnabled

Controls the state of the IME associated with this context.

The enabled parameter represents whether a user agent activates this IME when the given node gains the input focus. When this value is true, a user agent activates an IME when this node gains the input focus and sends composition events to the given node even though the node is not an editable one, such as a <canvas> element.

No exceptions.

Return type: boolean

setInputMode

Provides a hint to the user agent so it can select the appropriate input mode of its associated IME. This function returns true when a user agent can change the input mode of its associated IME. Otherwise it returns false.

The parameters for this function are copied from Annex E of the “XForms 1.0” specification [XFORMS10], for consistency with that specification.

The script parameter represents a Unicode script name.
The modifier parameter represents a string added to the script name in order to more closely specify the kind of characters: e.g., "digits", "halfWidth", "kotei", etc.

No exceptions.

Return type: bool

setOpenState

Controls the state of the IME currently associated with the hosting user agent. This function returns true if a user agent can activate or deactivate its associated IME.

The open parameter represents whether a user agent enables the IME and disables it.

Do we need to notice this event to JavaScript IMEs? If so, what is the best option?

No exceptions.

Return type: boolean

11. Best practices

This section is non-normative.

This specification provides two types of interfaces:

Interfaces for developing IMEs in JavaScript (JavaScript IMEs)
Interfaces for developing Web applications that are aware of IMEs (IME-aware Web applications).

Moreover, this API depends on several existing specifications to minimize the change for existing JavaScript IMEs.

These dependencies make developers harder to use this API in their JavaScript IMEs or IME-aware Web applications. This section describes practices for some use-cases.

11.1 JavaScript IMEs

more into to come later…

11.1.1 Composing text

Existing JavaScript IMEs use DOM events (e.g., "keydown", "keyup", "focus", "blur", etc.) to compose text. To avoid forcing developers to change their JavaScript IMEs too much, this API does not provide any callbacks; i.e., this API allows them to use their existing handlers for DOM events. On the other hand, when a JavaScript IME updates its composition text, it needs to call setComposition() instead of inserting text by itself. When a JavaScript IME calls setComposition(), a user agent sends a composition event and renders the composition text as it does for system IMEs. The following figure illustrates a sequence that composes text with a JavaScript IME which emulates the first example in the “Input Method Editors” section the DOM Level 3 Events specification [DOM-LEVEL-3-EVENTS].

11.1.2 Consuming events

When a JavaScript IME calls setComposition(), it must call preventDefault() to prevent user agents from inserting this character to an element.

11.1.3 Enabling or Disabling JavaScript IMEs

A JavaScript IME should not consume keyboard events when hosting Web applications disable it. The JavaScript IME should call getInputContext().isEnabled() when it receives a keyboard event and does not consume it only when it returns false.

A JavaScript IME disabled by Web applications

11.1.4 Candidate window

Existing JavaScript IMEs usually use so-called CSS layers to render their candidate window. Nevertheless, some JavaScript IMEs use absolute coordinates to render their candidate windows (i.e., <div style="position: absolute">…</div>), others use relative coordinates to render theirs (i.e., <div style="position: relative">…</div>). To satisfy both requests, this API provides two methods that retrieve the caret rectangle. This API provides window.inputmethodmanager.getCaretRectangle() for JavaScript IMEs that need the absolute position of the caret rectangle of the ongoing composition text. On the other hand, this API provides getInputContext().getCaretRectange() for JavaScript IMEs that need its relative position.

11.2 IME-aware Web applications

When developers develop an IME-aware Web application, they need to decide which IME to use in their Web application: JavaScript IMEs, system IMEs, or none. The following sections describe practices of these three cases.

11.2.1 Using JavaScript IMEs

A Web application that uses only JavaScript IMEs must disable the system IMEs associated with the hosting user agent to prevent keyboard events from being consumed by system IMEs. To disable the system IMEs associated with a user agent, the Web application must call getInputContext().setEnabled(true) and window.inputmethodmanager.setOpenState(false) when initializing itself and when it gains the focus.

Disable the system IME and enable a JavaScript IME

JavaScript IMEs may not dispatch keyboard events consumed by them to Web applications. Therefore, developers should not depend on such keyboard events to develop their Web applications when using JavaScript IMEs.

11.2.2 Using system IMEs

On the other hand, a Web application that uses the system IME must enable the system IME when it becomes active as well as it disables the JavaScript IMEs. JavaScript IMEs may consume keyboard events even though the Web application calls getInputContext().setEnabled(false). To prevent such JavaScript IMEs from consuming keyboard events, the Web application should add event handlers to keyboard events.

Enable the system IME and disable JavaScript IMEs

11.2.3 Does not use IMEs

When a Web application does not use any IMEs, it…

Input Method Editor API

W3C Working Draft 24 May 2012

Abstract

Status of This Document

Table of Contents

1. Introduction

2. Background: What’s an Input Method Editor?

2.1 Composer

2.1.1 Phonetic composer

2.1.2 Radical composer

2.2 Converter

3. Conformance

4. Terminology and algorithms

5. The getInputContext() method

5.1 Methods

6. The Composition Interface

6.1 Attributes

7. The CompositionCaret Interface

7.1 Attributes

8. The CompositionClause Interface

8.1 Attributes

9. The CompositionClauseList sequence

10. The InputMethodContext Interface

10.1 Attributes

10.2 Methods

11. Best practices

11.1 JavaScript IMEs

11.1.1 Composing text

11.1.2 Consuming events

11.1.3 Enabling or Disabling JavaScript IMEs

11.1.4 Candidate window

11.2 IME-aware Web applications

11.2.1 Using JavaScript IMEs

11.2.2 Using system IMEs

11.2.3 Does not use IMEs

A. References

A.1 Normative references

A.2 Informative references