Copyright © 2012 W3C® (MIT, ERCIM, Keio), All Rights Reserved. W3C liability, trademark and document use rules apply.
This specification defines an “IME API” that provides Web applications with scripted access to an IME (input-method editor) associated with a hosting user agent. This IME API includes:
InputMethodContext
interface, which provides methods to
retrieve detailed data from an in-progress IME
composition,
and to update that data.Composition
interface, which represents read-only attributes
about the current
composition,
such as the actual text being input, its length, and its target
clause.
need to define what a clause is
This API is designed to be used in conjunction with DOM events and elements on the Web platform, notably: composition events and the Canvas 2D Context API [CANVAS-2D].
This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.
This document is a proposal that is being made available for public review in order to solicit feedback, particularly from implementors, with a goal of potential cross-browser implementation and standardization.
This document was published by the Web Applications Working Group as a First Public Working Draft. This document is intended to become a W3C Recommendation. If you wish to make comments regarding this document, please send them to public-webapps@w3.org (subscribe, archives). All feedback is welcome.
Publication as a Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.
This document was produced by a group operating under the 5 February 2004 W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.
This section is non-normative.
Even though existing Web-platform APIs allow developers to implement very
complicated Web applications, such as visual chat applications, using
technologies such as SVG or the <canvas>
element and API,
developers have difficulties when implementing Web applications that
control input-method editors. To provide the
ability for hosting user agents to expose Web applications to the
composition text
being composed in an associated IME,
the DOM Level 3 Events specification
[DOM-LEVEL-3-EVENTS]
introduces composition events. Using composition events, Web
applications can retrieve
composition text
from an IME.
However, Web applications can still run into difficulties when
they manipulate IMEs on non-editable elements such as the
<canvas>
element; those difficulties include the fact
that a Web application cannot do the following:
The Web platform has a number of existing APIs useful for implementing a custom IME in JavaScript. For example, the Web Storage API can store an IME dictionary, and the WebSocket and XMLHttpRequest APIs allow sending a server request that performs a lookup in an IME dictionary, and so on. In fact, Web-application developers are already developing and deploying JavaScript-based IMEs that use these APIs. However, it is currently difficult to make those JavaScript-based IMEs work on all user agents, because they often rely on APIs specific to the hosting user agent, such as browser extension APIs.
To solve these IME-related problems, this specification introduces
an IME API that allows Web applications to read and write composition
data made available by user agents. Moreover, this specification introduces
interfaces for
compositions,
so Web applications can read detailed composition
data and update it. A Composition
object provides a reference
to an ongoing IME
composition,
so Web applications can retrieve the composition text and attributes.
The use of those APIs allows Web applications the ability to set the position of a composition window and to retrieve the text and attributes of the ongoing composition.
Consider the following examples. The first example shows the source for a Web application that renders composition text by itself and uses the candidate window provided by an IME.
<!DOCTYPE html> <html> <head> <script language="javascript" type="text/javascript"> function init() { window.inputmethodmanager.setOpenState(true); var node = document.getElementById('canvas0'); node.getInputContext().setEnable(true); node.addEventListener('compositionstart', onCompositionStart, false); node.addEventListener('compositionupdate', onCompositionUpdate, false); node.addEventListener('compositionend', onCompositionEnd, false); } function onCompositionStart(event) { } function onCompositionUpdate(event) { var x = 0; var y = 0; var canvas = document.getElementById('canvas0'); var context = canvas.getContext('2d'); var inputContext = canvas.getInputContext(); var composition = inputContext.composition; // Render a caret. // NOTE: this just renders a caret rectangle in black for // simplicity. if (composition.caret.start >= 0) { var start = context.measureText( composition.text.substring(0, composition.caret.start)); var end = context.measureText( composition.text.substring(0, composition.caret.end)); context.fillStyle = ’black’; context.fillRect(start.width, y, end.width, y + 10); } // Render the clauses in the composition. for (var i = 0; i < composition.clauses.length; ++i) { var text = composition.clauses[i].text; var metrics = context.measureText(text); // Draw the text of this clause. context.fillStyle = composition.clauses[i].textColor; context.fillText(text, x, y); // Draw an underline under the text. For simplicity, this code // draws a bold underline for selected clauses or a thin // underline for non-selected ones. if (composition.clauses[i].selected) { context.fillRect(x, y, x + metrics.width, y + 2); } else { context.fillRect(x, y, x + metrics.width, y + 1); } x += metrics.width; } // Move the candidate window outside of the composition text. window.inputmethodmanager.moveCandidateWindow(0, y, x, y + 10); } function onCompositionEnd(event) { } </script> </head> <body> <canvas id=”canvas0” width=”640” height=”480”></canvas> </body> </html>
The next example shows the source for a simple IME that composes Japanese Hiragana characters from key strokes.
This is just a sample and not suitable for real use.
<!DOCTYPE html> <html> <head> <title></title> <script language="javascript" type="text/javascript"> var imeActivated = false; var imeRomajiInput = ''; var imeRomajiTable = { 'A': '\u3042', 'I': '\u3044', 'U': '\u3046', 'E': '\u3048', 'O': '\u304A', /* suppressed */ }; function init() { // Disable the system IME associated with this window. window.inputmethodmanager.setOpenState(false); // Listens the keyboard events. var node = document.getElementById('input0'); node.addEventListener('keydown', onKeyDown, false); node.addEventListener('keyup', onKeyUp, false); } function onKeyDown(event) { // Toggle the input mode when pressing a shift key. if (event.key == 'Shift') { imeActivated = !imeActivated; imeRomajiInput = ''; } // Exit if this IME is not activated. if (!imeActivated) return true; var imeComposition = new Composition; var imeConfirm = false; if (event.keyCode < 0x20) { event.preventDefault(); return true; } // Convert the input key strokes to a Japanese character. imeRomajiInput += String.fromCharCode(event.keyCode); if (imeRomajiTable[imeRomajiInput]) { imeComposition.text = imeRomajiTable[imeRomajiInput]; imeConfirm = true; imeRomajiInput = ''; } else { imeComposition.text = imeRomajiInput; } // Fill the Composition object. imeComposition.caret.start = imeComposition.text.length; imeComposition.caret.length = 1; imeComposition.clauses[0] = new CompositionClause; imeComposition.clauses[0].text = imeComposition.text; imeComposition.clauses[0].start = 0; imeComposition.clauses[0].selected = true; imeComposition.clauses[0].textColor = 'currentColor'; imeComposition.clauses[0].backgroundColor = 'transparent'; imeComposition.clauses[0].lineStyle = 'solid'; imeComposition.clauses[0].lineColor = 'black'; // Send the Composition object to the user agent. var context = event.target.getInputContext(); context.setComposition(imeComposition); if (imeConfirm) context.confirmComposition(); // Disable the default action to prevent this key from being // inserted. event.preventDefault(); return false; } function onKeyUp(event) { } </script> </head> <body onload="init();"> <textarea id="input0" cols="80" rows="10"></textarea> </body> </html>
This section is non-normative.
An IME (input-method editor) is an application that allows a standard keyboard (such as a US-101 keyboard) to be used to type characters and symbols that are not directly represented on the keyboard itself. In China, Japan, and Korea, IMEs are used ubiquitously to enable standard keyboards to be employed to type the very large number of characters required for writing in Chinese, Japanese, and Korean.
An IME consists of two modules; a composer and a converter.
A composer is a context-free parser that composes non-ASCII characters (including phonetic characters) from keystrokes.
A converter is a context-sensitive parser that looks up a dictionary to convert phonetic characters to a set of ideographic characters.
A IME composition is a instance of text produced in an IME.
When an IME receives keystrokes, it sends the keystrokes to a composer and receives phonetic characters matching to the keystrokes. When an IME receives phonetic characters from a composer, it sends the phonetic characters to a converter and receives the list of ideographic characters matching to the phonetic characters. The following figure shows the basic structure of an IME.
A composer consists of two types of composers: a phonetic composer and a radical composer.
A phonetic composer composes a phonetic character from its ASCII representation.
A radical composer composes a phonetic character from phonetic radicals.
An IME usually shows the text being composed by a composer with its own style to distinguish it from the existing text. Even though most of composers output phonetic characters, some composers (such as Bopomofo composers) output a placeholder character instead of phonetic characters while composing text.
need to define composition window
probably should define radical
probably should define clause here too
Phonetic composers are not only used for typing Simplified Chinese and Japanese, but also used for typing non-ASCII characters (such as mathematical symbols, Yi, Amharic, etc.) with a US-101 keyboard. Each of these languages has a mapping table from its character to a sequence of ASCII characters representing its pronunciation: e.g., ‘か’ to ‘ka’ in Japanese, and; ‘卡’ to ‘ka’ in Simplified Chinese. This mapping table is called as Romaji for Japanese and Pinyin for Simplified Chinese, respectively. A phonetic composer uses these mapping tables to compose a phonetic character from a sequence of ASCII characters produced by a US keyboard.
A phonetic composer for Simplified Chinese outputs the input ASCII characters as its composition text.
On the other hand, a phonetic composer for Japanese outputs phonetic characters when the input ASCII characters have matching phonetic characters.
A phonetic composer for mathematical symbols outputs a composed mathematical symbol and shows the source keystrokes to its own window, which is an example of a composition window.
Radical composers are mainly used for typing Traditional Chinese and Korean with phonetic keyboards. Each phonetic keyboard of these languages can produce phonetic radicals: e.g., typing ‘r’ produces ‘ㄱ’ on a Korean keyboard; typing ‘o’ produces ‘人’ on a Traditional-Chinese (or Bopomofo) keyboard, etc. A radical composer composes a phonetic character from phonetic radicals given by these keyboards: e.g., typing 'ㄱ' (r) and 'ㅏ' (k) produces '가' on a Korean keyboard; typing ‘人’ (o), ‘弓’ (n), and ‘火’ (f) produces ‘你’ on a Traditional-Chinese keyboard, etc.
A radical composer for Korean outputs the phonetic radicals as its composition text.
A radical composer for Traditional Chinese outputs a placeholder character (U+3000) and shows the phonetic radicals being composed to its own window. This window is an example of a composition window.
Some platforms (such as Mac and Linux) use radical composers for typing accented characters used in European countries. For example, typing ‘ ̈ ’ (option+u) and ‘a’ (a) produces ‘ä’ on US keyboards of Mac.
A converter is a context-sensitive parser used for replacing the outputs of a composer to ideographic characters on Chinese, Japanese, and Korean.
Korean does not use ideographic characters so often.
Because Chinese, Japanese, and Korean have many homonyms, each sequence of phonetic characters usually matches many ideographic characters: e.g., a Japanese phonetic character 'か' matches Japanese ideographic characters ‘化’, ‘科’, ‘課’, etc.; Pinyin characters ‘ka’ matches Simplified-Chinese ideographic characters ‘卡', ‘喀’, ‘咯’, etc.; Bopomofo characters ‘人弓’ matches Traditional-Chinese ideographic characters ‘乞’, ‘亿’, ‘亇’, etc.
A converter looks up a dictionary and shows a list of candidates of possible ideographic characters so a user can choose one. This list is known as a candidate list. A candidate list is known as a candidate window when it has its own window.
Some Japanese IMEs show annotations in its candidate window for a character that is not so easy to distinguish from other characters (such as full-width alphabets, full-width Katakanas, and half-width Katakanas, etc.), as shown in the following figure.
The next figure shows a candidate window of a Simplified-Chinese IME.
And the next figure shows a candidate window of a Traditional-Chinese IME.
A converter often integrates an MRU (Most-Recently Used) list. Even though there are many ideographic characters for each phonetic character (or phonetic radical), a user does not usually use all these ideographic characters. A converter uses an MRU list to filter out ideographic characters not used so often from a candidate list. A converter sometimes integrates a grammar parser. A converter that integrates a grammar parser splits the given phonetic characters into grammatical clauses and converts only one clause at once. When a sequence of phonetic characters consists of n clauses and the i-th clause has m_i candidates, the total number of the candidates for the input characters become (m_1 * m_2 * … * m_n). To reduce the number of candidates owned by a converter, a converter usually processes one clause at once. This clause is called as a selected clause.
An IME usually renders a selected clause with a special style to distinguish it from other clauses, as shown in the following figure.
When a converter converts two or more clauses, it chooses candidates for the selected clause so it becomes grammatically consistent with the surrounding clauses: e.g., Japanese converters usually output ‘危機一髪’ (not ‘危機一発’) for Japanese phonetic characters ‘ききいっぱつ’ because ‘危機一発’ is grammatically incorrect.
As well as sections marked as non-normative, all authoring guidelines, diagrams, examples, and notes in this specification are non-normative. Everything else in this specification is normative.
The key words must, must not, required, should, should not, recommended, may, and optional in this specification are to be interpreted as described in [RFC2119].
More to be written.
To be written.
For each element, a user agent can choose an IME for the element. To control the IME attached to an element, it is a good idea to add a method to the HTMLElement interface.
If the
getInputContext()
method cannot be added to the HTMLElement
interface, it should be moved to the InputMethodContext
interface.
interface HTMLElement
…
Object getInputContext ();
};
getInputContext
Object
This interface represents an ongoing IME composition. It provides an attribute representing the text being composed by an IME. It also provides a method to retrieve attributes of the specified character in the composition text.
interface Composition {
readonly attribute DOMString text;
readonly attribute CompositionCaret
caret;
readonly attribute CompositionClauseList clauses;
};
caret
of type CompositionCaret
, readonlyclauses
of type CompositionClauseList, readonlytext
of type DOMString, readonlyThis interface represents the caret in the composition text. When
the value of its ”length” attribute is 0
, an IME uses a
vertical bar as an its cursor. Otherwise, an IME uses a block cursor as its
cursor. When an IME does not show a caret, both values must be -1
.
interface CompositionCaret {
readonly attribute int start;
readonly attribute int length;
};
length
of type int, readonly-1
.start
of type int, readonly-1
if an IME does not show a caret. The default value
is -1
.This interface represents a clause of the composition text. This interface also represents attributes of the clause retrieved from a user agent so Web applications can render the clauses as a user agent does.
Retrieving attributes is not supported by all operating systems or all IMEs.
When a user agent cannot retrieve attributes from an operating system, it sets default values for them.
interface CompositionClause {
readonly attribute DOMString text;
readonly attribute int start;
readonly attribute boolean selected;
readonly attribute DOMString textColor;
readonly attribute DOMString backgroundColor;
readonly attribute DOMString lineStyle;
readonly attribute DOMString lineColor;
};
backgroundColor
of type DOMString, readonlytransparent
".lineColor
of type DOMString, readonlycurrentColor
.lineStyle
of type DOMString, readonlynone
", "solid
", "double
",
"dotted
", "dashed
", and
"wave
". The default value is "solid
".selected
of type boolean, readonlystart
of type int, readonlytext
of type DOMString, readonlytextColor
of type DOMString, readonlycurrentColor
.typedef sequence<CompositionClause
> CompositionClauseList;
CompositionClause
> type.interface InputMethodContext {
readonly attribute DOMString source;
readonly attribute Composition
composition;
boolean setEnabled (in boolean enabled);
boolean isEnabled ();
bool setInputMode (in DOMString script, in DOMString modifier);
boolean hasComposition ();
void setComposition (in Composition
composition);
void confirmComposition ();
void setCaretRectangle (in int x, in int y, in int w, in int h);
boolean setOpenState (in boolean open);
};
composition
of type Composition
, readonlysource
of type DOMString, readonlyconfirmComposition
Finishes the ongoing composition of the hosting user agent.
When a Web application calls this function, a user agent sends a compositionend event and a textInput event as a user types an ‘Accept’ key as written in “Input Method Editors” section the DOM Level 3 Events specification [DOM-LEVEL-3-EVENTS].
This function is just copied from WebKit, to solicit opinions from developers of JavaScript-based IMEs.
void
hasComposition
Returns true when the hosting user agent is composing text.
This function is just copied from WebKit, to solicit opinions from developers of JavaScript-based IMEs.
boolean
isEnabled
boolean
setCaretRectangle
Notifies the rectangle of composition text to a user agent. When a user agent renders a candidate window or a composition window, it uses this rectangle to prevent these windows from being rendered on this rectangle.
On Windows, this rectangle is used as a parameter for ImmSetCandidateWindow(). On Mac, this rectangle is sent when it calls [firstRectForCharacterRange:]. On Linux (GTK), this rectangle is used as a parameter for gtk_im_context_set_cursor_location().
void
setComposition
Updates the composition information of the hosting user agent.
When a JavaScript-based IME starts a composition, it must call
this function with the appropriate composition information. When a
JavaScript-based IME cancels an ongoing composition, it must call this
function with a composition object whose text is empty. A user agent
sends a compositionstart event when this function is called while
hasComposition() returns false
. On the other hand, a
user agent sends a compositionupdate event when a Web application calls
this function while hasComposition() returns true
.
This function is just copied from WebKit, to solicit opinions from developers of JavaScript-based IMEs.
void
setEnabled
Controls the state of the IME associated with this context.
<canvas>
element.boolean
setInputMode
Provides a hint to the user agent so it can select the
appropriate input mode of its associated IME. This function returns true
when a user agent can change the input mode of its associated IME.
Otherwise it returns false
.
The parameters for this function are copied from Annex E of the “XForms 1.0” specification [XFORMS10], for consistency with that specification.
digits
", "halfWidth
", "kotei
", etc.bool
setOpenState
Controls the state of the IME currently associated with the hosting
user agent. This function returns true
if a user agent can activate or
deactivate its associated IME.
Do we need to notice this event to JavaScript IMEs? If so, what is the best option?
boolean
This section is non-normative.
This specification provides two types of interfaces:
Moreover, this API depends on several existing specifications to minimize the change for existing JavaScript IMEs.
These dependencies make developers harder to use this API in their JavaScript IMEs or IME-aware Web applications. This section describes practices for some use-cases.
more into to come later…
Existing JavaScript IMEs use DOM events (e.g., "keydown", "keyup", "focus", "blur", etc.) to compose text. To avoid forcing developers to change their JavaScript IMEs too much, this API does not provide any callbacks; i.e., this API allows them to use their existing handlers for DOM events. On the other hand, when a JavaScript IME updates its composition text, it needs to call setComposition() instead of inserting text by itself. When a JavaScript IME calls setComposition(), a user agent sends a composition event and renders the composition text as it does for system IMEs. The following figure illustrates a sequence that composes text with a JavaScript IME which emulates the first example in the “Input Method Editors” section the DOM Level 3 Events specification [DOM-LEVEL-3-EVENTS].
When a JavaScript IME calls setComposition(), it must call preventDefault() to prevent user agents from inserting this character to an element.
A JavaScript IME should not consume keyboard events when hosting Web applications disable it. The JavaScript IME should call getInputContext().isEnabled() when it receives a keyboard event and does not consume it only when it returns false.
Existing JavaScript IMEs usually use so-called CSS layers to
render their candidate window.
Nevertheless, some JavaScript IMEs use
absolute coordinates to render their candidate windows (i.e.,
<div style="position: absolute">…</div>
),
others use relative coordinates to render theirs (i.e.,
<div style="position: relative">…</div>
).
To satisfy both requests, this API provides two methods that retrieve the
caret rectangle. This API provides
window.inputmethodmanager.getCaretRectangle() for JavaScript IMEs that
need the absolute position of the caret rectangle of the ongoing
composition text. On the other hand, this API provides
getInputContext().getCaretRectange() for JavaScript IMEs that need its
relative position.
When developers develop an IME-aware Web application, they need to decide which IME to use in their Web application: JavaScript IMEs, system IMEs, or none. The following sections describe practices of these three cases.
JavaScript IMEs may not dispatch keyboard events consumed by them to Web applications. Therefore, developers should not depend on such keyboard events to develop their Web applications when using JavaScript IMEs.
On the other hand, a Web application that uses the system IME must enable the system IME when it becomes active as well as it disables the JavaScript IMEs. JavaScript IMEs may consume keyboard events even though the Web application calls getInputContext().setEnabled(false). To prevent such JavaScript IMEs from consuming keyboard events, the Web application should add event handlers to keyboard events.
When a Web application does not use any IMEs, it…