Re: proposal: add input/keyboard locale to text and keyboard events [ISSUE-119]

I encountered a tremendous amount of push-back from the whatwg regarding locale/system fingerprinting in response to my proposal that a getSpellingRanges method be implemented. They worried that such a feature would allow locale to be detected by injecting locale specific words (colour v. color). I remarked that such information as OS is generally available through the navigator object anyway. No success. Just letting you all know there may be massive pushback for any IME related DOM work. Seeing it from Roc at Mozilla and Oliver at Apple.

-Charles



On Dec 1, 2010, at 11:57 AM, Jacob Rossi <jrossi@microsoft.com> wrote:

> We have looked at this further and we’re willing to hold off on specing the global object for this and rather just put it on event interfaces. Placing such a property the on KeyboardEvent interface seems appropriate. However, I don’t believe putting it on TextEvent will work out well. The issue is that we can determine locale information when the inputMethod is keyboard or IME, however it is difficult or impossible to do so for the other input methods. 
>  
> For example, platforms don’t expose locale information for paste/drop data in most cases. Furthermore, pasted/dropped content could even contain data with multiple locales in it. Additionally, locale isn’t well defined for multimodal or script sources. I don’t think it’s likely we’ll be able to get two useful and interoperable implementations with locale defined for more than just IME and keyboard input methods.
>  
> In light of this, I propose that instead of TextEvent we place this on CompositionEvent (in addition to KeyboardEvent).
>  
> So to reiterate what this looks like:
>  
> On the KeyboardEvent and CompositionEvent interfaces, add a readonly attribute DOMString called inputLocale which is the BCP-47 formatted locale code of the inputted character data.
>  
> Useful resources:
> [1]  List of BCP-47 language/region subtags:   http://www.iana.org/assignments/language-subtag-registry
> [2]  BCP-47: http://www.rfc-editor.org/rfc/rfc4646.txt
> [3]  MSDN - LCIDs to BCP-47 codes: http://msdn.microsoft.com/en-us/library/ff531705(office.12).aspx 
>  
> -Jacob
>  
> On Wed, Oct 6, 2010 at 2:16 AM, Jacob Rossi <jrossi@microsoft.com> wrote:
>  
> >  > 2010/9/14 Aharon (Vladimir) Lanin <aharon@google.com <aharon@google.com?Subject=Re%3A%20proposal%3A%20add%20input%2Fkeyboard%20locale%20to%20text%20and%20keyboard%20events%20%5BISSUE-119%5D&In-Reply-To=%253CAANLkTinRgy4NrS59v6jS70XtiV1m%3DzowRa63KPHMNTwP%40mail.gmail.com%253E&References=%253CAANLkTinRgy4NrS59v6jS70XtiV1m%3DzowRa63KPHMNTwP%40mail.gmail.com%253E>>:
> > 
> > >> Perhaps it should then be redefined as .lastInputLocale, indicating the
> > 
> > >> locale of the last text input device to have generated input. But then if
> > 
> > >> the last text event was due to a paste operation, would it become null? If
> > 
> > >> yes, then the input locale is no longer available outside the scope of
> > 
> > >> events until the next time there is input. And if no, then during the
> > 
> > >> paste's text event its value would be misleading (it has nothing to do with
> > 
> > >> the pasted text). I guess it could become null during the paste text event,
> > 
> > >> and then go back to the last non-null value after the event is over, but
> > 
> > >> this is getting a little complicated.
> > 
> > >> What's wrong with putting the value right in the event, where the is no
> > 
> > >> possible ambiguity?
> > 
> > >> Mind you, I agree that having the input locale available outside the scope
> > 
> > >> of events would indeed be useful too. Perhaps (some global
> > 
> > >> object).lastInputLocale should be made available in addition to inputLocale
> > 
> > >> in the events. It would get updated on every text and keyboard event with a
> > 
> > >> non-null inputlocale.
> > 
> > >
> > 
> > > Actually, there's a good argument for not exposing inputLocale outside
> > 
> > > of text/keyboard events. If it is required that the user interact with
> > 
> > > the page before exposing locale, then this reduces the ability to
> > 
> > > fingerprint the user.
> > 
> > >
> > 
> > > / Jonas
> > 
> > 
> > 
> > An additional use case would be spell-checker or auto-complete language
> > detection heuristics. We’ve done a lot of work to better support
> > international keyboards and IMEs. This API would give better context to such
> > input and likely has a variety of use cases we haven’t even identified.
> > 
> > 
> > 
> > I prefer the implementation of an always available API (though, I think
> > document.inputLocale makes more sense). It seems to me that the user is
> > likely to change his/her language settings at a much slower rate than that
> > of the firing of textInput or the keyboard events. For many uses that I can
> > think of, if the locale is only surfaced through these events, then you will
> > end up having to cache the value and check for changes at each input event.
> > This seems like a lot of extra work when you could just be notified of
> > changes (which probably aren’t that frequent).
> > 
> > 
> > 
> > Further, some usage of locale is more likely to be reactionary to **
> > changes** in the current setting rather than at the time of input.
> > Consider my example of auto-complete. If a user switches his/her input
> > language, the site would have to wait for input from the user in order to
> > react to the change. If rather the page was notified at the moment of
> > language change, then the page might have extra time before the user’s input
> > to fetch data that might be useful for that specific language (e.g.,
> >  language-specific auto-completions).
> > 
> > 
> > 
> > I also think we should not consider pasting clipboard data as an unknown
> > locale. Outside of the web browser, pasted data would be considered with
> > whatever the user’s current input locale settings are. I don’t believe
> > there’s a reliable method for determining the locale of the data in the
> > clipboard (correct me, if I’m wrong). Not changing inputLocale to null (or
> > undefined) when data is pasted allows for the locale of pasted data to be
> > handled by the user via changing their input locale. Restated:  only the
> > user’s explicit changes to input locale (via keyboard shortcuts, IME,
> > speech/handwriting recognition tools, etc.) result in a change to
> > inputLocale.
> > 
> > 
> > 
> > I’d imagine it somewhat like this:
> > 
> > 
> > 
> > On the document (or arguably, the window):
> > 
> >      inputLocale  of type DOMString, readonly
> > 
> >           A BCP-47 language tag [2] representing the input locale as last
> > set by the user. When the underlying platform does not expose this
> > information, the value should be “Unidentified”.
> > 
> > 
> > 
> > As a new event:
> > 
> >           Type: inputLocaleChange
> > 
> >           Interface: Event
> > 
> >           Sync/Async:   Sync
> > 
> >           Bubbles: Yes
> > 
> >           Target: Element
> > 
> >           Cancelable: No
> > 
> >           Default action: none
> > 
> > 
> > 
> > --Jacob
> > 
> > 
> > 
> > [1] http://lists.w3.org/Archives/Public/www-dom/2010JulSep/0119.html
> > 
> > [2] http://en.wikipedia.org/wiki/BCP_47
> > 
>  

Received on Wednesday, 1 December 2010 20:14:35 UTC