EditContext API

Facilitator: Alex Keng

Demo and discussion of the EditContext API

Minutes (including discussions that were not audio-recorded)

Previous: Virtual Keyboard Control All breakouts Next: Delegated Ink Trails

Transcript

All right, hello everyone.

I'm Alex Keng from the Microsoft Edge Team.

Thanks for joining us today to discuss and learn more about EditContext.

Today, the goal of today's meeting is to bring awareness and provide update to our work and get general feedback from the (indistinct) framework or browser implementers.

Today, I will give a quick intro on why we need EditContext and then how the EditContext works.

Then I'll give some demos to show current status.

So the total presentation probably will last 15 to 20 minutes.

And then at the end, we will have an open discussion.

Okay, so the background.

So there are various advanced text input services like shape writing, handwriting recognition, or IME composition, and dictation, and provided by OS.

So let me show you some examples.

You can find this at the end of our edit contacts explainer.

So this is a shape writing, you can, where you can improve word just by one stroke.

And then this is a handwriting recognition where you can use pen as your text input method and then emoji picker.

And IME composition where if you are trying to type languages that have some special characters then you need to use this to compose your text input.

So our demo today mostly will involve this IME composition.

So I want to show you this IME composition in more details.

So here I can change my input method from English to a traditional Chinese.

So as I'm typing, I'm not typing ABC letter.

So this is sound fanatics symbols traditional Chinese.

And this phonetic symbols can compose into like one word like it is.

And when, when, while I'm doing this composition I can press left, left to move the caret.

And then I can press down to choose different word because all these words share the same pronunciation.

So I can even choose a phrase that can update two words at once or three at once like this.

So after I finish the conversation I press enter and you can see this on the decoration will disappear.

And at this point, if I press left, press left and press down I can no longer change the, the word.

I mean, I'm trying to compose because this word, as we call it is committed.

And also I want to show you another advanced text input.

Let me switch to English first, which is a dictation.

So if I press windows key and H, so this is a demo for dictation.

So you can see that the briefly we see this on the line decoration as well.

So that means this sentence at the time was being composed by the dictation engine.

Okay.

So now I'll go back to the slide.

So all these the advanced texts input services are available for web apps.

However, the only way the web app can access it is to place an editable element in DOM and focus it.

So the editable element includes input, textarea or contenteditable div.

And these three elements have their own capabilities their own different.

However, in general, these are not sufficient.

They are not enough for a web-based editors.

So, this is a current framework.

So editors will need to have their own models and they will need to convert their model to HTML and use it to update the DOM.

So this DOM will also, I mean this editable element will also be responsible to take the text input.

For example, take the English typing like an hello" form physical keyboard, or take the Chinese typing like this through IME or to take the touch or voice texting input. And also you will also be taking the input for text selection to update the caret and then change the range of the text selection. So this IVI here as you see before, we can press left left right right and then up and down to choose different words. So this IME will also need to access the view to update the view, to show what's going on for the user. So this is where the problem is because the text input is tied directly to the DOM view and the IME will need to access the view. And at the same time editor will also need to update the view. So, these two will have, sometimes will have concrete situation, which could be a very tricky for the editors to implement their editing experiences. So currently there are three approaches the editors trying to use to work with a current architecture. So one is that we want to directly use the editable element and to incorporate that into our view. And the second is that we want to completely hide it. So in the view, you will not see that editable Adelman. The third one is the Hybrid, Hybrid model where you sometimes show the editable element and sometimes you hide it. So, so basically the editors are forced to choose between these two. They either show this led this to update the view or let the model to update the view. You have to choose from either of them, either of them. So today I'm gonna show you three examples. So, so one for each approach and all of them will have some side effects. So the first one is the Word Online. The Word Online is using the editable element in the view. So here in Word Online this whole canvas is accountant editable div. So here you see the caret, on the line decoration these are all native, native caret, native decoration. And the problem of this is that when the composition is trying to update the view Word Online will not be able to update the view which caused a problem for the concurrent composition. So let's see some video. Okay. All right. So, so this is Word Online and these are two users updating the document at the same time on two different devices using Word Online. So you can see on the right the user is typing and the documents are in sync but if user on the left now trying to type Chinese. Now it's an active composition. So when B is typing the sync is no longer working. Right. So the reason for this is that when composition is active the composition will need the, the view remain unchanged. So if at this point, Word Online is trying to update the view with the new text input from user B, then the internal state will be messed up and the user will be very confused. So this is an example of what would happen if we update the view while we have active composition. So we have an active composition and now the user B is trying to type and notice that right now, on the line decoration is gone which means that, that sound state is messed up. And now if user A is trying to continue the composition, composition then we see that we have this duplicate texts. I probably predict let me, yeah, we have this duplicated texts. So this is the side effect of the approach one. Now, let's see the second approach. The second approach is to totally hide the editable element. So which is how Google docs work with the current framework. So in Google docs, you see that we have this iFrame like very tiny iFrame and one pixel tall. So inside this iFrame we have an editable element I think, which is content editable div. And if you are typing English that is one pixel tall. If you are typing like Chinese, then it's become bigger but you are still not able to see the editable element. And the problem for this is that first of all because it completely hide hide the edible element, so it has to like render this by their own. So this will be a custom rendering. So as we can see in Japanese, sometimes you can have different style for you on the line decoration. You can have solidly line and a dotted line. So for the user of Google docs, they can not see this difference. And also you will have a sound position issues and then the caret issues which is what I'm going to demo now. So if I type some Chinese. When I, first of all when I press left left left, I already pressed three times of left, but you can see the carets moving and this is a pretty big issue. And now if I press down, you see, I'm trying to change the word for the second, for the second word. But right now my (indistinct) window is not close to the caret where it's supposed to be. So this (indistinct) windows should be, it's supposed to be here and see if I update it to this word. Now I'm updating this and I can press, and, and update this and press enter. So this is an issue for the approach to, because if they completely hide it so sometimes you will be, sound unexpected bug. And that would be a very confused, confusing to the users. And the third approach is the Hybrid one. This is for visual studio code and or a Monaco editor. So the hybrid one is that when you type English we're using the one by one pics so it takes area. But if you are typing Chinese sticks text area we'll expand to, to just contend is the word that you are typing. So you can see that here, this is a native decoration and there will be a native caret inside. So the current manipulation will be kind of tricky. So let's see. So this is the Monaco Editer So they have this special feature where you can have multiple caret position. Like I pressed alt, click alt click and I can type English like in three places at the same time. And then I can move the caret. I can delete, it is all working fine. However, if I type Chinese And I press left, now I'm seeing the fourth caret because this one is the, the native one. And these three are the custom rendering from the Monaco editor. And this is the issue of the approach three. And they also have an accessibility issue but I'm not, I will not demo it right now. And then if you are interested you can see the explainer for more details. Okay. So we have all these issues and how do we address it? So right now we are proposing a new architecture because the main issue is that the text input is tightly coupled with, with a down view. So this new architecture, in this new architecture we coupled this text input from the down. So we have this new component. You can call it text input view, or texts input facing view. Basically right now, your model will have two views. One view is a user facing view where the user can see what's going on. And then the second view is for IME, Nepal input or text input. So when user have input through IME and this new component will hold a state for the IME and the editors right now will be free to update the user facing view however or whenever they want. And this is some details. So when the input goes to the EditContext, EditContext will use the text update events to update the new text to the model. And also it will use the text format update event update for the information for the, for the style of the on the line decoration. So that models, editor model will need to know how to render those decoration. Now let's move on to the demos. Okay. And so we'll have a full demos. And the first one we'll have some color symbols So the first one, the Trivial models, here we have this div with counting editable falls. So basically it's just a read only div, right now you can see here. It's false, and then we have some style. So right now, if you click it and then you press ABC and nothing is happening because it's a read only, but now if you create EditContext and then you use this ADI attach EditContext to associate the EditContext with the with the, with the div and then save, let's refresh it. Now you can put the caret in the div it becomes an editable element. And if you press A. You can see that also, this is the event that is received by EditContext, or this, this div. So you can see that right now, the the div is receiving some key up, key down, key press event. And the EditContext is receiving this text update event. So this, this information is maintained in EditContext and the view remains unchanged because remember that in our previous architecture right now the text input is going through the, the, the the text input facing view is not going through the, the user facing view. So right now the user facing view will not be updated. So I can continue to type ABC ABC, or I can continue to type Chinese. Okay. So now the this EditContext, EditContext buffer will be available for the model. And then the model can just update use it, however, and wherever it want. So, one, one, one way they can use it or, I mean, most cases, the model will use the EditContext buffer to update their view. So let's see some examples. So let's uncom, let's uncomment this update view. Okay, save. So this update view, this is what I mean by a Trivial Model. So in this example we are just taking the texts from context, from EditContext buffer and assign this text to in the HTML basically, we just used their text as (indistinct) buffer as our model. So now as we uncomment it, and then let's refresh. Now when, when I press A, Oh, let's switch back to English. When I press A B C, you see that our EditContext buffer has ABC and our view also have ABC, and I can also type Chinese and I can press left, left to move. And I can press down to select different word and notice that when I press the left and right, this curly bracket is also updated EditContext will have information for the text buffer and also having information for the selection. So I see that when I press left this will move to the corresponding position and it'll also move to the corresponding (indistinct). So that means EditContext and then the view are in sync. So this is very important. The, the two views we have the user facing view and then the, the text input facing view the information will need to sync. And some of them the processor will do for you. And some of them will need the editors to help with. okay, let's show more examples. Okay. So now I'm going to show you if I continue to tie to have a new line. So if right now I press up to move the caret, we can do that, we can do that with no problem. And this was one of blocking issues mentioned by developers where they said that the cabin navigation is tricky for EditContext, because EditContext didn't have a way to access the layout. And it's basically just a current buffer. So how do EditContext know where to move the caret? Because by, because here we are using the native, native selection. So when user interacting with the user facing view we will also get the selection, selection changing event. So the selection changing event will give you, give the, the model, the information of the caret in the current user-facing view. So the (indistinct) model will need to find a mapping between the user facing view and the text input view. And because right now I'm using the Trivial model. So I can just pretty easily get the offset from the anchor in the focus element. And then I will call this EditContext update selection to update the caret position. So this is how we can achieve the caret navigation with the EditContext. And there's another demo for this new behavior. Let's see here. This is another thing I want to demo. So this is the before input. Now I can praise backspace backspace to delete the characters but if I uncomment this, I call it preventable. Now I wanna play, when I play backspace I cannot delete anymore because the event is prevented. And then also you can see there's a squiggle here. So this is for Spellchecking. So there's also one over another blocking issues mentioned by developers. And because here we are using the, the user facing view. And so when I press this right click this, and I press this upper case ABC, I will get this event instead of replacement text and I can get information from this event and then use that to co update text, or EditContext. And now you can see that our EditContext buffer are in sync with our our (distinct). Okay, so this is the first demo of the Trivial model. And let's see the second one, the Multi-cursor editing. Now, as you remember that in Monaco, when you type Chinese and in this Multi-cursor scenario you cannot move left left right right. But now we can do that. The caret can move at the same time and I can also choose different words and omit All right. So the third one is the Cross-boundary composition. So before I I show that, I want to show you this. So this is a native words. So native WORDS, they have this two page layout so I can do dictation here. And you can see that the dictation can go cross the boundary to the second page. So this is a demo bokros ponderary come dictation. Yeah. So you can see that, that, let me you can see that the on the line is crossing the boundary but you can not see this In the Word Online. So this is the Word Online where they don't provide the two-page two page view. And one of the reason is that they just cannot do it because two pages will be, will mean two views. And then if you want to do some composition across the boundary, across to the boundary it will be pretty tricky. Okay. So now I'll go back to my demo. Now, I can use, I press window cage. So this is a demo for cross-boundary dictation. So now you can see that Oh, you can see that the decoration is crossing the boundary. All right. So now let's see the last demo. The last demi is the, the collaboration demo. So see the user A is typing Chinese and now they user B, while user B is typing, we can still see that is syncing to the, the other document. All right. So, summary. So EditContext can decouple texts input from DOM view which can unplug a lot of sophisticated editing scenarios. And our progress from last year is we support English typing. And then we introduced this attachEditContext API so that we can create the association between the DOM and the EditContext or the association between the user facing view and the user input facing view, the text input facing view. So that enables a lot of stuff like BeforeInput events carets,navigation and Spellcheck like I just say more before. All right. So that's a wrap for my presentation. So now we are open for questions or any topics you want to discuss.

Sponsors

Platinum sponsor

Media sponsor

For further details, contact sponsorship@w3.org