23:07:10 RRSAgent has joined #annotation 23:07:10 logging to http://www.w3.org/2015/10/25-annotation-irc 23:07:32 Meeting: Annotation WG F2F, Sapporo, 1st day 23:07:48 rrsagent, this meeting spans midnight 23:08:01 Chair: Rob 23:10:17 Agenda: https://www.w3.org/annotation/wiki/Meetings#Monday_26_October 23:10:45 ivan has changed the topic to: Agenda: https://www.w3.org/annotation/wiki/Meetings#Monday_26_October 23:11:00 Present+ Ivan_Herman 23:40:46 azaroth has joined #annotation 23:51:47 clapierre has joined #annotation 23:51:57 clapierre has left #annotation 23:52:56 clapierre has joined #annotation 23:53:03 Present+ Rob_Sanderson, Charles_LaPierre, Doug_Schepers 23:53:23 takeshi has joined #annotation 23:54:21 kurosawa has joined #annotation 23:54:48 shepazu has joined #annotation 23:56:45 clapierre has left #annotation 23:56:51 clapierre has joined #annotation 00:01:07 Jeff_Xu has joined #annotation 00:01:39 csarven has joined #annotation 00:04:48 present+ rhiaro 00:06:08 ScribeNick: rhiaro_ 00:06:09 scribenick: rhiaro_ 00:06:22 Present+ Takeshi_Kanai 00:07:33 present+ csarven 00:07:47 ivan: one of two staff contacts in the group. Also leading DPUB. One of the reasons I"m in this group is because digital publishing community has major use for annotations. Not at w3c, but ietf, already using first draft of annotation model, big use case. 00:07:51 ... Also part of DPUB IG 00:08:11 ... Other interest, Semantic Web Activity lead for 7 years. 00:08:48 clapierre: ... With Benotech(?), also in DPUB and co-chair of accessibility task force of that group. Interest in how to use annotations, for disabled community 00:09:24 s/Benotech(?)/Benetech 00:09:42 kurosawa has joined #annotation 00:11:16 azaroth: at (?) University, one of the two chairs. In 2009 there were two projects: web annotations and humanities, and one focussed on science annotation, found out about each other, merge goals to start CG. In 2013 completed CG work and last year with Ivan and Doug's assitance we started WG. My interest is from that history, but also my academic background is in humanities, phd in medieval french. Imagine a non-scholar trying to read a french 00:11:16 manuscript, having annotations to describe whats' going on is important 00:11:27 s/(?)/Stanford/ 00:11:48 Jeff(?): from (?). Annotations important to share annotations with users, and between user and publisher 00:12:09 shepazu: We need to put in our use cases the idea of sharing annotations between users and publishers 00:12:24 ... Doesn't immediately occur to people that annotations can be shared with plublishers 00:12:29 ivan: Isn't it in DPUB use cases? 00:12:48 shepazu: two kinds ofo publishers. Of a blog/website. But also publisher of site where you're reading an ebook 00:12:57 ... Need to make sure people understand two kinds of use cases 00:13:14 miyazaki has joined #annotation 00:14:29 csarven: visiting student at MIT, may join W3C aswell. PhD student at Bonn. Why I'm here relates to my research on scholarly publications and how to keep annotations around both from authors, reviewers, and any commentor on the web 00:15:02 ... Try not to make major distinctions between them other than roles. Been keeping an eye on this WG on mailing list and github. Overlapping interests with DPUB that I'm also trying to follow. Also in SocialWG. All these things overlapping. 00:15:13 ... Sarven Capadisli ^ 00:15:48 Eric(?): From iMinds. Semantics group there. Been with W3C since 2006 00:15:54 ... Primarily involved in semantics and media fragments 00:15:58 s/(?)/ Mannens/ 00:16:14 ... Afterwards in Prov WG, and now in annotations and publishing because we have two big projects on that 00:16:24 ... One Flemish one with publishers, there, and a European one called (?) 00:16:31 ... with Felix who will join later 00:16:53 ... Guys from my team in this group. Open source framework to publish ebooks, completely based on (?) and HTML5 00:17:04 s/(?)/EPUB3/ 00:17:05 ... Definitely going to implement your spec as one of the reference implementations 00:17:15 ... I always have to look for new stuff for my team, so will be in and out today 00:17:55 rhiaro_: Amy Guy, University of Edinburgh PhD student and visiting student at MIT, in SocialWG 00:18:35 miyazaki: from Japan Broadcasting Coproration. First time at TPAC, observer in this meeting. Research Engineer, in charge of constructing RDF database of TV programmes 00:18:41 ... Very interested in semantic web technology and social media 00:18:53 ... Interested in how to handle people's review about TV programmes 00:18:59 ... How to structure and so on 00:19:30 takeshi: From Sony, Japan. Sony has released device named digital paper, so you can take notes like you are writing on the paper on the device 00:19:56 erikmannens has joined #annotation 00:19:59 ... It is based on PDF. My motivation is to replace that with web 00:20:11 ... I have implemented a prototype into the device 00:20:26 ... Can't bring the device 00:20:52 ... Also spend many years in ebook industry. Also contributed to epub format. And printing industry background. 00:21:10 shepazu: Rob failed to mention that he's the editor of two fo the specs from this WG 00:21:31 ... Web Annotation data model spec, and also Web Annotation protocol spec, which is based on LDP 00:21:43 ... Basically the notions of how to publish annotations to different servers, write API for the web 00:21:53 ... Might be useful for us to go through individual items in out charter. 00:22:23 ... I'm Doug Schepers, staff contact for this group. Instigator of larger idea of web annotations beyond data model stuff in CG. Bring together group that solves lots of idfferent parts of the problem. 00:22:53 ... Also staff contact for SVG WG and Accessibility. Also Web Audio API WG. Touch Events WG. Web Payments WG. 00:23:14 ... Appreciate everyone showing up, we should have more people later. 00:23:36 azaroth: Thanks everyone. Until we are more familiar, just say who you are and/or your handle on IRC 00:23:44 ivan: we are expecting one more person in half an hour 00:24:19 ivan: I meant to mention, I'm also the editor of one of the specs - Find Text API that was just published as FPWD 00:24:43 s/ivan/shepazu/ 00:25:14 shepazu: Any questions or comments, feel free to ask 00:25:47 azaroth: One note about the WG - we are public, all communication is done publicly 00:26:00 ... Try to be somewhat less formal than other WGs and just roll with it and see how things go 00:26:12 ... Try to make most appropriate use of time and conversations 00:26:33 ... Might be slower, but we expect deliverables will be better 00:27:20 Agenda: https://www.w3.org/annotation/wiki/Meetings#Monday_26_October 00:27:28 TOPIC: Agenda review 00:28:10 azaroth: Today the majority of the agenda is focussed around client APIs. First FindText work that Doug has been working on 00:28:28 ... We have a joint meeting with Web Platform about that at 1130 00:28:36 ... Before that discussion particularly about i18n 00:28:50 ... After lunch, a meeting with Felix around translation 00:29:27 ... Then after that, we have to have tests in place for all of our work. Testing an abstract data model is somewhat complex. We have an IE, Chris Berg, who is going to be leading testing. 00:29:45 ... But if we could discuss how we want to go about testing for all of the different APIs and models and so on, we could make some good progress 00:30:46 ... A note from the programme, the break is actually between 3 and 4, but that's when we're meeting with DPUB 00:31:16 ... So not sure of exact time, but we have joint meeting with DPUB, particularly around use cases 00:31:44 ... I'm editor of DPUB note on annotation use cases, which can feed inttttttttttttto this group 00:32:06 ... Some blank time, probably that will get taken up with discussions 00:32:18 ... Towards the end of the day we want to work on next step sfor the client APIs 00:32:28 ... The charter has a broad pool of clientside API deliverables 00:32:40 ... Essentially says create some specifications that help browsers to create and deliver annotations 00:32:57 ... The FindText API is one of those, but there may be others that can be worked onw ithin that scope 00:33:14 ... We do have the beginnings of a second one called DOM Annotations, but after some initial work things stalled a little bit 00:33:32 ... Would be good to discuss what would be useful, and who we might be able to reach out to help us 00:33:47 ... Wrap up at the end of the day. Any questions or thoughts about today's agenda? 00:34:04 ivan: Want to talk about URIs sometime? 00:34:14 shepazu: While we're talking about items in charter 00:34:45 ... Also could during FindText 00:34:57 ivan: We shoudl put it on the table as something we plan to do, important for DPUB 00:35:09 azaroth: at least dicsuss it before lunch 00:35:39 ivan: the agenda now says as if the FindText goes into i18n, but maybe it's worth for people who are not familiar to have 10 minutes intro to FindText 00:36:03 bigbluehat: Benjamin Young with Hypothes.is 00:36:55 ... Hypothes.is is a nonprofit working to bring web anntotation back to the web. Offer a browser extension and bookmarklet, and embed for publishers. BSD licence, Python, angularjs 00:37:01 ... I'm coeditor of data model spec now 00:37:20 Present+ Benjamin_Young 00:37:47 azaroth: just discussing what else needs to be on the agenda for today 00:38:21 ... Agenda for tomorrow. Focussed around data model and protocol 00:39:34 ... Starting with protocol because that's more important to make progress on. CG gave us a headstart with the model, and because we have a bunch of people aligned with SocialWG 00:40:37 ... 3 parts ot protocol. REST (CRUD), is built on top of LDP. Also for the SocialWG we want to use AS2 Collections and Pages to be able to break up the response 00:41:00 ... Two areas we have made less progress, notifications from one system to another than annotation has been created, modified or deleted 00:41:05 ... We hope work in SocialWG will help us there 00:41:13 ... last TPAC we had a good conversation around using AS2 00:41:18 shepazu: Seems like a natural mechanism 00:41:25 azaroth: After lunch, third part of protocol is search 00:41:37 ... If you have millions of annotations across all resources on the web, how do you find those you're interested in 00:41:45 ... I have some ideas around that which I was writing up on the plane 00:41:58 ... The model, alignment with SocialWG 00:42:25 ... Less around here are some new features that are annotations, more here's what is settling on social side, and what we're settling on, and how they can work together 00:42:41 ... After the break, continuing to work on further features if we still have energy 00:43:18 ... Next steps, how far along with deliverables we are, who we need help from to get there 00:43:38 ivan: Also decide whether we need another f2f, let's not leave that to the last minute 00:44:00 azaroth: unscheduled meeting that we should try to schedule is to talk with TAG about protocol issues 00:44:10 ... Erik Wilde brought up some concerns around how protocol works, most of which are derived from LDP 00:44:26 ... So given that LDP is a full recommendation, it's a little bit problematic to say we find issues with it 00:44:36 ... But we want to be as valuable as possible 00:44:53 ivan: I wouldn't think you want to go there, if Erik has problems with LDP we shouldn't be the ones playing for Erik, it's not our role 00:45:22 shepazu: He had one thing that was not specific to LDP, which was we are saying that something is an annotation server, as opposed to a generic server, and he thought that was not a good design choice 00:45:38 bigbluehat: also our use of server singular instead of servers, as LDP can be spread across multiple machines 00:45:59 ... Editorial tweak on our part. To say URLs can be all over on the web. As long as your credentialing will let you move across machines, 00:46:13 ... Initially saying annotation client and annotation server makes it sound like a two piece things, Just some clarification 00:46:21 shepazu: You can say that about any LDP application 00:46:55 bigbluehat: we can clarify what distinguishes an annotations one 00:46:55 ... Just Link Headers 00:48:01 Ralph arrives 00:48:19 shepazu: Ralph is domain lead of domain under which this group operates, information and knowledge domain 00:48:33 TOPIC: Overview of work to date 00:48:38 azaroth: WG has six deliverables 00:48:48 ... first being a data model for annotations, which we now have a second draft of 00:49:03 ... Derived from CG and then has been discussed thoroughly within WG 00:49:23 q+ 00:49:40 ... Some of the areas that have changed around how it interacts with other specs, eg. some of the specs that the CG used were not full recs, so we can't refer to them normatively from rec, so we needed to remove it 00:49:57 ... Over the last few months we've talked about how to have multiple roles, one for each resource used within the annotation 00:50:07 ... Tied to the data model is a vocabulary for describing the data model 00:50:11 ... Vocabulary is in RDF 00:50:38 ... Important to note that we rely heavily on JSON-LD as a way to have the RDF graph model be something that is understandble and implementable by people who do not have a full RDF stack 00:51:13 ... One of our main driving principles is that the results should be useable without relying on RDF specific technology. You should be able to write JS in browser and work with JSON that comes back from the server. Time will tell how successful we are with that 00:51:18 ... Something we're trying to keep in mind 00:51:37 ... The model is a graph based model, using RDF, but the way we expect most people interact with it is via a specific JSON-LD serialization 00:51:49 ivan: is it the intention that the vocabulary will be published as a separate document? 00:52:09 azaroth: at the moment the vocabulary, the serialization and the data model are all rolled together into the data model spec, Annotation-Model 00:52:23 ... There has been some limited discussion about having multiple documents, one for model, vocab, and serialization 00:52:44 ... Tradeoffs have been that having multiple documents menas you need to read multiple documents, with lots of references between them that gets complex 00:52:54 ... But if it's all in one, it's more complicated for people who just want to see certain examples 00:53:01 ... Bit of a pedagogical issue, rather than a technical one 00:53:17 shepazu: My intention is that the serialization would not be a single spec, but rather a set of specs 00:53:42 ... Eg. as HTML, or as exif data in an image. Different ways of portraying the same data that would map back to the same terminology 00:53:58 azaroth: at the moment we've been focussing on JSON-LD, but there likely will be other serializations 00:54:13 ... First three, still work, but reasonably well 00:54:23 ... Fourth is protocol, how do you transfer annotations from client to server or server to server 00:54:36 ... based around LDP, also hopefully Collections from AS2 00:54:48 ... Five and six are closely related. Clientside API and robust linking anchoring 00:55:01 ... Clientside API helps browser or user agent create and consume annotations once they have them via the protocol or some event 00:55:22 ... So the current work is around the FindText API (previously rangefinder) which allows you to do find in page iwth a bunch of additional cool features 00:55:25 shepazu: fuzzy matching 00:55:34 ... defines a set of parameters around which you can do fuzzy matching. 00:55:53 ... Robust Link Anchoring is a more complex topic. The first spec as Rob said is the FindText API and that deals simply with text 00:56:16 ... BUt if you were annotation eg. an image, there should be a way of getting at a particular part of an image, FindText does not deal with that but robust link anchoring does 00:56:32 ... Once you ahve the FindText API, that opens up the door to having a URL scheme that using fragment ids you can say... 00:56:39 ... Say that you have a selection of text that you want to search for 00:56:49 ... Say it's repetitive, song lyrics for example 00:56:59 Jeff_Xu has joined #annotation 00:56:59 ... So you want to say, even though this particular text appears three times, I want the third instance specifically 00:57:14 ... So in addition to saying this specific stirng, you also say these are the 32 characters before and after, the prefix and the suffix 00:57:31 ... Given those three things, prefix, suffix and selection, you can have a URL that says # something 00:57:53 ... haven't deicded how to do it yet. Browser takes parameters and finds the instance you're looking for 00:58:12 ... If you wanted to select a passage and send a link to a friend, you can send a URL and your friend's browser takes them to the exact place 00:58:22 ... It's not the most elegant but we can't think of a more elegant way 00:58:29 ... Obviously once you have those things, you can use those for annotations 00:58:47 azaroth: at a slightly higher level, the robust link anchoring topic is, given a resource 00:58:52 ... how do I get the representation that I want 00:58:59 ... and how do I get the bit that I'm talking about 00:59:24 ... Issue around dynamic pages. Eg. js app, makes dynamic changes to page. You annotate something, what information does the client need to reconstruct the state of the page to make the annotation make sense 00:59:41 ivan: at some point we should look at these six, and plan what is realistic and what is not in the coming year 00:59:57 ... FindText is great, but personally I don't believe that we will have the time and energy to do anything else under the 6th point 01:00:09 shepazu: not sure I agree, but okay 01:00:27 ivan: that's my opinion. Same for serializations. I don't see us doing everything needed for rec - spec, testing, etc - within less than 1 year 01:00:33 ... so we have to be realistic about what we can achieve 01:00:58 ... maybe we should say that for certain entries here, we propose an extension or a new WG, but we have to be realisitic. We should try to find some time to discuss that. 01:01:24 shepazu: one last thing about robust anchoring 01:01:42 ... We talked about it largely in terms of text and images, but this also applies to media resources. Using media fragments for example to get a particular point at a video 01:01:50 ... You can also include a particular location in a video 01:01:58 ... All things that can and should be annotatable using the data model 01:02:11 ... How the robust anchoring links with the data model, it stores all the individual things as parameters 01:02:28 ... For example for text, the selection, prefix and suffix, maybe some other bits, those can be recomposed into a URL, but in the annotation they're stored as individual pieces 01:02:34 ivan: may require some adjustment between the two 01:02:46 ... the current selectors we have in the document may not cover all the things the FindText API can do 01:02:52 ... we may need to push additional terms into the data model 01:03:03 bigbluehat: is robust anchoring the ability to re-anchor across media types? 01:03:27 ivan: I think the idea is that if you get an annotation with a target uri, and somebody changes the text, you could still find the text. Robust against change of the media. 01:04:11 azaroth: the exact change is not well defined. For example if you have a resource that does conneg for plain text, html, pdf, the URI would be the same and the text is there, but the content negotiatble representations, one annotation should be able to re-anchored across all of those representations. OR is it for specific representations 01:04:16 bigbluehat: that definitely needs clarifying 01:04:29 ... The scenario that hypothes.is have, is publishing as html, epub, pdf 01:04:33 ... want annotations across all of them 01:04:44 ... Textually the ranges are the same, but scenarios of anchoring them are pretty different 01:05:08 BREAK 01:05:18 Back at quarter past 01:08:38 kurosawa has joined #annotation 01:16:00 azaroth has joined #annotation 01:22:57 shevski has joined #annotation 01:23:44 erikmannens has joined #annotation 01:23:47 azaroth has joined #annotation 01:29:16 takeshi has joined #annotation 01:31:11 miyazaki has joined #annotation 01:31:16 Meeting recommences 01:31:18 proposed RESOLUTION: Minutes of last call are approved: http://www.w3.org/2015/10/21-annotation-minutes.html 01:31:22 TOPIC: Resolve minutes of last meeting 01:31:34 azaroth: any objections? 01:31:44 RESOLUTION: Minutes of last call are approved: http://www.w3.org/2015/10/21-annotation-minutes.html 01:31:54 TOPIC: FindText API 01:31:56 http://w3c.github.io/findtext/ 01:32:08 shepazu: Editor's Draft ^ 01:32:18 ... Has latest changes since publication 01:32:22 ... Published as FPWD last week 01:32:43 ... A little unusual in that we have a liason in our charter with the WebApps WG to publish this document together 01:33:03 ... In the time I was working on it, there were plans to merge WebApps with HTML WG to form Web Platoform WG 01:33:12 ... We put out a cfc for the FPWD of FindText 01:33:37 ... And from the time the cfc started to the time it ended the new WG launched, so through some quirk of fate this is now published by WebAnnotations and is the first spec published by Web Platform WG 01:33:48 ... Web Platform is working on all of the big clientside APIs, plus HTML 01:34:23 ... So it's good that we have their attention. I talked informally to somebody from apple who works on safari, and today I bumped into somebody from MS who works on Edge (replacement for IE) 01:34:43 ... Both of them said that so far as they could tell without having looked at it too deeply they thought that FindText seemed like a good idea and they're interested in implementing it 01:34:54 ... That would be fabulous, and get the WG the attention of the use case that we're trying to do 01:35:14 ... While not diminishing the other things, the anchoring that is enabled by FindText, along with the data model, those parts are the core of annotations 01:35:32 ... THe publishing stuff is all useful, but those two pieces are the core, if we can get attention for those two pieces we are in very good shape 01:35:42 ... We also got the attention of the i18n WG 01:35:52 ... Any time you're working with text you need to make sure it' sinternationalized 01:36:04 ... about a year ago the i18n WG started working on a spec called charmodnorm 01:36:09 ... character model for the web normalization 01:36:20 ... worked on by Addison Philips(?) Amazon 01:36:37 ... solves so many of the problems we should have run into, we don't have to translate unicode stuff, already a spec for this, timing really fortunate 01:36:44 ... and the fact we were working on FineText got them interested 01:36:54 ivan: that document is a note or rec to be? Timing? 01:37:08 shepazu: Rec-to-be. Don't know about timing. Probably hand in hand with FindText 01:37:16 ivan: otherwise we run into stupd administrative issues 01:37:22 shepazu: they raised several issues on github 01:37:39 ... those issues I've started resolving them, some are easy some more tricky, all them are about my own ignorance about i18n 01:37:46 ... Just educating myself abou tthe right way to approach a problem 01:37:50 Github Issues link: https://github.com/w3c/findtext/issues 01:38:05 ... There will be a process of negotation between us and i18n about which parts of defining text search in FindText and which CharModNorm 01:38:18 ... CharModNorm applies search to broader set of resources 01:38:39 ... FindText specifically developer API, and beyond i18n because there are things around edit distance 01:38:52 ... That's the background of this thing. Seems like it's goign to get some momentum. Might change dramatically, but the barebones are here. 01:39:02 ... Is anybody interested in hearing how this api works? 01:39:14 ... I'll brieflly tell you 01:40:04 ... Three ways you can provide feedback on the spec 01:40:10 ... Either send an email to the mailing list 01:40:13 ... public-annotation 01:40:17 ... file a bug on github 01:40:34 ... Or leave an annotation directly on the spec 01:40:48 ... Make an account, select some text and leave annotation 01:40:52 ... They're sent to mailing list 01:41:19 ... API has several parameters. Pass them in as a JSON object 01:41:47 ... Example. Here's a poem, selected because it has the words 'rage rage' several times. So how would you find the fourth instance of 'rage rage' 01:42:05 ... EXAMPLE 1 ... pass in string to FindText 01:42:15 ... call searchAll() 01:42:27 ... find third match if you're looking for third instance 01:43:20 New arrivals: Richard Eschida, r12a, Dave Clark, Felix Sasaki, fsasaki 01:43:36 s/Dave Clark/Dave Clarke 01:44:03 shepazu: Here's another example of a search that will find that string 01:44:14 ... Intialize FindText object with thsi JSON object, with text and prefix 01:44:14 kurosawa has joined #annotation 01:44:27 ... This is the specific selection that we're looking for 01:44:38 ... So, the kind of parameters you can have 01:44:49 fsasaki has joined #annotation 01:45:00 present+ fsasaki 01:45:15 ... text and textDistance 01:45:47 ... Edit distance is an algorithmic way of saying how two words are related mathmatically 01:46:00 ... eg. dog -> fog, edit distance is one, have to change one letter 01:46:21 ... fog -> frog, have to add a character, so edit distance is one 01:46:34 ... edit distance dog -> frog, change and add, = 2 01:46:57 ... when you're talking about typos.. on a string this small this is significant. When you're talking about longer strings it becomes more useful 01:47:17 ... when you're talking abou ttypos and they miss one letter 01:47:22 ... still robust 01:47:34 ... you can still match, especially when you have prefix and suffix 01:47:47 ... turns out to be a very efficient way of searching for differences 01:47:58 ... edit distance is absolute number of changes 01:48:14 ... if I say I want an edit distance of one, that means I will allow one change, doesn't matter length of string 01:48:40 ... Quite likely if you didn't find match on first pass with FindText API you might increase edit distance until you find a match 01:48:52 ... once you get to ane dit distance of 15-20% it's very likely thsi thing doesn't exist in this document any more 01:48:54 davidclarke has joined #annotation 01:48:57 ... but first you try to have robust anchoring 01:49:06 ... selection is the target text you're looking for 01:49:11 ... textDistance is edit distance 01:49:16 ... prefix and suffix, both of which have edit distance 01:49:24 Present+ David_Clarke, Richard_Ishida, Felix_Sasaki 01:49:28 ... scope is an element that says the content I'm looking for must be within this element 01:49:31 ivan: DOM elemnt? 01:49:35 shepazu: yes 01:49:38 ... DOM API, to operate on webpages 01:49:47 ... So let's say that I make a webapp and my webapp is a text editing app 01:50:03 ... So I have an editing area and a bunch of worsd in there. I have the world 'file' 01:50:11 ... and in the UI for my app I have a bunch of menu options, and one of them is 'file' 01:50:29 ... so if I want to search this document, if I want my users to bea ble to search this document, I don't want them to find the UI instance of the word file 01:50:35 ... the thing within the content area 01:50:44 ... eg. google docs gives you its own find dialog 01:50:56 ... another use case is that you might say I know it's in this chapter, which is represented by this element 01:51:01 ... can be used to make more efficient searches 01:51:28 ?: multiple elements? 01:51:34 shepazu: no, you should set parent 01:51:50 ... range says where should I start this search 01:51:55 s/?:/takeshi/ 01:51:56 ... similar to scope but different use case 01:52:17 ... caseFolding, unicodeNormalization, set of choices 01:52:29 ... wrap, do youw ant to wrap around the document 01:52:40 ... so if you start from a start position do you want to go all the way around. Maybe not necessary. 01:52:47 ... The other stuff is allr elated to the search operation itself 01:53:11 ... The way it works, turn the entire document into a string, normalize it, collapse the white space, search on this long string that is the text of the document 01:53:20 ... Once you find a candidate match, you return it as a range, where is it in the DOM 01:53:25 ... Not simply where is it in the text 01:53:37 ... Allows you to treat element boundaries.. you ignore element boundaries when you're doing a FindText API search 01:53:43 ... DOM API that operates on text, and returns DOM range 01:53:49 ... A range may span multiple elements 01:53:54 ... That's basically how the API works 01:54:06 ... an algorithm of how it operates, not for implementation, just explaination of results 01:54:20 ... Finally, the notion that you would have this URL syntax, each of these parameters is something you would set in this URL syntax 01:54:29 ... Each URL is effectively a findText operation 01:54:42 Jeff_Xu: based on text structure? 01:54:44 shepazu: yes 01:55:01 Jeff_Xu: in html structure, if there's some element in front of keyword, but moved to somewhere else with CSS..? 01:55:06 shepazu: doesn't account for that 01:55:12 ivan: if you generate content by CSS, you don't find it 01:55:24 shepazu: there is discussion now that gneerated content in CSS should also be accessible 01:55:40 ... generating content should be treated as some part of the object model, whether DOM or some higher level, should be seralized as part of it 01:55:47 ... however we solve it for accessibility we should solve it as that case 01:55:53 azaroth: will make github issue to track that 01:56:03 shepazu: need to transfer issues sfrom spec to github 01:56:10 ... Some of the isses on spec are me thinking out loud 01:56:15 ivan: css stuff is a good question 01:56:21 shepazu: occurred to me before, but not resolved yet 01:56:35 ... there's generated content, but also the css has moved the text to appear to be in another part of the document. Nothing to be done for that. 01:56:49 ... the nice thing, even if the text is not there, the rendering of the text is different than the DOM order of the text, it will always be that way 01:57:01 erikmannens has joined #annotation 01:57:01 ... so when you come back to the document, it will still consider that to be part of the document in that order 01:57:12 ivan: one thing to make note of, we may want to talk to CSS people 01:57:19 ... (?) project that has started, to try to open what rendering engine does 01:57:30 ... may be that we have another version of findText that works on the CSS object model that they produce 01:57:35 ... which takes care of these rearrangements 01:57:44 shepazu: needs to be dealt with. Not the only thing we need to talk about with CSS 01:57:51 ... also, once you have that range, how can you style it 01:58:00 ... once you find the result, how you highlight it once you have the result 01:58:13 ... one thing you'd do today is take the range, surround with span with class 01:58:16 ivan: doesn't always work 01:58:20 ... the range spans over wto paragraphs 01:58:31 shepazu: have to chunk it. It's ugly. If you have a hundred annotations you don't want to do that 01:58:38 ... need to be able to style ranges arbitrarily 01:58:47 ... outside scope of this, inside scope of robust anchoring discussion 01:58:56 clapierre: do we have to worry about aria hidden role? 01:59:01 shepazu: could ask same question about visibility 01:59:03 ... same class of question 01:59:12 clapierre: css visibility is not visible in the DOM 01:59:15 annbass has joined #annotation 01:59:17 shepazu: Display none is not 01:59:21 ... we need to deal with all of that 01:59:54 takeshi has joined #annotation 01:59:58 ... there was an issue that was raised that said.. when I talk about that I say when you serialize HTML DOM into text, I suggest using the serialization in the DOM4 spec, somebody gave feedback that said no you should use one of these other serialization methods 02:00:07 ... maybe one of those others will deal with it. All good points. 02:00:27 ivan: we discussed with webform guys, small issue about promises 02:01:09 TOPIC: Internationalization 02:01:16 azaroth: welcome to folks in i18n WG 02:01:21 ... how would you like to go through issues? 02:01:49 *New people arrive* 02:02:00 Irinia Bolchevsky, w3c 02:02:04 Akira M(?) 02:02:07 Ann Bassetti 02:03:12 shepazu: before we got to individual issues, meta question to mailing list 02:03:31 ... You guys decided to file github issues, which is fine. In addition to that, PRs also welcome 02:03:47 ... You can just fix my spec and send an email or describe it in PR 02:04:01 ... chances are unless there's a fundamental disagreement I'll simply take a PR 02:04:47 Richard E: we prefer github, much easier to handle conversations 02:04:58 ... what we'd also like to do is get a tag that says i18n that we can attach to issues so we can track and get notified 02:05:05 ivan: adding a label to the issues? easy to do 02:05:12 s/Richard E/Richard Ishida/ 02:05:45 Richard Ishida also = r12a 02:06:08 shepazu: issue 4, one of the more complex ones 02:06:13 ... didn't know how to handle it in the spec 02:06:27 ... the issue is, in the character counts of ranges should they be unicode code points or graphemes or whatever 02:06:39 ... I think that you guys were suggesting unicode code points as your preference? 02:06:59 r12a: In javascript, if you have a.. you know supplementatry characters? 02:07:13 ... unicode can encode around a million code points, there are 65536 slots in the basic multilingual plane 02:07:19 ... utf16 you only need 2 bytes for each of those 02:07:31 ... if you go above that for some of the newer characters then you need 4 bytes to encode them in utf16 02:07:37 ... that is two code units 02:07:45 ... in javascript doesn't know how to handle the higher level characters very well 02:07:53 ... so you end up with two things that are not actually cod epoints 02:07:58 ... it shouldn't be like that, it should be a single code point unit 02:08:05 ... that leads to this question about whether we should do code unites or code points 02:08:11