23:07:10 RRSAgent has joined #annotation 23:07:10 logging to http://www.w3.org/2015/10/25-annotation-irc 23:07:32 Meeting: Annotation WG F2F, Sapporo, 1st day 23:07:48 rrsagent, this meeting spans midnight 23:08:01 Chair: Rob 23:10:17 Agenda: https://www.w3.org/annotation/wiki/Meetings#Monday_26_October 23:10:45 ivan has changed the topic to: Agenda: https://www.w3.org/annotation/wiki/Meetings#Monday_26_October 23:11:00 Present+ Ivan_Herman 23:40:46 azaroth has joined #annotation 23:51:47 clapierre has joined #annotation 23:51:57 clapierre has left #annotation 23:52:56 clapierre has joined #annotation 23:53:03 Present+ Rob_Sanderson, Charles_LaPierre, Doug_Schepers 23:53:23 takeshi has joined #annotation 23:54:21 kurosawa has joined #annotation 23:54:48 shepazu has joined #annotation 23:56:45 clapierre has left #annotation 23:56:51 clapierre has joined #annotation 00:01:07 Jeff_Xu has joined #annotation 00:01:39 csarven has joined #annotation 00:04:48 present+ rhiaro 00:06:08 ScribeNick: rhiaro_ 00:06:09 scribenick: rhiaro_ 00:06:22 Present+ Takeshi_Kanai 00:07:33 present+ csarven 00:07:47 ivan: one of two staff contacts in the group. Also leading DPUB. One of the reasons I"m in this group is because digital publishing community has major use for annotations. Not at w3c, but ietf, already using first draft of annotation model, big use case. 00:07:51 ... Also part of DPUB IG 00:08:11 ... Other interest, Semantic Web Activity lead for 7 years. 00:08:48 clapierre: ... With Benotech(?), also in DPUB and co-chair of accessibility task force of that group. Interest in how to use annotations, for disabled community 00:09:24 s/Benotech(?)/Benetech 00:09:42 kurosawa has joined #annotation 00:11:16 azaroth: at (?) University, one of the two chairs. In 2009 there were two projects: web annotations and humanities, and one focussed on science annotation, found out about each other, merge goals to start CG. In 2013 completed CG work and last year with Ivan and Doug's assitance we started WG. My interest is from that history, but also my academic background is in humanities, phd in medieval french. Imagine a non-scholar trying to read a french 00:11:16 manuscript, having annotations to describe whats' going on is important 00:11:27 s/(?)/Stanford/ 00:11:48 Jeff(?): from (?). Annotations important to share annotations with users, and between user and publisher 00:12:09 shepazu: We need to put in our use cases the idea of sharing annotations between users and publishers 00:12:24 ... Doesn't immediately occur to people that annotations can be shared with plublishers 00:12:29 ivan: Isn't it in DPUB use cases? 00:12:48 shepazu: two kinds ofo publishers. Of a blog/website. But also publisher of site where you're reading an ebook 00:12:57 ... Need to make sure people understand two kinds of use cases 00:13:14 miyazaki has joined #annotation 00:14:29 csarven: visiting student at MIT, may join W3C aswell. PhD student at Bonn. Why I'm here relates to my research on scholarly publications and how to keep annotations around both from authors, reviewers, and any commentor on the web 00:15:02 ... Try not to make major distinctions between them other than roles. Been keeping an eye on this WG on mailing list and github. Overlapping interests with DPUB that I'm also trying to follow. Also in SocialWG. All these things overlapping. 00:15:13 ... Sarven Capadisli ^ 00:15:48 Eric(?): From iMinds. Semantics group there. Been with W3C since 2006 00:15:54 ... Primarily involved in semantics and media fragments 00:15:58 s/(?)/ Mannens/ 00:16:14 ... Afterwards in Prov WG, and now in annotations and publishing because we have two big projects on that 00:16:24 ... One Flemish one with publishers, there, and a European one called (?) 00:16:31 ... with Felix who will join later 00:16:53 ... Guys from my team in this group. Open source framework to publish ebooks, completely based on (?) and HTML5 00:17:04 s/(?)/EPUB3/ 00:17:05 ... Definitely going to implement your spec as one of the reference implementations 00:17:15 ... I always have to look for new stuff for my team, so will be in and out today 00:17:55 rhiaro_: Amy Guy, University of Edinburgh PhD student and visiting student at MIT, in SocialWG 00:18:35 miyazaki: from Japan Broadcasting Coproration. First time at TPAC, observer in this meeting. Research Engineer, in charge of constructing RDF database of TV programmes 00:18:41 ... Very interested in semantic web technology and social media 00:18:53 ... Interested in how to handle people's review about TV programmes 00:18:59 ... How to structure and so on 00:19:30 takeshi: From Sony, Japan. Sony has released device named digital paper, so you can take notes like you are writing on the paper on the device 00:19:56 erikmannens has joined #annotation 00:19:59 ... It is based on PDF. My motivation is to replace that with web 00:20:11 ... I have implemented a prototype into the device 00:20:26 ... Can't bring the device 00:20:52 ... Also spend many years in ebook industry. Also contributed to epub format. And printing industry background. 00:21:10 shepazu: Rob failed to mention that he's the editor of two fo the specs from this WG 00:21:31 ... Web Annotation data model spec, and also Web Annotation protocol spec, which is based on LDP 00:21:43 ... Basically the notions of how to publish annotations to different servers, write API for the web 00:21:53 ... Might be useful for us to go through individual items in out charter. 00:22:23 ... I'm Doug Schepers, staff contact for this group. Instigator of larger idea of web annotations beyond data model stuff in CG. Bring together group that solves lots of idfferent parts of the problem. 00:22:53 ... Also staff contact for SVG WG and Accessibility. Also Web Audio API WG. Touch Events WG. Web Payments WG. 00:23:14 ... Appreciate everyone showing up, we should have more people later. 00:23:36 azaroth: Thanks everyone. Until we are more familiar, just say who you are and/or your handle on IRC 00:23:44 ivan: we are expecting one more person in half an hour 00:24:19 ivan: I meant to mention, I'm also the editor of one of the specs - Find Text API that was just published as FPWD 00:24:43 s/ivan/shepazu/ 00:25:14 shepazu: Any questions or comments, feel free to ask 00:25:47 azaroth: One note about the WG - we are public, all communication is done publicly 00:26:00 ... Try to be somewhat less formal than other WGs and just roll with it and see how things go 00:26:12 ... Try to make most appropriate use of time and conversations 00:26:33 ... Might be slower, but we expect deliverables will be better 00:27:20 Agenda: https://www.w3.org/annotation/wiki/Meetings#Monday_26_October 00:27:28 TOPIC: Agenda review 00:28:10 azaroth: Today the majority of the agenda is focussed around client APIs. First FindText work that Doug has been working on 00:28:28 ... We have a joint meeting with Web Platform about that at 1130 00:28:36 ... Before that discussion particularly about i18n 00:28:50 ... After lunch, a meeting with Felix around translation 00:29:27 ... Then after that, we have to have tests in place for all of our work. Testing an abstract data model is somewhat complex. We have an IE, Chris Berg, who is going to be leading testing. 00:29:45 ... But if we could discuss how we want to go about testing for all of the different APIs and models and so on, we could make some good progress 00:30:46 ... A note from the programme, the break is actually between 3 and 4, but that's when we're meeting with DPUB 00:31:16 ... So not sure of exact time, but we have joint meeting with DPUB, particularly around use cases 00:31:44 ... I'm editor of DPUB note on annotation use cases, which can feed inttttttttttttto this group 00:32:06 ... Some blank time, probably that will get taken up with discussions 00:32:18 ... Towards the end of the day we want to work on next step sfor the client APIs 00:32:28 ... The charter has a broad pool of clientside API deliverables 00:32:40 ... Essentially says create some specifications that help browsers to create and deliver annotations 00:32:57 ... The FindText API is one of those, but there may be others that can be worked onw ithin that scope 00:33:14 ... We do have the beginnings of a second one called DOM Annotations, but after some initial work things stalled a little bit 00:33:32 ... Would be good to discuss what would be useful, and who we might be able to reach out to help us 00:33:47 ... Wrap up at the end of the day. Any questions or thoughts about today's agenda? 00:34:04 ivan: Want to talk about URIs sometime? 00:34:14 shepazu: While we're talking about items in charter 00:34:45 ... Also could during FindText 00:34:57 ivan: We shoudl put it on the table as something we plan to do, important for DPUB 00:35:09 azaroth: at least dicsuss it before lunch 00:35:39 ivan: the agenda now says as if the FindText goes into i18n, but maybe it's worth for people who are not familiar to have 10 minutes intro to FindText 00:36:03 bigbluehat: Benjamin Young with Hypothes.is 00:36:55 ... Hypothes.is is a nonprofit working to bring web anntotation back to the web. Offer a browser extension and bookmarklet, and embed for publishers. BSD licence, Python, angularjs 00:37:01 ... I'm coeditor of data model spec now 00:37:20 Present+ Benjamin_Young 00:37:47 azaroth: just discussing what else needs to be on the agenda for today 00:38:21 ... Agenda for tomorrow. Focussed around data model and protocol 00:39:34 ... Starting with protocol because that's more important to make progress on. CG gave us a headstart with the model, and because we have a bunch of people aligned with SocialWG 00:40:37 ... 3 parts ot protocol. REST (CRUD), is built on top of LDP. Also for the SocialWG we want to use AS2 Collections and Pages to be able to break up the response 00:41:00 ... Two areas we have made less progress, notifications from one system to another than annotation has been created, modified or deleted 00:41:05 ... We hope work in SocialWG will help us there 00:41:13 ... last TPAC we had a good conversation around using AS2 00:41:18 shepazu: Seems like a natural mechanism 00:41:25 azaroth: After lunch, third part of protocol is search 00:41:37 ... If you have millions of annotations across all resources on the web, how do you find those you're interested in 00:41:45 ... I have some ideas around that which I was writing up on the plane 00:41:58 ... The model, alignment with SocialWG 00:42:25 ... Less around here are some new features that are annotations, more here's what is settling on social side, and what we're settling on, and how they can work together 00:42:41 ... After the break, continuing to work on further features if we still have energy 00:43:18 ... Next steps, how far along with deliverables we are, who we need help from to get there 00:43:38 ivan: Also decide whether we need another f2f, let's not leave that to the last minute 00:44:00 azaroth: unscheduled meeting that we should try to schedule is to talk with TAG about protocol issues 00:44:10 ... Erik Wilde brought up some concerns around how protocol works, most of which are derived from LDP 00:44:26 ... So given that LDP is a full recommendation, it's a little bit problematic to say we find issues with it 00:44:36 ... But we want to be as valuable as possible 00:44:53 ivan: I wouldn't think you want to go there, if Erik has problems with LDP we shouldn't be the ones playing for Erik, it's not our role 00:45:22 shepazu: He had one thing that was not specific to LDP, which was we are saying that something is an annotation server, as opposed to a generic server, and he thought that was not a good design choice 00:45:38 bigbluehat: also our use of server singular instead of servers, as LDP can be spread across multiple machines 00:45:59 ... Editorial tweak on our part. To say URLs can be all over on the web. As long as your credentialing will let you move across machines, 00:46:13 ... Initially saying annotation client and annotation server makes it sound like a two piece things, Just some clarification 00:46:21 shepazu: You can say that about any LDP application 00:46:55 bigbluehat: we can clarify what distinguishes an annotations one 00:46:55 ... Just Link Headers 00:48:01 Ralph arrives 00:48:19 shepazu: Ralph is domain lead of domain under which this group operates, information and knowledge domain 00:48:33 TOPIC: Overview of work to date 00:48:38 azaroth: WG has six deliverables 00:48:48 ... first being a data model for annotations, which we now have a second draft of 00:49:03 ... Derived from CG and then has been discussed thoroughly within WG 00:49:23 q+ 00:49:40 ... Some of the areas that have changed around how it interacts with other specs, eg. some of the specs that the CG used were not full recs, so we can't refer to them normatively from rec, so we needed to remove it 00:49:57 ... Over the last few months we've talked about how to have multiple roles, one for each resource used within the annotation 00:50:07 ... Tied to the data model is a vocabulary for describing the data model 00:50:11 ... Vocabulary is in RDF 00:50:38 ... Important to note that we rely heavily on JSON-LD as a way to have the RDF graph model be something that is understandble and implementable by people who do not have a full RDF stack 00:51:13 ... One of our main driving principles is that the results should be useable without relying on RDF specific technology. You should be able to write JS in browser and work with JSON that comes back from the server. Time will tell how successful we are with that 00:51:18 ... Something we're trying to keep in mind 00:51:37 ... The model is a graph based model, using RDF, but the way we expect most people interact with it is via a specific JSON-LD serialization 00:51:49 ivan: is it the intention that the vocabulary will be published as a separate document? 00:52:09 azaroth: at the moment the vocabulary, the serialization and the data model are all rolled together into the data model spec, Annotation-Model 00:52:23 ... There has been some limited discussion about having multiple documents, one for model, vocab, and serialization 00:52:44 ... Tradeoffs have been that having multiple documents menas you need to read multiple documents, with lots of references between them that gets complex 00:52:54 ... But if it's all in one, it's more complicated for people who just want to see certain examples 00:53:01 ... Bit of a pedagogical issue, rather than a technical one 00:53:17 shepazu: My intention is that the serialization would not be a single spec, but rather a set of specs 00:53:42 ... Eg. as HTML, or as exif data in an image. Different ways of portraying the same data that would map back to the same terminology 00:53:58 azaroth: at the moment we've been focussing on JSON-LD, but there likely will be other serializations 00:54:13 ... First three, still work, but reasonably well 00:54:23 ... Fourth is protocol, how do you transfer annotations from client to server or server to server 00:54:36 ... based around LDP, also hopefully Collections from AS2 00:54:48 ... Five and six are closely related. Clientside API and robust linking anchoring 00:55:01 ... Clientside API helps browser or user agent create and consume annotations once they have them via the protocol or some event 00:55:22 ... So the current work is around the FindText API (previously rangefinder) which allows you to do find in page iwth a bunch of additional cool features 00:55:25 shepazu: fuzzy matching 00:55:34 ... defines a set of parameters around which you can do fuzzy matching. 00:55:53 ... Robust Link Anchoring is a more complex topic. The first spec as Rob said is the FindText API and that deals simply with text 00:56:16 ... BUt if you were annotation eg. an image, there should be a way of getting at a particular part of an image, FindText does not deal with that but robust link anchoring does 00:56:32 ... Once you ahve the FindText API, that opens up the door to having a URL scheme that using fragment ids you can say... 00:56:39 ... Say that you have a selection of text that you want to search for 00:56:49 ... Say it's repetitive, song lyrics for example 00:56:59 Jeff_Xu has joined #annotation 00:56:59 ... So you want to say, even though this particular text appears three times, I want the third instance specifically 00:57:14 ... So in addition to saying this specific stirng, you also say these are the 32 characters before and after, the prefix and the suffix 00:57:31 ... Given those three things, prefix, suffix and selection, you can have a URL that says # something 00:57:53 ... haven't deicded how to do it yet. Browser takes parameters and finds the instance you're looking for 00:58:12 ... If you wanted to select a passage and send a link to a friend, you can send a URL and your friend's browser takes them to the exact place 00:58:22 ... It's not the most elegant but we can't think of a more elegant way 00:58:29 ... Obviously once you have those things, you can use those for annotations 00:58:47 azaroth: at a slightly higher level, the robust link anchoring topic is, given a resource 00:58:52 ... how do I get the representation that I want 00:58:59 ... and how do I get the bit that I'm talking about 00:59:24 ... Issue around dynamic pages. Eg. js app, makes dynamic changes to page. You annotate something, what information does the client need to reconstruct the state of the page to make the annotation make sense 00:59:41 ivan: at some point we should look at these six, and plan what is realistic and what is not in the coming year 00:59:57 ... FindText is great, but personally I don't believe that we will have the time and energy to do anything else under the 6th point 01:00:09 shepazu: not sure I agree, but okay 01:00:27 ivan: that's my opinion. Same for serializations. I don't see us doing everything needed for rec - spec, testing, etc - within less than 1 year 01:00:33 ... so we have to be realistic about what we can achieve 01:00:58 ... maybe we should say that for certain entries here, we propose an extension or a new WG, but we have to be realisitic. We should try to find some time to discuss that. 01:01:24 shepazu: one last thing about robust anchoring 01:01:42 ... We talked about it largely in terms of text and images, but this also applies to media resources. Using media fragments for example to get a particular point at a video 01:01:50 ... You can also include a particular location in a video 01:01:58 ... All things that can and should be annotatable using the data model 01:02:11 ... How the robust anchoring links with the data model, it stores all the individual things as parameters 01:02:28 ... For example for text, the selection, prefix and suffix, maybe some other bits, those can be recomposed into a URL, but in the annotation they're stored as individual pieces 01:02:34 ivan: may require some adjustment between the two 01:02:46 ... the current selectors we have in the document may not cover all the things the FindText API can do 01:02:52 ... we may need to push additional terms into the data model 01:03:03 bigbluehat: is robust anchoring the ability to re-anchor across media types? 01:03:27 ivan: I think the idea is that if you get an annotation with a target uri, and somebody changes the text, you could still find the text. Robust against change of the media. 01:04:11 azaroth: the exact change is not well defined. For example if you have a resource that does conneg for plain text, html, pdf, the URI would be the same and the text is there, but the content negotiatble representations, one annotation should be able to re-anchored across all of those representations. OR is it for specific representations 01:04:16 bigbluehat: that definitely needs clarifying 01:04:29 ... The scenario that hypothes.is have, is publishing as html, epub, pdf 01:04:33 ... want annotations across all of them 01:04:44 ... Textually the ranges are the same, but scenarios of anchoring them are pretty different 01:05:08 BREAK 01:05:18 Back at quarter past 01:08:38 kurosawa has joined #annotation 01:16:00 azaroth has joined #annotation 01:22:57 shevski has joined #annotation 01:23:44 erikmannens has joined #annotation 01:23:47 azaroth has joined #annotation 01:29:16 takeshi has joined #annotation 01:31:11 miyazaki has joined #annotation 01:31:16 Meeting recommences 01:31:18 proposed RESOLUTION: Minutes of last call are approved: http://www.w3.org/2015/10/21-annotation-minutes.html 01:31:22 TOPIC: Resolve minutes of last meeting 01:31:34 azaroth: any objections? 01:31:44 RESOLUTION: Minutes of last call are approved: http://www.w3.org/2015/10/21-annotation-minutes.html 01:31:54 TOPIC: FindText API 01:31:56 http://w3c.github.io/findtext/ 01:32:08 shepazu: Editor's Draft ^ 01:32:18 ... Has latest changes since publication 01:32:22 ... Published as FPWD last week 01:32:43 ... A little unusual in that we have a liason in our charter with the WebApps WG to publish this document together 01:33:03 ... In the time I was working on it, there were plans to merge WebApps with HTML WG to form Web Platoform WG 01:33:12 ... We put out a cfc for the FPWD of FindText 01:33:37 ... And from the time the cfc started to the time it ended the new WG launched, so through some quirk of fate this is now published by WebAnnotations and is the first spec published by Web Platform WG 01:33:48 ... Web Platform is working on all of the big clientside APIs, plus HTML 01:34:23 ... So it's good that we have their attention. I talked informally to somebody from apple who works on safari, and today I bumped into somebody from MS who works on Edge (replacement for IE) 01:34:43 ... Both of them said that so far as they could tell without having looked at it too deeply they thought that FindText seemed like a good idea and they're interested in implementing it 01:34:54 ... That would be fabulous, and get the WG the attention of the use case that we're trying to do 01:35:14 ... While not diminishing the other things, the anchoring that is enabled by FindText, along with the data model, those parts are the core of annotations 01:35:32 ... THe publishing stuff is all useful, but those two pieces are the core, if we can get attention for those two pieces we are in very good shape 01:35:42 ... We also got the attention of the i18n WG 01:35:52 ... Any time you're working with text you need to make sure it' sinternationalized 01:36:04 ... about a year ago the i18n WG started working on a spec called charmodnorm 01:36:09 ... character model for the web normalization 01:36:20 ... worked on by Addison Philips(?) Amazon 01:36:37 ... solves so many of the problems we should have run into, we don't have to translate unicode stuff, already a spec for this, timing really fortunate 01:36:44 ... and the fact we were working on FineText got them interested 01:36:54 ivan: that document is a note or rec to be? Timing? 01:37:08 shepazu: Rec-to-be. Don't know about timing. Probably hand in hand with FindText 01:37:16 ivan: otherwise we run into stupd administrative issues 01:37:22 shepazu: they raised several issues on github 01:37:39 ... those issues I've started resolving them, some are easy some more tricky, all them are about my own ignorance about i18n 01:37:46 ... Just educating myself abou tthe right way to approach a problem 01:37:50 Github Issues link: https://github.com/w3c/findtext/issues 01:38:05 ... There will be a process of negotation between us and i18n about which parts of defining text search in FindText and which CharModNorm 01:38:18 ... CharModNorm applies search to broader set of resources 01:38:39 ... FindText specifically developer API, and beyond i18n because there are things around edit distance 01:38:52 ... That's the background of this thing. Seems like it's goign to get some momentum. Might change dramatically, but the barebones are here. 01:39:02 ... Is anybody interested in hearing how this api works? 01:39:14 ... I'll brieflly tell you 01:40:04 ... Three ways you can provide feedback on the spec 01:40:10 ... Either send an email to the mailing list 01:40:13 ... public-annotation 01:40:17 ... file a bug on github 01:40:34 ... Or leave an annotation directly on the spec 01:40:48 ... Make an account, select some text and leave annotation 01:40:52 ... They're sent to mailing list 01:41:19 ... API has several parameters. Pass them in as a JSON object 01:41:47 ... Example. Here's a poem, selected because it has the words 'rage rage' several times. So how would you find the fourth instance of 'rage rage' 01:42:05 ... EXAMPLE 1 ... pass in string to FindText 01:42:15 ... call searchAll() 01:42:27 ... find third match if you're looking for third instance 01:43:20 New arrivals: Richard Eschida, r12a, Dave Clark, Felix Sasaki, fsasaki 01:43:36 s/Dave Clark/Dave Clarke 01:44:03 shepazu: Here's another example of a search that will find that string 01:44:14 ... Intialize FindText object with thsi JSON object, with text and prefix 01:44:14 kurosawa has joined #annotation 01:44:27 ... This is the specific selection that we're looking for 01:44:38 ... So, the kind of parameters you can have 01:44:49 fsasaki has joined #annotation 01:45:00 present+ fsasaki 01:45:15 ... text and textDistance 01:45:47 ... Edit distance is an algorithmic way of saying how two words are related mathmatically 01:46:00 ... eg. dog -> fog, edit distance is one, have to change one letter 01:46:21 ... fog -> frog, have to add a character, so edit distance is one 01:46:34 ... edit distance dog -> frog, change and add, = 2 01:46:57 ... when you're talking about typos.. on a string this small this is significant. When you're talking about longer strings it becomes more useful 01:47:17 ... when you're talking abou ttypos and they miss one letter 01:47:22 ... still robust 01:47:34 ... you can still match, especially when you have prefix and suffix 01:47:47 ... turns out to be a very efficient way of searching for differences 01:47:58 ... edit distance is absolute number of changes 01:48:14 ... if I say I want an edit distance of one, that means I will allow one change, doesn't matter length of string 01:48:40 ... Quite likely if you didn't find match on first pass with FindText API you might increase edit distance until you find a match 01:48:52 ... once you get to ane dit distance of 15-20% it's very likely thsi thing doesn't exist in this document any more 01:48:54 davidclarke has joined #annotation 01:48:57 ... but first you try to have robust anchoring 01:49:06 ... selection is the target text you're looking for 01:49:11 ... textDistance is edit distance 01:49:16 ... prefix and suffix, both of which have edit distance 01:49:24 Present+ David_Clarke, Richard_Ishida, Felix_Sasaki 01:49:28 ... scope is an element that says the content I'm looking for must be within this element 01:49:31 ivan: DOM elemnt? 01:49:35 shepazu: yes 01:49:38 ... DOM API, to operate on webpages 01:49:47 ... So let's say that I make a webapp and my webapp is a text editing app 01:50:03 ... So I have an editing area and a bunch of worsd in there. I have the world 'file' 01:50:11 ... and in the UI for my app I have a bunch of menu options, and one of them is 'file' 01:50:29 ... so if I want to search this document, if I want my users to bea ble to search this document, I don't want them to find the UI instance of the word file 01:50:35 ... the thing within the content area 01:50:44 ... eg. google docs gives you its own find dialog 01:50:56 ... another use case is that you might say I know it's in this chapter, which is represented by this element 01:51:01 ... can be used to make more efficient searches 01:51:28 ?: multiple elements? 01:51:34 shepazu: no, you should set parent 01:51:50 ... range says where should I start this search 01:51:55 s/?:/takeshi/ 01:51:56 ... similar to scope but different use case 01:52:17 ... caseFolding, unicodeNormalization, set of choices 01:52:29 ... wrap, do youw ant to wrap around the document 01:52:40 ... so if you start from a start position do you want to go all the way around. Maybe not necessary. 01:52:47 ... The other stuff is allr elated to the search operation itself 01:53:11 ... The way it works, turn the entire document into a string, normalize it, collapse the white space, search on this long string that is the text of the document 01:53:20 ... Once you find a candidate match, you return it as a range, where is it in the DOM 01:53:25 ... Not simply where is it in the text 01:53:37 ... Allows you to treat element boundaries.. you ignore element boundaries when you're doing a FindText API search 01:53:43 ... DOM API that operates on text, and returns DOM range 01:53:49 ... A range may span multiple elements 01:53:54 ... That's basically how the API works 01:54:06 ... an algorithm of how it operates, not for implementation, just explaination of results 01:54:20 ... Finally, the notion that you would have this URL syntax, each of these parameters is something you would set in this URL syntax 01:54:29 ... Each URL is effectively a findText operation 01:54:42 Jeff_Xu: based on text structure? 01:54:44 shepazu: yes 01:55:01 Jeff_Xu: in html structure, if there's some element in front of keyword, but moved to somewhere else with CSS..? 01:55:06 shepazu: doesn't account for that 01:55:12 ivan: if you generate content by CSS, you don't find it 01:55:24 shepazu: there is discussion now that gneerated content in CSS should also be accessible 01:55:40 ... generating content should be treated as some part of the object model, whether DOM or some higher level, should be seralized as part of it 01:55:47 ... however we solve it for accessibility we should solve it as that case 01:55:53 azaroth: will make github issue to track that 01:56:03 shepazu: need to transfer issues sfrom spec to github 01:56:10 ... Some of the isses on spec are me thinking out loud 01:56:15 ivan: css stuff is a good question 01:56:21 shepazu: occurred to me before, but not resolved yet 01:56:35 ... there's generated content, but also the css has moved the text to appear to be in another part of the document. Nothing to be done for that. 01:56:49 ... the nice thing, even if the text is not there, the rendering of the text is different than the DOM order of the text, it will always be that way 01:57:01 erikmannens has joined #annotation 01:57:01 ... so when you come back to the document, it will still consider that to be part of the document in that order 01:57:12 ivan: one thing to make note of, we may want to talk to CSS people 01:57:19 ... (?) project that has started, to try to open what rendering engine does 01:57:30 ... may be that we have another version of findText that works on the CSS object model that they produce 01:57:35 ... which takes care of these rearrangements 01:57:44 shepazu: needs to be dealt with. Not the only thing we need to talk about with CSS 01:57:51 ... also, once you have that range, how can you style it 01:58:00 ... once you find the result, how you highlight it once you have the result 01:58:13 ... one thing you'd do today is take the range, surround with span with class 01:58:16 ivan: doesn't always work 01:58:20 ... the range spans over wto paragraphs 01:58:31 shepazu: have to chunk it. It's ugly. If you have a hundred annotations you don't want to do that 01:58:38 ... need to be able to style ranges arbitrarily 01:58:47 ... outside scope of this, inside scope of robust anchoring discussion 01:58:56 clapierre: do we have to worry about aria hidden role? 01:59:01 shepazu: could ask same question about visibility 01:59:03 ... same class of question 01:59:12 clapierre: css visibility is not visible in the DOM 01:59:15 annbass has joined #annotation 01:59:17 shepazu: Display none is not 01:59:21 ... we need to deal with all of that 01:59:54 takeshi has joined #annotation 01:59:58 ... there was an issue that was raised that said.. when I talk about that I say when you serialize HTML DOM into text, I suggest using the serialization in the DOM4 spec, somebody gave feedback that said no you should use one of these other serialization methods 02:00:07 ... maybe one of those others will deal with it. All good points. 02:00:27 ivan: we discussed with webform guys, small issue about promises 02:01:09 TOPIC: Internationalization 02:01:16 azaroth: welcome to folks in i18n WG 02:01:21 ... how would you like to go through issues? 02:01:49 *New people arrive* 02:02:00 Irinia Bolchevsky, w3c 02:02:04 Akira M(?) 02:02:07 Ann Bassetti 02:03:12 shepazu: before we got to individual issues, meta question to mailing list 02:03:31 ... You guys decided to file github issues, which is fine. In addition to that, PRs also welcome 02:03:47 ... You can just fix my spec and send an email or describe it in PR 02:04:01 ... chances are unless there's a fundamental disagreement I'll simply take a PR 02:04:47 Richard E: we prefer github, much easier to handle conversations 02:04:58 ... what we'd also like to do is get a tag that says i18n that we can attach to issues so we can track and get notified 02:05:05 ivan: adding a label to the issues? easy to do 02:05:12 s/Richard E/Richard Ishida/ 02:05:45 Richard Ishida also = r12a 02:06:08 shepazu: issue 4, one of the more complex ones 02:06:13 ... didn't know how to handle it in the spec 02:06:27 ... the issue is, in the character counts of ranges should they be unicode code points or graphemes or whatever 02:06:39 ... I think that you guys were suggesting unicode code points as your preference? 02:06:59 r12a: In javascript, if you have a.. you know supplementatry characters? 02:07:13 ... unicode can encode around a million code points, there are 65536 slots in the basic multilingual plane 02:07:19 ... utf16 you only need 2 bytes for each of those 02:07:31 ... if you go above that for some of the newer characters then you need 4 bytes to encode them in utf16 02:07:37 ... that is two code units 02:07:45 ... in javascript doesn't know how to handle the higher level characters very well 02:07:53 ... so you end up with two things that are not actually cod epoints 02:07:58 ... it shouldn't be like that, it should be a single code point unit 02:08:05 ... that leads to this question about whether we should do code unites or code points 02:08:11 ... I think that you should do code points 02:08:18 ivan: is it hte same in ECMAscript6? 02:08:28 shepazu: same question 02:08:37 ... I believe they have a way of dealing with this in ECMAscript6 02:08:44 r12a: I believe, but not up to date 02:09:01 ivan: we may want ot say we rely on ECMAscript6 02:09:14 takeshi: current model 02:09:19 shepazu: I think they add capability to deal with code points 02:09:31 takeshi: add capabaility for additional characters (..?..) 02:09:38 shepazu: I think you can deal with it in other ways not just regex 02:09:48 ... I think this is an i18n issue, so we should do unicode code points - that's yoru recommendation? 02:09:54 r12a: that's mine, not necessarily... 02:10:00 shepazu: you guys should tell us how to do it 02:10:13 r12a: the other question was about graphene clusters 02:10:25 ... in unicode you can encode e with acute accent as a single character, or individually 02:10:32 ... supposed to be equivalent, but perceived by user to be a single character 02:10:43 ... what's perceived to be a single unit of text is potentially much more complicated 02:10:47 ... could be two or three characters 02:11:04 ... but there's this concept of graphene cluster which is used for editing process maybe a delete would delete 3 characters instead of one 02:11:12 ... graphene cluster boundaries intead of code point boundaries 02:11:26 s/graphene/grapheme/ 02:11:37 ... I don't think you should specify this in terms of grapheme clusters, partly because they don't solve all the problems at the moment. Maybe in ambiguous to users what's happening 02:11:50 shepazu: serialization and normalization should take care of that? 02:12:03 r12a: no, a grapheme cluster is how human perceives clusters 02:12:12 ivan: UX.. when they select on screen, what to they select 02:12:15 r12a: it varies 02:12:22 ivan: if you select something in web browsers, what do they select? 02:12:26 r12a: i think it still varies 02:12:28 ... I don't know the details 02:12:32 shepazu: we should test that 02:12:40 ivan: what we should do is whatever the web browers do 02:12:53 bigbluehat: we should spec it based on the dominant case inb rowsers and say this is what we want going forwards 02:13:06 r12a: if all the browsers agree but are all doing it the wrong way... 02:13:08 Georg has joined #annotation 02:13:34 shepazu: I'm sympathetic to both of those positions, we should do what's possible, we have the issue, we shoudl test it and find out if theyr'e doing the right thing or if there is a right thing. We should mov eon. 02:13:49 ... issue 5, avoid listing whitespace characters 02:13:54 ... we can change definition of whitespace character 02:14:17 ... issue 6, let people say case folding or not 02:14:30 ... params none, ascii, unicode, language-sensitive 02:14:34 ... they say we should say what language 02:14:41 ... I think it should be the language of the docuemnt 02:14:50 ... the api doesn't need to say that, the document already says that 02:14:57 ... the algorithm should include that information, but not necessarily a parameter 02:15:03 ... I know there can be mixed language docuemnts 02:15:08 ... I think that shoudl be dealt with in the algorithm 02:15:13 ... Not necessarily as a spearate parameter 02:15:22 ... I'm okay with the spec dealing with it, not the API 02:15:48 r12a: the qeustion is, if you want to search for a particular word in a document, you maybe be able to get the computed language of the docuemnt text, but how do you get the search text? 02:16:11 shepazu: the document says what language it's in, and it knows when it's serializing it it knows when it should be doing case folding, so when it serializes into string it knows how to perform the operation 02:16:21 ... not just document based 02:16:35 r12a: that's for the text in the document, but if a user types in in a field their search text.. 02:16:39 shepazu: that's a UI decision 02:16:50 bigbluehat: the api needs to know what it's being given 02:16:58 r12a: are you looking for a turkish word? 02:17:10 shepazu: maybe we should be looking at it... afraid to make the api larger and more complex. Need to make sure we need it. 02:17:38 ivan: There's one.. not necessarily only capitlisation, but in some cases when I search French, I can type in the search term without the accent and it will find the relevant french term without the accents 02:17:47 ... so the found term might differ from the search term only by the missing accents 02:17:52 ... which is not necessarily same as editing distance 02:17:59 r12a: that's in another issue 02:18:16 (this is how I figure out where the appropriate accents go, on French words!) 02:18:34 shepazu: issue 7, option to have ascii case folding is not useful. I'm fine with that. 02:19:16 s/7/10 02:19:40 ... issue 7 is you'd like us to use charmodnorm as normative reference for ... I'll ask for clarification in the issue 02:19:45 ... I'm find with referencing charmodnorm 02:20:19 ... issue 8, four parameters, including canconica, compabitibilty and all 02:20:33 ... I included all because of my ignorance... It hink we need back and forth so I understand the issue and we'll resolve it 02:20:44 ... This is for unicode normalization 02:20:47 ... there are multiple kinds 02:21:09 ... Before I didn't have anything about that, I just picked one. These guys think that I should. 02:21:19 ... Still don't know if this is something browsers are wililng to do 02:21:28 ... I included it cos i18n said I should, we'll have back and forth 02:21:35 ... issue 10, ascii case folding option is not useful 02:21:38 ... that's fine 02:21:49 ... when I was reading charmodnorm it seems like it would be useful 02:22:06 r12a: charmodnorm addresses two different scnarios, element names and markup, and the other is natural language coment, th estuff people actually read 02:22:13 ... ascii matching is useful for thing slike css identifiers 02:22:28 ... typically css doesn't concern itself with ascii case, but you can't extend that through other languages tahn english 02:22:42 ... so it's only in the case where you're talking about syntactic content that ascii casefolding is really useful 02:22:47 ... never useful in natural language processing 02:22:49 shepazu: I'm fine with that 02:23:00 ivan: we may want to search in html content that includes a pre with javascript content 02:23:03 ... for those you fall back to the ascii 02:23:09 ... you're not in natural language 02:23:14 shepazu: I don't think you will have aconflict 02:23:26 azaroth: a high level question, what would the default be? just unicode? 02:23:33 ... what would be specified? 02:23:53 ... if the text was ascii and you wanted to have casefolding, for example searching all lowercase, you owuld say casefolding is unicode rather than (?) 02:24:02 ... it's not that all casefolidng would go away, just that that option would go away 02:24:14 ... unicode is a superset of ascii therefore we don't need ascii 02:24:28 shepazu: we can drill into issues later 02:24:37 ... issue 11, unicode equivalent type not all clear 02:24:52 ... I added all, you guys informed me there's already a way of doing most promiscious match 02:24:57 r12a: there maybe additional issues 02:25:02 shepazu: raise as other issues 02:25:15 ... issue 12, order of case fold and normalizaiton in algorithm, should reverse, I'll change that 02:25:22 ... the algorithm is going to change dramtically anyway 02:25:34 ... the last one is issue 13, several different ones, I'd rather you broke these out into multiple issues 02:25:41 [just for the minutes as background for the case folding discussion: unicode provides a case property http://unicode.org/faq/casemap_charprop.html and that is for all characters not just the ascii set] 02:25:47 ... One of them is the oe for german transliteration 02:26:00 ... We need to break down into several things I can understand 02:26:16 r12a: there are all sorts of things you can do to match text. Ignoring accents is a common one. THere are other things depending on language 02:26:24 ... SOme might be syntactic stuff like fights vs fight 02:26:32 ... Recognising grammatical language specific differences 02:26:40 ... it hought we had that in charmodnorm already but we don't 02:26:48 ... we don't know all the answers yet, but know that there are a lot more 02:26:56 clapierre: includes punctuation? 02:26:59 http://www.w3.org/TR/xpath-full-text-10/#ftmatchoptions 02:27:02 shepazu: edit distance would handle a lot of things like that 02:27:12 fsasaki: maybe you're aware of xpath full text specification 02:27:22 ... various options related to language, stemming, useful to look at as background 02:27:38 shepazu: I do have to say that as a design decision we want to keep this as simple as possible but no simpler 02:27:54 ... Not sure how much the browser vendors will be willing to implement, two different normalization algorithms, we should dtalk to them to find out 02:28:03 r12a: what's important is some of these things like dia(?) stripping(?) 02:28:10 ... important for users 02:28:15 ... normalization stuff people don't normally know about 02:28:23 ... some things people know that they want to do this type of search 02:28:35 shepazu: there should be an option to strip out all dia(?)tics 02:28:50 azaroth: now we relocate 02:29:01 ... 1f middle hall a 02:29:23 WebPlatform meeting, then lunch 02:29:27 azaroth has joined #annotation 02:30:17 erikmannens has joined #annotation 02:32:57 Jeff_Xu has joined #annotation 02:33:20 clapierre has joined #annotation 02:39:13 shepazu has joined #annotation 02:43:15 erikmannens has joined #annotation 02:46:37 shepazu has joined #annotation 02:48:07 shepazu has joined #annotation 02:48:21 erikmannens has joined #annotation 02:48:28 azaroth has joined #annotation 02:48:31 /j #tpac 02:49:16 Jeff_Xu has joined #annotation 02:50:08 takeshi has joined #annotation 02:50:30 ivan has joined #annotation 02:50:36 clapierre has joined #annotation 02:51:10 /j #webapps 03:01:35 ivan has joined #annotation 04:07:48 Zakim has left #annotation 04:08:45 Jeff_Xu has joined #annotation 04:09:45 clapierre has joined #annotation 04:10:50 David_clarke has joined #annotation 04:10:51 azaroth has joined #annotation 04:12:20 fsasaki has joined #annotation 04:14:20 csarven has joined #annotation 04:15:41 ivan has joined #annotation 04:17:40 olivier has joined #annotation 04:19:07 erikmannens has joined #annotation 04:19:07 Zakim has joined #annotation 04:19:09 rrsagent, draft minutes 04:19:09 I have made the request to generate http://www.w3.org/2015/10/25-annotation-minutes.html ivan 04:19:14 zakim, pick a victim 04:19:14 I don't who is present, azaroth 04:19:17 :( 04:19:19 present+ David_clarke 04:19:43 rrsagent, set draft public 04:19:43 I'm logging. I don't understand 'set draft public', ivan. Try /msg RRSAgent help 04:19:53 rrsagent, set minutes public 04:19:53 I'm logging. I don't understand 'set minutes public', ivan. Try /msg RRSAgent help 04:19:57 scribenick: bigbluehat 04:20:11 http://www.w3.org/2015/10/its2-and-web-annotation/?full#1 04:20:56 Topic: ITS 2.0 - Internationalization Tag Set 04:21:00 rrsagent, set log public 04:21:30 rrsagent, draft minutes 04:21:30 I have made the request to generate http://www.w3.org/2015/10/25-annotation-minutes.html ivan 04:22:02 tag set often used to state something should be translated or not 04:22:37 Translate, Text Analysis, also includes system for annotating parts of text as language components 04:22:53 showing examples of HTML and XML usage 04:23:13 kurosawa has joined #annotation 04:23:26 W3C 04:23:42 translate is from HTML itself 04:23:49 its-ta-* are from ITS 04:23:53 takeshi has joined #annotation 04:24:17 growing interest in using these types of annotations in other formats 04:24:28 RDF, JSON, etc 04:24:40 ITS 2.0 ontology exists 04:25:03 translate attribute becomes an RDF property (for example) 04:25:48 for JSON there is an `annotations` key which contains the ITS annotations 04:26:18 JSON ITS interest is from the localization community primarily 04:26:41 what role could the Web Annotation model have to fulfill these requirements. 04:26:50 what things can Web Annotation use from ITS? 04:27:04 s/requirements./requirements? 04:27:30 could Web Annotation benefit from the data types in ITS? 04:27:54 it is not the purpose of todays discussion to file issues for Web Annotation, just to inform the group of the ITS efforts with regards to annotations 04:28:08 it is possible via ITS to have the annotations separate from the original content 04:28:27 via Natural Language Processing Interchange Format (NIF) 04:28:56 NIF developed by various EU projects 04:29:28 there are related APIs by these projects--for example FREME API 04:30:11 @types examples include `nif:String` `nif:Context` 04:30:54 methods for text selection (`anchorOf`, `beginIndex`, `endIndex`) 04:31:05 also has a "confidence" rating generated by the tool 04:31:41 @id is a URI which includes a #char=27,30 (for example) fragment selector 04:32:03 perhaps there are general mechanisms that can be shared 04:32:35 interest in feedback on the ITS work from this WG 04:32:49 ivan: I would like to go back to the markup examples 04:33:04 ...from the annotation model point of view these examples show one key discrepency 04:33:26 ...in the Web Annotation model there must be an identifier of the separate content 04:33:44 ...in the examples, the annotations are inline, they do not have identifiers, they do not have targets 04:33:56 ...that being said this markup approach has many usages 04:34:08 ...so do we want to deal with scenarios where there is no target? 04:34:33 ...should we have annotations with no target identifiers (i.e. the target is a blank node) 04:34:47 ...there do seem to be scenarios where this would be useful 04:35:33 ...in the CSV working group, the targeting is useful 04:35:39 ...but I can be forced into a redundancy 04:35:52 ...I would have to make an identifier that is circular 04:36:07 ...that may be an aspect worth considering 04:36:49 fsasaki: the markup example can not be easily expressed in Web Annotation? 04:36:59 ivan: it can be anchored via our selectors 04:37:14 ...but when you want to use JSON (etc), then you force the developer to use redundant data 04:37:41 azaroth: the higher level point would be what use cases are there where the information is being provided by someone other than the content provider? 04:37:54 ...of course if you own the content, you can put in whatever sort of markup you want 04:38:01 ...but that scenario is less about what we're building 04:38:09 ...which is more about content the person annotating does not control 04:38:18 ...does ITS have these use cases? 04:38:32 fsasaki: yes. that's also why exploring Web Annotation is interesting 04:38:43 annbass has joined #annotation 04:38:52 ...someone produces HTML content, and someone else wants to enrich it with separate language related content 04:39:09 ...if I were to put all that content into the HTML, then it would be overload 04:39:35 ...there can also be overlaps with what's created 04:39:59 ...there's no demand for hierarchies 04:40:16 ...it also works if you do not own the content, but only export it for translation 04:40:23 ...scenarios where lots of people need to look into the content 04:40:35 azaroth: so in the translate case... 04:40:50 ...is this the translation? or is this just related to the act of translation? 04:40:56 ...here is something that could be translated? 04:41:03 ...vs. something that points to the translation? 04:41:40 fsasaki: this could also relate to company policy and process---this should be translated, this should be translated 04:42:07 azaroth: so the target would be the W3C in this case? 04:42:25 ivan: not necessarily, it's the specific found instance of the word 04:42:32 ...which is where FindText API could come in 04:42:55 shepazu has joined #annotation 04:42:57 azaroth: the body of the annotation would state what is and is not annotatable 04:43:02 ivan: yes. that sounds right 04:43:37 fsasaki: it would be provided with ITS as additional RDF triples 04:43:53 azaroth: in the current working graph, it would need to be separate from the Annotation expression, but it is possible 04:44:32 ivan: minting a URI for the annotation itself seems unnecessary in the ITS use cases 04:45:10 azaroth: a URI for the annotation is a SHOULD 04:45:44 ...the more interest point here is the NIF URI pattern 04:45:52 ...it tries to encode quite a bit of data 04:46:13 fsasaki: it's actually changed. it's also separate as `beginIndex` and `endIndex` 04:46:42 ...there's also a `wasConvertedFrom` which explains where the content came from--HTML, etc. 04:47:05 azaroth: from my perspective, it looks like it can all be expressed in Web Annotation currently 04:47:21 ...what we don't currently have is confidence score 04:47:46 fsasaki: this `taConfidence` is ITS specific 04:48:07 ...there's also a place to put a URI for the actual tool that produced the ITS annotations 04:48:42 shepazu: there used to be a confidence score in an earlier version of the FindText API 04:48:49 ...I took it out because it was application specific 04:49:08 ...by which I mean use case specific 04:49:30 fsasaki: `taConfidence` is about "text analysis confidence" 04:49:43 ...there's also provenance information which can be stored 04:50:08 ...originally we'd wanted confidence statements everywhere, but they can prove pointless without provenance info 04:50:32 azaroth: it seems like a useful exercise to pick some selected examples 04:50:42 ...and see what's missing in Web Annotation to express ITS content 04:51:01 ...and then for us (WG & ITS folks) to see if that expression is useful outside of the NIF context 04:51:32 ...if so, then we may want to change the model---otherwise, they could be presented as available extensions for those unique use cases 04:52:07 fsasaki: there's an upcoming meeting in November--a hackathon--where these would be useful 04:52:11 azaroth: yeah, that would be great 04:52:21 ...canonical examples of all the features you'd like to discuss 04:52:25 ...we can iterate those from there 04:52:50 fsasaki: adding these items to the body seems to make sense 04:52:56 ...ITS can be used without NIF 04:53:13 ...and it may be that those make more sense in Web Annotation--perhaps as an extension 04:53:24 ...NIF is just one use case for ITS annotations 04:53:25 annbass has joined #annotation 04:53:44 azaroth: these examples may also prove useful for analyzing our "motivations" / "roles" list 04:54:05 http://www.w3.org/TR/its20/#basic-concepts-datacategories 04:54:09 ...are there examples of ITS and NIF expressions that you could link us too? 04:54:42 azaroth: for example. we have an "identifying" motivation, but not a "translating" one 04:54:47 ...some alignment there would be helpful 04:55:17 ...from the DPUB side, we also have some areas where this would be useful 04:55:42 ...if you had one example per category and what they'd be used for 04:55:52 ...then we can see which make sense as motivations/roles 04:56:05 ivan: i'm a little skeptical that these may not be general enough 04:56:11 ...but perhaps as extensions 04:56:32 azaroth: motivations are skos concepts 04:56:49 ...if your data categories could be defined as those, then we'd have a very easy time mapping them 04:57:00 fsasaki: we're currently working on the vocabulary, so this is good timing 04:57:20 ...some of these are text analysis specific 04:57:36 ...but some may be broadly applicable 04:57:59 ...we've also had a discussion about directionality 04:58:14 ...here's a string, here's a substring within it that has a directionality from right-to-left 04:58:31 ...it could be used to provide helpful information for using to solve for these scenarios 04:59:07 ...we're not sure what the mechanism could be, but it is something we are exploring 04:59:12 ...and also JSON annotation aspect 04:59:31 ...sometimes people put HTML into JSON to provide language information---which looks really bad 04:59:45 scribenick: azaroth 05:00:00 bigbluehat: We don't cyrrently have a json selector. Some hacks, like json pointer, but not a finished spec. 05:00:02 ivan: who? 05:00:28 bigbluehat: IETF. Also in json schema, wihich has also expired. If we can't point in to it, then we can't annotate json objects. Need to look at this 05:00:41 ivan: Make it clear in the model how it can be extended 05:00:52 bigbluehat: Not sure if it's in our charter to stir up the specs? 05:01:01 Ivan: I think we can just point to them. Especially at IETF. 05:01:20 scribenick: bigbluehat 05:01:20 scribenick: bigbluehat 05:02:05 azaroth: it's 2 pm. DPUB is showing up after the break--because coffee 05:02:16 ...the topic next on the agenda is testing 05:02:44 ...perhaps we stay with i18n topics instead since we have these folks here 05:03:36 shepazu: here's my reaction. let's storm the castle! 05:03:42 I have made the request to generate http://www.w3.org/2015/10/25-annotation-minutes.html fsasaki 05:03:47 ...I was somewhat encouraged, and somewhat discouraged 05:04:09 ...we had affirmative, encouraging feedback from Travis at Microsoft 05:04:23 Topic: FindText API discussion with Web Platform group 05:04:45 ...we got feedback from someone at Apple, which was less overtly encouraging 05:04:55 ...but he didn't say we're not going to expose FindText 05:05:02 ...more, I don't like the shape of the API 05:05:06 clapierre has joined #annotation 05:05:15 ...he said he wanted fewer options, but then he also wanted to add a feature 05:05:31 ...it seems encouraging that they gave us feedback at all 05:05:45 ...we should hold fast on the most important part--which is the edit distance, imo 05:06:08 ivan: so, we had a small discussion at lunch, and I was wondering if it makes sense to have different API entries 05:06:19 ...one that would say programmatically give us what you do already 05:06:32 ...a find or search API which exposes what browsers already implement 05:06:38 Ralph has joined #annotation 05:06:40 ...but which you cannot currently access from JS 05:06:52 annbass has joined #annotation 05:06:54 shepazu: I think that would tactically be a mistake 05:07:24 ...the current implementation does not solve our usecase 05:07:37 ...and without exposing the edit distance to implementation, then we don't get what we need for our use case 05:08:31 azaroth: it solves giving the client api content about what is selected 05:08:37 shepazu: those aren't related in my view 05:09:10 azaroth: we could reasonably get a find api that exposes finding a range of text in the browser 05:09:19 ...the edit distance is a robustness question 05:09:20 clapierre has joined #annotation 05:09:42 ivan: if I have an ebook, for example, the content is frozen 05:10:10 shepazu: you mean, the edition, on that ebook reader, but when you exchange it with someone else, then they won't re-anchor 05:10:49 ivan: sure, but there are also use cases where content is fixed, and for those, the lack of robustness is not a problem 05:10:53 ...text books for example 05:11:47 dsinger has joined #annotation 05:11:47 bigbluehat: robustness can go on top of find api 05:12:14 ivan: the edit distance was the only thing a new or unknown concept for browsers 05:12:25 ...the rest, sure, but could probably be done by regex 05:12:30 shepazu: maybe I had a different take 05:12:57 ...leaving out this edit distance thing--which is different, etc--the rest could be done with regex 05:13:07 ...I believe his point was about regex, not about edit distance 05:13:42 ivan: separating the two things may make sense, it drives them to expose search--which is currently not done for developers 05:13:51 ...maybe first they will do the simple thing without edit distance 05:13:58 ...and then later do the robustness thing 05:14:20 shepazu: ok. if that's what we can get---the find text api--then....well....every browser does it difference 05:14:37 ...I think we should not so easily give up robustness 05:14:47 ...we need to push the issue of robust anchoring 05:15:06 takeshi has joined #annotation 05:15:41 David_clarke: i'm not sure the web apps audience fully understood the value of robust anchoring 05:16:02 ...sell them the value proposition first, rather than selling the solution first 05:16:42 shepazu: Travis from Microsoft asked where our use cases were--which we didn't really present 05:16:58 ...and while we provide some in the spec, the interface was mostly skimmed 05:16:59 clapierre has joined #annotation 05:17:05 shevski has joined #annotation 05:17:20 s/Travis/paul cotton/ 05:18:30 ivan: there's a URI issue--the whole mechanism actually serves two purposes 05:18:34 ...1. finding the text 05:18:43 ...2. serializing the identification of that text 05:18:50 ...the identifier aspect is also very important 05:19:02 ...and it's not in the document 05:19:34 shepazu: we should annotate the spec to point to use cases 05:20:16 ivan: he wasn't really listening--skimming the spec while we were talking--and immediately reacted negatively 05:20:44 s/ivan: he wasn't really listening--skimming the spec while we were talking--and immediately reacted negatively// 05:20:58 q+ 05:21:56 shepazu: I think Paul Cotton has a really good point about having use cases 05:21:58 ack bigbluehat 05:22:13 scribenick: azaroth 05:22:48 bigbluehat: An opportunity to split the doc into sections to give info on use cases here, and outside. Such that browser vendors could see which parts are native, and can give developers access to 05:23:13 ... we can start them off in that direction. There won't be consensus around that, even though there's similarity from the user perspective 05:23:30 ... the robustness part shouldn't get caught up in that, and could have its own set of use cases on top of that 05:23:51 shepazu: I would like to not separate them so they can be considered as a whole, rather than what they do today. 05:24:05 ... The solutions from the whole problem set will be different than if you start with the narrow set 05:24:39 ... Would like the conversation to happen. 05:24:58 ... in the context of the specification so we have actionable things to do. No one else has volunteered to work on it, and we have their attention 05:25:02 ... lets engage directly 05:25:27 bigbluehat: If we mix in robustness, which they see as out of scope, it'll slow down exposing the find stuff which would be a huge value add for any implementer 05:25:38 ... the people to win over for robustness would be content editable people 05:25:43 ... they deal with it all the time 05:26:04 ... translation, internationalization, etc. Find text with static text is different 05:26:41 shepazu: Would rather we didn't throw the major use case out before we try to engage 05:26:47 ... they haven't said no yet 05:27:31 ivan: Don't know all the use cases off the top of my head, we have two documents, one from DPUB one here. I'd like to see what the % of use cases that are based on dynamic content 05:27:34 ... and what on static 05:27:48 ... Something that would help to decide which way to go 05:28:23 Antonio: Can you explain why the dichotomy matters? 05:28:33 shepazu: FindText now serves static case 05:28:53 Ivan: I can use it for annotations, but if I annotate content that can change, then we need the robustness 05:29:01 ... that's where edit distance comes in 05:29:37 shepazu: If you'regoing to design something for a limited use case that just does find in page, you might not have the extra params, and then if the API isn't extensible, you'd never get edit distance 05:29:52 bigbluehat: Can design concurrently, to make sure that the extensibility is there 05:30:08 ... core of what Im saying is that vendors don't have a motivation 05:30:19 ... find text to me seems obvious 05:30:31 ... if we can ship that sooner, and it makes lives easier... 05:30:41 olivier has joined #annotation 05:30:45 shepazu: How things are serialized could be a bigger, longer term problem 05:31:13 ... I think there's disagreement on how you serialize out the bit ... convert DOM to text 05:34:29 olivier has joined #annotation 05:35:31 Topic: Testing 05:35:36 ivan: what do we need to test? 05:35:49 ... more than one answer ... model, protocol and findtext are all very different 05:35:56 ... what do we test for the model? 05:38:20 scribenick: bigbluehat 05:38:32 fsasaki has joined #annotation 05:39:02 ivan: theoretically, what I want to see, is that if two implementations create two graphs, what do they have to do in order to test? 05:39:20 azaroth: so in the LDP group, we provided a set of tests that you could run internally against your implementation 05:39:32 ...there was no centralized validation service 05:39:40 ...each test was implemented in Java 05:40:00 ....it would check to see if each direct container had hasMemberOf relation (etc) 05:40:08 ...and it would tell you if you had it properly 05:40:21 ivan: we went through the whole RDFa testing, that was an easy one, because... 05:40:32 ...this is the way HTML looks like, this is the graph I have to produce 05:40:47 ...we created the markup, and then checked the triples it produced 05:40:59 ...the starting point involved a bunch of HTML files 05:41:06 ...what is the starting point for Web Annotation? 05:41:20 azaroth: client or server? 05:41:35 ivan: I don't care. however the annotation is made, it produces a graph, and I have to see if it's correct 05:41:42 ...I can't properly say what the process is. 05:42:14 azaroth: brain storming. we could provide a set of human readable set of annotation scenarios 05:42:37 ...create a comment on this URL, and then check their annotation with the testing tool--there would be at least one right answer 05:42:46 ...however, there will be more than one right way to do that 05:42:55 ...unless we are very very clear about which one we want 05:43:23 ...annotate this URL with some text--does it have a language? does it have an string body? a remote body URI? 05:44:06 bigbluehat: we're going to need lots of tests.... 05:44:13 ...very specific to the very scenarios 05:44:34 ivan: if we make it too detailed, we loose it's value by being over specific 05:44:40 ...but if it's too general, then only a human can validate it 05:45:00 azaroth: the distinction is perhaps the difference between syntactic and semantic 05:46:13 bigbluehat: using json-ld playground is a good testing ground. 05:46:22 ... would be nice to produce along those lines 05:46:33 ... may not count as good validation but help decide how much to implement that 05:46:44 ... taking the json-ld playground and adding text graph output. 05:46:55 ... only compacted or only in quads 05:47:36 ivan: combining automatic comparison of the graphs was not something they used because ... complicated to setup a processor / install.. practical issues 05:47:57 ... in the RDFa case it worked really well b/c you gave the graph to the processor and at the end it produced an output 05:48:10 ... playground leaves it back to the human in the end. 05:48:41 bigbluehat: I normally copy/paste around.. and add things to the @context if they don't exist ... in order to validate to check the basic cases at least 05:49:07 ivan: In the case of RDFa we had various versions (b/c of HTML5..) between 200-300 tests per category 05:49:14 ... a full RDFa processor is probably too complex 05:49:29 ... if we come up with a reasonable set of things e.g., 60-80 tests roughly. 05:49:47 ... looking at tests by humans is not unreasonable in that case 05:49:49 shevski has joined #annotation 05:49:58 ... I think it is doable 05:50:53 bigbluehat: If I put the hypothesis tests.. 2/3 tests.. at least for a first test is helpful to understand what is close to an implementation 05:51:04 ... there is also a visualizer to help create 05:51:31 ivan: If you need help, there are people. 05:51:39 ... eric How do you test implementations? 05:51:48 erikmannens: I have to check with the guys. 05:51:55 ... it is at the protocol level 05:52:03 ivan: we will come back to the protocol level :) 05:52:14 erikmannens: will come back to you 05:52:53 takeshi: it is not necessary to implement the selector for instance for us (Sony)' 05:53:14 ... rdfdiff helps us with this 05:53:14 shevski has joined #annotation 05:53:47 ... as of today, basic processor to test / how to validate the output... not sure if the output form the JS is correct or o not, but at least what's stored in the DB is correct. 05:53:57 ivan: when you say check.. how is the check done? 05:54:46 azaroth: IEEE (?) ... they use a tool to give a URL.. essentially creating an internal object structure.. it has its own way of going at it to check 05:55:22 ... here is an example of an annotation of this feature and that it. If all those working great, and here is what's misformed then ... 05:55:41 ivan: but yo ucheck as a human? 05:55:52 azaroth: No. 05:56:06 ... tries to generate the structure internally. 05:56:23 ivan: conceptually speaking it is a dedicated ..(?) 05:56:50 azaroth: The take away from earlier work is .. you've got something missing from your implementation, and here is what is required 05:57:22 ivan: we would put an informal annex for framing. we then run the framing algo for the playground. can the playground actually compare that? 05:57:56 bigbluehat: ours just shows it to you.. matching at the graph level is not clear 05:58:11 ivan: if we use straining it is easy for the human 05:58:26 bigbluehat: .. we don't have to give it to the graph if we use keys. 05:59:35 bigbluehat: Doug we want to make sure the triples are right (i.e., prior discussion) 05:59:54 ivan: generated structures are relatives are small for the tests so humans can make the comparison fairly easy 06:00:37 ivan: formally speaking we have two approaches. 06:00:48 bigbluehat: what we (hypothesis?) don't have is prebuilt code 06:01:14 ... I'd like to see something like JSON playground. 06:02:00 shepazu: arne't quads in turtle? 06:02:03 bigbluehat: no 06:02:17 shepazu: what about json-ld (re: N3.js)? 06:02:24 erikmannens: will check 06:02:46 ivan: check if N3.js understand nquads and JSON-lD 06:02:52 shepazu: what is its output? 06:02:59 bigbluehat: It has an internal model 06:03:15 shepazu: you can write a parsing engine and... test against that. 06:03:22 bigbluehat: we don't want to writ einto other people's code necessarily 06:03:33 ivan: with N3 we can check the Turtle serialization.. 06:03:54 takeshi has joined #annotation 06:15:14 present+ csarven 06:16:06 azaroth has joined #annotation 06:16:27 ivan has joined #annotation 06:16:34 dsinger has joined #annotation 06:16:48 tzviya has joined #annotation 06:16:55 clapierre has joined #annotation 06:19:31 olivier has joined #annotation 06:19:56 Topic: Meeting with DPUB 06:20:19 ivan: Annotation, DPUB 06:20:37 clapierre: Accessibility of DPUB 06:20:42 Karen: also on DPUB 06:20:47 Present+ Karen 06:20:48 azaroth: Annotation 06:20:53 Present+ Tzviya 06:21:03 Jeff: DPUB IG 06:21:15 bigbluehat: hypothes.is project.. 06:21:21 .. read ebooks occcassionally 06:21:36 Present+ Ann_Basetti 06:21:42 Present+ Brady_Duga 06:21:42 rhiaro: U of Edinburgh, social Web WG 06:21:56 Ann: free agent / Social 06:22:03 takeshi has joined #annotation 06:22:15 erikmannens .. open source .. to publish ebooks 06:22:24 Present+ Antonio 06:22:29 Antonio: W3C JP 06:22:32 Present+ Olivier 06:22:38 Present+ Ralph 06:22:38 Olivier: BBC 06:22:46 annbass has joined #annotation 06:22:49 Ralph: W3C 06:22:57 shepazu W3C 06:23:09 takeshi Sony 06:23:19 tripu has joined #annotation 06:23:45 azaroth: We brainstorm for Annotation/DPUB 06:24:19 ... DPUB produced a set of use cases (sometime last year) which we looked at in the ANnotation WG. Those use-cases are somewhat transformed into deliveranbles. 06:24:40 ... recently, some UC for accessibility 06:24:43 DPUB Annotations Use Cases: https://www.w3.org/dpub/IG/wiki/UseCase_Directory#Social_Reading_and_Annotations 06:24:47 ... we could go into more detail on those 06:24:50 -> http://www.w3.org/TR/dpub-annotation-uc/ "Digital Publishing Annotation Use Cases" ]W3C IG Note 2014-12-04] 06:25:11 ... we don't have to walkthrough Annotation UC as published 06:25:13 s/]W3C/[W3C 06:25:59 Brady: DPUB 06:26:05 ... ... since we are not familiar with wha tyou published.. 06:26:16 ... we would like to accessibility UC 06:26:25 s/to/to talk about 06:26:41 azaroth: we have 3 publications. data model, protocol, client API 06:26:59 ... protocol, we've been discussing but haven't produced a new draft. 06:27:06 ... tomorrow we can go over 06:27:19 ... the model changes 06:27:29 ... we can go over what each sections want to do 06:27:35 ... trying to produce normative requirements 06:27:45 ... now we have each section with short intro and use-case 06:27:50 ... rather than a prose 06:28:03 ... with JSON-LD/Turtle/Diagram 06:28:15 ... technical change tho the model is the role ot be associated with the "bodies" 06:28:19 ... changes the way tags are produced 06:28:31 ... here is a resource and it is "tagging" ... as opposed to a tag 06:28:44 ... also for editing. 06:29:44 ... the protocol is based around essentially REST and HTTP. Update and Delete annotations. We have notions for a Container 06:30:10 ... and y[et to be published, but hopefully after tomorrow.. an integrated way to say a list of annotations or a collection of annotations that can be broken into pages 06:30:20 ... annotations or lists are great of interest to (?) 06:30:38 s/?/idpf 06:30:39 ... At the moment we are trying to align that UC with the protocol 06:30:55 shepazu FindText API is basically exposing 'find in page' 06:31:04 ... functionality to as a DOM API 06:31:06 annbass has joined #annotation 06:31:21 ... the current idea is to include a 'edit distance', a fuzzy matching. 06:31:28 ... there was some feedback to handle this with regez 06:31:34 s/regez/regex 06:31:56 ... the basic idea is to let the API to pass in the params and so it ends up with a text result from the document and that returned to you as a range 06:32:08 ... one of the features in a page API is a prefix and suffix. 06:32:16 ... in case you have multiple instances of a word 06:32:24 ... which instance is of the selection that you are trying to find 06:32:53 ... and given that, you have a URL syntax, a fragment identifier , this string to be set of creative to search in the page. 06:33:10 ... so you could use it as a primary identifier 06:33:18 ivan: the identifier represent the search 06:33:24 shepaze: terms of the search 06:33:29 ... certainly a long string 06:33:34 ivan: well we have CSI 06:34:17 shepazu: CFI 06:34:18 s/CSI/CFI 06:34:56 shepazu: lets say we are working in a browser, I select something, and store the selection, and the prefix.. and some other things, and store them as annotation, and in this case you might not have a body.. 06:35:17 ... so, then you might share that or keep it for yourself. for the latter, the next time yo ugo to the next version of the book, the annotation still stands 06:35:38 ivan: to be very precise, the reference to the selection, the identifier ot the selection, is the spark of the annotation 06:35:54 ... annotation means I give you the id to what I annotate and what it is annotating 06:36:13 Unknown: how much sof the source document is part of the selection 06:36:28 s/unknown/brady 06:36:30 ... I ask from digital publication and ebooks, there are significant restrictions as to how much you can copy 06:36:45 ... at some point you have to stop wha tyou can select and annotate 06:37:09 azaroth: .,.. There are two types: here is an exact match, tand the second is a char offset 06:37:24 Brady: at leas tfor CFI, for char offsets, there are a number of implementation issues 06:37:31 ivan: one of th reasons why this is coming up 06:37:35 Brady: difficult problem 06:37:40 ... still painful 06:37:52 shepazu: lets talk about it "off camera" 06:37:55 .. or "in camera" 06:38:24 ... This was tried. how do you trust case with storage and fair use 06:38:33 ... that doesn't remove you from contractual cases, but copyright 06:38:40 s/how do you trust/hathitrust 06:38:40 ... lets not go into that here 06:39:14 bigbluehat we don't have the perfect solution but not telling you about it :P 06:39:20 Brady: I guess it just takes money :P 06:40:14 ivan: for DPUB people is important.. the model document, the Annotation POV... and epub world there shoul dbe a change reference.. (csarven: okay I lost all the references here)_ 06:40:33 takeshi has joined #annotation 06:40:33 shepazu: Rather than storing strings, we store hashes 06:40:41 ... we have thought about the problem but no solution 06:40:58 Range finder was discussed.. which I assume is just text 06:41:14 ^^ tzviya 06:41:35 shepazu: maybe range finder or.. to be this text or non-text to be part of content 06:42:00 ... I might have a picture or a face to be highlighted. I think that's a larger thning. different things for different media types. as long as covered by media annotations 06:42:14 ... we decided to start smaller, and solve the already smaller difficult task of text search 06:42:30 Brady: can I select 498 image? 06:42:41 azaroth: in current state or..? 06:42:48 Brady: or resource 06:43:00 ... I have this collection of images and I want to make sure that I go to the right page 06:43:04 azaroth Which scope? 06:43:09 bigbluehat Within the scope of page 5 06:43:26 Brady: In my case, it will be page media 06:43:36 ... and image might appear on different pages so i want to know which 06:44:06 bigbluehat We call it scope currently. 06:44:40 shepazu: There could be sophisticated URI, selecting part of the image being stored so that it can be matched against other things on the page 06:44:57 ... an app to compare this fragment on the page to compare with other images to see if it is included 06:45:10 Brady: I want to bookmark page 50, and create an image only for that 06:45:28 ... i want to be able to find that on a device now 06:45:35 bigbluehat We are talking about an XPath selector 06:45:40 ... any fragment selector 06:45:51 ... also exploring CSS/Xpath for canonical selectors 06:46:03 ivan : the model is such that for generic term "selector", and can be extended 06:46:16 ... it is not there yet, but XPath can be such selector 06:46:20 bigbluehat CFI already is 06:46:39 ivan: CSS can be one of the selectors 06:47:00 bigbluehat: you can encode a complicated selector and never put it in the URI 06:47:13 ivan: oi think the CSS selector is under utilized. 06:47:27 ... reselector is extremely powerful 06:47:37 ... using that to locate the target of an annotation 06:48:47 azaroth: Continue on for 5 more minutes? 06:50:16 tzviya Talking about.. our vision for what we'd like to see .. portable web pub. offline being not first-class 06:50:26 ... a lot of work going on in CSS WG 06:50:28 ... and ARIA 06:50:44 ... issues that this raises.. identifiers are prominent. 06:50:55 ... we really focused on heading in that direction 06:51:08 ... a lot of the work is involved in international publication 06:51:19 ... updating EPUB and what W3C is coming form 06:51:50 bigbluehat From the W3C side, multiple things expressible in a single package. 06:51:55 ivan: That's a question of the packaging 06:52:07 bigbluehat I mention it in relation to the protocol. 06:52:16 ivan: Don't open all the worms 06:52:25 tzviya we explored a lot of packaging option 06:52:43 ... definitely open 06:52:49 ... talking about it for long time 06:52:54 ... right now, to read publications online 06:53:11 ... publication object model is a proposal 06:53:15 ... welcome to join 06:53:31 ... we need to have some sort of an object model, what should be based on is open 06:53:36 ... and conversation with service workers 06:53:55 ivan: to update what's happening is that. there is a packaging format i.e., EPUB, zip based. 06:54:07 ... we know there is some work in W3C, mainly in TAG to create a packaging format. 06:54:17 ... the info I got so far is still a bit distorted. 06:54:27 ... unclear what the outcome of that work will be 06:54:38 web packaging: http://www.w3.org/TR/web-packaging/ 06:54:46 kurosawa has joined #annotation 06:55:11 ... from our POV, publishing community can say give us a packaging format.. 06:55:38 Brady: from my perspective, we proposed multitype mime. 06:55:57 ... every time we proposed it people were confused and asked why didn't you use zip? So, we used zip. 06:56:06 ... now we have toolchains, and sorts of code around ZIP. 06:56:17 ... if we can turn back time, multitype mime. 06:56:29 ... and reality on the web, i prefer service workers 06:56:30 s/multitype/multipart/ 06:56:34 :) 06:56:56 ... at least for the srevice workers, and not having a package makes more sense 06:57:16 ... especailly for delivery, and in that case, we don't care about streamability 06:57:46 tzviya: I try doing with multipart mime, but then when you try making it work, there is not much out there.. only some old stuff from stackexchange... 06:58:02 bigbluehat when not doing W3C stuff, .. i do couchDB stuff to stream JSON 06:58:28 ... for couchDB that's super useful b/c we don't hav eto read the whole disk 06:58:50 Brady: It sounds really interesting, but not interesting enough to do anything about it (so says some people) 06:59:19 azaroth: This has some impact on this WG, b/c it has offline reading modes 06:59:30 ... we should be able to accommodate 06:59:40 DPUB TPAC Agenda https://www.w3.org/dpub/IG/wiki/Oct_2015_F2F_Logistics_and_Details#Schedule 06:59:53 ... at the moment we are looking at ActivityStreams 07:00:05 ... to be discussed tomorrow with Social WG people 07:01:00 ivan: In a sense, the way we ... where we are getting ... the usage of service workers, it can fool the main reading system that it can believe that everything is accessible through Web (on/offline) 07:01:13 ... b/c the service work will catch the HTTP request and deal with it (cached or not) 07:01:38 ... for the time being, the service worker is... - I don't want to say scifi - only one browser implements it 07:01:41 Web Packing format (based on MIME Multipart) http://www.w3.org/TR/web-packaging/ 07:01:44 ... so, some fuzziness there 07:02:00 ... that may be that annotation work may be released of this issue 07:02:02 s/hav eto/have to/ 07:03:22 Jeff: .. [csarven: sorry, couldn't track this] 07:03:49 ivan: each annotation can have a role 07:04:22 ... We want to annotate @alt 07:04:34 tzviya So it is not visible content 07:04:41 shepazu I'm not sure how we can do that 07:04:41 elf-pavlik has joined #annotation 07:04:51 ... I can't think of how a UX would expose that 07:05:09 bigbluehat The browser can switch off [stuff] 07:05:11 ben_thatmustbeme has joined #annotation 07:05:19 David_clarke has joined #annotation 07:05:19 annbass has joined #annotation 07:05:26 shepazu From a UX perspective it makes sense 07:05:34 bigbluehat If you can turn the images off, it is possible 07:05:40 present+ David 07:05:57 tzviya This is beyond the scope of DPUB 07:06:05 shepazu I'm skeptical 07:06:12 ... having assumed a web resource 07:06:24 bigbluehat DOM expression to the user. 07:06:36 ... maybe @alt is in that content stream 07:06:52 ... the visual representation is not the only representation the browser can make, e.g., it can be audible 07:07:08 shepazu I think tzviya was talking about.. what's exposde to the API 07:07:14 ..., not ncessarily what's in a web page 07:07:34 ... there are things that can be done for @alt e.g., screen reader, but hard for a browser to do it 07:08:07 Jeff: can I blind user create it and a visual user can find it? 07:08:15 bigbluehat It really depends on what the visual text is given 07:08:26 shepazu There is a visual aspect to text 07:09:00 ... ultimately all text can be considered part of range, so not sure if you can't the text. It is possible, but probably has some issues which needs to be figured out 07:09:02 s/can I blind/can a blind/ 07:09:19 ... what you are selecting as a range is the main issue 07:09:32 s/exposde/exposed/ 07:09:39 ivan: The
in HTML5, .. 07:09:54 Was it
or @details ? I only know of
07:10:51 Ralph: Without talking about interfaces, ... whcih is then available the user not using that system 07:12:17 Jeff: so we can annotate anything 07:12:23 shepazu Anythign. Even scene descriptions 07:13:11 ivan: We should have a way to incorporate this 07:13:19 .. and not throw away 07:13:57 azaroth also check the model so that we can.. .make an example of @alt. and how to represent that. 07:14:32 ... is there anything further from DPUB side to discuss? 07:14:49 ... welcome to hangout with us.. next topic is.. what we can accomplish before next TPAC for client side 07:15:35 Kerren (?) what would be the top implementations to expect? 07:16:05 shepazu Realistic for us to say... a data model - finish that with multiple representations/serializations 07:16:13 s/Kerren/Karen 07:16:35 ... Some find text, polyfill thing? possibly protocol as well 07:16:50 ... browser extensions 07:17:00 ... supporting annotation data model, findtext as well 07:17:10 ... we can hope that a browser supporting some parts of this 07:17:13 ... that's not yet clear 07:17:35 ... within a year we could at least get a findtext implemented in a browser - even if it is a stripped down version 07:17:52 ... b/c webannotation is a bunch of moving parts, so I don't think we are goin gto have the full annotation system in a year 07:18:11 ... if you want the implkementation, hypothes.is will probably have it in ayear.. but that's a browser extension 07:18:21 bigbluehat the slectors we output match spec. 07:18:31 ... if we get xpath in , and define how we do lists in.. 07:19:03 ... I have written some translation code with WebAnnotation, and give it to Hypothes.is group... 07:19:12 ... the annotations are in public and public domain 07:19:30 ... the hope is that, we get up to the protocol as well. least likely to get done, probably because not done as a draft yet either. 07:19:40 .. that part is moving already internally due to other forces 07:20:05 ... there are libs that we have ... if we do CSS/XPath, there is plenty of code laready 07:20:21 ... if we ship nothing in a year, we can confidently deliver the data model at least 07:20:43 ... RadiumJS and (?) JS 07:21:05 ivan: essentially implementations may appear in readers 07:21:21 bigbluehat these are usually last mile problems, but somebody has to tie them together 07:21:51 Karen: Academic publishers/Journals ... 07:22:39 ... I will take what you have discussed and go over it.. headline would be; academic journals will be soon able to such and such from W3C's stuff 07:22:52 shepazu Are you afraid that they'll talk down to you? ;P 07:23:29 dsinger has joined #annotation 07:23:30 shepazu The way to store and share annotations - that's the data model - the way to write annotations - that's the protocol - I'm not saying teach them these terms... 07:23:41 ... the way to write and store.. in the cloud. 07:23:59 ... open up annotations .. ther eare certainly ways to talk abotu these things like the real people talk 07:24:06 azaroth Anything else? Overlapping with DPUB? 07:24:26 ... Thanks for coming 07:24:31 ... Sooooo 07:24:50 Topic: what should be done before the end of the charter? 07:24:55 ... remaining of the day, we can talk about what do with the remaining of the charter 07:24:55 rrsagent, draft minutes 07:24:55 I have made the request to generate http://www.w3.org/2015/10/25-annotation-minutes.html clapierre 07:30:04 ... 07:31:20 DOM related functions by tilgovi: 07:31:21 https://github.com/tilgovi/dom-seek 07:31:23 azaroth The description of client-side API in the charter.. is slightly different 07:31:25 https://github.com/tilgovi/dom-node-iterator 07:31:30 https://github.com/tilgovi/dom-anchor-text-quote 07:31:34 https://github.com/tilgovi/dom-anchor-fragment 07:31:36 tripu has joined #annotation 07:31:38 https://github.com/tilgovi/dom-anchor-text-position 07:31:53 ... So, do we think that .. a Python implementation of that would not be very useful. 07:32:09 shepazu If we can get one browser implementation, and one polyfill 07:32:13 ivan: that would be ideal 07:32:17 ... process wise 07:32:28 azaroth A server-side findtext in any language... 07:32:37 shepazu No, it is a client-side API 07:32:46 azaroth Unless under robustness. 07:32:56 bigbluehat Things being conflated in findtext 07:33:29 shepazu: If you ar etalking abotu the fragment identifier, but I was talkiung about the findtext API 07:33:45 bigbluehat: I agree, but I don't think that's how it was presented earlier. If we can separate that.. 07:34:00 shepazu Fragment identifier being completed in a year is not clear. 07:34:09 ivan: I have no doubt that we can do that in a year 07:34:19 bigbluehat: That's the easy part 07:34:40 shepazu: it took the media fragment group 3 years to do that.. so, sure we can do that in a year, but i don't htink it is trivial 07:34:59 http://www.w3.org/annotation/charter/ 07:36:15 ... Can we get it wide spread instead of the implementations? 07:36:29 bigbluehat: I don't think we can put it on our shipping b/c of that 07:36:40 erikmannens has joined #annotation 07:36:58 ... hopefully changing towards more implementable 07:37:15 shepazu: I think Charles brought up an interesting point. I don't know how to represent a column in HTML. 07:37:28 azaroth: How to annotate a column 07:37:39 bigbluehat: You can do that in a browser right.. clicking random cells 07:37:48 shepazu: Multoiple discontinous selections 07:38:03 shepazu: Actually ou could use findtext 07:38:10 takeshi: ARIA and CSS are worlds are apart 07:38:25 ... Between CSS and ARIA roles you could solve that.. It is a very long conversation. 07:38:44 ... So CSS selectors.. you can select whatever, and use ARIA roles to say this has roles... 07:38:50 ... col doesn't exist, but colgroup exists 07:39:05 ... the table models are most robustly defined, b/c it is really hard to navigate a table 07:39:17 ... so sit down and talk to ARIA people. 07:39:39 ... ARIA/Annotations could be really accessible. Doesn't say much now, but it can be extremely valuable for a13y 07:40:03 shepazu: Sure.. lets talk about how that world fits in 07:40:17 ... the ability to serialize out what you selected or having being given selection and recalling that in the document 07:40:29 ... i don't htink ARIA is in the right place. I tis a way oyt express something.. but I could be wrong 07:40:36 azaroth: Back to high level topic 07:40:39 ... we can finish the model 07:40:52 ... there are some outstanding issues we can deal with tomorrow 07:41:13 ivan: i think we should wait for the others, but sending CR by Dec. 07:41:19 bigbluehat: Discussing before sending to the list 07:41:44 azaroth: For the serializations however, JSON-LD... but HTML something comes up 07:42:06 shepazu: I'll show you later, but I have a prototype.. HTML addresses some of the things brought up 07:42:19 ... for any out of band content 07:42:23 ... comment, footnote 07:42:43 ... if example it were to be a footnote, a single fn can be referenced in multiple places in the doc 07:43:04 ... if we had a native note element, I'll ahve a mapping between a note lemenet. 07:43:12 ... something I've been tinkering with 07:43:25 shepazu: Two way mapping 07:44:05 ivan: Either using RDFa or using even extra attributes like ITS did 07:44:12 ... there is an existing mechanism to go through 07:44:20 ... I am concerned by adding new elements 07:44:24 ... we don't have an extension model in HTML 07:44:44 ... the web components is for the time being is up in the air - only one company doing? 07:45:05 shepazu: This WG won't necessarily accomplish everything in a year.. so lets concentrate our energy. 07:45:22 ... I propose, in parallel, we can start on work that could be done. 07:45:41 ... we hav ea list of areas, but not a list of specs. 07:45:55 ... so a set a of specs in a year-two would be great 07:46:21 ... we might not re-charter, unless we hav ea concrete list of stuff (which can map to the spec) 07:46:34 ... they might say that if something doesn't match a spec, it might get dropped 07:46:39 ... at least the ground for the next charter 07:47:55 ... Does anyone object to that approach? 07:48:07 bigbluehat: Don't mind if we have clear deliverrables, but some things will fall through the cracks 07:48:22 shepazu: I want to make sure that... this work might take longer. 07:48:32 ivan: lets separate these discussions 07:48:39 ... for tomorrow, what to do for the coming yera 07:49:03 ... provided that this will get done, how much energy will the other stuff wake .. considering that most active are probably 10 people 07:49:56 shepazu: Do we hav ea general agreement on what we can deliver .. 07:50:05 bigbluehat: We need to repost the 6 points in the charter and map it to what we can deliver 07:50:12 ... some stuff is super big 07:51:39 olivier has joined #annotation 07:52:05 shepazu: So, the way that we struck upon is 'selection' is a pseudo-element in CSS. So, the range we get from the API is ... will provide as a range, once we have it, it iwill register in the document. and name it. 07:52:20 ... a pseudo-element ... [csarven: I lost it] 07:53:30 azaroth: Data model, protocol, vocabulary (poart of data model), JSON-LD (part of protocl), not yet decided, search or notifications as par tof the protocol, plain text API 07:53:48 ivan: what about URL? 07:54:14 bigbluehat: if anyone wanted to do it go to IETF and then we can incorporate it 07:54:32 shepazu: We can certainly try, erikmannens you can test how long it took to do media fragments 07:54:34 seems DPUB really wants it...so...we should talk! :) 07:55:25 ivan: XPointer like framework... 07:55:29 shepazu: hash something 07:55:51 azaroth: The point is IETF media fragments, .. and existing defs how those specific... 07:56:56 tzviya: Can we have a meeting with DPUB? Sounds like a large deliverable. 07:57:08 shepazu: Probably small with a lot of fight :) 07:57:38 bigbluehat: the call time is imporant/limited.. and fast track that to get the specs out. That's primary .. 07:57:46 azaroth: Thanks all 07:59:08 rrs agent, please generate minutes 07:59:09 RRSAgent: draft minutes 07:59:09 I have made the request to generate http://www.w3.org/2015/10/25-annotation-minutes.html csarven 08:00:04 erikmannens has joined #annotation 08:00:53 kurosawa has joined #annotation 08:01:24 clapierre has joined #annotation 08:03:32 tzviya has joined #annotation 08:08:09 ivan has joined #annotation 08:08:26 rrsagent, draft minutes 08:08:26 I have made the request to generate http://www.w3.org/2015/10/25-annotation-minutes.html ivan 08:08:41 olivier has left #annotation 08:12:34 kurosawa_ has joined #annotation 08:18:25 shevski has joined #annotation 08:19:39 Jeff_Xu has joined #annotation 08:21:56 Ralph has joined #annotation 08:24:40 erikmannens has joined #annotation 08:35:24 zakim, bye 08:35:24 leaving. As of this point the attendees have been David_clarke, csarven, Karen, Tzviya, Ann_Basetti, Brady_Duga, Antonio, Olivier, Ralph 08:35:25 Zakim has left #annotation 08:35:42 rrsagent, bye 08:35:42 I see no action items