14:51:40 RRSAgent has joined #epub-locators 14:51:40 logging to https://www.w3.org/2021/06/09-epub-locators-irc 14:51:43 RRSAgent, make logs Public 14:51:44 please title this meeting ("meeting: ..."), dauwhe 14:52:02 meeting: EPUB 3 Locators Task Force 14:52:04 chair: dauwhe 14:52:28 regrets+ 14:53:29 dauwhe has changed the topic to: https://docs.google.com/document/d/1y5JKcwlq1rvGYJ1HoLg0GF-NRiBsBNNZopGuQ0mgHUw/edit 14:59:50 avneeshsingh has joined #epub-locators 15:01:23 pilarw2000 has joined #epub-locators 15:01:23 present+ 15:01:27 present+ 15:01:31 present+ 15:01:53 present+ dlazin 15:02:50 present+ Laurent 15:03:29 dlazin has joined #epub-locators 15:03:33 present+ 15:03:34 scribe+ dauwhe 15:03:59 laurent__ has joined #epub-locators 15:04:09 present+ 15:04:53 dlazin: we're working on what might be reasonable to solve 15:05:21 ... we appear to be heading towards a defined fall-back for page numbers 15:05:35 ... a way to ensure pagelist is always present, even if not defined by the author 15:05:45 ... we do want to encourage a pagelist in all EPUBs 15:05:56 ... but have something to fall back to that is easy to implement 15:06:22 ... Laurent and Hadrien had good comments about page numbers via email 15:06:39 ... we don't think CFI is a bad idea, but we think it's a solution for a different set of problems 15:06:48 ... these problems are around traditional page numbers 15:07:05 avneeshsingh: CFI is also a problem for future for HTML EPUB 15:07:13 ... these are quite broad use cases 15:07:25 ... did you decide to put some of them out of scope 15:07:41 dlazin: there's a spreadsheet for the use cases 15:07:46 ... the gdoc is out of date 15:07:48 https://docs.google.com/spreadsheets/d/1KO-HyLGUUw36F-ruAARHNiPO1aUJCCNeTv3zxGtjuHw/edit#gid=0 15:07:51 q+ 15:08:08 ... in column I is whether things are in scope 15:08:13 ... those are the page number things 15:08:23 pilarw2000: dan, were you there last week? 15:08:26 ... I missed it 15:08:43 ... I did not see in the minutes: Mary and I had been in an indexing conf 15:08:57 ... we asked about terms about what to call page numbers in ebooks that don't have page numbers 15:09:03 ... one suggestion was "pointer" 15:09:16 ... the thing a reader would click on to get to content 15:09:34 ... the thing in the index that might look like a section/page/para number that you click 15:09:45 ... "pointer" is a good word for that 15:10:02 ... we'd talked about translating a CFI into something readable; that would be a pointer 15:10:08 dlazin: don't forget that word :) 15:10:24 ... anything that we name has to be worldwide 15:10:35 ... users are familiar with pages 15:10:41 ... the concept is functionally a page 15:11:00 ... if we were to give it some other name, that might lead to irrelevancy 15:11:15 ... the email said, I'm in favor of calling these things pages 15:11:31 ... and encouraging reading systems to call their auto-numbering "screens" 15:11:46 ... pages don't change as you change font size, but the screens do move 15:11:54 pilarw2000: would we call them screen numbers? 15:12:03 dlazin: we would do page numbers, RSs would do screen 15:12:05 ack lau 15:12:24 laurent__: First, we cannot ignore the Kindle 15:12:29 ... which has locations 15:12:42 ... but it's not simple 15:12:50 ... in Readium we call it position 15:12:55 ... we didn't want to call it page 15:13:05 ... I would never speak about screens 15:13:14 q+ 15:13:16 ... screens is what you see, it's what's in the viewport 15:13:22 ... no two experiences are the same 15:13:33 ... to find something transportable, screens is not good 15:13:43 ... we could keep the name page but there is confusion with pagelist 15:14:15 ... we have to choose a name that is adequate 15:14:27 dlazin: we don't bother with the queues here :) 15:14:45 ... to clarify, what we're trying to introduce is not dependant on viewport or screen size 15:14:56 ... the problem is that reading systems already have a variable thing, and they won't drop those 15:14:57 present+ 15:15:10 ... we do want to call the fixed thing pages 15:15:20 ... for one thing, if there's a pagelist we'll use that 15:15:25 ... and we want the same functionality 15:15:38 ... and we're recogniing there's an existing variable thing 15:15:48 ... I do believe screen is a bad name 15:15:54 ... becaause a11y 15:16:06 laurent__: RMSDK has been using position with a basic calculation 15:16:17 ... size of zipped resource in bytes, divided by 1024 15:16:29 ... because they are using zipped html resource 15:16:36 ... it has no real semantic meaning 15:16:56 ... but it is easy to calculate 15:17:18 ... we are wondering about using zipped vs unzipped HTML resource 15:17:34 ... if we want to keep with RMSDK and the reading toolkits 15:17:48 ... this is simple and adequate--zip divided by 1024 15:18:09 dlazin: brady said Google does something similar, unicode code points divided by 1000 15:18:24 ... once we have a recommendation, we should ask a RS to prototype a few things 15:18:50 laurent__: we have tried with uncompressed by 2??? but is about the same as compressed / 1024 15:18:58 ... you won't have interop until we agree 15:19:16 ... if we try we can take everything 15:19:23 dlazin: I think we agree it's arbitrary 15:19:28 ... and we're trying to set a standard 15:19:56 s/2???/2500 15:20:00 ivan: Laurent, could your team and google team write down their algo. 15:20:09 ... so we could compare 15:20:44 ... I don't know the result of this task force 15:20:52 ... to have something documented as a first step is good 15:21:13 ... I don't know what other reading systems do 15:21:41 laurent__: apple recalculates every time the font size or viewport changes 15:21:52 ivan: it depends on screen size, font size, etc 15:21:57 ... that's not what we're looking for 15:22:00 laurent__: exactly 15:22:24 dlazin: if you provide a pagelist, Apple will use your pagelist and the screen number; you used to be able to toggle 15:22:41 ... it will have both variable and fixed 15:23:18 ivan: who else can we contact? Kobo? 15:23:44 dauwhe: it may even depend on which Kobo implementation 15:24:00 dlazin: I can't get info from Google for a bit, as Brady and Garth are out 15:24:16 laurent__: yes, we can write down our heuristic algo 15:24:22 ivan: perfect 15:25:09 ivan: what about bluefire 15:25:24 laurent__: old bluefire is RMSDK, new bluefire is Readium 15:25:48 ivan: can someone ask about Japanese reading systems on Thursday night? 15:26:07 laurent__: I have to leave now for a readium call 15:26:29 dlazin: the time for this is arbitrary, we're open to moving it 15:27:08 ivan: the epub call is 4pm CET / 10AM ET 15:27:19 ... it's simpler for me if that slot is taken by EPUB 15:27:58 dlazin: we are proposing to move to Friday 10AM ET 15:28:07 ivan: the first would be June 25 15:28:13 dlazin: we'll propose it 15:28:27 pilarw2000: are we still doing wednesday nights? 15:28:40 ivan: yes, but laurent and I can't participate 15:28:56 dlazin: we won't move those 15:29:11 ivan: we'll try and see what others say 15:29:47 dlazin: with dave and ivan here it's a good time to talk about strategy 15:29:51 ... what should we be producing 15:29:58 ... and how do we get adopted by reading systems? 15:30:12 ... let me lead the discussion 15:30:24 ... maybe it's an experimental thing where we need adoption 15:30:32 ... it's an attempt not a spec 15:30:37 ivan: there is an intermediate thing 15:30:44 ... if I become administrative 15:30:51 ... a WG can publish WG notes 15:31:01 ... about any subject 15:31:18 ... sometimes it could be a design that is not final 15:31:42 ... in this case my option would be that we have a doc published as w3c note on locator issue in general 15:31:48 ... describing the problems and solutions 15:32:00 ... and talk about the role of EPUB CFI 15:32:05 ... and republish CFI with a note 15:32:14 ... and the same note would document this algo 15:32:29 ... this would not be a w3c recommendation 15:32:41 ... would raise more interest; more people would look at it than a CG doc 15:32:46 pilarw2000: that's what Wendy has said 15:35:31 dauwhe: this sounds like incubation, and we could sell this 15:35:41 avneeshsingh: is the ultimate objective the algo? 15:35:48 pilarw2000: are we? 15:36:00 dlazin: I think so? It's a part of what we're providing. It's like an appendix. 15:36:16 ... we are striving for an agreement that all RSs should have a fallback 15:36:33 ... we need to decide about backward compatibility 15:36:44 ... should we address all the existing epubs in the world 15:36:52 ... I don't think we need to 15:37:02 ... I think it's acceptable to be an initiative for the future 15:37:11 ... it would be OK for us to handle backwards compat 15:37:21 ivan: what do. you mean by backward compatibilty 15:37:32 ... today, if author provides a page list, it's clear what the RS should do 15:38:05 ... if an EPUB does not have a pagelist, then the RS may do something but there is no incopatibility 15:38:22 dlazin: do we need to support all existing EPUBs? 15:38:38 .. is this in authoring, ingestion, or reading system? 15:38:47 ivan: no author will calculate the bytes in the xip 15:39:02 ... from my perspective, ingestion and RS is identical 15:39:09 dlazin: I think RS makes the most sense 15:39:18 ... you could do it in authoring 15:39:28 ... it could be every 1k character 15:39:34 ... and indesign could implement it 15:39:40 avneeshsingh: as far as algo is concerned 15:39:48 ... why do we worry about this? 15:39:58 ... if there is an algo, it could be used anywhere 15:40:09 dlazin: some are only practical in RS 15:40:21 avneeshsingh: there are complexities in vertical writing 15:40:31 ... how would that work? 15:40:45 pilarw2000: indexing is done and embedding in doc. that's part of authoring 15:40:55 ... I'm not sure I'd agree it's RS-dependent 15:41:17 dlazin: if you have an index in a book, and the index header is "carrot" 15:41:25 ... 12, 38, 44 15:41:32 ... indexer needs to put links in 15:41:39 pilarw2000: it's flipped around 15:41:49 ... I've embedded the index entry in the text 15:42:03 ... and then some process results in the pointer 15:42:08 ... the page number comes after 15:42:21 ivan: this means that whatever algo we come up wiht 15:42:40 ... should be oblivous to whether it is done in RS or in authoring tool 15:42:50 ... so shouldn't be length of zip file 15:43:02 ... could be number of unicode characters 15:43:08 dlazin: there is another alternative 15:43:19 ... that's hard, though, for reasons avneesh explains 15:43:30 ivan: I don't think avneeshsingh is right here 15:43:40 ... unicode is there; how they are displayed is a different question 15:43:45 avneeshsingh: it is different 15:44:08 ... may be a small number of chars between pages 15:44:36 dlazin: you need page numbers for xref or index 15:44:39 ... so we could say 15:44:53 ... the pagelist is primary, algo is fallback 15:45:01 ... you can only do indexing and xrefs with a pagelist 15:45:12 ... that lets algo be deferrred to the RS 15:45:22 pilarw2000: you're also talking about TOC, endnotes 15:45:29 dlazin: TOC doesn't need page numbers 15:45:37 ... more important for index 15:45:42 ivan: I'm not sure I follow 15:45:53 ... the term authoring is too broad 15:46:40 pilarw2000: I'm working with the book right before it's final 15:46:49 ivan: the content is already there 15:47:34 dlazin: if Pilar finds a book without a pagelist, I'd expect her to request a new version with a pagelist or make one herself 15:47:41 ... there could be a fallback at authoring time 15:47:58 ivan: that's the algo doable for both author and RS 15:48:14 dlazin: if Pilar is inserting page numbers, she can use arbitrary numbers 15:48:27 ivan: if indexing is not there, and RS still does it 15:48:41 ... then you solve the use cases around shared locatio 15:48:52 ... if the algo covers severaal use cases that's important 15:49:01 pilarw2000: I don't generate the page numbers 15:49:13 ... the publisher creates the page numbers for all the use cases 15:49:23 ... InDesign is great for that 15:49:27 dlazin: I still don't agree 15:49:40 ... there's no problem if we use the same algo but I don't think it's necesary 15:49:55 ... one case is a compiled ebook we don't touch, reading system generates pagelist 15:50:07 s/pagelist/positions 15:50:23 ... the other case is where the page numbers are written into the files 15:50:50 ivan: is it better if it's the same algo 15:51:29 dlazin: the algo is an implementation of the goals. It is an appendix. 15:51:37 ... we derive the algo from our goals and requirements 15:51:48 ... we want it to be useable at both authoring time and RS time 15:51:59 ... and needs to work for all languages 15:52:12 ... but doesn't need to work identically 15:52:20 avneeshsingh: should be consistent within a language 15:52:39 pilarw2000: as we go from spanish to english 15:52:53 avneeshsingh: if you have ten books in that language it should work consistently 15:53:14 dlazin: from strategy perspective 15:53:30 ... we need to remember is to reach out to adobe and other authoring systems 15:53:35 ... this will be vastly more useful 15:53:48 ... the best case is for every epub to have a pagelist 15:53:59 ... and for that we need ID and scrivener and Word 15:54:03 ivan: yeah 15:54:14 ... I have a script that produces EPUBs of W3C docs 15:54:20 ... I start with HTML 15:54:30 ... for me to generate a pagelist doesn't work 15:54:34 ... I'd need a script 15:54:42 dlazin: that's not the primary case 15:55:12 pilarw2000: when I'm indexing, instead of using page numbers I have distinct text for every hit under a concept 15:55:26 ... every instance has a unique chunk of text 15:55:33 ivan: yes 15:56:16 pilarw2000: we do need to tell people that's what you should do 15:56:42 dlazin: next time, we should start working on the note 15:56:46 ... we have things to write down 15:56:56 ivan: is that all? 15:57:04 avneeshsingh: thanks for helping me catch up 15:57:09 dlazin: thanks for coming! 15:57:43 rrsagent, draft minutes 15:57:43 I have made the request to generate https://www.w3.org/2021/06/09-epub-locators-minutes.html ivan 16:04:25 Zakim, end meeting 16:04:25 As of this point the attendees have been dauwhe, avneeshsingh, pilarw, dlazin, Laurent, laurent__, ivan 16:04:28 RRSAgent, please draft minutes 16:04:28 I have made the request to generate https://www.w3.org/2021/06/09-epub-locators-minutes.html Zakim