13:29:55 RRSAgent has joined #epub 13:29:55 logging to https://www.w3.org/2022/10/07-epub-irc 13:29:57 RRSAgent, make logs Public 13:29:59 please title this meeting ("meeting: ..."), ivan 13:30:13 ivan has changed the topic to: Meeting Agenda 2022-10-07: https://lists.w3.org/Archives/Public/public-epub-wg/2022Oct/0000.html 13:30:14 Chair: dauwhe 13:30:14 Date: 2022-10-07 13:30:14 Agenda: https://lists.w3.org/Archives/Public/public-epub-wg/2022Oct/0000.html 13:30:14 Meeting: EPUB 3 Working Group Telco 13:30:14 Regrets+ matthew, tzviya, toshiakitoike 13:40:30 ivan_ has joined #epub 13:52:33 dauwhe has joined #epub 13:56:29 Zakim, who is here? 13:56:29 Present: (no one) 13:56:31 On IRC I see dauwhe, ivan_, RRSAgent, Zakim, hober, github-bot, Github, npd, jcraig 13:58:03 present+ rickj 13:58:07 present+ 13:59:06 MasakazuKitahara has joined #epub 13:59:18 present+ 13:59:22 present+ 14:00:02 present+ dhall 14:00:09 present+ wendy 14:00:24 romain has joined #epub 14:00:40 rickj has joined #epub 14:00:44 wendyreid has joined #epub 14:00:47 present+ 14:00:54 present+ 14:00:58 present+ 14:01:39 present+ bduga 14:01:51 CharlesL has joined #epub 14:02:02 present+ 14:02:26 dhall_ has joined #epub 14:02:35 present+ 14:03:02 duga has joined #epub 14:03:06 present+ 14:03:10 present+ billk 14:03:12 scribenick: dauwhe 14:03:18 Bill_Kasdorf has joined #epub 14:03:27 gpellegrino has joined #epub 14:03:40 present+ 14:03:43 wendyreid: first agenda item is viewport metadata 14:03:45 chair: wendy 14:03:53 https://github.com/w3c/epub-specs/issues/2442 14:04:00 present+ 14:04:06 topic: https://github.com/w3c/epub-specs/issues/2442 14:04:09 romain: I had a late review of new viewport spec 14:04:19 gpellegrino_ has joined #epub 14:04:19 ... wondering if it was too strict 14:04:24 q+ 14:04:24 ... is it enough for reading systems? 14:04:32 ... talking about viewports meta tag for FXL 14:04:48 ... for reflowable we are covered (RS must ignore viewport) 14:04:58 ... in FXL this is important because it sets ICB 14:05:05 ... there's an EBNF grammar to defined 14:05:23 ... it's more constrained than what's in the CSS draft it comes from 14:05:49 ... it's more constrained than what browsers/reading systems can extract info from 14:05:55 ... so should we relax the grammar a bit 14:06:08 ... or further specify how reading systems should extract useful values from it 14:06:26 ack iv 14:06:36 MURATA has joined #epub 14:06:57 ivan_: to pick up on what romain said 14:07:12 wendyreid_ has joined #epub 14:07:14 github-bot has joined #epub 14:07:25 ... the spec does say somewhere that if the viewport spec is wrong,, it should use device height/width 14:07:29 q? 14:07:36 ... it's not only strict, it 14:07:53 q+ 14:07:58 https://github.com/w3c/epub-specs/issues/2442 14:07:58 q? 14:07:58 hober has joined #epub 14:08:05 https://github.com/w3c/epub-specs/issues/2442#issuecomment-1271300202 14:08:24 ... for example Apple can make sense of an invalid viewport tag 14:08:48 ... what I propose is to relax the requirement to use device sizes when there is a problem 14:08:57 q+ 14:08:59 ... and instead say the RS should warn the user/author 14:09:09 ... and then it can do a best attempt to do something sensible 14:09:19 ... it can try to extract a window size 14:09:38 ... the only place where there's a different problem 14:09:51 ... if you have a number of content files and forget to set the viewport for one of them 14:09:59 ... today there is nothing there so you use device sizes 14:10:21 ... the other possibility is to [1] warn user and [2] go back to previous spine element and use that viewport size 14:10:28 q? 14:10:31 ack romain 14:10:53 romain: I agree that the spec already set that device height/width should be used when faced with an invalid value 14:11:03 ... so maybe relaxing RS spec is the way to go 14:11:12 ... I'm not convinced RSs should warn users 14:11:19 +1 14:11:21 q+ 14:11:39 ... if the reading system can do its best to render something, or make up a value that's good enough 14:11:47 ... as transparently as possible 14:12:05 present+ makoto 14:12:07 ... there's a caase with valid viewport but specifies the width several times 14:12:21 ... this is valid per grammar but don't say if it should pick first/last/any 14:12:48 ... do we need to say something in RS spec? If there's dupes then it becomes invalid? 14:12:58 ack dhall_ 14:13:07 dhall_: I like your rec there 14:13:18 ... I would lean toward should report errors to user 14:13:27 ... some could be unrecoverable, like xml parsing errors 14:13:42 ... or there could be dev mode and remotely inspect 14:13:50 ... but most readers won't be interested 14:13:57 ... so "should" warn the users 14:14:14 ... on multiple width values, add something to spec to prefer first value found 14:14:20 q+ 14:14:23 or even MAY warn the users 14:14:26 .... separately, what to do when there isn't anything specced 14:14:37 ... typically in FXL books you have the same page size throughout 14:14:54 ... different page size present challenges to the RS 14:15:17 ... I like the idea of falling back to the most recent valid viewport 14:15:38 ... if nothing is defined, do you fall back to device dimensions or something else? 14:15:51 ... because device dimensions change with orientation 14:16:06 ack wendyreid 14:16:09 ... and what happens when dimensions change? 14:16:30 wendyreid: as much as i like the idea of warning the user 14:16:56 ... in the real world of RSs, telling the user that something is wrong with their viewport isn't helpful. There's nothing they can do. 14:17:14 q+ 14:17:19 ... this puts an undue burden on the user 14:17:38 ... I think this should be an epubcheck thing 14:17:48 ... I like idea of preferring first value 14:18:08 ... RSes do lots of things to optimize for speed when they render FXL 14:18:22 ... we use first viewport value and don't reparse 14:18:33 ... so we don't support changing viewport sizes 14:18:58 q+ 14:19:10 ack duga 14:19:19 duga: agree with David and Wendy 14:19:28 ... I would not say should warn the user, it's a bad idea 14:19:35 ... and these are often kid's books :) 14:19:54 ... we have end users and publishers 14:20:03 ... might be helpful to talk about these two groups 14:20:10 q+ 14:20:13 ... I want my publishers to know but not my users 14:20:23 ... you can reject something with a bad viewport 14:21:08 ... I'm agnostic on which to pick of multiple values 14:21:15 ack ivan_ 14:21:17 ... I think we do support multiple page sizes 14:21:34 ivan_: we have multiple viewports in the spec 14:21:57 +1 to Brady (et. al.) and differentiating messaging between readers and publishers 14:22:05 ... (different page sizes for different files) 14:22:09 ... we can't change that 14:22:14 ... if they don't implement that's ok 14:22:30 ... for reporting, I understand that I don't want reports going to kid 14:22:38 ... I like the safari analogy of David 14:22:57 ... I wonder if we should say something in the spec 14:23:00 q+ 14:23:21 q- 14:23:22 ... from standards point of view it's ok to say reading systems should issue a warning 14:23:47 ... matt and I can come up with a larger PR 14:23:53 ack dhall_ 14:24:21 dhall_: when I think of users, do you think of author and publisher as the same user 14:24:26 ivan_: we don't specify that 14:24:43 dhall_: I can see a publisher being interested in results from epubcheck 14:25:09 ... I can see that authors that write their own epubs wouold benefit from warnings from RS 14:25:15 ... do we use term "May"? 14:25:17 ivan_: yes 14:25:24 ack dauwhe 14:25:26 scribe+ 14:25:44 dauwhe: We're talking a lot about the situation where the author messed up the viewport 14:25:50 ... what choices the RS has in that case 14:25:57 ... this sounds like EPUBCheck's job 14:26:04 ... is the viewport parse-able 14:26:13 ... compatibility issues, maybe it's a warning 14:26:25 ... seems wrong to put this on the end user 14:26:30 q? 14:26:49 wendyreid: if we reach any sort of consensus we don't want to put anything on the end user/reader 14:26:57 q+ 14:27:14 ... we do want to warn the content creator (publisher/author) 14:27:41 ... maybe we want to increase robustness of the section 14:27:41 I would recommend "EPUB creator 14:27:55 ... but we want requirements around validity of viewport 14:28:04 ... RSes rely on epubcheck 14:28:07 q? 14:28:09 ack duga 14:28:19 duga: +1 to using may instead of should 14:28:35 s/"EPUB creator/"EPUB creator" vs. "content creator" because the person writing the book created the content but not the EPUB/ 14:28:45 ... this section isn't a real world problem except there was no spec to reference 14:28:55 ... we're not actually seeing real problems in FXL books 14:30:40 CharlesL: Q about checking viewport on mobile devices 14:30:56 dauwhe: viewport is property of EPUB not device, spec says how to adapt 14:31:34 Resolution: Ivan to adjust language to reflect content creator/EPUBCheck responsibility for viewport 14:31:57 Topic: https://github.com/w3c/epub-specs/issues/2447 14:32:08 Topic: XML Security and internal parsed entities 14:32:22 MURATA: old issue; more than 20 years ago 14:32:39 https://github.com/w3c/epub-specs/issues/2433 14:32:44 ... related to external parsed entities issue 2433 14:32:59 ... this issue is about INTERNAL parsed entities 14:33:09 ... doesn't use external identifiers or URLs 14:33:20 ... these are defined in internal DTD subsets 14:33:31 https://en.wikipedia.org/wiki/Billion_laughs_attack 14:34:00 ... if an internal reference references a different internal reference, things can get out of hand quickly 14:34:28 q+ 14:34:30 ... some EPUB reading systems ignore rec of XML and ignore internal parsed entities. This is non-conformant but avoids security issue 14:34:43 ... what should we do? Prohibit internal DTD subsets? 14:34:55 ... which gets rid of internal parsed entities. 14:35:11 ack ivan_ 14:35:21 ... can exissting RSes be destryeed by malicious content? 14:35:29 ivan_: thanks Makoto for raising the issue 14:35:58 ... the problem is that removing the possibility to use internal entities is something we can't do because of existing epub docs that might use this 14:36:07 ... we cannot invalidate existing content per our charter 14:36:28 https://github.com/w3c/epub-specs/pull/2451 14:36:46 ... as part of a pull request that makoto and I did together is a separate section 14:36:56 ... in the security section of RS spec 14:36:57 https://raw.githack.com/w3c/epub-specs/makoto-xml-conformance-change/epub33/rs/index.html#security-privacy-recommendations 14:37:16 ... which say that RS should be aware of this problem and should deal with it. 14:37:26 q+ 14:37:38 ... so we should draw attention to the issue 14:37:39 q+ to ask about known solutions 14:37:41 ack dauwhe 14:37:58 dauwhe: I don't think we can or should forbid internal entities 14:38:10 ... the problem has been around 20 years and EPUB is still functional 14:38:26 ... it's a security vulnerability, we may need to warn people in our security section 14:38:42 ... ordering RSs to follow a specific algorithm might be overkill 14:38:46 ... let's warn people 14:38:58 MURATA: Basically, not our problem 14:39:01 MURATA: this is not our problem :) 14:39:07 ack duga 14:39:07 duga, you wanted to ask about known solutions 14:39:08 q? 14:39:35 q+ 14:39:37 duga: this is a known problem in XML. Are there known solutions? 14:39:48 ... do they have mitigations? Like in libxml? 14:40:23 +q 14:40:24 ack ivan_ 14:40:31 "Defenses against this kind of attack include capping the memory allocated in an individual parser if loss of the document is acceptable, or treating entities symbolically and expanding them lazily only when (and to the extent) their content is to be used." 14:40:38 ivan_: makoto may know 14:40:58 ... makoto made some sample epubs, and we tested them 14:41:05 ... I tested on Thorium and iBooks 14:41:13 ... both reacted by rejecting the EPUB 14:41:22 ... without identifying the problem 14:41:41 ... we suspect they don't accept internal entities 14:41:44 q? 14:41:49 ack MURATA 14:41:58 MURATA: I'm not aware of general solutions 14:42:10 ... XML WG was aware of this issue, but had no good solutions 14:42:43 ... thorium doesn't reject publication, it ignores the entity but not the entire publication 14:42:50 ... this behavior is non-conformant 14:43:28 dauwhe: I don't think it's entirely fair to say that this implementation is non-conformant because it's not deploying my vulnerability 14:43:37 ... not fair to RS 14:43:39 q+ 14:43:44 ack wendyreid 14:43:45 q+ 14:43:58 wendyreid: it's like the viewport convo 14:44:05 ... we want to avoid putting burdens on readers 14:44:32 q+ 14:44:45 ... does the fail quietly solution work? 14:44:50 ... is that a bad thing? 14:44:58 ack iv 14:45:00 ack ivan_ 14:45:07 ... I think because we are running out of time 14:45:13 ... people should look at the PR 14:45:18 ... there are other things in the PR 14:45:26 ... then we move on 14:45:28 https://github.com/w3c/epub-specs/pull/2451 14:45:28 q+ 14:45:32 ack dauwhe 14:45:43 dauwhe: I think we add something to the security section 14:46:06 ... it hasn't been a problem in the past, internal entities aren't usually a problem unless malicious 14:46:20 ... what are the consequences of a malicious use 14:46:26 ... the consequences seem small 14:46:41 ... good enough to say "XML has a known vulnerability: 14:46:42 ack dug 14:46:44 ack duga 14:47:04 duga: I agree putting this in the security section, maybe mentioning other xml vulnerabilityes 14:47:15 s/vulnerabilityes/vulnerabilities/ 14:47:38 duga: any attack like this can jeapardize personal information 14:47:59 ... although no one has bothered to do this 14:48:54 dauwhe: Some of this is also outside the scope of the WG, we can't ask for people to patch the OS they are running dev machines one 14:48:59 s/one/on 14:49:53 wendyreid: consensus: let's look people look at Ivan's PR 14:49:56 ... comment on it 14:50:25 ... ten minutes remaining 14:50:32 ... two more topics 14:50:39 Topic: satellite specs 14:50:49 ... some of these are in our w3c repo 14:50:54 ... multiple renditions, CFI 14:50:57 ... some aren't 14:51:05 ... which might be a good thing 14:51:16 ... we should pull in region-based navigation, which has implementations 14:51:27 q+ 14:51:29 ... are there any other that need attention? 14:51:46 ... they will remain in idpf space if we don't move them 14:51:46 q+ to ask about CFI 14:51:50 ack ivan_ 14:51:54 ivan_: we have to be precise 14:52:06 ... should any of these be published as w3c note? 14:52:17 q+ 14:52:29 .. CFI is in our space, I pulled in the existing idpf doc, turned into a format for w3c 14:52:35 ... but it hasn't been published 14:52:43 ... and we did have some discussion with brady etc 14:52:56 ... publishing as a note would require a new section on the processing model 14:53:02 ... it's more than editorial work 14:53:04 ack duga 14:53:04 duga, you wanted to ask about CFI 14:53:27 duga: we should pull over CFI, unless I have to do more work 14:53:41 ... it is the one that always comes up 14:53:41 q+ 14:53:50 ack rick 14:53:56 Has the Locators group dealt with CFI? 14:54:12 rickj: ignoring extra work, we're five years into w3c owning epub 14:54:20 ... I still see articles linking to IDPF 14:54:39 ... is there an arguement for moving them to the w3c domain ? 14:54:40 ack dhall_ 14:54:44 q+ 14:54:53 dhall_: to comment on CFI, what's the remaining work? 14:55:05 ... we could maybe get that work done 14:55:05 q+ 14:55:07 ack dauwhe 14:55:25 dauwhe: Just to Rick's question, I still think there's no intention of shutting down IDPF web links 14:55:29 ... bad web practice 14:55:35 ... we have lots of pointers to W3C-land 14:55:41 q? 14:55:42 ... I certainly understand the problem 14:55:45 ack iv 14:56:02 ivan_: the epub cfi doc today is a spec of syntax 14:56:23 ... what's missing is a processing model, of what the machine should do 14:56:32 ... I cannot write that because I don't know enough 14:56:46 ... if you want to write it I can bring it into the w3c world 14:56:53 ... then it would be worth publishing 14:57:01 duga: that's what needed 14:57:12 ... I started to write that once and was told not to waste my time 14:57:26 q+ 14:57:33 ack dhall_ 14:57:34 ack dh 14:57:51 dhall_: if there are examples of a parallel algo description that would help me 14:57:57 ivan_: do you have that, brady? 14:58:03 Who uses EPUBCFI now? 14:58:11 duga: I modelled it on something in html 14:58:14 q? 14:58:44 ivan_: we did processing for publication manifest 14:59:22 https://www.w3.org/TR/pub-manifest/#manifest-processing 14:59:31 zakim, end meeting 14:59:31 As of this point the attendees have been rickj, ivan_, MasakazuKitahara, dauwhe, dhall, wendy, wendyreid, romain, bduga, CharlesL, dhall_, duga, billk, Bill_Kasdorf, gpellegrino, 14:59:34 ... makoto 14:59:34 RRSAgent, please draft minutes 14:59:34 I have made the request to generate https://www.w3.org/2022/10/07-epub-minutes.html Zakim 14:59:36 I am happy to have been of service, ivan_; please remember to excuse RRSAgent. Goodbye 14:59:38 CharlesL has left #epub 14:59:41 Zakim has left #epub 15:00:35 rrsagent, bye 15:00:35 I see no action items