EPUB 3 Working Group Telco — Minutes

Date: 2021-03-18

Attendees

Present: Shinya Takami (高見真也), Masakazu Kitahara, Toshiaki Koike, Matthew Chan, Wendy Reid, Brady Duga, Ben Schroeter, Matt Garrish

Regrets:

Guests: Dave Cramer, Dan Lazin, Rob Smith

Chair: Dave Cramer

Scribe(s): Matthew Chan

Dave Cramer: we have several new people here today

Dan Lazin: i’m a technical writer, documentation expert
… i could also maybe help with testing

Rob Smith: i’m from ebsco, we’re an educational content aggregator
… here representing our ebooks group
… particularly around a11y, and George’s gh issue about virtual page numbers

1. Virtual Page Numbers

See github issue #1542.

Dave Cramer: we discussed this issue on the last call without reaching a resolution
… we recognize the utility of page numbers, esp. in educational environments
… but different RS all do their own thing when faced with epub without pagelist
… but that’s not really something our WG can standardize
… any further comments to bring us up to speed?

Wendy Reid: we’ve had a lot of feedback from RS side, i.e. from Hadrien
… but does the utility of standardizing outweigh the difficulty?

Rob Smith: we have our own RS as part of our web platform, so we understand the difficulty
… in our market, page nums and addressable locations are critical for academic citations
… our buyers prefer PDF because of the predictable page numbers
… if you’re using epub without page numbers, the risk of plagiarism is higher because its harder to precisely locate citations

Matthew Chan: shiestyle addresses our Japanese members in Japanese

Shinya Takami (高見真也): toshiakikoike from Voyager (a jp RS) says that it is difficult to implement these features
… TOC serves a similar use case

Rob Smith: I don’t think a toc is suitable substitution
… the toc is too sparse for a book of, say, 300 pages
… the technical limitations are clear, but maybe we could have some sort of a non-normative recommendation?

Brady Duga: we already have pagelists so the publishers can implement that
… the question is what the RS should do in the absence of pagelist
… to provide consistent virtual page numbers

Dave Cramer: yes that’s it, but what steps should we take to get there?

Wendy Reid: i think one of the struggles is we technically have pagelist as an established method of doing this, but obviously its either not enough or there is something about it not meeting the need
… maybe we could use something else as a kind of standardized, periodic locator?
… this may fall under best practices, but it seems there is a clear desire for a standardized practice

Dan Lazin: it seems that page numbers specifically are an artifact of printed books
… in a lot of cases we want to preserve that correspondence to print
… but there are other ways to locate inside a book
… but also RSes have implemented their own solutions
… so, as an alternative, we could use, for example, paragraph numbering (i.e. what legal texts do)
… this wouldn’t interfere with the existing page number system

Matt Garrish: what George was looking at was an algorithm for inserting page numbers automatically
… but maybe this is a task for a separate task force, with the goal of creating a note or something
… rather than using our call time to do this

Dave Cramer: it doesn’t feel realistic to me to expect RS to change their existing way of doing things
… e.g. ibooks will count the number of screens and repaginate for font size
… and that goes against what a lot of other people want
… it seems possible that we could come up with a tool to do a thing to epubs and produce a pagelist from that in the absence of a pre-prepared one
… maybe a TF or the CG would like to take a shot at this
… changing RS behaviour is kind of outside spec
… but we could experiment with that tool as something to help out content creators

Wendy Reid: i think a TF is a good idea
… agreed that we probably won’t change behaviour for major RSes
… its an industry thing
… trade publishers probably won’t be as interested in this as educational side (VitalSource, Ebsco, etc.)
… we could well get implementors of this from the right crowd of publishers

Dave Cramer: does anyone want to try to put together a little script?
… okay, we’ll put together a task for to do these experiments

2. Normative Schemas

See github issue #1566.

See github pull request #1573.

Dave Cramer: the quick summary is that epub has historically had some normative schemas
… i think it would be a good idea to change to say that the spec is normative, and that any references to spec are informative only

Matt Garrish: that’s part of it
… Ivan particularly didn’t like that we were referencing across to epubcheck repo
… basically the PR that we have open now makes that schema section informative, and adds some description boxes to things we hadn’t previously defined

Proposed resolution: Accept PR 1573 (Wendy Reid)

Ben Schroeter: +1

Matt Garrish: +1

Toshiaki Koike: +1

Matthew Chan: +1

Masakazu Kitahara: +1

Wendy Reid: +1

Brady Duga: +1

Shinya Takami (高見真也): +1

Resolution #1: Accept PR 1573

Dave Cramer: this makes the distinction between spec and epubcheck clearer, which seems good for everyone

3. Data URLs

See github issue #1564.

Dave Cramer: where should data URLs be allowed in epub?
… seems to make sense that they shouldn’t be allowed in <a> that are not spine elements
… there are some interesting uses of data urls in CSS (e.g. for backgrounds)
… so where are these okay, and where shouldn’t they be okay?

Matt Garrish: if we allow them to be embedded, basically, we still can check if they are core media types
… so we still have a sense of what they are
… if they aren’t core media types, you can’t really provide a fallback to data URL, so they would be disallowed that way
… i have no particular issue with the embedding part
… but the <a> part opens security issues

Dave Cramer: so is it good enough to forbid the <a> usage, but to allow as embedded resource? Would that break epub check?

Matt Garrish: that would be possible

Brady Duga: i don’t think we spec away the security issues here
… i agree that not putting data urls in your <a> is good, and that RS should ignore them and not do anything
… if we do spec something, what do we tell RS to do?
… this seems more like a comment to authors that they shouldn’t do this thing

Dave Cramer: the problem is that i want this in epub check, and to do that we need the spec to say something
… by putting it in epub check many fewer RSes will have to deal with this at all
… can we just say that “RS must not perform navigation”?

Matt Garrish: do we have to spec out what the RS does?

Brady Duga: i’m okay putting this in just for epub check purposes…

Dave Cramer: and i’m okay writing a test for it, e.g. an RS fails if it tries to follow the <a>
… can we resolve on this?

Proposed resolution: embedded Data URLs are OK, but data URLs in <a> are forbidden; reading systems must not navigate to such URLs. (Dave Cramer)

Wendy Reid: +1

Toshiaki Koike: +1

Brady Duga: +1

Masakazu Kitahara: +1

Matthew Chan: +1

Dave Cramer: is the link element an issue here?

Matt Garrish: i wouldn’t think that it would matter, because there is no requirement for the RS to do anything

Ben Schroeter: 0

Brady Duga: what about in SVGs?

Wendy Reid: we’ll do some research

Matt Garrish: i can take a look at it, yes

Resolution #2: embedded Data URLs are OK, but data URLs in <a> are forbidden; reading systems must not navigate to such URLs.

4. TAG Explainer

See github pull request #1521.

Dave Cramer: I haven’t really worked on this in a while, but there are a bunch of edits still to be made based on Ivan’s comments
… if anybody else here has anything to add, that would be okay
… otherwise, let’s just defer this?

Wendy Reid: this PR is for an explainer that we submit to TAG alongside spec

Dave Cramer: its mostly about security and privacy issues
… after I make the edits for Ivan’s comments, we’re going to need input from everyone

Brady Duga: yes, I can take a look

5. FXL Accessibility

Wendy Reid: we’ve got significantly fewer open issues!
… there’s been a little bit of discussion about the fxl a11y stuff

Wendy Reid: See Wiki page on our repo.

Dave Cramer: See Ken Jones’ doc on FXL A11Y.

Dave Cramer: Ken wrote a document about that

Wendy Reid: this doc is a collection of some of the proposals we’ve gathered about FXL a11y
… won’t be part of spec
… falls more under best practice
… will probably end up as a note of some kind
… encourage anyone with interest to comment
… will probably have a meeting to discuss this, or a separate TF
… and we should start talking about a virtual face-to-face soon

Dave Cramer: George and Ben have posted about FXL a11y to email thread

6. AOB

Wendy Reid: we’re also thinking of doing a special session about scripting in epub
… hash out scripting-specific issues
… this would be more open, so that we could have RS and production people who aren’t in the WG participate

Dave Cramer: okay thanks everyone, it feels like we’ve made some progress today
… and remember this call’s time is pinned to JP time, but other call time is pinned to MIT campus
… so this call time has shifted for daylight savings, but other one has not

Wendy Reid: and everyone should have received calendar invites via email
… with the correct times

Dave Cramer: we’ll talk to you soon!

7. Resolutions

Resolution #1: Accept PR 1573
Resolution #2: embedded Data URLs are OK, but data URLs in <a> are forbidden; reading systems must not navigate to such URLs.