EPUB 3 Working Group vF2F, 1st Day — Minutes
Date: 2021-05-27
See also the Agenda and the IRC Log
Attendees
Present: Dave Cramer, Wendy Reid, Masakazu Kitahara, Ben Schroeter, Matthew Chan, Zheng Xu (徐征), Brady Duga, Shinya Takami (高見真也), Dan Lazin, Marisa DeMeglio, Ryo Kuroda
Regrets:
Guests:
Chair: Dave Cramer, Wendy Reid
Scribe(s): Brady Duga, Matthew Chan
Content:
- 1. satellite specs
- 2. What does it mean to “support” a foreign resource? (issue epub-specs#1464)
- 3. HTML Serialization
- 4. iframes and external content
- 5. Task Forces Update
1. satellite specs
Dave Cramer: https://w3c.github.io/publ-cg/
Wendy Reid: We have a long tail of specs
… Some we have discussed, eg multiple renditions, a11y, cfi
… Some we have not discussed
… We should figure out the fate of these docs
… Currently they are in the archives, and will always be there regardless of what we decide
… Some are now moved to notes because we are discussing them
… What do we do with the ones that we have yet to pull in?
… Could be leave them alone, could be to pull them all in
… personally, think we should only pull in the ones we plan to change/discuss, otherwise leave them alone
… If we leave them, they continue to exist but are not w3c docs
Dave Cramer: We have an obligation to inform people which of these are viable
… Most of them are not
… If you make an epub against one these specs, you will waste time since they are not implemented
Dan Lazin: Question about what we have pulled in. Looks like about 8
… Not clear what we are trying to do with them from a w3c specs perspective
Wendy Reid: All docs that are currently notes are not on the rec track
… They never have to become recs. The ones on the rec track are going through the rec process
… Notes can be elevated, eg CFI could move to rec track
Brady Duga: is it easy to make something a note?
… so it doesn’t hurt us to leave something where it is, right?
Wendy Reid: yes
Brady Duga: does leaving them in the archive achieve the end of signifying that these are dead?
Dave Cramer: i was thinking of something more than that
… especially with stuff like EDUPUB
Dan Lazin: I do want consistency and clarity
… It’s unclear if something is in an archive to muddle through the history and figure out if it is supported
… Fragmentation is bad, we should consider the user and pull in everything we think might be alive and add a note to the clearly dead ones
Wendy Reid: +1
Dan Lazin: Aim for being as tidy and helpful to new users as possible
Dave Cramer: Strongly agree
… It is easier to change things in w3c web space, as opposed to IDPF web space
… Feel more comfortable if we made the w3c stewardship more obvious
… Might be good to point to w3c from idpf page, then pull in all specs as notes, with a specific comment about status
Wendy Reid: A single change that says “all docs are moved to www.xxx.yyy” is easier than lots of changes
Dave Cramer: Even the IDPF home page isn’t that clear that it is over
… Would like to get a feel from the WG what the general principles are
… then on chairs calls we can decide on specifics, since there is a lot of w3c admin stuff
Wendy Reid: Question for the group, what do we think the best direction is?
… Migrate all to w3c space with updates with disclaimer at IDPF, OR do we leave them in IDPF space?
… Currently epub search goes to w3c
… Of the current list, migrated are 3.3, 3.2, a11y, multiple renditions, and CFI
… all are note or rec track
… not moved are legacy (3.1, etc) and a whole bunch of docs we all forgot about
Dave Cramer: Scriptable components never really implemented, Adaptive stuff was too CSSy
Brady Duga: Softpress(?) implemented a bunch of these satellite specs
Wendy Reid: Criteria is, will we use it?
… we could pull them all in and add a deprecation note
Dave Cramer: Pull over if we need it makes sense and it is less work overall
… but batch change might make it clearer in the idpf site
Brady Duga: i’m leaning towards pulling everything in, and adding deprecation notice
… because right now its not clear who owns these specs now
… on the other hand, this could clutter up W3C notes
Dave Cramer: I agree
Wendy Reid: No matter what, there will be a note saying it is a dead spec
… Real question is do we want to do it at w3c or idpf
… logistical question is where it is easiest/best to do it?
… takeaway is go to Ivan and see what is easiest
Dave Cramer: Makes sense, we have an end in mind, need to talk to Ivan and Matt about the means
2. What does it mean to “support” a foreign resource? (issue epub-specs#1464)
See github issue #1464.
Dave Cramer: When writing some spec tests, one of the foundational aspects is a core media type
… ie something that does not need a fallback
… so manifest fallbacks are also foundational
… Wrote some fanciful tests with docbook, binary files, etc
… No reading system implemented manifest fallbacks
… Some of the behavior was not ideal for the end user
… for instance a .dmg file was downloaded to the local system [Scribe note: !!!]
… Looks like everyone just throws it at a webview and let the webview handle it
… Even Readium doesn’t handle them
… What are the implications of a core RS feature not working in the real world?
Shinya Takami (高見真也): In Japan, in some cases manifest fallback is implemented, but may be domain specific
… Would like to discuss with Voyager people in Japanese
… [Japanese]
Masakazu Kitahara: [via shiestyle] Voyagers RS does not support this in RSes, but in some places we do use the feature in Japan
Brady Duga: we have two pipelines at Google, 1 for publishers, and 1 for people sideloading
… the one for publishers is better for support for this type of thing
Wendy Reid: Don’t think it is supported by Kobo
Brady Duga: dauwhe can you make your sample epubs available?
Dave Cramer: yes, i’ll let you know where
Dan Lazin: I have some tests that are not checked in, because I don’t know what the proper behavior is supposed to be
… Do we need a graveyard for these sorts of things? Since I can’t set the “does it pass” field
Dave Cramer: The tests are great, as it points to where we should be investigating
… Seems like a case I would like more semi official information on what RSes support/claim to support with regard to manifest fallbacks
… If no one implements it, we need to have hard conversations
… If there aren’t implementations we need to remove it
… But need to keep concept of core media type
Wendy Reid: Since we now have several tests without clear passes
… Does it make sense to make a list and send the tests out to the community pre-CR
… Typically this is done during CR phase, but would be good to know where we are in trouble
Dan Lazin: To pursue, in my test sheet, I have some blank cells so we can add notes there
… Can use that as a way to corral a list
Brady Duga: this sounds like a problem, but if nobody has implemented manifest fallbacks, then maybe we just say that you can only use core media types in the spine, period
Dave Cramer: Part of this is how epub has changed over time
… used to have the idea of general epub container, but that is not what really happened
… Have several next steps to gather info and look into hard to test tests
… Have enough useful take aways
Ben Schroeter: Where did we leave the first conversation?
Dave Cramer: We have the goal of communicating the status, but where to do that is still an open question
… Will check with Ivan and Matt to figure out the best path forward
… Break now?
Wendy Reid: Yes!
Dave Cramer: Brady can eat dinner
Wendy Reid: Will reconvene at the upcoming whole hour
3. HTML Serialization
See github issue #636.
Dave Cramer: i’ve made epubs that have used HTML that was not well-formed XHTML
… sometimes it works
… but we hear that RS is sometimes built using XML toolchains, and would have to be reworked if they can’t expect well-formed XHTML
… there are arguments in favor, but I have not felt large support from authors, publishers, or RS
Brady Duga: sounds like a lot of work, so maybe we should wait until there is need for it
Dave Cramer: in my mind epub 4 doesn’t have to worry about backwards compatibility with epub 3, chief among which is support for XHTML
Shinya Takami (高見真也): i have no objection, but we have to consider compatibility with existing epubs
… so we must differentiate between HTML5 epub and XHTML5
… now might be a good time to start talking about how spec would have to change
Dave Cramer: we would never mandate a change to HTML5
… compatibility issues would arise where people begin to try to open new HTML5 epubs in older RSes
… has Kobo looked into this?
Wendy Reid: when I looked into it I specifically talked to ingestion side (where most of the problems would pop up)
… they said it probably wouldn’t be too bad
… we’d have to add in new libraries for parsing HTML, and maybe do some additional validation
… not impossible, but work would be involved
… agree that we might have to identify HTML5 epubs separately
… speaking for Kobo, we have a long tail for device support
Dave Cramer: comment from Daniel Weck was that Readium would experience several issues if we did this
… also issue with epub:type
, given that it is namespaced, it won’t carry over to HTML serialization
… another issue was about CFI, which uses a very Xpath like syntax to point to places in XML files
… concern that that might break in HTML serialization
… but it might break in XHTML serialization too (e.g. parser inserting tbody element into the DOM)
Brady Duga: CFI was intended to work with the text, not with the DOM
… so yes, that issue could arise
… this is a hard topic because of how much work would be involved, and the lack of a clear reason to do it, especially vs all the other features we could be working on
Dave Cramer: the other argument that has been put forward for HTML is that HTML tools could be used, but i’ve never seen an example of one that works on HTML and breaks on XHTML
Proposed resolution: Defer HTML serialization to EPUB 4, close issue 636 (Wendy Reid)
Brady Duga: if we’re deferring, should it be closed?
Brady Duga: +1
Shinya Takami (高見真也): +1
Wendy Reid: +1
Masakazu Kitahara: +1
Ben Schroeter: +1
Wendy Reid: i think the idea for epub 4 is that we start with clean slate
Dave Cramer:
Matthew Chan: +1
Marisa DeMeglio: +1
Wendy Reid: not resolved yet. Will return to this with tomorrow’s group.
4. iframes and external content
See github issue #1061.
Dave Cramer: example use case would be embedding youtube video in iframe
… right now epub doesn’t like this
… there are security issues here too
… this issue was raised almost 3 years ago, but I haven’t felt strong push for this from publishers, and I’m not sure why
Ben Schroeter: I think publishers may be doing it already
Dave Cramer: I think there is a fairly high amount of epubs with video content in higher learning books
Brady Duga: but are those videos embedded in the epub, or a link to a youtube video that is supposed to open in a player?
… from a security perspective, we disable networking in our webviews (at least on Android)
… so this would be a fairly major change
Shinya Takami (高見真也): the epub spec should allow this external content, and RS should take care of making associated alerts to users
… if we allow this sort of foreign resource, then RS should check this for users
… the ability to have this sort of external content could be valuable to users, but for security reasons, RS should take care of the related risks
Dave Cramer: that puts a lot of burden on RS to evaluate every URL of this nature and try to figure out if it is safe or not
Shinya Takami (高見真也): the alert could just tell users that they are about to access a foreign resource, and that it may be dangerous
Brady Duga: users might be scared away by such alerts
… and enabling network access means more than dealing with the linked URL
… it could also present privacy issues in addition to the security issues
… more so if we also enable scripting
… authors could be trying to report all sorts of user behaviour
Wendy Reid: agreed
… we’d then have to address all of these issues in the security and privacy review
… in the issue Ken presented use cases, e.g. RS sending quiz results to a server somewhere
… a valid use case, but also one which suggests an untold number of nefarious uses
… within something like the VitalSource system, the user knows they can trust foreign resources also in the system
… but the more generalized use cases are risky
Shinya Takami (高見真也): how about adding features to RS to allow user to toggle whether to permit or deny things like foreign resources, scripting?
Wendy Reid: i think that’s outside of scope. We can’t tell RS how to handle privacy.
… and sure, users could then choose to provide informed consent, but it presents an uncomfortable situation for both the user, and probably the publisher
Brady Duga: it also assumes that the user is legally able to give consent to whatever might happen
Dave Cramer: in some circles i’ve heard concern over how web handles this sort of issue today (e.g. ubiquitous cookie consent pop-ups)
… but we will continue this discussion tomorrow. It doesn’t feel like we are close to consensus right now. But we’ve raised good desires and concerns
5. Task Forces Update
Wendy Reid: https://w3c.github.io/epub-specs/epub33/fxl-a11y/
Wendy Reid: https://w3c.github.io/epub-specs/epub33/locators/
Wendy Reid: the TFs are both developing their own WG notes
… the FXL a11y group is writing documentation to provide concrete guidance on how to produce more accessible FXL content
… the link above will give you idea of structure and topics
… the other TF is debating re-using CFI as a locating method
… after a lot of discussion we’ve decided to pause, and put together a list of use cases to understand the problem space of locations in epub
… and we’re working towards potentially coming up with some sort of algorithm that creates segments within an epub that resemble pages but are not pages
… i.e., a fallback location scheme should a pagelist not be provided (because epub is digital first and has no physical version, or they just left out pagelist to physical correspondence)
… document will also have solutions for those use cases
… might result in us recommending that we add something to spec, but we’re not there yet
Dave Cramer: about that algorithm, would it be a script that use you on an epub? Or is it something that RS implements?
Wendy Reid: it would be script run by author, or maybe at distribution level
… we want the locations to be as platform agnostic as possible
… and we’d want to locations to be roughly identical to each other if the same book is distributed through various different platforms
Dave Cramer: is the FXL TF still working on the idea of alternate style sheets?
Wendy Reid: yes, and we have an explainer doc for that
… but it is in very early stage of draft
Wendy Reid: See Visual to Textual Explainer
Wendy Reid: there are lots of a11y solutions for people who use AT to parse DOM, but not many solutions for low vision users
… this proposal creates a method by which content creators can provide secondary style sheet for their FXL, that would allow compatible RS to switch reading mode from FXL to reflow
… then user could change font size, font face, etc.
… while preserving the reading order
… but there is still a lot to hash out
Dave Cramer:
@media (-epub-prefers-reflow: reflow) { /* styles for reflow */ @viewport { size: auto; } body { width: auto; height: auto; } p, div, h1, h2 { position: relative; } }
Wendy Reid: one of the challenges we have is EUAA, which comes into force in 2025, and specifically states that content should be alterable by user
… under those requirements FXL would not pass as it is today
… lends additional importance to something like this
Dave Cramer: interesting, might be able to do link rel=alternate stylesheet
… also media queries for user preferences (prefers dark scheme, prefers low motion)
… so a lot of this might be able to be achieved using existing web technologies
Brady Duga: this is not legal advice from me nor my employer, but I think there is carveout in EUAA for content that cannot be reasonably modified to comply
… e.g. a 1967 Spiderman comic book
Wendy Reid: right, like some kind of undue burden clause
… one of the challenges is just the breadth of content that becomes FXL
… e.g. novels that are FXLs
… use cases like that will probably have a hard time bringing themselves within that carveout
… this method will not work for all types of content, but for things like cookbook or textbook, where there is a lot of text alongside pictures, it could work
Brady Duga: it could work really well for some books
… e.g. where publisher feels that layout is critical to the book, but reflow just makes it easier to read to end user
Dave Cramer: is there an issue where a spread extends across both pages?
Wendy Reid: i wonder if we’re going to have to provide some sort of stitching hint
Dave Cramer: i see implementation issues with this
Wendy Reid: the other question was what to do with the images
… in a particularly image heavy book, there might be a background image, with some images overlaid, and then a block of text too
… so what happens if the background image is semantically relevant?
… or what happens if a single image runs across the gutter in a spread?
… might lose some of the meaning
Dave Cramer: we’ll continue tomorrow