TPAC Summary: DPUB F2F Day 1 (Overview, Anno, EPUB 3.1, CSS, POM, ServiceWorkers)

Full minutes are available

What a turnout! We had about 10 of the regular IG participants in Sapporo. At all points, we had at least 20 people present, sometimes closer to 30. This shows the growth and impact of the DPUB IG. Ivan commented that in Shenzhen (just 2 years ago), few had heard of us. Dave pointed out that of all things happening at TPAC (and there are so many things happening at TPAC at once), several people considered DPUB to be the most interesting. Maybe it was the cream puffs and Pocky! Thanks to all who contributed, scribed, memed, called in, and provided Japanese sugar.
After three days of meetings with others and two days of DPUB meeting, my biggest take away for DPUB is that we don’t yet have a clear idea of what the PWP manifest must/may/should a manifest include. Without understanding this, it is difficult to move forward with several of the topics discussed. So, we have a lot of work ahead of us, we have accomplished a lot already. Here’s a summary of a great few days.

Annotations
Rob Sanderson provided overview of Annotations WG model. We discussed TextFinder API (formerly Rangefinder API), which accomplishes both search and locate in the URL. Doug Schepers explained that this stores hashes not strings. Character offsets are possible. The group is also exploring other selectors, including XPath and CSS Selectors. DPUB and Anno should remain in contact, especially if we know of real world implementations.

Summary of current work:
Take a look at the minutes to see how much we have already accomplished and how much is in progress. Here is a quick list:

  • PWP: we outlined a vision. Now we must work toward functionality
  • CSS: published modules, priorities list, keep the highly-informed input coming
  • DPUB-ARIA: module exists. People are eager to use it in EPUB world as well as in scholarly publishing
  • A11y TF: working with ARIA WG to get extended descriptions right and then point out other issues specific to publishing or where publishing can contribute
  • STEM TF: a lot of exploratory work, next steps will probably be around the future of math on the web. Major outcome is that there is a need for those with understanding of Math/MathJax/polyfills to talk to Houdini
  • Metadata: published interviews, learned that publishers use metadata heavily, need some rights expression/management, and maybe make metadata more aligned w OWP

This became a fascinating discussion about intersection of Math and CSS and the need for communication between those who will implement Houdini and those working on Houdini. End result: MathJax has done a great deal of research regarding MathML and polyfills. Houdini wants to know and wants to talk you.

EPUB 3.1 overview
Dave and Tzviya offered an overview of the ongoing work at IDPF on the first major revision of EPUB (see slides. Some suggestions from this discussion:

  • Reconsider schedule (note: Ivan pointed out that IDPF and W3C have different modes of working, and this was not really up for discussion)
  • Bring in libraries, especially wrt metadata (Heather and Lars offer to help)
  • Do not deprecate elements. Kill them. Deprecation will cause problems
  • Assess what is the interoperable core of EPUB 3.0.1 to determine the best way to move forward with EPUB 3.1.
  • CSS Profile: snapshot may not be best option because it includes CSS specs that are rec level or almost rec level. It would be unwise to require all UAs to support all of snapshot. Good starting point though.

Meeting with CSS WG
Dave Cramer led a discussion of CSS priorities. He chose to skip the topic of pagination, because it’s too big. The group covered several topics, and the CSS WG wants more detailed examples from DPUB for all of these items. It is important for DPUB to file bugs as well. Need help with samples or filing bugs? We have several members of CSS WG among us, and they are friendly. Don’t hesitate to ask.

  • Table alignment: CSS WG asks DPUB what is missing? Send your sample tables to Dave and Florian. (note that David Baron filed issues on https://drafts.csswg.org/css-text-4/#character-alignment while we spoke)
  • a11y of generated content: There is concern that generated content is not accessible. CSS WG concludes this is an implementation bug, and DPUB should file implementation bugs.
  • Hyphenation control: There was much discussion about parameters that control hyphenation, line breaking, line balance, and how this affects performance. Discussion pointed to this being an issue with line breaking, not hyphenation, which means that it would not affect performance and is an issue for Houdini.
  • Keeping image and caption together in paged view: This pains the publishing industry. Fantasai wrote some CSS using flexbox. Dave is testing it out.

Actions: DPUB should not hesitate to file bugs. If you need help, ask members of CSS WG. If they don’t know about issues, they can’t fix them. Provide specific examples, not just complaints. Explain reasoning, not just requests. Communicate often. (These are friendly people who also want beautiful typography.)

POM
Daniel Glazman presented his proposal for implementing POM, a Publication Object Model for EPUB. It is a framework for resources, packaging, metadata, authoring, and reading on the Web with an abstraction layer to hide the “publication manager interface”. This layer can reach the individual resources as needed. Individual resources can be anything: HTML, PDF, futureFormat. The only thing common to all publications is the manager. The specific file type is a plugin to POM. There will be an open source framework that implements POM in C++, JavaScript, and potentially other languages, such as Swift and Python.
Next steps: Assess different publication formats, what is common to all of them, how the components are connected. Daniel created POM CG.

Service Workers
Guest: Jake Archibald, picking up where we left off after IG call on 19 Oct.
The group asked Jake many questions about Service Workers, including whether our thought experiment in PWP makes sense. One hot topic is that SW requires https, not local file. If it’s localfilehost, it can be http. SW cannot use file: protocol. There is a same-origin policy for the Worker script, and the requests intercepted by the SW are limited to a scope defined by the script location (although as Jake pointed out, this can be configured via the “Service-Worker-Allowed” HTTP header).
Daniel Weck explained how a prototype implementation of the Readium web reader uses SW to intercept HTTP requests for files located inside EPUB archives (using a URL path convention like http://domain.com/ebook.epub/META-INF/container.xml). HTTP responses are created by extracting / inflating file contents on-the-fly (using a Javascript zip library and HTTP byte range requests). As a result, the web browser can consume EPUB-bundled resources transparently via “normal” URLs, with no reliance on the “Blob URI trick” currently used by Readium for accessing files inside EPUB archives (which requires complex prefetching and pre-processing, thereby interfering with webview features such as caching and streaming).See the Readium repo for details.
Readium’s experiment with SW aims at transparently extracting + serving packaged content (i.e. from a zip archive), based on a “deep” URL syntax that references bundled resources. By contrast, the proposed use of SW in PWP focuses on enabling a seamless online-offline reading experience for regular content URLs. The PWP use-case requires some sort of offline local storage. There is a cache API accessible within SW, so this can be used to manage offline copies of remote resources.
There was also much discussion of benefits of SW for publications wrt security considerations and the ultimate goals of the industry. Jake requests use cases, specifically what is hard on web vs native + web view? As hybrid native apps do not normally need a built-in HTTP server to feed content to the webview, and as native URL protocol handlers can be implemented to manage access to bundled or filesystem resources, what is the role of SW in this context?
After the session, Jake created an offline-enable publication available at https://github.com/jakearchibald/ebook-demo. Test it and send your feedback.

2 Responses to TPAC Summary: DPUB F2F Day 1 (Overview, Anno, EPUB 3.1, CSS, POM, ServiceWorkers)

  1. Pingback: New PWP Draft Published | Digital Publishing Activity News

  2. Pingback: DPUB IG Telco, 2015-11-23: PWP Locator task force, planning on PWP Work | Digital Publishing Activity News