TPAC Summary: DPUB F2F Day 1 (Overview, Anno, EPUB 3.1, CSS, POM, ServiceWorkers)

Full minutes are available

What a turnout! We had about 10 of the regular IG participants in Sapporo. At all points, we had at least 20 people present, sometimes closer to 30. This shows the growth and impact of the DPUB IG. Ivan commented that in Shenzhen (just 2 years ago), few had heard of us. Dave pointed out that of all things happening at TPAC (and there are so many things happening at TPAC at once), several people considered DPUB to be the most interesting. Maybe it was the cream puffs and Pocky! Thanks to all who contributed, scribed, memed, called in, and provided Japanese sugar.
After three days of meetings with others and two days of DPUB meeting, my biggest take away for DPUB is that we don’t yet have a clear idea of what the PWP manifest must/may/should a manifest include. Without understanding this, it is difficult to move forward with several of the topics discussed. So, we have a lot of work ahead of us, we have accomplished a lot already. Here’s a summary of a great few days.

Rob Sanderson provided overview of Annotations WG model. We discussed TextFinder API (formerly Rangefinder API), which accomplishes both search and locate in the URL. Doug Schepers explained that this stores hashes not strings. Character offsets are possible. The group is also exploring other selectors, including XPath and CSS Selectors. DPUB and Anno should remain in contact, especially if we know of real world implementations.

Summary of current work:
Take a look at the minutes to see how much we have already accomplished and how much is in progress. Here is a quick list:

  • PWP: we outlined a vision. Now we must work toward functionality
  • CSS: published modules, priorities list, keep the highly-informed input coming
  • DPUB-ARIA: module exists. People are eager to use it in EPUB world as well as in scholarly publishing
  • A11y TF: working with ARIA WG to get extended descriptions right and then point out other issues specific to publishing or where publishing can contribute
  • STEM TF: a lot of exploratory work, next steps will probably be around the future of math on the web. Major outcome is that there is a need for those with understanding of Math/MathJax/polyfills to talk to Houdini
  • Metadata: published interviews, learned that publishers use metadata heavily, need some rights expression/management, and maybe make metadata more aligned w OWP

This became a fascinating discussion about intersection of Math and CSS and the need for communication between those who will implement Houdini and those working on Houdini. End result: MathJax has done a great deal of research regarding MathML and polyfills. Houdini wants to know and wants to talk you.

EPUB 3.1 overview
Dave and Tzviya offered an overview of the ongoing work at IDPF on the first major revision of EPUB (see slides. Some suggestions from this discussion:

  • Reconsider schedule (note: Ivan pointed out that IDPF and W3C have different modes of working, and this was not really up for discussion)
  • Bring in libraries, especially wrt metadata (Heather and Lars offer to help)
  • Do not deprecate elements. Kill them. Deprecation will cause problems
  • Assess what is the interoperable core of EPUB 3.0.1 to determine the best way to move forward with EPUB 3.1.
  • CSS Profile: snapshot may not be best option because it includes CSS specs that are rec level or almost rec level. It would be unwise to require all UAs to support all of snapshot. Good starting point though.

Meeting with CSS WG
Dave Cramer led a discussion of CSS priorities. He chose to skip the topic of pagination, because it’s too big. The group covered several topics, and the CSS WG wants more detailed examples from DPUB for all of these items. It is important for DPUB to file bugs as well. Need help with samples or filing bugs? We have several members of CSS WG among us, and they are friendly. Don’t hesitate to ask.

  • Table alignment: CSS WG asks DPUB what is missing? Send your sample tables to Dave and Florian. (note that David Baron filed issues on while we spoke)
  • a11y of generated content: There is concern that generated content is not accessible. CSS WG concludes this is an implementation bug, and DPUB should file implementation bugs.
  • Hyphenation control: There was much discussion about parameters that control hyphenation, line breaking, line balance, and how this affects performance. Discussion pointed to this being an issue with line breaking, not hyphenation, which means that it would not affect performance and is an issue for Houdini.
  • Keeping image and caption together in paged view: This pains the publishing industry. Fantasai wrote some CSS using flexbox. Dave is testing it out.

Actions: DPUB should not hesitate to file bugs. If you need help, ask members of CSS WG. If they don’t know about issues, they can’t fix them. Provide specific examples, not just complaints. Explain reasoning, not just requests. Communicate often. (These are friendly people who also want beautiful typography.)

Daniel Glazman presented his proposal for implementing POM, a Publication Object Model for EPUB. It is a framework for resources, packaging, metadata, authoring, and reading on the Web with an abstraction layer to hide the “publication manager interface”. This layer can reach the individual resources as needed. Individual resources can be anything: HTML, PDF, futureFormat. The only thing common to all publications is the manager. The specific file type is a plugin to POM. There will be an open source framework that implements POM in C++, JavaScript, and potentially other languages, such as Swift and Python.
Next steps: Assess different publication formats, what is common to all of them, how the components are connected. Daniel created POM CG.

Service Workers
Guest: Jake Archibald, picking up where we left off after IG call on 19 Oct.
The group asked Jake many questions about Service Workers, including whether our thought experiment in PWP makes sense. One hot topic is that SW requires https, not local file. If it’s localfilehost, it can be http. SW cannot use file: protocol. There is a same-origin policy for the Worker script, and the requests intercepted by the SW are limited to a scope defined by the script location (although as Jake pointed out, this can be configured via the “Service-Worker-Allowed” HTTP header).
Daniel Weck explained how a prototype implementation of the Readium web reader uses SW to intercept HTTP requests for files located inside EPUB archives (using a URL path convention like HTTP responses are created by extracting / inflating file contents on-the-fly (using a Javascript zip library and HTTP byte range requests). As a result, the web browser can consume EPUB-bundled resources transparently via “normal” URLs, with no reliance on the “Blob URI trick” currently used by Readium for accessing files inside EPUB archives (which requires complex prefetching and pre-processing, thereby interfering with webview features such as caching and streaming).See the Readium repo for details.
Readium’s experiment with SW aims at transparently extracting + serving packaged content (i.e. from a zip archive), based on a “deep” URL syntax that references bundled resources. By contrast, the proposed use of SW in PWP focuses on enabling a seamless online-offline reading experience for regular content URLs. The PWP use-case requires some sort of offline local storage. There is a cache API accessible within SW, so this can be used to manage offline copies of remote resources.
There was also much discussion of benefits of SW for publications wrt security considerations and the ultimate goals of the industry. Jake requests use cases, specifically what is hard on web vs native + web view? As hybrid native apps do not normally need a built-in HTTP server to feed content to the webview, and as native URL protocol handlers can be implemented to manage access to bundled or filesystem resources, what is the role of SW in this context?
After the session, Jake created an offline-enable publication available at Test it and send your feedback.

DPUB IG Telco, 2015-10-19: Service Workers, extended description analysis

See minutes online for a more detailed record of the discussions. (The headers below link into the relevant sections of the minutes.)

Overview of Service Workers

Jake Archibald, one of the editors of the Service Workers Draft, was a guest of the meeting, and gave a short overview of Service Workers.

(The reason of this discussion is because Service Workers have been identified as one of the possible means to create a PWP Architecture.)

SW itself is just a javascript runtime that can operate on a separate thread. Key difference in life cycle is that it can spin up without the existence of pages on the origin. The starting point for the web becomes the SW rather than the page. One of the key features is offline because you get a fetch event for every request the page makes, so you can create an offline experience by listening for fetch events. You get to choose what to do, the default being nothing, but you can create a response, a string or blob and send that back, you can fetch things from named caches, from IndexedDB, etc. The Gold standard of this type of app development is “offline first”, i.e., create the offline experience before even attempting to go to the network, seeing the network as progressive enhancement. The aim is to ship stuff from the caches as quickly as possible, and then go to the network. Lots of user experience we are trying to figure out now, but the SW specification doesn’t make any of these decision.

Service workers can also be used to unpack content on the fly, whether the packed content is in some ZIP format or other packaging format (the issue around the streaming ability of ZIP and other formats came up during the discussion).

There were some discussion about the availability of SW in browsers (is true for Chrome and Firefox, Microsoft Edge is in the process of development, nothing is known of Safari). There are also plans to create more tutorials and introductory texts for the specification.

Subsequent discussions on the meeting concentrated on the experience using Service Workers. Dave Cramer, from Hachette, has already played with this with a good first impression. The Readium consortium has also tried using it, and Daniel Wreck, from the consortium, shared his experiences and questions. The Readium proof-of-concept implementation is able to handle an EPUB content that is exploded on the web server or is able to do some ZIP unpacking (the current implementation does not do caching, only fetch intercepts). There were issues about same-origin constraints, usage of HTTPS as opposed to HTTP.

Extended Description Analysis

The PF Working group has created a document analyzing the various proposed content description techniques for accessibility purposes (i.e., usage of longdesc, aria-describedat, <detail>, etc. That document aimed describing the various needs of the publishing industry; that was now reviewed by Deborah and Mia, primarily, with comments from others. The document is now ‘back’ to the PF Working Group.

Posted in Activity News | Comments Off on DPUB IG Telco, 2015-10-19: Service Workers, extended description analysis

Two W3C drafts on annotations published

  • New Working Draft of Web Annotation Data Model. Annotations are typically used to convey information about a resource or associations between resources. Simple examples include a comment or tag on a single web page or image, or a blog post about a news article.
  • First Public Working Draft of FindText API. The FindText API specification describes an API for finding ranges of text in a document or part of a document, using a variety of selection criteria.
Posted in New W3C documents | Comments Off on Two W3C drafts on annotations published

CSS Snapshot 2015

The Cascading Style Sheets (CSS) Working Group has published a Group Note of CSS Snapshot 2015. This document collects together into one definition all the specs that together form the current state of Cascading Style Sheets (CSS) as of 2015. The primary audience is CSS implementers, not CSS authors, as this definition includes modules by specification stability, not Web browser adoption rate.

DPUB IG Telco, 2015-10-12: Portable Web Publication publication and outreach, F2F planning

See minutes online for a more detailed record of the discussions. (The headers below link into the relevant sections of the minutes.)

Portable Web Publication publication and outreach

Following the discussion last week the PWP Editor’s draft has been updated. Ivan Herman gave a status report; the participants unanimously decided to publish the document as a First Public Draft. The plan is to publish on the 15th of October.

There was a discussion on the outreach following the publication. It has been agreed that the outreach activities should not be huge (e.g., press release) and that a set of blogs and home page news at W3C and at IDPF should suffice. Some points of necessary emphasis in the outreach message came to the fore:

  • explaining the process, i.e., what is a WD, how to provide input, etc
  • what the relationship of the work around PWP is with the EPUB3.1 work at IDPF
  • background both for publishers (who may not know about W3C) and the Web community (who may not know about publishing)
  • position of W3C relative to the rest of the publishing community

Because this is the Frankfurt Book Fare week, it is probably better to set the dates for the outreach sometimes next week. There will be a regroup of volunteers about this by the end of the week

F2F planning

The agenda of the F2F meeting at the W3C TPAC meeting is shaping up. The group spent some time on working out some details, plan for some more meetings (if possible) and determine session chairs.

DPUB IG Telco, 2015-10-05: Portable Web Publication Draft, CSS Inline, Extended Description Analysis

See minutes online for a more detailed record of the discussions. (The headers below link into the relevant sections of the minutes.)

Portable Web Publication Draft

The PWP Editor’s draft has been updated since like week. Ivan Herman gave a status report, whereby:

  • The term ‘Publication’ has replaced ‘Document’ overall
  • The terminology for states have been incorporated into the document
  • The EPUB dependencies have been removed; instead, a separate section has been added on the relationships to EPUB

There were some discussions on the last issue during the call. It was agreed that

  • The section on EPUB relationships should be a full-blown section and not an appendix
  • That section included a paragraph making it clear that the content of a PWP are (just like EPUB) primarily Open Web Platform resources. That content stays where it is, but the section on terminology should also include some very clear statement in the same direction

It has been agreed that these changes will be done within 1-2 days, and that the rest of the group can look at, and comment by email, between now and next Monday. The goal is to make a formal decision on next Monday to publish a First Public Working Draft

CSS Inline

Dave Cramer has reported that the Initial Letter features have been implemented in the latest version of Safari, and it ships in IOS9 as well as Mac OS “El Capitan”. Although there are some bugs, this is still a great step forward. Unfortunately, tests are still missing, and continuing work on internationalization is necessary (what are the Initial Letter features in, say, Arabic?).

Extended Description Analysis with PF

Michael Cooper published an Extended Description Analysis with PF, that outlines proposed approaches to provide extended descriptions (@longdesc, aria-describedby, etc), in view of making a final decision in which direction the ARIA work would decide to go in this respect. The Accessibility task force has already looked at the table, and will produce comments to the PF Working Group before TPAC


The group discussed the schedules and joint meeting plans with other groups in view of the W3C Technical Plenary in a few weeks. The group also decided to hold its meeting next week, although that day is Columbus day in the US.

Posted in Activity News, Meeting reports | Comments Off on DPUB IG Telco, 2015-10-05: Portable Web Publication Draft, CSS Inline, Extended Description Analysis

DPUB IG Telco, 2015-09-28: Glossary, Portable Web Publications FPWD

See minutes online for a more detailed record of the discussions.


Development of the Digital Publishing Glossary has been ongoing for a while now, and the IG discussed remaining outstanding issues. There was full agreement that we need to be very clear that the terms included should be used and understood only within the scope of DPUG IG publications and discourse; this since there is a considerable overlap between terms used here and the same terms in other domains (e.g. “document”, “resource”). To reduce the risk of cross-domain confusion, it was agreed as a first step to rename one of the core terms, “portable web document”, to “portable web publication” (PWP). It was also agreed to highlight the need for single-URI identifiability of the set of resources that make up a publication as one of the differentating factors between a PWP and an ordinary website. Regarding states of portable web publications, it was agreed to use protocol vs file system API access instead of the online/offline distinction.

Some further work on the glossary remains, but in general the IG believes we can soon call it done (as a first version) and consequently move on with our lives and other IG activities.

Towards FPWD of Portable Web Publications (formerly EPUB-WEB)

The EPUB-WEB whitepaper has been renamed and revised with the intent of using the terminology of the glossary and through that achieve a lesser dependence on EPUB-specific terminology and technology. The new version is currently being reviewed by the IG, and after another week or two of additional edits we hope to publish it as a FPWD. The IG will discuss the document’s status and outstanding issues on the coming week’s call.

New CSS Grid Layout, Inline Layout and Page Floats Drafts

The W3C Cascading Style Sheets (CSS) Working Group has published three Working Drafts:

CSS Grid Layout Module Level 1: This CSS module defines a two-dimensional grid-based layout system, optimized for user interface design. In the grid layout model, the children of a grid container can be positioned into arbitrary slots in a flexible or fixed predefined layout grid.

CSS Inline Layout Module Level 3: The CSS formatting model provides for a flow of elements and text inside of a container to be wrapped into lines. The formatting of elements and text within a line, its positioning in the inline progression direction, and the breaking of lines are described in CSS3TEXT. This module describes the positioning in the block progression direction both of elements and text within lines and of the lines themselves. This positioning is often relative to a baseline. It also describes special features for formatting of first lines and drop caps. It extends on the model in CSS2.

CSS Page Floats: This document describes floats that move to the top or bottom of content passages. This feature has traditionally been used in print publications in which figures and photos are moved to the top or bottom of columns or pages, along with their captions. This draft describes how to achieve this effect for floats within pages, columns, regions and elements.

Posted in New W3C documents | Tagged | Comments Off on New CSS Grid Layout, Inline Layout and Page Floats Drafts

DPUB IG Telco, 2015-09-21: Publication Object Model, MathML

See minutes online for a more detailed record of the discussions. (The headers below link into the relevant sections of the minutes.)


Daniel Glazman presented his ideas of an API to access contents of publishing packages (like EPUB). Daniel is implementing an EPUB3 editor (Bluegriffon). When you create a tool like that and, for example, you want to add metadata, you have to follow a chain of ID-s and IDREF-s repeatedly. The same for adding a file, delete a file, etc. The code you have to write is huge and repetitive. This may have hindered the adoption of EPUB3. If we had an EPUB object model, with an open source implementation and polyfill this would help deployment; right now everyone has to implement everything from scratch.

Although the current development is focused on EPUB, the idea is more general. It would be a two-layered approach: a general layer to access essential constituents within a publication package, and then an lower layer mapping this to EPUB, generic ZIP, MOBI, whatever else. The API would also connect to other API-s underneath: i.e., if the API gets to an HTML resource within the package, it would return a DOM tree of the HTML content. The API should not be bound to Javascript; it is currently drafted in Web IDL.

The discussions led to the agreement that an initial draft specification could be published as an IG Note by this Interest Group and, later, we can see how to gather enough interest and find the right format for a final specification, if there is an agreement to do so.

(Subsequent discussions converged towards the usage of “Publishing Object Model” or “Publication Object Model”, i.e., POM, as a term to be used.)

MathML situation II.

Peter Kreutzberger, from MathJax, continued the discussion started the last week.

Based on the current situation around deployment, a better direction for implementation and deployment is emphasize the server side: an implementation would turn MathML into clever HTML+CSS, meaning that the deployment effort is more closely based on the renderers as developed by the browsers. Although client-side implementations would not disappear, but the bulk of the development would be on the server. It is not unlike the usage of markdown: although there are ways to render markdown on the client on-the-fly, many HTML pages are originally written in markdown, and then converted into HTML before publishing.

The issue is then accessibility. At present, what happens often in practice that MathML is put into the HTML page with display turned off alongside a, e.g., image displaying the mathematical equation. This is done because assistive technologies do understand MathML. The alternative is to rely on ARIA: the conversion of mathematics into HTML+CSS would also add the semantics to the ARIA framework and, therefore, again be based on technologies that browsers already deploy.

The issue of polyfilling came up, i.e., whether that approach would work. It may be possible, but makes a complicated custom element structure, and it seems to be a difficult process for performance. For polyfill, you would expect developments to manipulate the DOM as if it was a normal MathML implementation, but no one is doing that. MathJax does not do polyfill because Web Components aren’t yet really available. Besides, the real problem is to render in HTML and CSS, i.e., it is better to get the rendering easier and rely on native.