DPUB IG Telco, 2015-03-09: Some Task Force Updates, discussion on Web Packaging

See the minutes online for a more detailed record of the discussions.

Task Force Updates

STEM Task force

The Questionnaire has been sent out to a number of people in several. With the last round (last week) the number of people who have been contacted is around 90, with a deadline for responses set at the end of the month. At the moment, there are 15 respondents plus some others who, though not replying themselves, have forwarded the questionnaire to colleagues.

Accessibility Task Force

The task force has surveyed the W3C Accessibility Guidelines to see which techniques are relevant for Digital Publishing. Most of them (a dozen or so) are not really required, and some others are not clear (e.g., PDF related techniques). However, most of what is in the current guidelines are very much relevant for the Digital Publishing Industry.

The more complicated question, which has not been addressed yet, is whether there are issues in Digital Publishing that are not addressed by the guidelines. Such issues may be related to page numbers, drop caps, etc., although some of these things could be addressed via other specs (like CSS).

The issue is that the task force is a little bit low on resources at the moment…

Content & Markup Task Force (Update to the role module)

The Task Force has been working with the W3C PF Group on a draft for a role module. This is now an early editors‘ draft; some terms have been cut out that could be addressed elsewhere. There is a need to also remove some ambiguity from the terms definitions and make sure things have context outside of Digital Publishing as well.

The challenge is to determine the scope of publishing and the definition of the terms. There are a large number of potential terms (almost a thousand) and a good balance must be found for a module that is neither too large nor too little.

Discussion on Web Packaging

The discussion included Yves Lafon, the W3C staff contact in the Web Application Working Group. Yves gave an update:

The Web packaging format started as a way of identifying—with the URL—a way to identify packaging. It then derailed to some use-case as to why there was a need for a package format or document. One of the main drivers was the need for Javascript libraries. There is also a strong relationship to using service workers; it is kind of a portable cache format. Without the need of a configuration. We wanted to actually know if the work we’ve done will be actually useful to our people. We started to gather input from other people, and we got some security input, signatures—part of the document from inside the package—and of course it would be good for us as we know this IG would be interested in this type of packaging, if our approach was good for you, what would we need to make better. The current point is trying to figure out who would be the perfect customer of the specification.

Subsequent discussions concentrated on technical as well as organizational issues. In general, it was agreed that better Digital Publishing use cases, and resulting requirements, should be collected and forwarded to the Web Application Working Group to represent Digital Publishing, and what is required from a packaging format. Eg, publishers are looking at ZIP alternatives to work well on mobile, that could also efficiently include large data sets, etc.

One of the main technical issues for the Digital Publishing community is the pros and cons of abandoning ZIP in favor of a new packaging format. One of the arguments against ZIP is that it is not properly streamable. However, it may be possible to add some restrictions to ZIP so that the result is actually streamable. If so, there is a legitimate issue whether abandoning ZIP, which is largely deployed through EPUB3 publications, is a acceptable alternative. On the other hand, it is in the interest of the Publishing Community to use a packaging format that can and is natively implemented by browsers.

It was noted that IETF also plans to look at packaging issues (see the IETF WG charter) and is currently considering the W3C Web Packaging Work, too.

The plans for this Interest Group is to (1) find a definite answer on whether ZIP files can be made streamable and (2) collect use cases to be submitted to the Web Packaging work.

Posted in Activity News, Meeting reports | Comments Off on DPUB IG Telco, 2015-03-09: Some Task Force Updates, discussion on Web Packaging

DPUB IG Telco, 2015-03-02: Houdini project, EPUB 3.1 workplans

See the minutes online for a more detailed record of the discussions.

CSS WG’s Houdini project

The Houdini Project of the CSS Working Group had its first meeting a few weeks ago in Sydney, and some participants gave a short overview for the IG. The goal of the Houdini Project is to extend CSS. At present, CSS is a big black box where stuff goes in and formatted display comes out; if the magic isn’t what you want, it is difficult to make changes. The goal of the Houdini project is to “open up” that so that scripts might get additional information and control over the layout process and possibly modify how browsers lay things out.

The sense of the Houdini meeting is that there was a great enthusiasm, but also some level of skepticism on how all this can be done. But, behind the large picture, there are a number of little things, plumbing, etc, that will be done and that are useful. E.g., for the Digital Publishing community the possible control over pagination may be the biggest win: putting pagination on top of existing browser using scripting require some low-level elements in order to make a good reading experience in the browser. Such work will be accelerated if the lower-level work gets going.

It was agreed that it is important that the use cases and possible implementation experiences of e-readers, i.e., of the Digital Publishing Community, should be communicated to the Houdini project.

There is a good introduction and report by Simon Sapin, as well as a summary of Vivliostyle, and a report of the project by Peter Linss to the TAG.

IDPF’s workplaces on EPUB3.1

Epub 3.1 was released several months ago: that included bug fixes, and ISO wording + backwards compatibility. IDPF is now thinking of the next version, and this was presented, by Markus Gylling, at a recent EDUPUB Symposium in Phoenix. Various features to be added were mentioned like 3D format, migration of epub:type to the role attribute, or to HTML5. Some features may also be deprecated, like switch. However, at present, all those are just discussion items, no formal decisions or timeline yet; instead, a discussion among IDPF members should take place.

HTML Image Description Extension (longdesc) is a W3C Recommendation

The HTML5 Image Description Extension (longdesc) was published today as a Recommendation by the HTML Working Group, with the approval of the Protocols and Formats Working Group. This extension for HTML5 adds a longdesc attribute that is used to provide links to detailed descriptions of images, and is part of W3C’s work to ensure that the Open Web Platform is accessible to people with disabilities.

Posted in New W3C documents | Comments Off on HTML Image Description Extension (longdesc) is a W3C Recommendation

W3C Pointer Events is a Recommendation

The W3C Pointer Events Working Group has published a W3C Recommendation of Pointer Events. The Pointer Events specification defines a unified set of events and interfaces for device-neutral pointer input, such as a mouse, touchscreen, and pen-tablet, including capabilities for handling pointer pressure, contact geometry, and tilt; it also defines a mapping to traditional mouse events. This specification provides additional functionality not available in the related Touch Events specification; for more information on the relationship between these two specifications, see the Touch Events Community Group.

Posted in New W3C documents | Comments Off on W3C Pointer Events is a Recommendation

DPUB IG Telco, 2015-02-23: Identifiers, packaging, & manifests

(Meta comment: the W3C Digital Publishing IG has weekly teleconferences. The minutes of the meetings, as well as a short summaries, are available on line. However, to give a greater visibility, from now on these summaries will be published on this blog rather than just putting them on the wiki.)

The meeting mostly concentrated on some technical issues around the EPUB-WEB vision. See the minutes online for a more detailed record of the discussions.

Metadata Task force and identifiers

Some of the crucial issues related to EPUB-WEB are around identifiers, fragments, etc. It was suggested that the former Metadata Task Force would concentrate on these, identifying use cases and requirements primarily in the area of fragment identifiers. While the problem area around fragments is relatively clear, the issues on identifiers, and how that would affect EPUB-WEB are more complex. Indeed, many identifiers used out there are based on registries and are only loosely coupled with HTTP URI-s; also, many discussions in that space are happening outside this group. The way forward is probably to “reset” the Metadata Task Force, essentially by creating a new task force to make the intentions clear.

(There are some very initial thoughts on identifiers and EPUB-WEB on the epubweb wiki.)

Overview of the Web Packaging draft

The W3C Web Packaging draft was discussed to see how it would fit in the EPUB-WEB vision (as a possible alternative to ZIP). Ivan Herman has prepared some notes on the document on a wiki page.

Three main areas of attention in the draft are:

  1. Packaging itself, based on (essentially) a multipart Mime approach. The important point is that, conceptually, a package is a concatenation of HTTP responses, including HTTP Headers, for specific resources into one package resource; the package itself may also have its own HTTP Header. This approach brings the package very close to current Web technologies, and provides a rich possibility of metadata on each resource as defined in the HTTP standard. (E.g., and ePub “spine” can be implemented through these headers)
  2. Fragment identifier, as defined in the document, is based on the idea of:
    1. define a set of “candidate” parts within the package (listing a set of possible URL-s, for example)
    2. choose among the candidates using some filters (essentially content negotiations based on type or lang).
    3. use a fragment as defined for that specific media type; i.e., EPUB-WEB can rely on existing and evolving fragment identifications for different media without having to reinvent its own.
  3. “Link relations”, either in form of an HTTP Link header or an HTML <link> element. These provide a suitable entry point to an EPUB-WEB document: e.g., a landing page refers to the package (i.e., the possibly offline document).

Subsequent discussions looked at the question where such a packaging would be advantageous compared to ZIP. The document mentions facilities of streaming, tooling support, and richer per-part metadata; the feeling on the call was that the last argument is the strongest in favor of Web Packaging (although the availability of HTTP related tooling when handling the content of a package was also deemed to be important).

It is worth mentioning that Dave Cramer made a test on how the (ubiquitous) Moby Dick could look like in a package. The package can be downloaded from the Web (note that the fact that it is a “ZIP” file is just a means to make the file smaller in an email; the package itself can be looked at in a text editor.)

It was emphasized that the Digital Publishing community is in a unique position to strongly influence the evolution of Web Packaging, because the work is at its starting phase; joining the relevant Working Group, possibly acting as editor, is in a window of opportunity right now.

Overview of the Manifest draft

The W3C Manifest draft was also discussed to see its relations to EPUB-WEB. Tzviya Siegman has prepared some notes on the document on a wiki page.

The question, from the EPUB-WEB point of view, is whether that manifest format can be used as a manifest for EPUB-WEB documents.

The manifest is a JSON-LD file that can be associated to a resource via a specific <link> element. It has a number of metadata term that are currently aimed at web applications (icons with their sizes, display formats, etc.). Three specific issues were brought forward:

  1. The manifest has a notion of “scope”: a URL that represents the scope of URLs that can be navigated within context (note that web packaging also has the notion of a “scope”). It is not clear whether that functionality is enough for EPUB-WEB to help in identification
  2. Display mode: this is one of the terms defined by the manifest and may be very important for personalization
  3. Openness (or closeness) of the manifest terms: is it possible to add/define additional terms that are more important to the publishing community. It was felt that some sort of an extension structure, whereby various communities could add their own terms, would be a way forward, rather than cast a specific set of terms in concrete.

An opinionated guide to digital publishing specifications (guest blog)

(This is a reproduction, with permission, of a blog published by Liza Daly, published on the Safari’s blog, on the 22nd of January.)

The World Wide Web Consortium (W3C) is a standards organization serving the “open web” — the set of freely available specifications that underpin most of the visible internet. In the years since the W3C was founded, all modern businesses have become “web” businesses, with their own industry-specific processes, jargon, and priorities. To that end, the W3C has formed interest groups for those industries which are adjacent to the web, with a goal to promote web technologies and ensure that the web is meeting common commercial needs.

I was co-chair for the Digital Publishing Interest Group for a time, and I have first-hand exposure to their work in interviewing publishers, documenting best practices, and writing recommendations for future specifications.

Screen shot of the first table of the DPUB specification review

One of those deliverables is an intimidating table of W3C specifications and standards that were considered relevant to digital publishing. There’s a lot to digest there, and it’s unlikely that any single human is deeply familiar with all of it. I’ve provided an opinionated gloss of the most relevant or active standards, and feel free to comment if I’ve disparaged or ignored your favorite specification.


The audience

I’m assuming that the reader is one of the following:

  • A developer who is working in digital publishing
  • A curious non-developer who isn’t afraid of the word “normative” and acronyms that begin with ‘X’
  • A standards wonk who wants to be more familiar with publishing activity

These are the “bread and butter” of digital publishing — whether it’s commercial ebooks, academic publishing, or journals:


HTML5 is a monster of a spec, but at least it’s reflective of current browser support. You should be familiar with the basics of markup, as well as the sections on browsers and common APIs.


There’s the workhorse CSS 2.1 specification which has been around for a decade. Unfortunately for the curious but lazy, all the cool new stuff is in CSS3, and that spec is broken out into many modules. Here’s a drive-by of the most interesting or publishing-relevant ones:

  • Start with Dave Cramer’s highly readable Requirements for Latin Text Layout and Pagination (“Latin” here means Western languages, not veni, vidi, vici). Note that this is a requirements document, not a spec, which means much of what Dave recommends won’t actually work anywhere yet. Welcome to standards!
  • CSS Text Module Level 3 is the “real world” equivalent to the above. Though it’s technical a spec in-progress, most everything in here is available in modern browsers and reading systems.
  • CSS Regions Module Level 1 is a good read when you want to be angry about something. Regions can do some amazing things for advanced layout, but there’s a long and sordid history behind their implementation and deployment. There’s a lot of momentum behind getting Regions or an equivalent standard moving again, so there’s hope.

Extra credit assignments: CSS Media Queries and CSS Fonts Module Level 3. And while it’s unlikely that you’d need to actually read the SVG and MathML specs, it’s important to be familiar with those formats at a high level.


The simplest way to approach accessible web or ebook content is to study the semantics that are built in to HTML5. High-quality semantic markup will not only help a range of human users, it’ll aid in discovery and ranking by search engines.

Follow that up with the non-technical best practices in Web Content Accessibility Guidelines, and this overview of creating accessible interactive content.


It’s not dead yet! There’s a lot of cruft in the list, but ebooks are still required to be well-formed XML documents, and academic publishing remains dominated by XML (and, sigh, PDF).

Bleeding edge

If everything above is old hat, check out the emerging specs on the Shadow DOMCSS Flexible Box Layout Module Level 1 (flexbox), and Packaging on the Web.

Posted in Activity News | Comments Off on An opinionated guide to digital publishing specifications (guest blog)

New W3C Recommendation: Indexed Database API

The W3C Web Applications Working Group has published a W3C Recommendation of Indexed Database API. This document defines APIs for a database of records holding simple values and hierarchical objects. Each record consists of a key and some value. Moreover, the database maintains indexes over records it stores. An application developer directly uses an API to locate records either by their key or by using an index. A query language can be layered on this API. An indexed database can be implemented using a persistent B-tree data structure.

Posted in New W3C documents | Comments Off on New W3C Recommendation: Indexed Database API

DPUB IG Metadata Task Force Report Published

The Digital Publishing Interest Group has published a Group Note of DPUB IG Metadata Task Force Report. The Metadata Task Force of the DPUB IG found, through extensive interviews with representatives of various sectors and roles within the publishing ecosystem, that there are numerous pain points for publishers with regard to metadata but that these pain points are largely not due to deficiencies in the Open Web Platform. Instead, there is a widespread lack of understanding or implementation of the technologies that the OWP already makes available for addressing most of the issues raised. However, some of the very technologies that are little used or understood in most sectors of publishing are widely used and understood in certain other sectors (e.g., scientific publishing, libraries). Priorities that have emerged are the need for better understanding of the importance of expressing identifiers as URIs; the need for much more widespread use of RDF and its various serializations throughout the publishing ecosystem; and the need to develop a truly interoperable, cross-sector specification for the conveyance of rights metadata (while remaining agnostic as to the sector-specific vocabularies for the expression of rights). This Note documents in detail the issues that were raised; provides examples of available RDF educational resources at various levels, from the very technical to non-technical and introductory; and lists important identifiers used in the publishing ecosystem, documenting which of them are expressed as URIs, and in what sectors and contexts. It recommends that while little new technology is called for, the W3C is in a unique position to bridge today’s currently siloed metadata practices to help facilitate truly cross-sector exchange of interoperable metadata. This Note is thus intended to provide background and a context in which concrete work, whether by this Task Force or elsewhere within the W3C, may be undertaken.

Program Announced for EDUPUB Summit and Workshop (Feb 26-27)

The next gathering of the EDUPUB community will take place in Phoenix, Arizona (USA) on February 26 and 27, 2015. A preliminary program is now available,  and registration is open.  The summit will launch the implementation phase of EDUPUB, a cross-organizational initiative to develop  a comprehensive open platform for next-generation learning content based on EPUB 3, IMS standards for learning environment integration, and the overall Open Web Platform.

Posted in Activity News, Events | Comments Off on Program Announced for EDUPUB Summit and Workshop (Feb 26-27)

First Public Working Draft: Web Annotation Data Model

The Web Annotation Working Group has published a First Public Working Draft of Web Annotation Data Model. Annotations are typically used to convey information about a resource or associations between resources. Simple examples include a comment or tag on a single web page or image, or a blog post about a news article. The Web Annotation Data Model specification describes a structured model and format to enable annotations to be shared and reused across different hardware and software platforms.