What publishing needs from the web (and how you can help)

(This blog was originally published, by Liza Daly, on the Safari Flow Blog, on March 24th. It is reproduced here in full and without any change, with the permission of the author.)

For a few months now I’ve served as co-chair of “DPUB”, the W3C Digital Publishing Interest Group, (with Markus Gylling, who somehow has time to be a wonderful CTO of two different standards organizations). DPUB acts as a channel for those of us in digital publishing to influence the development of web standards like HTML5 and CSS3. The group has already produced two public documents describing use cases for text layout and for annotations, which we’re quite proud of. But we’d like to do more, and we need your help.

Widows and orphans, oh my

Screen Shot 2014-03-24 at 9.25.01 AM

In Requirements for Latin Text Layout and Pagination, Dave Cramer of Hachette has compiled an exhaustive list of requirements for laying out text using best practices developed over centuries of print publishing, including guidelines for pagination, table layout, and other formatting. By design, these requirements focus on problems which are not natively solvable today in CSS. Those of you who have tried to lay out drop caps or figure/caption pairs know that the current techniques are fragile, easily breaking down in the face of alternate viewport sizes or user controls. Dave’s work is well worth reading for anyone interested in CSS-based layout.



Annotation Use Cases by Robert Sanderson of Los Alamos National Laboratory documents issues that those of us who have implemented annotations in our reading systems must deal with: annotation metadata, position and text selection, and annotation styling. Rob includes advanced cases as well, such as annotation of documents under revision and the elusive cross-format annotation (which I consider to be one of the hard problems in digital publishing).

Why this matters

It’s hard to believe this in 2014, but when EPUB was first born, it was provocative to suggest that ebooks would be a part of the web. Early ereaders were implemented with proprietary software stacks that presumed interoperability at the level of file interchange only. We’re still paying the price for some of these decisions, from DRM, to strict XML-centric markup, to a collective deficit in documented best practices, free software, and open dialog.

This is changing fast, thanks to projects like Readium, but we now have a new problem: the web isn’t entirely ready for us. Today, nearly all ereading systems are built either directly on the web (like Safari) or using low-level browser libraries (like most reader apps on mobile devices). There are unique requirements in publishing that need to be supported directly in web technologies; the alternative is continued fragmentation and lack of interoperable innovation.

What needs to be done

I’m a reading system developer, so I’ve grappled with problems like pagination (ugh), progressive enhancement, accessibility, and interactivity. Publishers and authors care about semantics and metadata. Ebook developers struggle with markup and styling. While we’re all getting by with the tools we have available today, there’s tremendous variability from platform to platform in how many of these issues are addressed. Too many organizations are starting from nothing when they endeavor to answer questions like:

  • How do we provide metadata at the level of a chapter or section?
  • How do we best convey the rich semantics of a book in a way that search engines will understand?
  • How can we preserve the beauty and order of professional typography in web-based books?
  • How can we create rich interactive publications that work across a range of platforms?

Screen Shot 2014-03-24 at 11.46.19 AM

Help us help you!

We need participation from publishers, reading system developers, ebook developers and other thinkers in digital publishing. This is important, high-profile work, and there are some hard problems to solve. Please drop me a note in the comments, an email, or on Twitter for more information on joining the interest group and/or becoming a W3C member.

About Ivan Herman

Ivan Herman is the leader of the Digital Publishing Activity at W3C. For more details, see http://www.w3.org/People/Ivan/

Comments are closed.