Making Robust Digital Books

Bridging the Gap Between Standards and the Marketplace

A Position Paper from the Hachette Book Group

Dave Cramer, Content Workflow Specialist

Phil Madans, Director of Publishing Standards & Practices

Introduction

Five years into the era of EPUB, we are frustrated that the promise of e-book standards has not been fulfilled. Differences between e-reading systems are increasing. Proprietary extensions proliferate, while mandatory features remain unimplemented. It’s becoming more and more difficult to know how your book will look once it’s in the hands of a reader.

Hachette has pushed hard for standards, being the first major publisher to supply only EPUB2 to the industry. But history has taught us that we cannot depend on universal support for e-book standards. How can we create quality e-books in such an environment? Perhaps one answer is to make individual e-books more resilient, more adaptable to a sometimes-hostile ecosystem. We hope that the W3C (in conjunction with IDPF) can explore some ideas for making digital books more robust.

Our Principles

Create a single version of each book. Targeting e-books to individual devices is a problem, not a solution. Multiple versions make distribution harder, even after dealing with identifiers. They confuse consumers. Content authors spend their time and money worrying about compatibility, rather than making books.
Use existing Web technology as much as possible. e-books are, fundamentally, just HTML and CSS. Our experience is that technology that doesn’t exist in most browsers is rarely supported (for example, epub:switch), and is dangerous to use in an unknown environment.
Device testing is not the answer. The number of reading systems is growing exponentially with every holiday shopping season. We can’t test everything. Even looking at Amazon alone, Kindle on iOS is different than Kindle on Mac is different than early e-ink Kindles which is different than later e-ink Kindles which is different than the first Kindle Fire which is different from the larger Kindle Fire. Does anyone expect a movie studio to test a Blu-ray disk on every Blu-Ray player ever manufactured?
If content follows a standard, the reading system should render it according to that standard. Content creators, as a whole, have upheld their end of the bargain (mostly due to epubcheck). EPUB2 is the dominant interchange standard in the e-book world, and essentially every EPUB2 now validates against epubcheck. But that validation is no longer a guarantee that a book will render as intended. As publishers, we need to concentrate on our content. Gaps between the specifications, reading systems, and content need to be bridged. This may require more stringent authoring requirements and practices, and increased reading system conformance. But the goal remains: if a content author follows the standards, the reading system must render it correctly. Otherwise there is no standard.

What Can Be Done?

Different reading systems render the simplest HTML and CSS in surprising ways. Much of this is due to vendor CSS which overrides publisher-supplied content. (a) Documenting this behavior for each reading system would be of immense value in developing workarounds and strategies for maintaining the desired rendering. (b) Trusted content creators could be allowed to bypass these overrides, as Apple allows with "specified-font" metadata.
Web developers manage environmental variables through browser detection. Providing a simple mechanism for an e-book to learn about the reading system, and respond appropriately, would be helpful. Media queries and @supports rules are two existing standards which could be useful for this.
We fear that EPUB3 is not really being implemented at all. For a year and a half, every meeting we’ve had with reading system developers ended with them asking us to prioritize each feature in EPUB3, as they were deciding which pieces to support. IPDF should adopt a modular approach, with a core set of EPUB3 functionality (nav, the CSS profile) with at least Media Overlays, Interactivity, Triggers, and CFIs being optional components. Especially in the case of interactivity, the spec is mostly speculative—we have little guidance on what reading systems will eventually support.
In the early days of EPUB2, the epubpreflight project was created to test for things that were valid in theory, but often didn’t work in practice. An updated version of this could be very useful, if combined with information about individual devices’ conformance with relevant specifications. Such a report might include:
1. Content profiling. Which components of EPUB3 (for example) does this file use. Does it have video? What codecs? Does it have spine-level scripting? MathML? An NCX for backward compatibility?
2. CSS profiling. Does the file use absolute positioning? Floats? Are font sizes expressed in absolute units?
3. Character profiling. Comparing the glyphs used in a document against what’s available in common reading systems would be incredibly valuable.
4. Image profiling. Apple has a two-million-pixel limit. ADE won’t display CMYK images.
Information from the e-book, compared to conformance information about reading systems, could tell us if an e-book would work as intended on a given device. A report could be generated indicating which reading systems would fully support the content.