eBooks: Great Expectations for Web Standards

A W3C Workshop on Electronic Books and the Open Web Platform

11-12 February 2013, New York, USA

W3C gratefully acknowledges O'Reilly for hosting this workshop.


Summary of the Electronic Books and the Open Web Platform Workshop

W3C, together with IDPF (International Digital Publishing Forum) and BISG (Book Industry Study Group), held a Workshop on Electronic Books and the Open Web Platform, under the title eBooks: Great Expectations for Web Standards, on the 11 and 12 February 2013 in New York, USA. The Workshop was co-located and hosted by O'Reilly Tools of Change for Publishing Conference.

The Workshop's technical discussions focused on Open Web Platform technologies currently used in eBooks and the need for improvements of these technologies for future digital publications. The over-arching question of how to bring the publishing industry closer to the development of Web technologies to ensure a smoother cooperation was discussed by participants throughout the sessions and during the breaks.

Executive Summary

Today’s eBook market is dynamic, fast-changing and strong. eBooks compete with printed versions, and there is a wide choice of hardware and software available for eBook readers. Nevertheless, publishers face major business and technical challenges in this market, some of which could be reduced or removed through standardization.

The International Digital Publishing Forum (IDPF) has defined the EPUB standard (latest version 3.0) that largely builds on W3C’s Open Web Platform technologies. OWP is also increasingly used at the core of standalone desktop and mobile applications. However, more can and should be done to address the specific issues and requirements the publishing industry has for Web Standards as applied to eBooks, the publication of Web sites, or content applications.

W3C seeks to support the wide adoption of Web technologies in digital publishing contexts. Consequently, there is a need for the Web and Publishing communities to reinforce cooperation around well defined technical issues. This Workshop was a first step, bringing together a wide range of stakeholders to share their own perspectives, requirements, and ideas to ensure that emerging global technology standards meet the needs of the Digital Publishing industry. The Workshop has identified a number of technical issues where the W3C could and should work together in the coming years.

The Workshop participants began discussions to prioritize lists of topics such as presentation, layout, fonts, or accessibility. As a next step, the W3C staff will work with stakeholders, such as IDPF and BISG, in the digital publishing ecosystem to identify opportunities for work related to publishing standards and that can be launched at W3C.

There were 43 position papers submitted, and 89 registered participants. There were 24 presentations spread over 5 sessions during the one day and a half of the workshop.

The top topics in the call for papers were:


  • standardization issues, including relationship to current and future W3C standards like HTML, SVG, MathML, Web API-s, metadata, etc.
  • layout definition and control (fixed and adaptive layout, high quality typesetting, font definition and management, etc.)
  • accessibility (authoring accessibility guidelines including graphics, fallbacks)
  • voice control


  • color management and conversion
  • device descriptions
  • widgets definitions, standardization
  • conformance (definitions, requirements, testing methodologies, certification)
  • ergonomy


  • DRM management (including interoperable and open DRM systems, social DRM, etc.)
  • unique identification of eBooks
  • outreach and deployment
  • packaging
  • metadata storage and vocabularies

The Workshop participants reached a broad consensus that technologies in the Open Web Platform provide a compelling basis for eBooks, but further work is needed. There was strong support for work on presentation issues, accessibility, testing, but also on other topics. The full list is given in the summary of the wrap up session.

Main workshop discussions

First day, February 11th afternoon

The first day included: an Overview speech; a first Keynote 1; and two sessions -- the first on Presentation and the second on OWP/ePub. [Minutes from Day 1]

Following a welcome by session moderator Karen Myers (W3C) and scene setting talk by Thierry Michel (W3C) [Slides] the Workshop Co-Chair, Jeff Jaffe (W3C) [Slides] presented an overview of the World Wide Web Consortium and the Open Web Platform as a platform for innovation, consolidation, and cost efficiencies. Jeff also presented the expectations for this workshop and the workshop success criteria.

Bill Mc Coy (IDPF) [Slides] gave a keynote on the need to increase collaboration between W3C and the IDPF, urging that W3C collaborate on a shared vision and roadmap, building on EPUB 3 as the standard packaged format, for the eBook (portable document) instantiation of the Open Web Platform. Bill also talked about the need to extend current W3C work to address the requirements of the publishing industry, such as high-design content and rich media, accessibility for people with disabilities, internationalization, and semantic structure. He also emphasized the difficulty met by developers when adopting the large number of individual W3C Recommendations that constitute the full Open Web Platform.

Session 1

The first Session focused on "Presentation" (CSS, Fonts, etc.). It included four talks:

Håkon Lee (Opera Software) gave a demo style presentation showing an implementation of extended layout capabilities in a news magazine format. It demonstrated pagination and multi-column formatting, gesture-based navigation between pages for book-like presentations, all done adding a few lines of CSS to their code. Håkon said that the technical solution he proposed should cover most digital publishing needs.

The second talk by Vladimir Levantovsky (Monotype) [Slides] emphasized that digital publications should achieve the same level of typographic quality as print publications. Proposal for topics to be worked on included: high-quality typesetting and font definition and management. To achieve these goals, the standards developed by W3C and IDPF should provide adequate, unified support for all critical technology solutions enabling high-quality typography for eBooks and on the Web.

The third talk by Jaejeung Kim (Kaist) [Slides] took the position that electronics books should resemble the paper book in both design and functionality. Do do so he talked about enrichment of eBook user interfaces with a Skeuomorphic approach. He demonstrated a prototype with features such as thumbing-through the book to get an overview concept, or temporal bookmarking by holding a page with one's finger and jumping back and forth between pages, letting go of the bookmarked finger when done.

Finally, Alan Stearns (Adobe) [Slides] compared and contrasted the Web and eBook ecosystems. He proposed the adoption of a single solution when interests converge, and where EPUB leads, to improve the Web. He advocated for the prioritization of CSS specs like CSS3-Text and CSS3-Speech. He also talked about Paginated Views or Adaptive Layout as new Web features adapted from EPUB draft (CSS Regions, CSS Exclusions and Shapes CSS Page Templates). Alan also addressed the necessity for testing and the need for ePub readers to contribute their own test cases to, e.g., the Caniuse site.

These four presentations were followed by a discussion and feedback on presentation issues with the audience, moderated by Alan Stearns (Adobe) [Minutes]

Session 2

The second session focused on the Open Web Platform and the ePub format and included five presentations:

The first presentation by Daniel Glazman (Disruptive Innovations) [Slides] confronted the “Good and the Bad in the EPUB3”. Daniel highlighted important inconsistencies or incomplete specifications in EPUB, important changes between EPUB2 and 3, or non-normative references to W3C Working Drafts. He also criticized, in his view: "useless manifest, too many TOCs, ID/IDrefs issues, very complex refinable metadata and complex management of property vocabularies." He recommended using only HTML5, allowing both serializations. Daniel also drew attention to the fact that the needs of eBooks should be included in the use specs such as CSS Regions, Exclusions, Page Templates, Grids, Flexbox, Writing Modes, Text, Fonts and improved Paged Media as RECs. He finally addressed the relationship with W3C, and the complexity of building a WYSIWYG EPUB editors and demoed snapshots of the BlueGriffon EPUB Edition.

The second talk by Soo Choi (HarperCollins Publishers) [Slides] outlined why publishers would benefit from having a clearer picture of the landscape of device manufacturers and plans to fully enable EPUB  3.0 functionality. HarperCollins Publishers supports the widespread adoption and implementation of the IDPF’s EPUB 3.0 specifications across device manufacturers and retailers.

The third talk by Dave Cramer (Hachette Book Group) [Slides] addressed the need to create a single version of each eBook and then target it to individual devices; however, multiple versions make distribution harder. Device testing is not the answer. The number of reading systems is growing exponentially: a publisher can’t test everything. Dave emphasized the need to use existing Web technology as much as possible.

The fourth talk by Kim Marriott (Clayton School of IT) [Slides] addressed the need that standards should allow eBooks to be more than just a digital printed book: they could also include dynamic content, interactive diagrams, videos, customized content, responsive presentation and accessibility, collaborative and continuous authoring.

The last talk by Robert Glushko (University of California at Berkeley) [Slides] explained that the current set of Web technologies looks promising, but still has limitations because most of the energy in the recent HTML5 developments have gone into improving the runtime environment for scripting, rather that into improving the Web as an information delivery platform. He asserted that eBooks, and ePublishing of pre-packaged materials in general, should be an important enough use case to influence some of the relevant HTML5 standards, and the current landscape of over 50 specs under development makes it non-trivial to identify eBook-reader uses cases.

These five presentations were followed by a Discussion and Feedback with the audience moderated by Markus Gylling (IDPF and the Daisy Consortium) [Minutes]

Second day, February 12th 2013.

The second day included a Wrap-up of Day1 sessions, a second Keynote , and three sessions: Accessibility, DRM, and Metadata & Annotations. The day ended with a Wrap-up of Day2 sessions, and a final Wrap-up conclusion of the workshop. [Minutes from Day 2]

Philippe Le Hégaret (W3C)[Slides] kicked off Day 2 by summarizing the issues raised during the Day1 "Presentation" and "OWP/EPUB" sessions. Philippe focused on CSS issues such as extensions done at IDPF for EPUB3 and modifications done at W3C since then. He outlined the need for collaboration and the common goal to move towards high quality design, with publishers to be engaged and to voice their needs and requirements. He also outlined CSS priorities for the current specs (Regions, Exclusions, Page Templates, Grids, Flexbox, Writing Modes, Text, Fonts, Improved Paged Media) and said that CSS Ruby was being dropped. Philippe explained that HTML5 has HTML and XHTML syntaxes, but cautioned that some of the future functionalities may not be available for XHTML. Finally, he addressed the need to produce interoperable content for eBook readers (fragmentation of readers, non supported features, lack of good Math ML support) and the need for an improved test coverage and test tools.

This Wrap-up session was followed by a Discussion and Feedback with the audience [Minutes].

The Workshop's second Keynote was by Len Vlahos (BISG) [Slides] who presented BISG and its role of education and propagation of the EPUB3 standard. BISG also maintains the EPUB3 Support Grid and released a policy statement formally endorsing EPUB 3.0. Len also talked about BISG's publication of Best Practices for Identifying Digital Products and its program of metadata certification.

Session 3

The third session was dedicated to Accessibility and began with a presentation by George Kerscher (DAISY Consortium) who identifies and develop standards to make publishing accessible. DAISY endorsed EPUB3 for accessible digital publications. He listed issues a number of issues: that graphical content requires more than alternate text; adaptive data visualization needs to provide more opportunities; that MathML interfaces are in their infancy; and that interactive content in digital publishing and reading system and app proliferation is an issue to consider.

The second talk by Janina Sajka (W3C PFWG Chair) [Slides ] was on ensuring accessibility of digital publishing systems and content for people with disabilities.  She listed a number of issues that still require accessibility consideration: accessibility support in standards, digital rights management that doesn’t break accessibility, accessible content and content distribution, and accessible user interface control in client and authoring applications.

Finally, a third talk by Mark Hakkinen (Educational Testing Service) [Slides] addressed the goal of the eBook as a platform for rich, interactive learning experiences. The foundation that EPUB provides for tools and the creation of accessible reading experiences must integrate HTML5 in a manner that does not compromise or limit accessibility to students with disabilities. He focused on support for accessibility guidelines from W3C (such as WCAG and UAAG), and addressed unresolved accessibility issues such as image descriptions standards outside of W3C and IDPF, (such as the IMS Global Consortium's APIP, AfA, and QTI).

These three presentations were followed by a Discussion and Feedback with the audience moderated by  Janina Sajka (W3C PFWG Chair)  [Minutes].

Session 4

After the morning break, the third session on DRM started with a presentation by Jim Dovey (Kobo) [Slides] addressing the merits and failures of Digital Rights Management. He presented two components of DRM, i.e., authentication (user authentication, device/Reading System authentication, content authentication, action authentication) and authorization. He also presented Watermarking that can be used to identify the purchaser. W3C technologies such as XML-ENC for encryption and XML-DSig for signing play a fundamental role in the definition and the deployment of these features.

The second talk by Youngwan (Samsung) [Slides] addressed general issues such as market fragmentation: various device types (e-Ink devices, smart devices), different content formats (PDF, EPUB, proprietary format), barriers of DRM technologies, different viewers (Native Apps, Browsers) and Viewer performance.

The third talk by Gerardo Capiel (Benetech) [Slides] was on social DRM use, for example for Accessibility, including techniques as Watermarking an eBook, Fingerprinting the user’s name and ID in the downloaded eBook, monitoring transactions and searches for illegal copies of content.

The last talk by Oliver Brooks (Valobox) [Slides] presented a solution “Buy once, sync anywhere” enabling a user to buy a book on platform A, where that purchase gets synced so that it is also available on platform B, C & D.

These four presentations were followed by a Discussion and Feedback with the audience, moderated by Jim Dovey (Kobo) [Minutes].

Session 5

The last session on Metadata and  Annotations started with a presentation by Mark Bide (EDItEUR) [Slides] on book metadata and identification, such as managing a huge catalog of products with ISBN, a huge volume of transactions with EDI and a huge volume of metadata with ONIX. He also addressed what these standards have in common, and how to manage the metadata explosion.

The talk by Rob Sanderson and Paolo Ciccarese (Co-Chairs of the W3C Open Annotation CG) [Slides] introduced the concept of annotations and presented the results of the work at the W3C Open Annotation Community Group. He also talked about the Annotation basic data model, and outlined eBook use cases such as commenting, bookmarking, highlighting text, comparing text.

The talk by Peter Meirs (Time Inc) [Slides] talked about managing digital source content with the PRISM Source Vocabulary (PSV) facilitates, for an automated delivery of magazines to the EPUB3 eReader. He mentioned the agreement to collaborate on cross-association specification efforts—EPUB 3.0 (packaging, delivery & display for eReaders) and PRISM 3.0 (source content format). He then detailed PSV as a standard metadata schema.

The talk by Stuart Myles (Associated Press) [Slides] presented ODRL and RightsML to develop and apply rights expression. He introduced three dilemmas (general purpose REL/industry specific REL,Sophistication/simplicity, tool support for adoption/implementation of new standard) and three adoption strategies for rights mechanisms.

Finally, a talk by Todd Carpenter (NISO) [Slides] introduced NISO’s interests in electronic books (identification, description, preservation discovery and distribution, XML Production and Accessibility of digital content). The talk also addressed the limitations of MARC and ONIX components and finally introduced NISO’s Bibliographic Roadmap Initiative as a future of bibliographic information exchange ecosystem.

These five presentations were followed by a Discussion and Feedback with the audience, moderated by David Wood (3 Round Stones Inc) [Minutes].


After the afternoon break, Liam Quin (W3C) [Slides] summarized the issues raised during Day2 presentations and discussions and highlighted specific areas of possible work:

  • Accessibility (issue with barrier of DRM, extending WCAG and/or UAAG, authoring guidelines, Maths, graphics, video, simulations, interactivity and navigation, and Accessibility in education)
  • Digital Rights Management (Accessibility barrier, XML Sig/Enc possibly converging to a single system, fingerprinting and social DRM, long-term archiving, device convergence for DRM)
  • Metadata (identifying digital objects (cf. fingerprints), Annotations, Relationship between PRISM, JDF, ODRL+RightsML and other metadata, and HTML/Web, Self-cataloging eBooks, User rights)

Emphasis was on a better collaboration between W3C and IDPF, and the publishers’ and stakeholders’ higher participation by direct voices in W3C groups.

This wrap-up of Day2 sessions was followed by a Discussion and Feedback with the audience [Minutes].

Conclusion and follow-up

To conclude this workshop, Ivan Herman (W3C) did a final wrap-up of the Workshop [Slides]. Ivan emphasized the need for participation by the publishing industry in W3C and with related experts in the IDPF and BISG work. To move forward, the publishing industry should consider participation in W3C work in the following areas: various CSS issues, HTML/XHTML issues, fonts management, MathML, DRM/Payments & WebCrypto, metadata, etc. Also a stronger cooperation in testing is necessary.

The next steps should be:

  • Organize an informal discussion between IDPF, major publishers, and the CSS WG in the coming three to four weeks to provide the CSS Working Group with specific user requirements and issues.
  • Set a group at W3C to identify use cases and requirements to CSS HTML, Encryption, etc.
  • Consider organizing a one-day eBook workshop in Japan in June 2013,  concentrating on Asian rendering issues.

Ivan also proposed the participation of W3C experts in the IDPF work, and the Metadata work at BISG.

Finally, Ivan presented W3C plans for other workshops: back end/workflow Workshop likely in September in Europe, and possibly workshops on journal, magazine publishing, and on metadata issues in publishing.

Ivan extended thanks to O’Reilly for hosting the Workshop; sponsors Pearson, Adobe, Google, and Microsoft; the program committee members; co-chairs Thierry Michel (W3C), Markus Gylling (IDPF) and Angela Bole (BISG); and W3C Administrative, Communications and Business Development staff for logistical and other arrangements.