First face-to-face meeting of the Digital Publishing Interest Group

Fairly Lake Botanical Garden, Shenzhen, China

Fairy Lake Botanical Garden, Shenzhen, China

The Digital Publishing Interest Group had its first face-to-face meeting during the W3C TPAC week in Shenzhen, China. The meeting’s main goal was to give more direction to the various task forces that the Interest Group had started to define in earlier weeks, by specifying their scope and main focus. Not all task forces were covered; indeed, the two days also included meetings with other experts from other W3C groups, so the final scoping of the task forces will have to be done in subsequent teleconferences.

The issues around pagination (based on the very fist draft document) were, obviously, the most complex. Producing a document covering all aspects for all kinds of publishing would be a huge task, going beyond what the IG could reasonably do. After much discussion, the scope, for the first version of the document, was restricted along two axes: publishing genres and writing systems. For the former, some of the possible areas were, for the time being, were set aside; these are journals and magazines, poetry, children’s picture books, comics and manga. Indeed, these genres require very special considerations (e.g., possibly fixed layout), and additional efforts and resources will be needed to cover those. As far as writing systems are concerned, the group had to take into account that W3C already had published documents on Japanese Layout, and similar documents may become available in future for Chinese, Korean, or Indic writing systems. As a consequence, the current document will concentrate on Latin based pagination, including the various local variations for different languages or cultures.

Beyond the current pagination concerns (i.e., headers, footers, page breaks, etc.),  it was also recognized that typography issues, again concentrating on Latin languages, should be considered to be very much in scope along the same lines as pagination. Whether this will be a separate document or part of the pagination document is still to be decided.

Although the pagination work primarily results in issues around CSS (possible missing features, setting priorities, etc), which was also the subject of a joint meeting of this group with the CSS Working Group, it was also recognized that pagination raises a number of problems in terms of the content model, in the DOM, as well as available events (e.g., event should be raised when user turns a page). These notions are necessary for reading systems, and it is not clear at this moment whether all the necessary features are covered by the current set of events defined for HTML and/or whether DOM extensions would be necessary. A separate document will have to be published to look into this, which may result in some further joint work with the HTML Working Group in the future.

A very different problem area the group looked at is what is currently known as “Behavioral Adaption”, exemplified by some use cases on the IG Wiki. The solution of those problems require some sort of an additional markup identifying, e.g., the publisher’s semantics for specific elements (chapter title, index, etc.). There are different approaches: one is to use more powerful metadata syntaxes like microdata or RDFa Lite to annotate the content; the other is to use e-book specific attributes as extensions to the core HTML5 set. After discussions on the pros and cons of these two alternatives, the IG decided in favor of the attribute approach. This will be considered in more details in the months to come. The current EPUB specification already introduces an EPUB namespace, yielding epub:type attributes; however, that approach may lead to issues in the future in view of the evolution of HTML5. The direction that will be explored further is the attributes of epub-XXXX format, i.e., without the usage of the XML namespace syntax. It was recognized that a document specifying these attributes, as well as possible values, should be produced (probably by IDPF) to get this accepted as a bona fide HTML5 extension.

The issue of security was also addressed. After quite some discussion it was decided that this large area of concern should be made more specific to decide what is, and what is not in scope for the Interest Group. The issue of DRM on books naturally came up; indeed, it would be, in theory, possible for the IG to collect use cases for various forms of DRM. However, the feeling was that the IG would never get to a consensus on such use cases, due to the different appreciations of the underlying business models. As a consequence, the IG decided that DRM is out of scope for this IG. There are, however, other security as well as privacy issues that are relevant for digital publishing: e.g., what happens if a malicious URI is added to the spine of an electronic book, what happens to the private data a reading system may collect on the user’s behaviour, etc. These issues are very much in scope, and the decision of the IG is to explore those areas further.

There were other discussions areas, sometimes with guests coming from different groups within W3C, e.g., on accessibility or testing. The more detailed minutes, both for the first day and the second, are available on line.

It was a good meeting, which also gave the possibility for many to meet personally for the first time! Additionally, members of the Digital Publishing Interest Group attended other working group meetings throughout the week which, hopefully, was useful for everyone involved.

Last Call: CSS Syntax Module Level 3

The Cascading Style Sheets (CSS) Working Group has published a Last Call Working Draft of CSS Syntax Module Level 3. This module describes, in general terms, the basic structure and syntax of CSS stylesheets. It defines, in detail, the syntax and parsing of CSS – how to turn a stream of bytes into a meaningful stylesheet. CSS is a language for describing the rendering of structured documents (such as HTML and XML) on screen, on paper, in speech, etc. Comments are welcome through 17 December.

HTML Working Group updated HTML 5.1, HTML Canvas 2D Context, Level 2, and HTML Microdata

The HTML Working Group has update two Working Drafts and a Working Group Note today:

  • A Working Draft of HTML 5.1, which defines the 5th major version, first minor revision of the core language of the World Wide Web: the Hypertext Markup Language (HTML). In this version, new features continue to be introduced to help Web application authors, new elements continue to be introduced based on research into prevailing authoring practices, and special attention continues to be given to defining clear conformance criteria for user agents in an effort to improve interoperability.
  • A Working Draft of HTML Canvas 2D Context, Level 2. This specification defines the 2D Context for the HTML canvas element. The 2D Context provides objects, methods, and properties to draw and manipulate graphics on a canvas drawing surface.
  • A Group Note of HTML Microdata, which defines the HTML microdata mechanism. This mechanism allows machine-readable data to be embedded in HTML documents in an easy-to-write manner, with an unambiguous parsing model. It is compatible with numerous other data formats including RDF and JSON.

Internationalization Tag Set (ITS) Version 2.0 is a W3C Recommendation

The W3C MultilingualWeb-LT Working Group has published a W3C Recommendation of Internationalization Tag Set (ITS) Version 2.0. ITS 2.0 provides a foundation for integrating automated processing of human language into core Web technologies. ITS 2.0 bears many commonalities with its predecessor, ITS 1.0, but provides additional concepts that are designed to foster the automated creation and processing of multilingual Web content. Work on application scenarios for ITS 2.0 and gathering of usage and implementation experience will now take place in the ITS Interest Group.

New draft for CSS writing modes

The Cascading Style Sheets (CSS) Working Group has published a Working Draft of CSS Writing Modes Level 3. CSS Writing Modes Level 3 defines CSS support for various international writing modes, such as left-to-right (e.g. Latin or Indic), right-to-left (e.g. Hebrew or Arabic), bidirectional (e.g. mixed Latin and Arabic) and vertical (e.g. Asian scripts). Inherently bottom-to-top scripts are not handled in this version.

Event Report: Multimedia Archives and Metadata for Digital Publishing

The W3C Germany and Austria office has published a report on the Multimedia Archives and Metadata for Digital Publishing September 2013 event, which was jointly held with Xinnovations. The metadata topic is covered in detail in the report and shows high relevance for a wide range of technologies – from Semantic Web to Digital Publishing and Web technology in general – and application areas: from general or scientific publishers and libraries to Wikipedia related communities. More information in German is provided by a dedicated press release.

At the CONTEC Conference

I had the pleasure of participating at the CONTEC Conference last week, taking place in conjunction with the Frankfurt Book fair. /It was really good to be there and I would like to thank Kat Meyer for the invitation to participate. I had lots of conversations, informal or semi-informal meetings with various people; I do not want to list names because I would incur the danger of forgetting, and thereby offending, someone… Suffice it to say that it was really good for networking!

I spent most of my afternoon at two open sessions, both around EPUB 3, namely a session on IDPF and on Readium.org, respectively. Although, through W3C, we of course have a  contact with IDPF, this session was extremely useful to gain a bit more insight into what is happening there these days. I knew about some of the work going on (e.g., the fact that EPUB 3.01 goes to ISO), but others were new to me. For example, I did not know until that day that a work is planned to adapt the Open Annotation Model (developed in the corresponding W3C Community Group) to EPUB. This work makes a lot of sense, portable annotations is a hugely important area for electronic books, and I am quite excited to see this work happening; I will try to keep up-to-date on this. The other extensions to EPUB (e.g., on indexes, usage of dictionaries) also look interesting and important. Finally, it was also interesting to see that IDPF is continuing its efforts in outreach (e.g., that it will take over the Support Grid of BISG and develop it further); I think outreach is yet another area where a future cooperation between IDPF and W3C may happen.

While I of course knew about many things about IDPF, the presentation on Readium.org was different: I only had very vague ideas, previously, about what was going on there. The goal is to develop an open source implementation to be at the core of EPUB3 readers. This “Readium SDK” will sit on top of Open Web Platform based rendering engines like Gecko or Webkit, and should take care of all the core EPUB3 specific features (e.g., table of contents, management of indexes, packaging, etc.). The code is expected to be available at the end of the year, and we can expect first full-blown readers mid 2014. This can become hugely important: it means EPUB3 compliant readers can really come to the fore and, due to the architecture, those readers can evolve in parallel with browser developments.

There was also a separate presentation on the thorny issue of content protection through the separate sub-project called LCP (Lightweight Content Protection). The way I understood it, as a kind of an elevator pitch: consider what is currently available for PDF in terms of password protection and right expressions, and adapt it to EPUB3. It is not a really strong content protection, as far as I know, but it seems that at least the Readium.org participants (which includes a number of publishers) consider it as good enough. I do not know whether this is a solution to the current DRM issues and discussions on books, and I guess it is still controversial, but it was interesting to see that at least new ideas are being sought and are being implemented as alternative solutions. (To avoid any misunderstandings: the Readium SDK is not dependent on LCP; it is up to the final users of the code whether they want to include that module or not.)

Last but not least:-), Markus Gylling and I also had a session on the relationships between IDPF and W3C, entitled “Digital publishing and the open web: The W3C’S digital publishing interest group”. (The slides of the session are also available on-line). We explained the reasons for setting up the W3C Interest Group; that the publishing industry should play a more active role in the development of the Open Web Platform; what has already been achieved; and also how the cooperation between IDPF and W3C is essential in this respect. Although it was not a huge room, it was certainly full with around 50-60 people (out of around 250 attendees overall at CONTEC). It was great to see that many of the participants, who may not have heard of us before, became really interested by the issues around the Open Web Platform; hopefully, this will be the basis for more contact and cooperation in the future!

It was a good day!

Update from the Co-Chairs

In the first two months since the launch of the Digital Publishing Interest Group, we have already identified approx. 36 Use Cases. They include narratives for pagination, annotation, the representation of mathematical and scientific content in reflowable MathML, and accessibility scenarios for personalized learning materials to specific conditions like Dyslexia. Robert Sanderson provided a suite of Use Cases for the basic model for commenting, annotating, tagging with persistent layout, and with that we have a full spectrum of social reading examples. New use cases are added weekly, so please check in regularly with our Directory on the DigiPUb wiki.

Having real-world examples from users is critical in identifying the technical requirements and the Working Groups that will provide the specification for a seamless, portable, and enjoyable reading or learning experience. User experience will no doubt provide more information as our last weekly meeting explored internationalization, second screen / multi-screen, and the convergence of journals, books, and testing. Use Cases for these are hotly anticipated.

Meanwhile, two Task Force developments are underway. Dave Cramer, Hachette Livre, kindly agreed to lead the Pagination team and Suzanne Taylor, Pearson Education will lead Accessibility. Both of these bring attention to the evolving expectations of the digital narrative as we discover different “rules” for the various kinds of publishing, e.g. STEM, Professional, Education – Testing.

Thanks to the participants of the group for their generosity. If we deliver these open specifications, we will surely have the potential to significantly impact and change the way we deliver and consume information. With the recent announcement from Digital Book World magazine of a $13 e-reader called Beagle, the idea of an eventual free e-reader can’t be far off. Smart phones are also beginning to use better e-ink to display text and with 87% of the population owning one, their reach can’t be underestimated

On behalf of my Co-Chair Markus Gylling, we thank you and look forward to keeping you updated with our progress.

Use Cases and Exploratory Approaches for Ruby Markup (Note)

The W3C Internationalization Working Group has published a Group Note, Use Cases & Exploratory Approaches for Ruby Markup.

This document was designed to support discussion about what is needed in the HTML5 specification, and possibly other markup vocabularies, to adequately support ruby markup. It describes a number of use cases associated with ruby usage, and then examines a number of possible ruby markup approaches for each use case, listing pros and cons for each approach.

CSS Style Attributes Proposed Recommendation Published

The Cascading Style Sheets (CSS) Working Group has published a Proposed Recommendation of CSS Style Attributes. Markup languages such as HTML and SVG provide a style attribute on most elements, to hold inline style information that applies to those elements. This draft describes the syntax and interpretation of the CSS fragment that can be used in such style attributes. Learn more about the Style Activity.