The W3C Pointer Events Working Group has published a W3C Recommendation of Pointer Events. The Pointer Events specification defines a unified set of events and interfaces for device-neutral pointer input, such as a mouse, touchscreen, and pen-tablet, including capabilities for handling pointer pressure, contact geometry, and tilt; it also defines a mapping to traditional mouse events. This specification provides additional functionality not available in the related Touch Events specification; for more information on the relationship between these two specifications, see the Touch Events Community Group.
(Meta comment: the W3C Digital Publishing IG has weekly teleconferences. The minutes of the meetings, as well as a short summaries, are available on line. However, to give a greater visibility, from now on these summaries will be published on this blog rather than just putting them on the wiki.)
Metadata Task force and identifiers
Some of the crucial issues related to EPUB-WEB are around identifiers, fragments, etc. It was suggested that the former Metadata Task Force would concentrate on these, identifying use cases and requirements primarily in the area of fragment identifiers. While the problem area around fragments is relatively clear, the issues on identifiers, and how that would affect EPUB-WEB are more complex. Indeed, many identifiers used out there are based on registries and are only loosely coupled with HTTP URI-s; also, many discussions in that space are happening outside this group. The way forward is probably to “reset” the Metadata Task Force, essentially by creating a new task force to make the intentions clear.
(There are some very initial thoughts on identifiers and EPUB-WEB on the epubweb wiki.)
Overview of the Web Packaging draft
Three main areas of attention in the draft are:
- Packaging itself, based on (essentially) a multipart Mime approach. The important point is that, conceptually, a package is a concatenation of HTTP responses, including HTTP Headers, for specific resources into one package resource; the package itself may also have its own HTTP Header. This approach brings the package very close to current Web technologies, and provides a rich possibility of metadata on each resource as defined in the HTTP standard. (E.g., and ePub “spine” can be implemented through these headers)
- Fragment identifier, as defined in the document, is based on the idea of:
- define a set of “candidate” parts within the package (listing a set of possible URL-s, for example)
- choose among the candidates using some filters (essentially content negotiations based on
- use a fragment as defined for that specific media type; i.e., EPUB-WEB can rely on existing and evolving fragment identifications for different media without having to reinvent its own.
- “Link relations”, either in form of an HTTP
Linkheader or an HTML
<link>element. These provide a suitable entry point to an EPUB-WEB document: e.g., a landing page refers to the package (i.e., the possibly offline document).
Subsequent discussions looked at the question where such a packaging would be advantageous compared to ZIP. The document mentions facilities of streaming, tooling support, and richer per-part metadata; the feeling on the call was that the last argument is the strongest in favor of Web Packaging (although the availability of HTTP related tooling when handling the content of a package was also deemed to be important).
It is worth mentioning that Dave Cramer made a test on how the (ubiquitous) Moby Dick could look like in a package. The package can be downloaded from the Web (note that the fact that it is a “ZIP” file is just a means to make the file smaller in an email; the package itself can be looked at in a text editor.)
It was emphasized that the Digital Publishing community is in a unique position to strongly influence the evolution of Web Packaging, because the work is at its starting phase; joining the relevant Working Group, possibly acting as editor, is in a window of opportunity right now.
Overview of the Manifest draft
The question, from the EPUB-WEB point of view, is whether that manifest format can be used as a manifest for EPUB-WEB documents.
The manifest is a JSON-LD file that can be associated to a resource via a specific
<link> element. It has a number of metadata term that are currently aimed at web applications (icons with their sizes, display formats, etc.). Three specific issues were brought forward:
- The manifest has a notion of “scope”: a URL that represents the scope of URLs that can be navigated within context (note that web packaging also has the notion of a “scope”). It is not clear whether that functionality is enough for EPUB-WEB to help in identification
- Display mode: this is one of the terms defined by the manifest and may be very important for personalization
- Openness (or closeness) of the manifest terms: is it possible to add/define additional terms that are more important to the publishing community. It was felt that some sort of an extension structure, whereby various communities could add their own terms, would be a way forward, rather than cast a specific set of terms in concrete.
(This is a reproduction, with permission, of a blog published by Liza Daly, published on the Safari’s blog, on the 22nd of January.)
The World Wide Web Consortium (W3C) is a standards organization serving the “open web” — the set of freely available specifications that underpin most of the visible internet. In the years since the W3C was founded, all modern businesses have become “web” businesses, with their own industry-specific processes, jargon, and priorities. To that end, the W3C has formed interest groups for those industries which are adjacent to the web, with a goal to promote web technologies and ensure that the web is meeting common commercial needs.
I was co-chair for the Digital Publishing Interest Group for a time, and I have first-hand exposure to their work in interviewing publishers, documenting best practices, and writing recommendations for future specifications.
One of those deliverables is an intimidating table of W3C specifications and standards that were considered relevant to digital publishing. There’s a lot to digest there, and it’s unlikely that any single human is deeply familiar with all of it. I’ve provided an opinionated gloss of the most relevant or active standards, and feel free to comment if I’ve disparaged or ignored your favorite specification.
I’m assuming that the reader is one of the following:
- A developer who is working in digital publishing
- A curious non-developer who isn’t afraid of the word “normative” and acronyms that begin with ‘X’
- A standards wonk who wants to be more familiar with publishing activity
These are the “bread and butter” of digital publishing — whether it’s commercial ebooks, academic publishing, or journals:
There’s the workhorse CSS 2.1 specification which has been around for a decade. Unfortunately for the curious but lazy, all the cool new stuff is in CSS3, and that spec is broken out into many modules. Here’s a drive-by of the most interesting or publishing-relevant ones:
- Start with Dave Cramer’s highly readable Requirements for Latin Text Layout and Pagination (“Latin” here means Western languages, not veni, vidi, vici). Note that this is a requirements document, not a spec, which means much of what Dave recommends won’t actually work anywhere yet. Welcome to standards!
- CSS Text Module Level 3 is the “real world” equivalent to the above. Though it’s technical a spec in-progress, most everything in here is available in modern browsers and reading systems.
- CSS Regions Module Level 1 is a good read when you want to be angry about something. Regions can do some amazing things for advanced layout, but there’s a long and sordid history behind their implementation and deployment. There’s a lot of momentum behind getting Regions or an equivalent standard moving again, so there’s hope.
Extra credit assignments: CSS Media Queries and CSS Fonts Module Level 3. And while it’s unlikely that you’d need to actually read the SVG and MathML specs, it’s important to be familiar with those formats at a high level.
The simplest way to approach accessible web or ebook content is to study the semantics that are built in to HTML5. High-quality semantic markup will not only help a range of human users, it’ll aid in discovery and ranking by search engines.
It’s not dead yet! There’s a lot of cruft in the list, but ebooks are still required to be well-formed XML documents, and academic publishing remains dominated by XML (and, sigh, PDF).
- Extensible Markup Language (XML) 1.0 (Fifth Edition) The ur-spec. If you’re new to XML, don’t try to read this.
- XSL Transformations (XSLT) Version 2.0 Even if you never write any XSLT, you should know what it is and when it’s useful. There’s a version 3, but even version 2 is only somewhat common; you may need to refer back to XSLT 1 to work in Python or many other languages.
The W3C Web Applications Working Group has published a W3C Recommendation of Indexed Database API. This document defines APIs for a database of records holding simple values and hierarchical objects. Each record consists of a key and some value. Moreover, the database maintains indexes over records it stores. An application developer directly uses an API to locate records either by their key or by using an index. A query language can be layered on this API. An indexed database can be implemented using a persistent B-tree data structure.
The Digital Publishing Interest Group has published a Group Note of DPUB IG Metadata Task Force Report. The Metadata Task Force of the DPUB IG found, through extensive interviews with representatives of various sectors and roles within the publishing ecosystem, that there are numerous pain points for publishers with regard to metadata but that these pain points are largely not due to deficiencies in the Open Web Platform. Instead, there is a widespread lack of understanding or implementation of the technologies that the OWP already makes available for addressing most of the issues raised. However, some of the very technologies that are little used or understood in most sectors of publishing are widely used and understood in certain other sectors (e.g., scientific publishing, libraries). Priorities that have emerged are the need for better understanding of the importance of expressing identifiers as URIs; the need for much more widespread use of RDF and its various serializations throughout the publishing ecosystem; and the need to develop a truly interoperable, cross-sector specification for the conveyance of rights metadata (while remaining agnostic as to the sector-specific vocabularies for the expression of rights). This Note documents in detail the issues that were raised; provides examples of available RDF educational resources at various levels, from the very technical to non-technical and introductory; and lists important identifiers used in the publishing ecosystem, documenting which of them are expressed as URIs, and in what sectors and contexts. It recommends that while little new technology is called for, the W3C is in a unique position to bridge today’s currently siloed metadata practices to help facilitate truly cross-sector exchange of interoperable metadata. This Note is thus intended to provide background and a context in which concrete work, whether by this Task Force or elsewhere within the W3C, may be undertaken.
The next gathering of the EDUPUB community will take place in Phoenix, Arizona (USA) on February 26 and 27, 2015. A preliminary program is now available, and registration is open. The summit will launch the implementation phase of EDUPUB, a cross-organizational initiative to develop a comprehensive open platform for next-generation learning content based on EPUB 3, IMS standards for learning environment integration, and the overall Open Web Platform.
The Web Annotation Working Group has published a First Public Working Draft of Web Annotation Data Model. Annotations are typically used to convey information about a resource or associations between resources. Simple examples include a comment or tag on a single web page or image, or a blog post about a news article. The Web Annotation Data Model specification describes a structured model and format to enable annotations to be shared and reused across different hardware and software platforms.
The Digital Publishing Interest Group has published a Group Note of Digital Publishing Annotation Use Cases. This document describes the set of use cases generated for Annotation and Social Reading within the W3C Digital Publishing Interest Group, in coordination with the Open Annotation Community Group. This Note will also serve as an input for the W3C Web Annotation Working Group
The HTML Working Group published HTML5 as W3C Recommendation. This specification defines the fifth major revision of the Hypertext Markup Language (HTML), the format used to build Web pages and applications, and the cornerstone of the Open Web Platform.
“Today we think nothing of watching video and audio natively in the browser, and nothing of running a browser on a phone,” said Tim Berners-Lee, W3C Director. “We expect to be able to share photos, shop, read the news, and look up information anywhere, on any device. Though they remain invisible to most users, HTML5 and the Open Web Platform are driving these growing user expectations.”
HTML5 brings to the Web video and audio tracks without needing plugins; programmatic access to a resolution-dependent bitmap canvas, which is useful for rendering graphs, game graphics, or other visual images on the fly; native support for scalable vector graphics (SVG) and math (MathML); annotations important for East Asian typography (Ruby); features to enable accessibility of rich applications; and much more.
With today’s publication of the Recommendation, software implementers benefit from Royalty-Free licensing commitments from over sixty companies under W3C’s Patent Policy. Enabling implementers to use Web technology without payment of royalties is critical to making the Web a platform for innovation.
Read the Press Release, testimonials from W3C Members, and acknowledgments. For news on what’s next after HTML5, see W3C CEO Jeff Jaffe’s blog post: Application Foundations for the Open Web Platform. We also invite you to check out our video Web standards for the future.
The Digital Publishing Interest Group has published a new Working Draft of “Requirements for Latin Text Layout and Pagination”. This document describes requirements for pagination and layout of books in latin languages, based on the tradition of print book design and composition. It is hoped that these principles can inform the pagination of digital content as well, and serve as a reference for the CSS Working Group and other interested parties.