## New PWP Draft Published

One of the results of the busy TPAC F2F meeting of the DPUB IG Interest Group (see the separate reports on TPAC for the first and second F2F days), the group just published a new version of the Portable Web Publications for the Open Web Platform (PWP) draft. This draft incorporates the discussions at the F2F meeting.

As a reminder: the PWP document describes a future vision on the relationships of Digital Publishing and the Open Web Platform. The vision can be summarized as:

Our vision for Portable Web Publications is to define a class of documents on the Web that would be part of the Digital Publishing ecosystem but would also be fully native citizens of the Open Web Platform. In this vision, the current format- and workflow-level separation between offline/portable and online (Web) document publishing is diminished to zero. These are merely two dynamic manifestations of the same publication: content authored with online use as the primary mode can easily be saved by the user for offline reading in portable document form. Content authored primarily for use as a portable document can be put online, without any need for refactoring the content. Publishers can choose to utilize either or both of these publishing modes, and users can choose either or both of these consumption modes. Essential features flow seamlessly between online and offline modes; examples include cross-references, user annotations, access to online databases, as well as licensing and rights management.

The group already had lots of discussions on this vision, and published a first version of the PWP draft before the TPAC F2F meeting. That version already included a series of terms establishing the notion of Portable Web Documents and also outlined an draft architecture for PWP readers based on Service Workers. The major changes of the new draft (beyond editorial changes) include a better description of that architecture, a reinforced view and role for manifests and, mainly, a completely re-written section on addressing and identification.

The updated section makes a difference between the role of identifiers (e.g., ISBN, DOI, etc.) and locators (or addresses) on the Web, typically an HTTP(S) URL. While the former is a stable identification of the publication, the latter may change when, e.g., the publication is copied, made private, etc. Defining identifiers is beyond the scope of the Interest Group (and indeed of W3C in general); the goal is to further specify the usage patterns around locators, i.e., URL-s. The section looks at the issue of what an HTTP GET would return for such a URL, and what the URL structure of the constituent resources are (remember that a Web Publication being defined as a set of Web Resources with its own identity). All these notions will need further refinements (and the IG has recently set up a task force to look into the details) but the new draft gives a better direction to explore.

As always, issues and comments are welcome on the new document. The preferred way is to use the github issue tracker but, alternatively, mails can be sent to the IG’s mailing list.

## DPUB IG Telco, 2015-11-23: PWP Locator task force, planning on PWP Work

See minutes online for a more detailed record of the discussions. (The headers below link into the relevant sections of the minutes.)

Following the discussions on PWP identifiers last week a task force has been set up, led by Bill Kasdorff. There were some discussions on the call as for the goals of the task force (this has to be cleaned up), but the general ideas are:

• The task force should concentrate on locators (as opposed to identifiers) both for the PWP level as well as on the individual resources’ level
** I.e., dealing with identifiers (ISBN-s of different sort, ISTC work, DOI-s, etc) is out of scope, as well as the issues around fragment identifiers, hence also the name of the task force
• The task force should dig into the addressing/identifier work described in the PWP document, should flesh out the details, possibly have some mock-up implementation, and identify if and what of this work would require a targeted Recommendation/Standardization work (either at W3C, or at IDPF, or in a joint group)
• The task force should also provide input to the IDPF EPUB3.1 work, which is looking at a “browser friendly manifestation” of EPUB. The goal of EPUB3.1 work, in this respect, would be to be forward compatible with an eventual PWP work

There were also some technical discussion, emphasizing the fact that a PWP can be a collection of very different resources from all over the place, where the order of the resource access (reading) can be different from one PWP to the other even if they share resources. The locator structure should make this possible (e.g., via a manifest).

## Planning PWP Work

There is a need for a more generic planning on where the PWP work ought to be going. The terminology-state-identifier-locator discussion has resulted in a more stable bases, and the task force on locators will dig into the details. What else? Ideas that came up:

• Looking at the library and archiving community. A focussed work will be pursued to see what specific needs that community may have and whether what is in the PWP document is adequate or not, whether it has to be extended, etc.
• The presentation control issue needs further work
• Other issues listed in the PWP draft should also be checked.
• Some sort of a proof-of-concept implementation is necessary to identify the necessary missing bits

For the last issue: Dave Cramer has recently created a simple mock-up based on the earlier discussion with, and work of Jake Archibald. (The repo of Dave is also available for cloning.) This is a tremendous start, and it has been agreed that Dave would give a more detailed overview on what is happening there on one of the next calls.

## Miscellaneous

The Interest Group has agreed to publish the next version of the PWP document as a formal Interest Group Draft. Should be out on Thursday the 26th.

The group has been reminded on the need of having better CSS examples, and some further ideas did come up.

## DPUB IG Telco, 2015-11-16: CSSWG examples, DPUB-AAM, PWP Identifiers

See minutes online for a more detailed record of the discussions. (The headers below link into the relevant sections of the minutes.)

Note that we experienced telco problems which cut some of the discussions a bit short and slightly chaotic…

## CSS WG Examples

As agreed on the last call, the IG is supposed to collect CSS examples on typesetting issues the community has. This is an ongoing effort; participants were reminded on this. Some new volunteers came forward on the call.

## DPUB ARIA Update

The ARIA technology has two parts

• The definition of the ARIA terms proper for which, in the digital domain, there is now a (soon to be updated) working draft
• Mapping of the ARIA terms on the various Assistive Technology Interfaces available today; this makes it possible to use the aria terms with those technologies.

Richard Schwerdtfeger has edited a draft for the mapping of the DPUB ARIA terms. That should be complement of the DPUB ARIA term specifications themselves. The DPUB IG was asked to approve the publication of that draft (formally done by the ARIA Working Group). The approval was voted on at the meeting.

## PWP Identifiers

Ivan Herman gave an overview of some of the proposed changes on the PWP draft. The new, proposed draft introduces changes based on the various discussions at the Sapporo F2F meeting.

Some of the proposed changes are minor: reinforcing the importance of manifests, or raising issues on how files on the local file systems should be handled by service workers. The major changes relate to the role and usage of identifiers, based on the specific session at the meeting (introduced by a slide set for the discussion). There are several aspects listed below; it has been agreed to provide more comments and issues on the draft and try to publish a new, official draft soon.

### What type of identifiers do we have

The previous discussions included references to the fact that identifiers may have several usages (the work, a particular copy, a particular edition, etc.) and each would have to have several identifiers. However, it was also emphasized that the DPUB IG, or a future formal PWP specification, cannot decide on these issues. On the other hand, a clear locator, to uniquely ‘find’ a PWP on the Web, is essential. The proposal is therefore to include, in the document both an identifier and a locator; the identifier is stable, can be any kind of URN (i.e., can be a DOI, an ISBN, etc.), whereas a locator should be unique, and should be a HTTP(S) reference on the Web. Subsequent discussions made it clear that (a) the two URI-s may coincide and (b) it may be possible to have several identifiers. The PWP level metadata may include some extra relationships (e.g., on provenance) between those two URI-s, but, at this moment, those are not specified.

### If one dereferences the canonical URL, what is returned?

Essentially, a manifest: either directly, or via <link> element or a LINK: header in the HTTP return. The role of the manifest, beyond containing additional metadata, is to “represent” the PWP as a whole.

### What is the URL of the constituent Resources within a PWP

The URL of the PWP as a whole establishes some sort of a “context” for URLs. Ie, if the URL of the PWP is http://example.com/2, then the constituents may be http://example.com/2/index.html. Ie, everything is interpreted with the scope of URL as the base.

This is a simple approach, though the Resources may be spread over the Web, so this may not be enough. An idea is to have some sort of a mapping within the manifest to map this view onto “real” URI-s in that case

Fragments should not be defined by and for PWP. With this approach, the fragment identifiers are “simply” those that are defined by the community at large for the specific media type.

### Cooperation with the IDPF EPUB 3.1 effort on identifiers

The EPUB 3.1 effort also looks at the issue of identifiers in a possible approach of “forward compatibility” with en eye on PWP. Details of this should be discussed. To be picked up on future meetings.

## DPUB IG Telco, 2015-11-09: DPUB-ARIA Update, CSS WG examples, POM

See minutes online for a more detailed record of the discussions. (The headers below link into the relevant sections of the minutes.)

## DPUB ARIA Update

The latest version of the DPUB ARIA Module Working Draft was published in July; a new version should come soon. Tzviya Siegman summarized some of the new terms that have been added since: colophon, credits, epigraph, errata. There were also discussion on noteref, glossref, etc.

There were also discussions on the ARIA mailing list on how to handle roles on links. EPUB has been doing that (‘noteref’) and it is very useful for Assistive Technologies. It is not clear, at this moment, whether @role should be used for that, or whether, for example, @rel is more appropriate. This is a decision the ARIA WG should make.

Another issue is the long term planning of the evolution of the vocabulary vs. the vocabulary currently available in EPUB (the latter is much larger). At the moment the plan is to back port the ARIA terms into EPUB and, at the same time, shrink the EPUB terms. A golden middle will have to be found. (There are some tensions with the ARIA group about how many terms we should use there.)

Overall, the IG is in favor of publishing a new WD (although the final decision is the ARIA WG’s). There is also a call for testimony from organizations that use, or plan to use, this vocabulary.

## CSS WG Examples

The discussion with the CSS WG at TPAC (see the minutes of the relevant TPAC session) revealed that a more systematic set of use cases should be provided, including screen dumps, etc, to show what should be rendered and how this should be achieved through CSS. Florian (who is also part of the CSS WG) gave additional rationale for this.

There were some discussion on how to do that in practice, and what the priority for those should be. At the moment, two use cases came to the fore for a first round: table alignment (e.g., aligning table cells on, say, the fraction sign of numbers) and inline grid management for CJK languages. A wiki page will be set up to collect these and there is a general call for members of the IG (and anybody else…) to provide as many cases as possible.

## Publication format for the POM CG

The Publication Object Model Community Group has been set up by Daniel Glazman, following a discussion at TPAC. That community group needs examples for various publication format beyond EPUB and PDF to have enough input to be able to define the POM API in a general way. This may include Manga and Comic formats, and also KF8. It is important to provide such information to the POM CG. (Information on Kindle’s KF8 has already been provided after the meeting.)

## TPAC Summary: Day 2: Ed & Outreach, IDs, ARIA

Full minutes for Day 2 are available.

After an exciting night of hunting for green Kit Kats and Pokemon, we regrouped for another day of meetings.

Education and Outreach
Karen Myers is looking for more authors from the DPUB IG to write ~500-word pieces, frequent updates to tell everyone what we are doing. Topics might include working with IDPF or TF updates. Every time we publish anything, we should blog about it. Our blog (the one you are reading right now) needs to have more than short minutes.
Conversation then shifted to trade press and conferences. We had a great brainstorming session about organizations outside of the US that Karen and others can contact. We are also considering running webinars. This is a call for action to the whole IG. Do you have a reflection on what happened this week? Write a short blog post! Tweet about us. If you’re speaking at a conference, please let Karen know. Please let Karen know where you go for publishing industry news.

Identifiers
Slides are available here.
Ivan Herman prepared a quick overview of PWP and the need for identifiers. PWP is basically a URL for a collection of web resources with the advantage of portability. We need an identifier to get to the package and as well as a method to get to its components and sub-components.
Ivan mentioned that the publishing industry has a variety of identifiers, and this group is not setting out to resolve the issue of creating one identifier to solve them all. It is important to keep in mind that PWP is a collection of resources, so we need to be able to access the collection as well as the insides.
Overall consensus is that DPUB must decide what is in the package before deciding how to point to it. Further, it is important to understand that a URL is a locator, not an identifier. It points to a page. The page may have all sorts of stuff on it, but that URL is not the thing. It is a location. We still have a lot of questions, but we have some direction about how to begin answering them.

Joint Meeting with ARIA WG
We met with several members of the ARIA WG to go over several loose ends.
Extended descriptions: DPUB provided feedback about the ARIA WG’s extended descriptions grids. The ARIA WG plans to rule out some of the proposed options. ARIA will compile feedback and bring back to the stakeholders with info about how specific use cases might meet use cases.
Mark Hakkinen provided an overview of his work on web components with ETS and IMS Global. Mark has been transforming the DIAGRAMMAR model into web components. There was some discussion about whether it is possible to implement this today and browser support for web components.
After some discussion about roles, attributes, and code samples, we agreed that DPUB-ARIA Module will go to Second Public WD in mid-November. If code samples are not updated at that point, we will release a third public working draft later.

And, that closed our formal sessions. We had a few breakouts with great attendance. I have compiled about 20 action items from the F2F. We have a lot of work to do! Thanks everyone for a great week!

Posted in Activity News | Comments Off on TPAC Summary: Day 2: Ed & Outreach, IDs, ARIA

## TPAC Summary: DPUB F2F Day 1 (Overview, Anno, EPUB 3.1, CSS, POM, ServiceWorkers)

Full minutes are available

What a turnout! We had about 10 of the regular IG participants in Sapporo. At all points, we had at least 20 people present, sometimes closer to 30. This shows the growth and impact of the DPUB IG. Ivan commented that in Shenzhen (just 2 years ago), few had heard of us. Dave pointed out that of all things happening at TPAC (and there are so many things happening at TPAC at once), several people considered DPUB to be the most interesting. Maybe it was the cream puffs and Pocky! Thanks to all who contributed, scribed, memed, called in, and provided Japanese sugar.
After three days of meetings with others and two days of DPUB meeting, my biggest take away for DPUB is that we don’t yet have a clear idea of what the PWP manifest must/may/should a manifest include. Without understanding this, it is difficult to move forward with several of the topics discussed. So, we have a lot of work ahead of us, we have accomplished a lot already. Here’s a summary of a great few days.

Annotations
Rob Sanderson provided overview of Annotations WG model. We discussed TextFinder API (formerly Rangefinder API), which accomplishes both search and locate in the URL. Doug Schepers explained that this stores hashes not strings. Character offsets are possible. The group is also exploring other selectors, including XPath and CSS Selectors. DPUB and Anno should remain in contact, especially if we know of real world implementations.

Summary of current work:
Take a look at the minutes to see how much we have already accomplished and how much is in progress. Here is a quick list:

• PWP: we outlined a vision. Now we must work toward functionality
• CSS: published modules, priorities list, keep the highly-informed input coming
• DPUB-ARIA: module exists. People are eager to use it in EPUB world as well as in scholarly publishing
• A11y TF: working with ARIA WG to get extended descriptions right and then point out other issues specific to publishing or where publishing can contribute
• STEM TF: a lot of exploratory work, next steps will probably be around the future of math on the web. Major outcome is that there is a need for those with understanding of Math/MathJax/polyfills to talk to Houdini
• Metadata: published interviews, learned that publishers use metadata heavily, need some rights expression/management, and maybe make metadata more aligned w OWP

This became a fascinating discussion about intersection of Math and CSS and the need for communication between those who will implement Houdini and those working on Houdini. End result: MathJax has done a great deal of research regarding MathML and polyfills. Houdini wants to know and wants to talk you.

EPUB 3.1 overview
Dave and Tzviya offered an overview of the ongoing work at IDPF on the first major revision of EPUB (see slides. Some suggestions from this discussion:

• Reconsider schedule (note: Ivan pointed out that IDPF and W3C have different modes of working, and this was not really up for discussion)
• Bring in libraries, especially wrt metadata (Heather and Lars offer to help)
• Do not deprecate elements. Kill them. Deprecation will cause problems
• Assess what is the interoperable core of EPUB 3.0.1 to determine the best way to move forward with EPUB 3.1.
• CSS Profile: snapshot may not be best option because it includes CSS specs that are rec level or almost rec level. It would be unwise to require all UAs to support all of snapshot. Good starting point though.

Meeting with CSS WG
Dave Cramer led a discussion of CSS priorities. He chose to skip the topic of pagination, because it’s too big. The group covered several topics, and the CSS WG wants more detailed examples from DPUB for all of these items. It is important for DPUB to file bugs as well. Need help with samples or filing bugs? We have several members of CSS WG among us, and they are friendly. Don’t hesitate to ask.

• Table alignment: CSS WG asks DPUB what is missing? Send your sample tables to Dave and Florian. (note that David Baron filed issues on https://drafts.csswg.org/css-text-4/#character-alignment while we spoke)
• a11y of generated content: There is concern that generated content is not accessible. CSS WG concludes this is an implementation bug, and DPUB should file implementation bugs.
• Hyphenation control: There was much discussion about parameters that control hyphenation, line breaking, line balance, and how this affects performance. Discussion pointed to this being an issue with line breaking, not hyphenation, which means that it would not affect performance and is an issue for Houdini.
• Keeping image and caption together in paged view: This pains the publishing industry. Fantasai wrote some CSS using flexbox. Dave is testing it out.

Actions: DPUB should not hesitate to file bugs. If you need help, ask members of CSS WG. If they don’t know about issues, they can’t fix them. Provide specific examples, not just complaints. Explain reasoning, not just requests. Communicate often. (These are friendly people who also want beautiful typography.)

POM
Daniel Glazman presented his proposal for implementing POM, a Publication Object Model for EPUB. It is a framework for resources, packaging, metadata, authoring, and reading on the Web with an abstraction layer to hide the “publication manager interface”. This layer can reach the individual resources as needed. Individual resources can be anything: HTML, PDF, futureFormat. The only thing common to all publications is the manager. The specific file type is a plugin to POM. There will be an open source framework that implements POM in C++, JavaScript, and potentially other languages, such as Swift and Python.
Next steps: Assess different publication formats, what is common to all of them, how the components are connected. Daniel created POM CG.

Service Workers
Guest: Jake Archibald, picking up where we left off after IG call on 19 Oct.
The group asked Jake many questions about Service Workers, including whether our thought experiment in PWP makes sense. One hot topic is that SW requires https, not local file. If it’s localfilehost, it can be http. SW cannot use file: protocol. There is a same-origin policy for the Worker script, and the requests intercepted by the SW are limited to a scope defined by the script location (although as Jake pointed out, this can be configured via the “Service-Worker-Allowed” HTTP header).
Daniel Weck explained how a prototype implementation of the Readium web reader uses SW to intercept HTTP requests for files located inside EPUB archives (using a URL path convention like http://domain.com/ebook.epub/META-INF/container.xml). HTTP responses are created by extracting / inflating file contents on-the-fly (using a Javascript zip library and HTTP byte range requests). As a result, the web browser can consume EPUB-bundled resources transparently via “normal” URLs, with no reliance on the “Blob URI trick” currently used by Readium for accessing files inside EPUB archives (which requires complex prefetching and pre-processing, thereby interfering with webview features such as caching and streaming).See the Readium repo for details.
Readium’s experiment with SW aims at transparently extracting + serving packaged content (i.e. from a zip archive), based on a “deep” URL syntax that references bundled resources. By contrast, the proposed use of SW in PWP focuses on enabling a seamless online-offline reading experience for regular content URLs. The PWP use-case requires some sort of offline local storage. There is a cache API accessible within SW, so this can be used to manage offline copies of remote resources.
There was also much discussion of benefits of SW for publications wrt security considerations and the ultimate goals of the industry. Jake requests use cases, specifically what is hard on web vs native + web view? As hybrid native apps do not normally need a built-in HTTP server to feed content to the webview, and as native URL protocol handlers can be implemented to manage access to bundled or filesystem resources, what is the role of SW in this context?
After the session, Jake created an offline-enable publication available at https://github.com/jakearchibald/ebook-demo. Test it and send your feedback.

## DPUB IG Telco, 2015-10-19: Service Workers, extended description analysis

See minutes online for a more detailed record of the discussions. (The headers below link into the relevant sections of the minutes.)

## Overview of Service Workers

Jake Archibald, one of the editors of the Service Workers Draft, was a guest of the meeting, and gave a short overview of Service Workers.

(The reason of this discussion is because Service Workers have been identified as one of the possible means to create a PWP Architecture.)

SW itself is just a javascript runtime that can operate on a separate thread. Key difference in life cycle is that it can spin up without the existence of pages on the origin. The starting point for the web becomes the SW rather than the page. One of the key features is offline because you get a fetch event for every request the page makes, so you can create an offline experience by listening for fetch events. You get to choose what to do, the default being nothing, but you can create a response, a string or blob and send that back, you can fetch things from named caches, from IndexedDB, etc. The Gold standard of this type of app development is “offline first”, i.e., create the offline experience before even attempting to go to the network, seeing the network as progressive enhancement. The aim is to ship stuff from the caches as quickly as possible, and then go to the network. Lots of user experience we are trying to figure out now, but the SW specification doesn’t make any of these decision.

Service workers can also be used to unpack content on the fly, whether the packed content is in some ZIP format or other packaging format (the issue around the streaming ability of ZIP and other formats came up during the discussion).

There were some discussion about the availability of SW in browsers (is true for Chrome and Firefox, Microsoft Edge is in the process of development, nothing is known of Safari). There are also plans to create more tutorials and introductory texts for the specification.

Subsequent discussions on the meeting concentrated on the experience using Service Workers. Dave Cramer, from Hachette, has already played with this with a good first impression. The Readium consortium has also tried using it, and Daniel Wreck, from the consortium, shared his experiences and questions. The Readium proof-of-concept implementation is able to handle an EPUB content that is exploded on the web server or is able to do some ZIP unpacking (the current implementation does not do caching, only fetch intercepts). There were issues about same-origin constraints, usage of HTTPS as opposed to HTTP.

## Extended Description Analysis

The PF Working group has created a document analyzing the various proposed content description techniques for accessibility purposes (i.e., usage of longdesc, aria-describedat, <detail>, etc. That document aimed describing the various needs of the publishing industry; that was now reviewed by Deborah and Mia, primarily, with comments from others. The document is now ‘back’ to the PF Working Group.

Posted in Activity News | Comments Off on DPUB IG Telco, 2015-10-19: Service Workers, extended description analysis

## Two W3C drafts on annotations published

• New Working Draft of Web Annotation Data Model. Annotations are typically used to convey information about a resource or associations between resources. Simple examples include a comment or tag on a single web page or image, or a blog post about a news article.
• First Public Working Draft of FindText API. The FindText API specification describes an API for finding ranges of text in a document or part of a document, using a variety of selection criteria.
Posted in New W3C documents | Comments Off on Two W3C drafts on annotations published

## CSS Snapshot 2015

The Cascading Style Sheets (CSS) Working Group has published a Group Note of CSS Snapshot 2015. This document collects together into one definition all the specs that together form the current state of Cascading Style Sheets (CSS) as of 2015. The primary audience is CSS implementers, not CSS authors, as this definition includes modules by specification stability, not Web browser adoption rate.

## DPUB IG Telco, 2015-10-12: Portable Web Publication publication and outreach, F2F planning

See minutes online for a more detailed record of the discussions. (The headers below link into the relevant sections of the minutes.)

## Portable Web Publication publication and outreach

Following the discussion last week the PWP Editor’s draft has been updated. Ivan Herman gave a status report; the participants unanimously decided to publish the document as a First Public Draft. The plan is to publish on the 15th of October.

There was a discussion on the outreach following the publication. It has been agreed that the outreach activities should not be huge (e.g., press release) and that a set of blogs and home page news at W3C and at IDPF should suffice. Some points of necessary emphasis in the outreach message came to the fore:

• explaining the process, i.e., what is a WD, how to provide input, etc
• what the relationship of the work around PWP is with the EPUB3.1 work at IDPF
• background both for publishers (who may not know about W3C) and the Web community (who may not know about publishing)
• position of W3C relative to the rest of the publishing community

Because this is the Frankfurt Book Fare week, it is probably better to set the dates for the outreach sometimes next week. There will be a regroup of volunteers about this by the end of the week

## F2F planning

The agenda of the F2F meeting at the W3C TPAC meeting is shaping up. The group spent some time on working out some details, plan for some more meetings (if possible) and determine session chairs.