Publishing WG Telco, 2017-11-20: Entry Page, FPWD

See minutes online for a more detailed record of the discussions.

Entry Page

There were lots of discussions last week on the exact details on the entry page (formerly “landing page”) and what it would contain. The discussion took place on the Github issue #103. The call picked up some of the issues, and the group get to an agreement on some consensus points. This is in line with the PR #101 which, essentially, said that the entry page, ie, the result of dereferencing the WP’s address:

  • MUST be an HTML document
  • MUST include a link to the manifest
  • SHOULD be a publication resource

It was also agreed on the call that the term “entry page” is not really good either, but there was no discussion on the call for providing an alternative.

FPWD

The goal is to publish three documents (Web Publications, Locators, and PWP) before the end of the year. For that to happen it would be good to have a version to vote on at our next call, to allow for the one week consensus period and then the subsequent administration. (The publication moratorium begins on the 13th of December.)

Publishing WG Telco, 2017-11-13: TPAC/PWA, Landing Page

See minutes online for a more detailed record of the discussions.

TPAC

The group discussion summarized the impression and experiences of the TPAC F2F meeting, as well as the various hallway conversations last week (Minutes of the F2F meeting days are found here and here).

Most of the discussion crystallized around the relationships between Web Publications and Progressive Web Applications (PWA-s). There were several voices during TPAC that claims that Web Publications are “just” PWA-s, and there is nothing else to do. The WG has to make a prepare a more convincing “story” on why WPUB-s have special issues to consider that are not covered by PWA-s “out of the box”. The discussion converged towards a view that it is perfectly fine to say that a WPUB is also a PWA, but there are things that have to be built upon, and standardized, on top of the concept. There should be a writeup of what those extra considerations are (the draft discussion on the call included offline reading, strong a11y and privacy considerations, specific navigation, search annotation, explicit “collection”/boundaries, collection metadata, authors’ rights, etc.).

Landing Page (Issue #94)

There was a discussion around Issue #94 (“What does a WPub URL resolve to?”) which introduced the concept of a landing page that would include a <link> to the publication manifest. Is that page part of the WPUB or not (ie, is it listed as part of the resources)? Should it include some information items like a TOC or not?

There is no final conclusion yet on these questions, the discussion is ongoing on GitHub.

Publishing WG Telco, 2017-10-23: Packaging, Archiving, F2F

See minutes online for a more detailed record of the discussions.

Packaging

The group started to collect requirements to be documented as part of the FPWD for the Packaged Web Publication document (also scheduled to be published by the end of the year). The first set of issues and comments that came to the fore on the call:

  • Compression vs. flat and related processing requirements and limited bandwidth / cost considerations.
  • Large (enabled) – in past has been limited in size
  • Package format neutral
  • Are all resources required to be packaged (ie. remote or non-essential resources)?
  • Don’t invent something new
  • Support for signing and/or authenticity
    • And don’t reinvent the wheel here either…
  • Ability to work on small devices
  • Support for non-web hosted resources
  • Usable by humans

A separate document has been set up for an initial list.

Archiving

The topic of archiving has not yet been addressed in the WP Editors’ draft. The question is how this influences the WP and/or the PWP technology.

It turned out that NISO/ISO has started some work related to archival and EPUB, with a proposal for work sent out by NISO. There will be a need for synchronization with them. Further issues that were discussed:

  • how to construct a file that is deconstructable and preservable
  • some section on archiving should be part both of WP and PWP documents
  • it may be worth having a separate get-together (workshop or other) gathering the main archival institutions and experts

Face-to-Face meetings

  • The decision for the spring F2F meeting: May 30-31, 2018, Toronto, Canada, hosted by Kobo.
  • The agenda for TPAC F2F is almost final, some minor adjustments may come this week.

Publishing WG Telco, 2017-10-16: Lifecycle, Locators

See minutes online for a more detailed record of the discussions.

Lifecycle

Matt Garrish began working on a section for the “lifecycle” of a WP. In line with recent discussion, the (accepted) Pull Request is reduced to a series of general principles (e.g., it is not the goal to fork the Web, a Web Publication shouldn’t have to be it’s own app…) and technical issues and requirements on what should be achieved (e.g., continuous search). By listing this in the (upcoming) FPWD we would require contributions from the Browser world on how this fits, e.g., the view of browser contexts.

Locators

There is a Pull Request and an accompanying Locator document to review.

Originally, the locator document’s content was part of the pull request to the main WPUB spec, but it turned out to be way too long. Hence the proposal to fork that part into the separate document. The pull request itself for the WPUB spec only contains some general principles and delegates the main content to the other document.

The Locator document is also a spin-off, it is a slightly updated version of the Selectors and States WG Note, published by the Web Annotation Working Group. What it defines is a JSON based structure to “describe” possible selections in documents (within a WP or not), accommodating several different means of identifications (using a traditional fragment ID, using text matching, defining ranges, combining various selections into one). The use cases for this approach should fit the use cases for EPUBCFI.

The document relies on the Web Annotation model, but has two extensions to it

  • There are (at the moment, two) additional selector structures defined that are WPUB specific; the details are currently under discussion
  • It also defines a fragment ID that serializes the JSON structures; it is a question whether this is really necessary (knowing the administrative issues to register a fragment ID, as well as the complexity of the fragments themselves).

These details will have to be discussed in the future and the question whether a fragment ID format is necessary in the first place will also be decided. Several EPUBCFI use cases have been discussed at the meeting. The pull request has been merged.

Publishing WG Telco, 2017-10-02: Lifecycle, pagination, security

See minutes online for a more detailed record of the discussions.

Lifecycle

Matt Garrish began working on a section for the “lifecycle” of a WP. However… We don’t have a specific user agent in mind, in contrast to the web-app manifest which is focused on browsers. There is also not any one lifecycle for a web publication. If you’re a browser and you’ve loaded a web-app manifest directly, it’s different from what an ebook reading application might do with it. There’s an endless web of scenarios we’d have to flesh out if we wanted to.

Should we focus rather on the data model, and then maybe further down the line try to standardize if things come to that. Then starting to look at more specific issues on how a browser could represent a publication. What are we expecting a browser to do — such as browsing context. What would entail for a web publication to be layered on top of the existing HTML object model experience? There are probably interesting avenues to pursue instead of spending a lot of time specing out a lifecycle.

A PR will be made available along those lines soon.

Pagination, layout

There is a Pull request on github (see also draft version thereof) on pagination. The pull request collects a number of issues, and starts with the assumption that most of what publishing wants should be handled by CSS and, if there are additional features, these should be settled with the CSS WG.

The author of the PR could not be on the call. It was agreed that Dave Cramer will provide some editing on the text and merge it afterwards.

Security, Integrity

Security section was on the agenda, but it could not be provided by the time of the meeting. Nevertheless, a discussion occurred on the notion of integrity (ie, that the WP’s content has not changed) and how to ensure that (probably through the usage of digital signatures). It is not clear whether this issue is relevant for a Packaged WP only, or for WP-s in general.

Misc

  • There is a draft agenda for the upcoming TPAC meeting
  • Next week’s meeting is cancelled (Frankfurt Book Fair, various holidays in North America)
  • A doodle will be provided with choices for a possible F2F meeting in Spring 2018.

Publishing WG Telco, 2017-09-25: Metadata, Packaging

See minutes online for a more detailed record of the discussions.

Metadata

There was an accepted Pull Request on metadata that generated some questions on whether some extra metadata items (publication and modification dates, mainly) should be labelled as “should” or “may” in the current draft. After some discussions it was decided to keep to the current “should”, to clearly differentiate a publication from “just” a Web Page. It was also discussed that we may revisit this at some point and label this as a “should” only as part of a Packaged WP (which opens up a more general discussion on some metadata becoming more stringent when getting to packaged versions).

PWP

Discussion began on what a FPWD would include regarding Packaged Web Publication. It was agreed that, at this point, deciding on a specific packaging format is premature; at this point the Web Packaging work is not yet cast in concrete. Besides, we should stricly separate between PWP as a general concept, that would allow for different types of packaging via some “profiles” (e.g., allowing for a packaging format for EPUB4 but also allowing for a PDF based backaging for other communities). In any case, the FPWD, at this point, would probably only put down the different alternatives rather than deciding on any specific ones.

TPAC

TPAC is not that far away; planning for the agenda has already begun and participants are encouraged to register asap…

Publishing WG Telco, 2017-09-18: SMIL Lite, metadata, fragment locators

See minutes online for a more detailed record of the discussions.

SMIL Lite

Marisa DeMeglio presented some work that the accessibility task force started on Media Overlay (MO)/Synchronized Multimedia (SMIL) related issues. The task force collected a number of use cases that shows that synchronized multimedia is an important feature for publishing. Use cases include text+sign language, or video+descriptive audio synchronization, for example.

The problem is that while SMIL has been implemented in several EPUB3 readers, its adoption on the Web is very poor. This is a source of problems for Web Publications. The question is how to move forward from there. One avenue is to explore the possibility to define a minimal SMIL (possibly transferring the syntax to JSON) and explore the avenue of creating, eg, polyfills for that level, with the goal of standardizing that level.

Administratively it is not clear whether that should happen within this WG (and whether that is possible, in terms of chartering) or whether a separate CG+WCIG+WG route at W3C is more preferable. To be explored.

Metadata section

A Pull Request has been proposed by Baldur Bjarnason et al on extra metadata issues. This proposal adds some (possibly optional) items to the information set, and also references to external metadata items in their own vocabularies. There has been some discussions on github already, and the resolution is to merge the pull request and separate some of the problems (also discussed on github) into separate github issues.

Fragment Locators

There is a separate discussion around a github issue on what formalism should be used for identifying fragments within resources, as a possibly alternative to EPUB CFI. One approach is to use the Web Annotation Selector and State Model, though this should be explored as for the usability and for its applicability.

The issue includes some questions on details that must be answered. It has been decided that the group would look at the WA document and the issue and a separate Pull Request would be provided.

Publishing WG Telco, 2017-09-01: Web Packaging

See minutes online for a more detailed record of the discussions.

Web Packaging

Apart from some administrative issues, the meeting was around the Web Packaging work, that is currently on the way to define a packaging spec for the Web. This effort replaces the older Web Packaging spec by the TAG. The work has been presented by our guest, Jeffrey Yasskin, who is the main editor of the work.

The work is currently planned to be under the auspices of IETF rather than W3C, although some of the main sections may be, eventually, taken over the W3C. The “Explainer” document on the Web gives an introduction to the technology. The project came out of our emerging markets‚ — a system might have an expensive or limited data plan — so there is peer-to-peer data sharing. Our team wanted to share web pages in the same way. The current sketch is that the whole thing will be a CBOR (binary version of JSON with a few extra features); it has a sequence of features with an index of sections pointing to the offset of the file. The sections are HTTP requests and answers. There will be some mechanisms for sub-packages, as well. The request is where the interesting stuff happens. It has a set of signatures and a certificate on how to trust those signatures. Then there is the manifest – which is the app manifest – and a set of hashes of the sub-resources. There can be a set of hashes for each resource. The thing that is hashed is the concatenation of the request headers, the request, and the body.

The short presentation was followed by some questions and answers, considering issues like relationships to Service Workers (which is very good, ie, a service worker based implementation may hide the details of packaging on the network layer), how to handle this approach with non-browser clients, relationships to (and difficulties with) certificate management, relationship to ZIP, etc.

Publishing WG Telco, 2017-08-28: Issue cleanup, Manifest serialization, WAM, metadata

See minutes online for a more detailed record of the discussions.

Issue cleanup

A number of open issues have accumulated over the weeks. With the latest changes in the Editor’s Draft, and the discussions leading to that point, a number of issues have been closed.

Manifest Serialization

The group discussed whether the serialization of the manifest should follow a JSON syntax (without getting into the details on whether it should be JSON-LD, or any other dialect of JSON). After some discussions the group agreed to go for JSON.

Metadata Proposal

A Task Force has put up a first version of a document on metadata, which had a number of additional discussions off-line. The main question that had to be discussed during the call is to understand what makes a difference between a metadata and a manifest information item; these two coincide a bit in the document.

One approach that was put forward (but not yet decided) is based on what the metadata represents and what we are doing with it:

  • is it for the user agent in order to consume?
  • is it meant for external processors?

However, the borderline are nevertheless fuzzy (e.g., the schema.org based accessibility metadata, which is an important group of data).

Discussion will continue with a cleanup of the document.

Relationships to Web App Manifests

The group must answer on what the relationships are between the Web Publication manifest and the Web App Manifest (WAM). No decision can be taken in a short time; instead, the WG sets up a separate Task Force to look into this and come back to the WG. The goal is to be in position to talk to the WAM editors (at the latest at TPAC) but the requires some sort of a consensus in this group.

Posted in Activity News, Meeting reports | Comments Off on Publishing WG Telco, 2017-08-28: Issue cleanup, Manifest serialization, WAM, metadata

Publishing WG Telco, 2017-08-21: HTML TOC proposal, milestones

See minutes online for a more detailed record of the discussions.

HTML TOC as manifest proposal

Dave Cramer and Benjamin Young have put forward a proposal, for using HTML as a binding document. It became clear in the discussion that the proposal has several facets that may have to be discussed separately:

  1. Using an HTML file for TOC and its connection to the list of primary/secondary resources
  2. Is HTML a suitable syntax for the Manifest information as a whole

Although no decision has been taken yet on these (and the discussion should continue offline) the more general question on whether it is time to define the serialization of the manifest or whether this should be postponed to cover the abstract manifest first. Some of these issues are also covered in the (currently open) pull request cleaning up the current views on the abstract and concrete manifests.

Milestones

The chairs have prepared a set of “official” milestones between now and the end of the December when the FPWD is planned. The current issues will be assigned to these milestones in the coming days.