Portable Web Publications: Technology Challenges

Ivan Herman, W3C

W3C Track @ WWW2016, Montréal, Canada

2016-04-13

Portable Web Publications: Technology Challenges

Ivan Herman, W3C

W3C Track @ WWW2016

2016-04-13

These Slides are Available on the Web

See:
http://www.w3.org/2016/Talks/W3CTrack-IH/

(Slides are in HTML)

Is it a book? Is it a Web site?

Extract from “Big Java", by Cay Horstmann, John Wiley & Sons, 2013

The main message:

Digital Publishing
=
Web Publishing!

put it another way…

Web Publishing
=
Digital Publishing!

What does this mean?

Portable Web Publication at a glance

Separation between publishing “online”, as Web sites, and offline and/or packaged should be diminished to zero

What does this mean?

Portable Web Publication at a glance

ibta arabia

For example: book in a browser

Joseph Reagle's book as a web page
Extract of Joseph Reagle’s Book
  • On a desktop I may want to read a book just like a Web page:
    • easily follow a link “out” of the book
    • create bookmarks to “within” a page in a book
    • use useful plugins and tools that my browser may have
    • create annotations
    • sometimes I may need the computing power of my desk-top for, e.g., interactive 3D content

For example: book in a browser

Joseph Reagle's book as an ebook in reader
Extract of Joseph Reagle’s Book as ePUB
  • But, at other times, I may also want to use a small dedicated reader device to read the book on the beach…
  • All these on the same book (not conversions from one format to the other)!

For example: I may not be online…

Person sitting in a station with a mobile in hand
Bryan Ong, Flickr
  • I may find an article on the Web that I want to review, annotate, etc., while commuting home on a train
  • I want the results of the annotations to be back online, when I am back on the Internet
    • note: some browsers have an “archiving” possibility, but they are not interoperable

For example: educational publications

University hall with students, most of them with a tablet
Merrill College of Journalism, Flickr

Synergy effects of convergence

Advantage for the publishers‘ community

Photo of a bookshelf with lots of technical books
Jeffrey Zeldman, Flickr
  • The main interest of publishers is to produce, edit, curate, etc, content
  • Publishers have invested heavily into technology developments, but the Web developers’ community can complement that with a wider reach and perspective
  • Working closely with Web developers avoids re-inventing wheels

Advantage for the Web community

image of a medieval manuscript
Oliver Byrne's edition of Euclid, University of British Columbia
  • Publishers have experience in:
    • ergonomics, typography, aesthetics…
    • publishing long texts, with the right readability and structure
  • Workflow for producing complex content

But… why not rely only on the Web?
(i.e., forget about downloaded content, it is outdated!)

Several reasons…

How do we get there? (Technically)

Moyan Brenn, Flickr

Warning: everything I say is subject to change!

Catherine Kolodziej, Flickr

Technical Challenge: Fundamental Terminology

Web Publications

a collection of resources with different URL pointer
  • The current Web has the notion of a single resource:
    • conceptually, a single piece of data
      • HTML source, metadata, CSS style sheet, etc.
    • each has its own URL
  • Presentation is based on the interoperation of many such resources

Web Publications

a collection of resources in a 'blob' with one URL pointer
  • But publishers need the concept of a single Publication:
    • a collection of pages, together with the relevant CSS, images, video, etc., files
    • it is the collection that has a real distinct identity (URL), not its constituents

Formally

  • A Web Publication: an aggregated set of interrelated Web Resources, intended to be considered as a single entity, and which can be addressed on the Web as a unit (is itself a Web Resource)
a collection of resources in a 'blob' with one URL pointer

Portable Web Publications

More Formally

What kinds of documents are we talking about?

What kinds of documents are we not talking about?

But there are of course differences

Envisioned “states” of a Portable Web Publication

Protocol Access File Access
Packed PWP as one archive on a server PWP as one archive on a local disc
Unpacked PWP spread over several files on a server PWP spread over several files on a local disc

Technical challenge: an overall architecture to handle PWP-s

Envisioned architecture:
a “PWP Processor”

Envisioned architecture:
unpacked state

Document consumed through the Web in a traditional way

Envisioned architecture:
cached state

Document consumed through a Service Worker, possibly cached

Envisioned architecture:
packed state

Document consumed through a Service Worker, possibly unpacked

Envisioned architecture:
packed state

Document consumed through a Service Worker, possibly unpacked

Draft…

Is this approach at all feasible?

Advances in modern browsers: Web and Service Workers

Advances in modern browsers: Web and Service Workers

Work in progress

A PWP Processor could be implemented as a Service Worker

Not only a wild idea…

Technical challenge: addressing, identification

Is it "addressing" or is it "identification"?

Is it "addressing" or is it "identification"?

Three layers of addressing

  1. Locator for the PWP itself:
    http://www.ex.org/MyPWP/
  2. Locating a resource within a PWP:
    http://www.ex.org/MyPWP/Chapter1.html
  3. Locating a target within a resource:
    http://www.ex.org/MyPWP/Chapter1.html#section1

Locating the different PWP “states”

Canonical locators

The PWP Processor can take care of the rest…

What does an HTTP GET return for L?

Getting hold of all locators

Flow diagram on accessing and combining various sources of Metadata

Getting hold of all locators

Flow diagram on accessing and combining various sources of Metadata

Work in progress

Manifests

Technical challenge: presentation control
(a.k.a. Personalization)

How do we get there? (Practically)

Moyan Brenn, Flickr

DPUB IG and Portable Web Publications

screen dump of the PWP draft

IDPF, W3C, and others

Some references

DPUB IG Wiki
https://www.w3.org/dpub/IG/wiki/Main_Page
Latest PWP Official Draft:
http://www.w3.org/TR/pwp/
PWP Editors’ draft:
https://w3c.github.io/dpub-pwp/
PWP Issue list:
https://github.com/w3c/dpub-pwp/issues

Thank you for your attention!

This presentation:
http://www.w3.org/2016/Talks/W3CTrack-IH/
(PDF is also available for download)
My contact:
ivan@w3.org