Publishing BG, IG, & WG Joint F2F, 2nd day — Minutes

Date: 2017-06-23

See also the Agenda and the IRC Log

Attendees

Present: Leonard Rosenthol, Ivan Herman, Tzviya Siegman, Matt Garrish, Ric Wright, Charles LaPierre, Cristina Mussinelli, Avneesh Singh, Dave Cramer, Jun Gamou, Brady Duga, Romain Deltour, Dan Sanicola, Mateus Teixeira, Rick Johnson, Micah Bowers, Garth Conboy, Liisa McCloy-Kelley, Rachel Comerford, Bill McCoy, Takeshi Kanai, George Kerscher, Bill Kasdorf, Laurent Le Meur, Vagner Diniz, Selma Morais, Karen Myers, Hadrien Gardeur

Regrets: Peter Krautzberger

Guests: Fred Chasen

Chair: Tzviya Siegman, Garth Conboy

Scribe(s): George Kerscher, Charles LaPierre, Rachel Comerford, Romain Deltour, Leonard Rosenthol, Bill McCoy, Karen Myers, Matt Garrish

Content:


Toshiaki Koike: toshiaki-koike has joined #pwg

Mateus Teixeira: mateus-teixeira has joined #pwg

1. web packaging

Tzviya Siegman: Leonard‘s slides https://www.dropbox.com/s/qyzc96c1unpt6kq/Web%20Package.pdf?dl=0

Leonard Rosenthol: Packaging on web spec with 4 changes
… index of content that enables local fetching in Random access.
… offline access and faster processing
… sub processing, different orig
… removed something “idea of fragment identifiers” CFI’s refer to sub resources. How do we find something in a package # something.
… identified 5 use cases 1. Local Sharing. bluetooth wifi, etc. 2. Snapshot a page or site, no standard for doing this “web archives”. 3. Signed applications PWA’s chrome extension APK on android send this this would enable this to be signed correctly. 4. Physical web devices like beacons, store can send you info, package sent to your device. interesting use cases.
… amp address CDN web cache one big chunk.
… goals: authenticated sharing. from one origin to one person but would trace back to the original origin moving something off the web wouldn’t change this. Important for streaming, over a line can be processed as it goes.
… multiple origins one library ref. to another JS library
… . Crypto agility
… cross signatures. publisher may want to sign something but the distribution network can then sign. there could be unsigned content. file formate is binary
… you can revoke signatures if a publisher goes out of business you can revoke access.
… backwards compatibility.
… 4 non-goals. They were not part of the spec, but non are prohibited. store confidential data. Streamed processing regeneration. Non origin identity. only type of identity is only web origin web centric. Everything needs to be tied to an origin. Not included DRM.
… HTTP client protocol relies on request/response these are specific message and document IETF and W3C specs. just plain text. operation get/put/post what language, who am I, and then the response comes back error code, dates, who am I and type of data coming back and then actual data.
… take those requests/responses and bundle them up then this becomes a web package. used existing IETF standard put together in a different way. CBOR (Binary JSON)
… took JSON and then took that and made it into Binary. magic header / footer, used EMOJI’s package of web and package EMOJI.
… Binary offsets for each item
… Manifest / Authenticity. Metadata URL to ORIGIN and date is mandatory. Web app manifest.
… list of hashes to validate no changes. signing and all signatures go with it, and support for sub packages.

Garth Conboy: packaging on the web that we defined.
… who is working on this?

Bill McCoy: see: https://mailarchive.ietf.org/arch/msg/dispatch/NQ0deHSsRvt4BL4alk_WYVnhhvo

Leonard Rosenthol: Google.

Bill McCoy: “The Chromium project has started work on this sort of packaging format

Bill McCoy: within the W3C’s WICG, at https://github.com/WICG/webpackage. We have

Bill McCoy: a list of use cases, some goals and explicit non-goals, and a draft

Bill McCoy: for the format itself. We believe the IETF is the ideal place to

Bill McCoy: standardize the format, and in parallel we’ll specify within the W3C

Bill McCoy: how browsers should load it.”

Bill McCoy: I have put a link in. latest info seems to indicate that they will split this and the W3C will take the web piece. archive part going to IETF. LinkRel going to W3C. but this may look differently once split.

Leonard Rosenthol: This is currently being lead by Google and Node, portable applications. Note with Electron project and NPM node packages, and self extracting archive are the two leaders.

Ivan Herman: lets finish the technical piece.

Leonard Rosenthol: each of the resources defined with HTTP headers encoded in HPACK
… method scheme authority / path is our request
… response: status . this is the pairing. you are not limited you can add additional headers as long as you follow this .. fully extensible.
… manifest shows an example of JSON code
… SHA3/4 hashs, you define the signature algorithm you are using and the signature
… does this help us?
… in PWP EPUB4
… already defined and part of web standards / specs. there may be other companies who pick this up and does address the origin problem and hardest problems facing us.
… add some features. sub packages, combine two epubs together. useful feature and extensible metadata.
… big item its NOT ZIP! keep this in mind.

Ivan Herman: archival algorithm and web specific part is separate… lets say this is true. if we put archive aside, the fact of signature we may want to sign
… there is more in it than the packaging PWP and web packaging can benefit. Web publication should be browsers want to use web archive then web publication is done for them.
… I have no idea who carries that at W3C, Is this an active work inside the W3C.
… I am not sure where we are because this bothers me.

Leonard Rosenthol: CG group this is coming out of WCG, it came out of the tech.
… this version WICG, then this can’t be published as a Rec, we could offer a place to come out as a rec.

Ivan Herman: not part of our current charter. But it still under the Web Platform WG.
… many things are in the community group so these are the details that come out.

Liisa McCloy-Kelley: combining EPUBs challenge, now we need another level on navigation and structure and I don’t know if that is taken into account.

Leonard Rosenthol: no this is not covered, there is nothing that precludes this but would need to be developed.

Bill McCoy: we could have more than one formate, and higher level stuff could apply to both. You can’t put zips inside of zips, but I think we should look into this even if they do split this up and zip compatibility is important for this gorup. but we need to separate out this.

Ivan Herman: we need to talk to Phillip

Leonard Rosenthol: raise IETF may not even want to take this on.

Dave Cramer: https://discourse.wicg.io/t/proposal-packaging-for-the-web-signed-and-indexed/1827/13

2. Service Workers

Brady Duga: service workers are a fundamental part of getting data from a webapp
… service workers exist and we will need to use them in the js layer if you using them

George Kerscher: my questions might be answered by an overview but my understanding is that a service worker fetches and combines

Brady Duga: if you need to find and load a jpg, it give js code a way to do this in a secure manner
… for example

Rick: do we expect to require these?

Brady Duga: no - its basically: here are the resources I want to be called for, say a picture of the sun, typically you grab a URL. This gives the service worker a chance to say, I am going to find you the picture of the sun that you have on your desktop instead
… it will store locally the resources that you need to access - when a URL is requested, the service worker grabs the info it already has instead of going to the web
… it’s a hook for the js
… the open web can have a cache of data because of service workers so if you’re writing a webapp you can do this easily
… helpful for webapp users
… . other than a reading system, who would be interested in service workers?

Tzviya Siegman: what if I want my book offline?

Brady Duga: what are you using for that

Ivan Herman: if I do not want this offline always, I can have an environment where specific pages can be offline where I as a user do not have to do anything to use it in that way. I can turn off my browser/machine and the content will still be there

Ivan Herman: the service worker as a technology to specify things like “I want to be offline and implement x”
… it’s not the magic bullet for everything but it gives the request for offline access a possibility

Leonard Rosenthol: the point brady is making is that systems have always been able to offline, service workers allow a publication to offline content

Brady Duga: I disagree - you’re just making the publication a reading system

Dave Cramer: Should I show an example

Ivan Herman: it makes certain requirements in our spec realistic

Brady Duga: it’s a footnote in the spec

Ivan Herman: it gives us the tools in an open web platform

Ric Wright: to support Rick on this, I see it as an implementation detail but it’s not something we want in the spec I think

Dave Cramer: from the broader perspective our motivation over the last few years has been can do all of these long form book like experiences and make it closer to what the web does everyday
… one of the fundamental things that we want to do is - it’s easy enough to make this experience on the web, but for service workers, if I click this save button a service worker is started up like a concierge that opens the link
… the service worker will intercept a request and grab what it already has and make it available to the user
… reading systems can already do this but it makes it easier for experimentation

Rick: given that this is the commenting with the business group - talking about this in absence of the security piece that Leonard mentioned, publishers would not allow me to implement this without the security piece

Ivan Herman: my example would be a scientific journal. Each article is a web publication and I want to read them either online or offline. Today I need to download a pdf version.
… we will have to define what a web publication is
… we will have to set up an environment where offline is possible

Brady Duga: service workers are not a magic bullet
… it has to be implemented in the reader
… for service workers there is always a reading system involved

Dave Cramer: reading system can cover a broad range

Charles LaPierre: in dauwhe’s example there was a push button application, but you would also want a resource for (for example) sudden loss of power

Dave Cramer: google has a set of guidelines, but there is a user consent element

Tzviya Siegman: the average online element in a journal includes a download version - right now users do not assume that offline access is available. Right now the button needs to be there

Brady Duga: you can address it with various UI solutions

Ric Wright: I am struggling to see how this relates to the spec - it sounds a lot like authoring to me

Brady Duga: we never put network requirements into the epub spec

romain: the publication, the website, and the app can be the very same thing. (??? I had trouble hearing this)

Brady Duga: we need to make it clear that your book can be a reading system

Ivan Herman: if we decided in the spec to say “you should not put js in you manifest” we may be shooting ourselves in the foot. We need a model for how this should work and be implemented. Service workers are not a unique way to do this but it will in coming years be an option for implementing offline access so we need to be aware of it

Brady Duga: we need to make sure we don’t exclusively look at service workers. ie we shouldn’t require my android has service workers

Matt Garrish: remember that browsers have limits in what they can house offline

Dave Cramer: I think we’re touching on a fundamental question which is who is responsible for the experience
… the publisher packages the content but then hands it off to the reading system which determines how the user interacts with it
… this is different from the web experience where someone is responsible for content and the user experience for their own website
… how could we author content that could act in both contexts - both putting it into a reader and just throwing it up on the web

Brady Duga: if you want to make a book as a website - you can do that today using browser technologies

Leonard Rosenthol: if all content is consumed by dedicated reading systems this is fine, but if we are user browsers for access do we expect them to have the same functionalities that a reading system has

Brady Duga: Ivan raised the example of an article - a book that isn’t a book. What about a receipt, a memo, a boarding pass - are these in scope?

Leonard Rosenthol: i certainly hope so, @clapierre

Brady Duga: are we replacing the technology that allows us to store these locally

micah: right now an epub3 could be a receipt, a boarding pass…

micah: do I want to use the browsers built in reading system or my own? We have a cookbook demo from hachette that you can interact with in a reading system but the controls are all screwed up - so you wouldn’t want it used on a kindle. But you would use it on the Hachette website

Bill Kasdorf: we should be disciplined in our terminology. Document = thing; publication = collection of things
… avoid using the term document for something that has more than one file or resource
… webpage is a document, but fonts are outside
… publication includes the font

Tzviya Siegman: what about a journal article

Bill Kasdorf: is can be a publication

Brady Duga: you need service workers to deal with multiple elements

Liisa McCloy-Kelley: part of the struggle we’ve seen publishers have is that we don’t have access to the whole frame - running heads and pagination - we can’t control. It would be nice to pull this back into the control of the publisher

Tzviya Siegman: +1 Liisa

Leonard Rosenthol: we move into the realm of author intent - the difference between epub and pdf is that pdf has always been an example of author intent

Brady Duga: unless its fxl

Leonard Rosenthol: the philosophical position of publisher/author intent over user control/user intent is something we’ll need to continue

Dave Cramer: responding to Liisa and Leonard this is a fundamental problem between the author and the user - sometimes authors do things that aren’t great for the user

George Kerscher: I have a concern about a variety of user interfaces that I am going to need to use
… if I was a college student with ten interfaces to learn it would be a nightmare

Brady Duga: sw gives us the ability to create one reading experience
… the reality of the unique reading systems is that service workers are as hard to write as apps - publishers have pushed back on the effort here

Garth Conboy: q

Brady Duga: my feelings that we won’t see 10 million reading system, but 10 reading systems because of the effort involved
… this spec won’t change the effort publishers want to put into this

Garth Conboy: disagrees with Liisa about level of control author/publisher needs

Liisa McCloy-Kelley: I don’t want control over everything, I want some though

Garth Conboy: I understand that but I think the last thing we should give publishers control of is pagination

Leonard Rosenthol: @dauwhe, is that circular?

everyone: argues about page breaks which has nothing to do with service workers

Bill McCoy: there may be some reading systems that are a large percentage of the usage but if we’re successful there will be many publishers that want to create their own reading systems
… the answer is that the scope of the wg is broad, and we can use the pdf reading systems as an example

Leonard Rosenthol: +1

Garth Conboy: though not a great acronym

Rachel Comerford: publisher control and unique required reading systems are HUGELY important to educational publisher

3. why content producers provide all the data?

Garth Conboy: dauwhe said ‘why am I as a content producer’ providing all this data that the reading system could understand itself. ie a manifest. I said, that’s insanity: you need reading order. I’m just curious about what people think about this manifest need.

Dave Cramer: one way to summarize the thought, why is it my responsibility to remember the mimetype of an open source font? as a content author, why am I worrying about the inside baseball of reading systems? I wrote a script that finds and gathers the manifest information.

Dave Cramer: the webapp manifest is not a manifest in the traditional sense

Liisa McCloy-Kelley: I would say from a content providers perspective, it’s a good way to validate what you need to maintain. What do I need to look at?
… ie 1x1 pixel png file that keeps getting thrown into the manifest whether it’s in the publication or not

Rachel Comerford: +1 what Liisa says

Dave Cramer: in that case the manifest caused the problem instead of finding one

Leonard Rosenthol: from a technical perspective: 1) random access - a reading system knows where to find it and gets it quickly 2) mimetypes - security benefit; when they are wrong they show potential attack
… the manifest allows us to identify internal and external resources; manifest allows a change in direction

Dave Cramer: I’m not arguing the value to reading systems, but why is it an author responsibility

Leonard Rosenthol: because the RS/UA can’t know the intent of the author/publisher

clpierre: in evaluating publications I have found many extra files; echoes security issues

Brady Duga: if you don’t have a manifest but you do have js - the offline experience is broken (Rachel: did I get this right @duga??)

Brady Duga: Yes, if there is no manifest, we can’t know all the resources that are needed if JS is used

Garth Conboy: (correct, as the RS can’t with accuracy, figure what’s referenced internally or externally by JS code in a publication)

Dave Cramer: maintenance point that Liisa raised - we’ve been through several cycles of updating the scaffolding around content without changing content. I think there is a layer of work that is not content change. We could make this easier.

Ivan Herman: you guys are talking about publishers that are putting in energy to create a big book but when we talk about web publication it can be anyone, myself who is not a publisher, puts together a few pages and wants to make it available for offline use. this is the same community that makes webpages for themselves. It is essential that this is simple for that community.
… the script that turns w3c documents into epub - I didn’t want the spec author to be forced to do this authoring. The need to make this simple for the authors is being lost - reading systems must take the load.

4. EPUB 4 and WP, hope & dreams

5. hopes and dreams for WP & EPUB 4

Dave Cramer: we touched on what I wanted to talk about already
… remind everybody of our priority of consistuencies: users over authors over specifiers over technical purity
… not the other way around!

Rachel Comerford: an additional grieve to what dave said
… specs are using the language of the people writing them, rather than the language of the spec readers
… plain language is extremely important

billk: another very important thing is examples

Tzviya Siegman: so rachel is gonna be an editor and bill will write examples :)

6. (P)WP Security

Ivan Herman: Slides at https://www.dropbox.com/s/dnztoypteq83xpt/PWP%20Security.pdf?dl=0

Leonard Rosenthol: at Adobe and within the PDF association we’ve been talking about our version of portable web documents for a year
… security is very important
… certains things are addressed by EPUB, not everything
… we have a 70+ page spec just focusing on security
… starting with areas related to web publication
… 1. secure contexts
… the web app security spec says it allows and UA to enable features
… if you load a secure page, it loads in a secured context
… when we start talking about sth off the web, should secured context be applied or not, what does it mean?
… some features of the Web platform are only available in secure context, e.g. SW
… if we want to use them, it will require a secure context
… location services, other things, also require a secure context
… also push notifications
… the other thing to keep in mind is you can’t cross the streams
… if you start from a secure context, you can’t reference sth from an unsecured context
… you can’t call unsecure from secure; you can do the other way
… we have to define what’s our recommended model

Romain Deltour: 2. restricting content

Leonard Rosenthol: best case scenario is: you can put everything you want
… we should talk about CSP (content security policy)
… it’s a mechanism to specifically restrict what is allowed on a page
… so we want to allow a publication to specify CSPs?
… do we want RS to override CSPs?

Ivan Herman: how do you do CSP setting in HTML?

Leonard Rosenthol: I think it’s in the head
… now, looking at the content, some interesting projects: e.g. AMP
… one of thing it does is restricting CSS, we might want to look at that
… also need to consider other formats (SVG, …)
… another interesting tech: plugins. what if a publication uses an embedded to load a plugin, do we allow that?
… by using an embedded tag, you could embed a publication within another. is recursion allowed?
… do we warn the user above what happens with the web publication itself?
… what happens when the app opens full screen, with no UI, is the user warned?
… then javascript
… an answer is to let the web solves its own problem, but we have to consider the ramifications
… turning it off gets us a greater level of security, but disables some features
… not necessarily on/off, can be disabled on a feature basis

6.1. PWP risks

Leonard Rosenthol: do packaging and enabling network access create a greater risk than a malicious URL?
… how does a user know where a doc comes from?
… the browser gives you some protection with the web origin
… they can also validate the content, ahead of time
… today we don’t have mechanism for controlling maliciousness off the web
… how does the stuff gets checked when it comes from email, USB key, etc
… what happens when you send out a publication, and realize after that it includes a bug, a vulnerability?
… how to update users that have already downloaded the publication?

6.2. Origins

Leonard Rosenthol: each publication needs its own origin
… it’s a good thing™
… on the other hand, without persistence, offline and caching is out the window
… we’re gonna have to think it through and talk to the Web Platform group
… when we talk about web packages, how do we connect the base origins?
… we talk about receipts, reservation, memo, self-author content: what are the origin? we need something that scales
… another mechanism that can be very interesting to us are Sub-Origins
… it enables things to be related to each other ( share cookies, share preferences)
… we want to be connected with that group

6.3. Restricting Network access

Leonard Rosenthol: do we assume that all PWP are allowed to talk to the network, or all the trusted ones?
… it it’s new issues, but we need to revisit it
… if we turn off network access completely, we’re losing some features (e.g. web fonts)
… blocking downloads – downloads are very common on the web, what does it mean for publications?
… what if you have a link to a different site, does that move you out of the context, open a browser, etc
… that answer is different depending on the user’s context (e.g. browser vs. RS)
… another consideration are URL schemes. what does a gopher of ftp link means for publications?

6.4. Security issues due to non-updatability

Leonard Rosenthol: if there’s no update mechanism, how to deal with the flaws and vulnerabilities
… it might come from your code, or 3d party libraries
… there is no solution necessarily, but it’s a concern we have to be aware of

6.5. Privacy

Leonard Rosenthol: phoning home, 3 specific cases. 1. silent tacking 2. spammer abuse 3. CSRF
… you can have customized images for tacking reasons (e.g. tracking pixel)
… CSRF: cross site request forgery
… I hope all this will give us things to think about
… it’s clearly one of the fundamental topic

Charles LaPierre: about updatability, that can be also the other way: what if a 3d party wants to inject malicious code into a valid EPUB?

Garth Conboy: I always thought this topic is less a problem in the PWP land, somewhat analogous to EPUB land
… what do you think we need to solve in PWP?

Leonard Rosenthol: in EPUB, curation is a big advantage
… also, you have a controlled set of RS
… having that level of control, many of these problems didn’t matter

Tzviya Siegman: EPUB has no control over RS
… e.g. 10 RS based on webkit, all of them different

Rachel Comerford: privacy question: much like in EPUB with the education profile, is there a need for a privacy specification?

Liisa McCloy-Kelley: we don’t really have food support of javascript
… the ability to specify what to do with local storage would be great
… it’s a priority I hope we don’t have to wait to long for

Bill McCoy: many people don’t know there’s scripting in PDF too
… as we look into security consideration, we have to look at how PDF answers it
… we got to learn from the 20 years of experience

George Kerscher: on the a11y side, for a while a11y was shot down for security
… people were afraid that clear text sent to AT would be abused
… it should be remembered that if you can see it it must be redirected to AT
… also, about tracking, schools are interested in tracking student’s progress (e.g. QTI things, analytics)
… it’s useful information, but there are privacy issues

Ric Wright: EPUB is curated, but it only takes a few malicious EPUB to cause issues
… in Readium, we have several issues logged about our use of javascript
… content javascript might have access to the outside, it’s not clear it’s possible to block that
… maybe EPUB and PWP just can’t be made secure

Ivan Herman: it’s also an area where PWP and EPUB 4 might be different
… in my vision, WP and PWP should be simple, as webby as possible
… if we want to constraint that, maybe that’s better left for EPUB 4

7. Accessibility challenges

Avneesh Singh: plan for accessibility work. High level
… must be possible to make WPs accessible to wide range of readers. We want to keep this in mind up front as we work through technologies
… Scope - make sure to use WCAG and WAI requirements but add our own publication needs to fill gaps
… success criteria ned to include impacts and challenges and how to resolve
… coordinate with Acc Platform Arch WG
… Objectives - all deliverables are accessible according to WCAG and dig pub requirements
… also work on accessibility features such as Media Overlay, metadata, and other content items
… such item as drop caps
… How to meet these requirements? Both technical and political.
… A11y review will help for base but won’t cover pub reqs. WCAG 2.1 is the place eto finalize the work done in other groups. But it’s complex politically
… it may take a long time to get things into WCAG
… for us the CR is due 2019 and 2020, but not sure that will make WCAG
… also WCAG 2.1 is more a place to finalize things not do the actual work. Work should happen here in DigPub
… we should have a specific task force to focus on this work
… create a document to help with horizontal review with references/links to WCAG and other docs plus calling out the publishing specific ones
… task that it can address: 1) developing the pub specific requirements, 2) help meeting the requirements from note and WCAG, 3) work with AG WG to inc. into WCAG and 4) explore ways to address new issues (overlays, etc)
… Coordination is needed within the publishing community (WG and BG), A11y Platform Arch WG, A11y guidelines WG, and other groups like WAI, ARIA, etc. And let’s not forget ISO, the EU and others that will adopt adn use these specs.
… there is also a group working on math, so addressing a11y there would also be a good place to be

Tzviya Siegman: involved with WCAG group when we proposed some additional techniques. What happened?

Matt Garrish: it’s not a high priority for them, other things are higher
… metadata will come up later on, but some thing have ben accepted

Avneesh Singh: minor things in , but big ones (like metadata) are not
… for Sectin 508 and others, it is happening but slowly and not widely

Matt Garrish: lots of opinions

Tzviya Siegman: if we are willing to present them with something complete, then hopefully it wont be a problem - but I am concerned about items where we ar making changes to their docs

Leonard Rosenthol:

Ivan Herman: ou addressed one thing that is important for no just a11y.
… lack of support in browsers. For example lack of support for SMIL
… not even sure how we’ll handle it for PWP anyway…
… XML specifications will cause adverse reactions
… not sure of the solution but this si something we need to definitely address.

Avneesh Singh: which is probably why SMIL isn’t the right direction and we need something else…

Ivan Herman: not just for a11y

George Kerscher: audio publishing industry needs to also be delivering (P)WP - so this is mainstream
… and not just for a11y, but also for language learning and others
… also the ACT (conformance testing) activities
… which are also at W3C and feeds into metadata and more
… we just need to be aware of
… Read Out loud feature also needs to be addressed due to lack of implementation. We should try to move this forward
… but will require resources and also authoring (lexicons needed, etc.)

Cristina Mussinelli: what happened to report about a11y on digpub?

Avneesh Singh: that was the gap analysis, now we will work on the specific tech issues

Ric Wright: SMIL may be outdated but many browsers support it
… for R2, this has come up as well and not sure what they solution is…yet

Charles LaPierre: we should also consider personalization as part of this work, cognitive and more.

Tzviya Siegman: there is a TF about tis

Laurent Le Meur: we are not just thinking about SMIL for R2, but have done some prototyping with JSON. Perhaps some we can promote to the browsers.
… for R2, we are looking at how to use machine learning to improve a11y, such as language detection, named entities - all of which can help with lexicon creation

George Kerscher: that would mean that the machine would do it, not the publisher?

Laurent Le Meur: it could be validated by pub, but not necessarily authored

George Kerscher: let’s talk!

Tzviya Siegman: borrow Daniel Weck…

Tzviya Siegman: thanks - go eat!

8. Document planning

Tzviya Siegman: aggressive timeline… we’ll talk more later about editors
… hope to have a shell of documents today (laughter ensues…)
… level 1 headings only

Ivan Herman: The section heading, and associated tentative editors has been collected on https://docs.google.com/document/d/1sXM51YzrfahFmkJBL-rt69Jvo0LGbOesleuEgwRWvP0/edit

8.1. WP Outline

Tzviya Siegman: Matt = editor
… a section = identifiers

Leonard Rosenthol: should we figure out the areas (task forces) and then pull back together, as we did in IG?

Ivan Herman: this is different, we are trying to carve out what the main areas are, how we organize the work and sub-editors comes later

Tzviya Siegman: a]q?

Tzviya Siegman: SMIL or something like SMIL, security, …

Leonard Rosenthol: we have 4 deliverables, are 4 in one doc or…??

Ivan Herman: we can slice/combine deliverables, for time being we were talking about only WP
… PWP part we can talk more about when we have a clearer idea about WP, and then EPUB 4

Ivan Herman: WP sections are scope of this discussion

Laurent Le Meur: my list: starts with conceptual model (high level, not yet serialization specifics)
… when we go to HTML5, is it everything or do we have a profile that will work with all reading systems?
… when we speak of content we may need to speak about constraints
… you didn’t speak about doc- roles from ARIA replacing epub:type, do we need to reference the ARIA doc?

Tzviya Siegman: Identifier:

Tzviya Siegman: Metadata:

Tzviya Siegman: Manifest:

Tzviya Siegman: SMIL:

Tzviya Siegman: Content:

Tzviya Siegman: Security:

Laurent Le Meur: we have people who want to study web comics

Ivan Herman: I don’t know whether page transitions must be defined by this group, as opposed to CSS WG… we don’t have answer yet
… it has to be included on our list of concerns but not sure if it in WP doc

Matt: TOC/navigation - if separate from manifest/reading order

Ivan Herman: identifier is different than locator

Ivan Herman: annotations have selectors (not called locators there)

Romain: section about integration with larger web platform facilities, e.g. storage?

Ivan Herman: that is similar case to Service Workers
… but that stuff is not part of the WP content model, it is more about implementation issues / implementability, we will see later whether it is in the same document or not
… localStorage, ServiceWorkers, etc. - maybe important but we do not specify
… there is no requirement for a separate section for i18n, we can decide to do it

Garth Conboy: scripting?

Tzviya Siegman: do we need User Agent conformance section?

Ivan Herman: we must have

Matt: some people say we should have not mixed RS and content conformance sections

Ivan Herman: we must have conformance section, status section, etc. but that’s for later

Leonard Rosenthol: stop using Reading System term?

Liisa McCloy-Kelley: navigation file is uncomfortable because of user aspects

Tzviya Siegman: we have in EPUB specs the triad of stuff inc. packaging… we don’t need that in WP…

Leonard Rosenthol: make annotations a top-level item?

Ivan Herman: web annotation model does not have anything related to UI of annotation, only exchange model of annotatoin structures

Matt: CSS overrides, personalization?

Mateus Teixeira: interoperability between publications?
… not addressed by EPUB today (beyond referencing), 2 pubs that “talk to” each other
… in trade world, a series of books, plot elements that could be related to each other and make stories more interesting
… in EDU world 2 content pieces could share test info (QTI) or 2 textbooks could share a study guide

Tzviya Siegman: sharing annotations on works

Ric Wright: core media types?

Tzviya Siegman: subheading of content model

Ivan Herman: to Mateus comment, I am not sure it is different than the Web. We start from the Web so if Web can do it, Web Publications can do it.

Ivan Herman: I don’t want to make any kind of restriction, maybe for EPUB 4

Rick: EDUPUB aka EPUB for Education, had things like launch-outs, need to figure out where these things land

Liisa McCloy-Kelley: want URL-ish construct that could link to product idea, that could be resolved in context (in case of book it could be resolved to the work, in context of users app and environment)
… mags are struggling to show a picture and bind to shopping

Ivan Herman: we cannot solve everything… this has come up in the earlier discussion, book ID systems used in the world is an immense problem, zillions of orgs have tried to solve… so far W3C has decided not to enter this game

Ivan Herman: we can’t solve it in group / during our 2+ years

Ivan Herman: we must have proper identifiers for a work, the tools to do it must be there, but we will not define a general identifier scheme plus all the services around it needed by industry

Karen Myers: Bill McCoy: Minus one to Ivan

Ivan Herman: to decide what we will or won’t do over the next two years
… we should focus on the outline of the spec
… if it does not need to be on the spec
… only come to consensus on what is in this outline

Ivan Herman: I believe the identifier issue we explicitly put out of scope for the charter

Bill McCoy: I don’t think we are doing it for this document
… but I +1 other point if something should be a web general thing
… if something is already on the web we should not do
… and the corollary, make sure the Web Platform group does it

DaveC: the Web works, let’s start there (per Baldur)
… need to address relationship between author and reading environment, flip side of personalization
… this is interaction between author and reading environment, how do we know what’s going on with reading system chrome, hand off events, etc.
… I think we need to make that an explicit part of our work

Avnneesh: in EPUB world, NAV doc was required, to facilitate reliable a11y, to among other things create a unified document heading hierarchy
… but in EPUB there was not much restrictions in the HTML content so we required NAV doc to be complete
… if we don’t have restrictions in HTML in WP then we will similarly need to have complete NAV for a11y
… maybe task force needed?

Ivan Herman: why scripting a top-level item in the heading?

Garth Conboy: could be restrictions on scripting, although it could end up a sub-section of security

Brady Duga: maybe each section needs a scripting sub-section

Ivan Herman: web just works, use it… this is WP so it’s by definition online, on the Web (not like today’s EPUB or PDF) at this level I would be uneasy about any kind of restriction

Ric Wright: fundamental difference of book or other publication from web page is it has multiple pages (whether fixed or dynamically generated)
… 1/3 of Readium code and 1/4 of EPUB spec are devoted to pagination

DaveC: within a content item CSS handles it but handling a “run” of multiple content items is in scope for us

Ivan Herman: a Web Publication is a collection of Web Resources that together form an entity
… you need identification of the collection of stuff, it is not just a set of individual resources
… we start with concept that we have several content elements that are identifiable, and has an order - that is the core thing around which everything else is put
… collection

Ivan Herman: we have forgotten about huge library of documents that is the Web

Leonard Rosenthol: @bigbluehat pondering it in what context?

Ivan Herman: interactivity and other things come in when they are necessary but they are not the center

bigbluehat: archived paginated publication for the Web

Ivan Herman: not necessarily text, it can be graphics, video, but it’s not gmail or a game (borderline is fuzzy)

bigbluehat: pondering this (for the logs) https://tools.ietf.org/html/rfc5005#section-4.1

Ivan Herman: it’s more what it is not than what it is (it is NOT a game)

Garth Conboy: Outline: https://docs.google.com/document/d/1sXM51YzrfahFmkJBL-rt69Jvo0LGbOesleuEgwRWvP0/edit

Leonard Rosenthol: if sharing data with a VP offline on phone they want an interaction experience like dashboard but containment like PWP

Garth Conboy: anyone with link can edit

Tzviya Siegman: Editors???

Tzviya Siegman: Matt, Editor in Chief

Leonard Rosenthol: will edit security section

Tzviya Siegman: luc and boris for metadata
… we will check with them
… dauwhe volunteers for the manifest section
… defer navigation until known if this will be needed
… daniel weck and marisa demeglio volunteered for synchronized media
… leaving scripting section for later
… leaving page transitions for later

Dave Cramer: there has been interest in the css group about transitions

Tzviya Siegman: will defer personalization for later

Ivan Herman: need an editor for identifiers - that will be important

Tzviya Siegman: takeshi (if he joins) and bill kasdorf will take on
… mattg to take over introduction
… tim cole potentially for locators
… also potentially benjamin young

Avneesh Singh: george and I could take on navigation

Mateus Teixeira: i could try taking on personalization

Hadrien Gardeur: just for fun, I just used the Readium Web Publication Manifest model to convert the first example in the Web App manifest spec: https://gist.github.com/HadrienGardeur/ffc13b9ae5029b188c41907f365ab2c3

Tzviya Siegman: we have a broad outline now, do we want to try another?

Hadrien Gardeur: not really because Web App Manifest lacks a good abstract model and extensibility

Hadrien Gardeur: we would lose a bunch of things that we already use in Readium-2

bigbluehat: gotcha. are those outlined some place? I’d love to catch up on the distinctions

Hadrien Gardeur: bigbluehat: sure let me give you a link

Hadrien Gardeur: bigbluehat: the draft is available at https://github.com/readium/webpub-manifest and we use this mainly in https://github.com/readium/readium-2

bigbluehat: fabulous Hadrien thanks!

8.2. PWP outline

Garth Conboy: I pasted into irc the outline we did thus far for WP
… a fine idea to do the same thing
… for PWP
… let me paste it back in

Garth Conboy: https://docs.google.com/document/d/1sXM51YzrfahFmkJBL-rt69Jvo0LGbOesleuEgwRWvP0/edit#

Garth Conboy: that is the document that has stuff to look up
… On PWP
… Tzviya, Ivan and I did not do any early
… drafts of that, so let’s go around and get some input
… either verbal or typing into the document as the case may be
… take the terminology and conformance section from previous ones

Ivan Herman: PWP
… you get all the issues with origin, security
… how to maintain the origina
… that is a packaged level problem typically

Garth Conboy: Ok, that is a vote for identifiers to come down
… PWP itself

Ivan Herman: Origin
… subsection of it is the issue of Origin

Garth Conboy: Ok

Leonard Rosenthol: a separate topic

Ivan Herman: there was a load of security issues that you raised
… do we make PWP light and get more stringent on EPUB 4 level

Leonard Rosenthol: we can start on PWP; fine to be assigned to that section

Garth Conboy: my plan worked
… we’re going through a few sections for PWP
… We need to pick an archive format
… is that here or down at this level?

Ivan Herman: I don’t know what terms to use for that

Leonard Rosenthol: Don’t use the word archive
… Packaged format; it’s what the P stands for

Garth Conboy: compression format

Hadrien Gardeur: not sure we need archive format
… manifest more important
… well indication or convention where we find manifest
… how do you get that manifest into the package

Leonard Rosenthol: It pre-supposes that we know what the manifest is; but it doesn’t…we don’t care what foo is
… good point

Rick: Pre-supposes there is some known way for the package?
… some known way to package things

Leonard Rosenthol: does not have to be same thing if I zip or PDF
… always something called foo and I always package it

Liisa McCloy-Kelley: * liisamk comment NOT about linear

Garth Conboy: I put package/compression/bundling and question whether we put it in PWP
… if one of profiles in zip, another in web package; I don’t know
… in sufficient data I would prefer one answer

Rick: I misunderstood your profiles as EPUB 4

Garth Conboy: right, EPUB 4 is the one profile we are on the hook to deliver
… whether it’s a zip archive or not I don’t know the answer

Liisa McCloy-Kelley: hearing this, comes to mind
… heard someone said yesterday
… someone is looking for archival format of the source material rather than what ends up in the package
… like high res art or video before it has been down sampled

Ivan Herman: what Liisa just said made me realize
… even on the WP level
… we are using one word for too many things
… I know there are people interested in archival (not zip sense) but the library communities
… maybe that is an extra item in the WP level
… may be additional information, metadata
… and people from UofM would want that

Hadrien Gardeur: using more IIIF for data docs and then decide which image types; whereas in our use case we would have one image
… need those two to work together

Ivan Herman: that word archive triggered that for me

Karen Myers: ..for Origin we have to have a section that defines what a profile is

Ivan Herman: I don’t know where we put htat?

Garth Conboy: in the terminology section of PWP

Leonard Rosenthol: Actual archiving
… long-term storage/preservation
… is correct term
… For long-term preservation, how archivists look at
… is that there are two approaches
… either have a very specific format for collecting all of these things
… or you have a format that has an archival format
… like at EPUBa profile for archiving EPUB

Karen Myers: ..ensuring work we do

Leonard Rosenthol: that EPUB 3 or 4 can be used for archive purposes
… and not unreasonable to say we may need an additional profile
… that is a valid use and should not do anything to prevent that use case

Charles LaPierre: IIIF (http://iiif.io)

Hadrien Gardeur: worried about use of the world profile
… might be very different from PWP

Leonard Rosenthol: there aren’t profiles yet of PWP

Garth Conboy: from a charter perspective we have PWP and profiles of PWP

Handrien: no profiles of @

Garth Conboy: We can have more profiles

Tzviya Siegman: and we can write specs forever! [laughs]
… shall we assign sections again?

Leonard Rosenthol: small section of PWP, I would take the packaged section that says go see the profiles for a starting point
… that would be a contentious area
… at least get it started

Tzviya Siegman: Heather Flanagan offered to do some writing

Garth Conboy: I tend to think we will approach this topic when we’re more in aspirational
… what is the last?

Ivan Herman: We may end up earlier
… the practicalities

9. How to write specs

Karen Myers: Dave Cramer: I think we had a section in the agenda about writing fact and not fiction when writing specs

Ivan Herman: this is a plea for implementation and experimenting…
… even on a small scale
… EPUB history of making a final spec and hoping someone builds it
… hope we can operate at a slightly different manner
… when we come up with something new, try to demo it and code it
… and see if it meets our expectations and iterate on things
… incubation and experimentation
… Hope we can make it a practice of doing all those things
… and match up with idea of doing tests early
… have experimental file be the core of tests

Bill: all in favor [raises hand]

10. practicalities

Bill: how do we do all this; task forces, not task forces
… this is still repositories, short names; a bunch of things to get through that are details
… we can stop here and pick up on call Monday if we want

Garth Conboy: Looking at WP which is longer with the sub-bullets
… we will make more progress when we start mailing manifest
… we’ll get more input

Leonard Rosenthol: all the way back to WP

Garth Conboy: yes, and other thing is what is our agenda for Monday

Tzviya Siegman: Yes, I was hoping we would address that today
… We will have a call on Monday
… we have a ton of work to do
… we have to deliver the first public working draft by the end of the year
… which is November in US

Karen Myers: Ivan; and Europe as well

Tzviya Siegman: WG won’t have a summer hiatus, although people will be taking vacations
… so we have about three/four months to do this
… and we have to get to work on it immediately
… we could start working on the manifest right now
… but people are getting tired

Leonard Rosenthol: was the plan
… question to the chairs
… is the plan to do this in one big group, or to break into task forces

Ivan Herman: Either we discuss now how we organize ourselves
… or we do it on Monday’s call
… Matt can look at old document and take a start from there
… but maybe some do require some sort of a task force

Garth Conboy: hard to do at this point without answering that question
… about PWP
… whether the packaging piece is an attribute of PWP or the profiles of PWP

Rick: two reasons I asked
… one area where I thought my name would go on the list, which is ok
… you can do later
… but was hoping to discuss as a group

Rick Johnson: http://idpf.org/ongoing

Rick: look at that page
… that is how the world stopped with EPUB
… are we going to define where all these things land in the new world?

Handrien: A lot of them are not part of the core

Tzviya Siegman: I think we need to bring this up with the BG
… a lot of these are out of scope

Rick: Oh, I agree, but should not say what is/is not out of scope without the BG

Tzviya Siegman: yes, as terrifying as that is

Rick: That’s all I wanted to bring up

Avneesh Singh: for Accessbility

Karen Myers: ..we have to figure out the work we need to do

Avneesh Singh: and figure out whether we need another group
… and what group will do
… other thing is to note the DAISY board meeting

Garth Conboy: Rick feeling bad about not having his name there

Rick: I was expecting something would come my way
… I think Matt is editor-in-chief, but I can be his flunky

Garth Conboy: ok

Tzviya Siegman: Sounds like people would like to form the task forces
… we are not restricted
… if you want to be on TF; or in WG; or special circumstances, contact us
… Let’s look at our groups
… Intro does not need a TF
… Identifiers
… may or may not have two leads

Ivan Herman: We have to wait for that

Hadrien Gardeur: in EPUB 3 a lot of those devolved separately
… can we really do all those in parallel
… like cannot do manifest separate from metadata
… my feeling was some yes, that can live on their own
… but not the case for every item

Leonard Rosenthol: depends how far the groups go
… if manifest TF defines a manifest or a format
… here are all the things we require
… and get down into PWP or profile, then we say we will use JSON

Handrien: I strongly disagree

Tzviya Siegman: we have check-ins by TFs with the larger group
… Dave is our manifest lead
… Matt, Matheus, Garth, Hardian, Leonard, Laurent, Benjamin Young

Garth Conboy: … Brady too

Romain Deltour: + Romain

Karen Myers: [Karen: if you are not there, add your name please]

Leonard Rosenthol: Question of chairs
… when and how we have such larger conversations; if you have given thought?

Tzviya Siegman: on the weekly calls

Leonard Rosenthol: just asking

Ivan Herman: yes, weekly calls is where everyone is present and all the things are put on the table
… for example, I think that on Monday
… spending some time on what we think the manifest will be
… you two guys can have your fight [laughs]
… and have something more specific in mind for what to put there
… that is decision we have to take together

Garth Conboy: So Monday’s agenda is the kick-off of the manifest

Ivan Herman: yes

Tzviya Siegman: You don’t have to wait for a meeting; use email, etc.
… we can have this discussion continue

Garth Conboy: For Monday, do we have to switch to WG members’ email?

Ivan Herman: yes, at the moment that is correct
… it’s up to us whether we use the email list or the core GitHub list

Garth Conboy: from an agenda POV
… is it ok to send to the IG while we are still ratcheting up?

Karen Myers: Ivan; in some sense, no

Garth Conboy: it should be a WG call where people can make commitments
… the clean state should be that those who are not yet members of the WG should become WG members asap

Garth Conboy: I will send a short agenda then in the next 4-5 hours
… this first one I will send to IG and WG to send to right people

Tzviya Siegman: Or link to IG

Ivan Herman: with the extra step of people around this table, please lobby your AC Reps to join the Working Group

Garth Conboy: Leonard, will you and Handrien both join?

Handrien: yes

Garth Conboy: fair enough

Leonard Rosenthol: Procedural question
… given we know this and other things where we may not reach consensus
… how do W3C WGs resolve non-consensus

Ivan Herman: The WG will fight as long as necessary to find a consensus

Leonard Rosenthol: no voting process?

Ivan Herman: there is a voting; you can raise a formal objection
… and then it goes up to the Director

Ivan Herman: I think finding a consensus is important

Ivan Herman: Very practical question
… we will use the webex set up for IG for last time
… one offer came up
… since we all “love” webex
… an offer came up to use Go-to-meeting
… It’s fine for W3C as long as members are happy to do that
… nothing binds us to webex
… is there anyone here who prefers to stay with webex and not go to Go-to-meeting?

George Kerscher: it’s totally inaccessible
… webex is somewhat
… Blackboard is working on accessibility with their tool and it’s the most accessible I have seen

Ivan Herman: neither webex nor Go-to-meeting use shared boards?

Garth Conboy: have to be together on call

Karen Myers: ..is that level accessible?

George Kerscher: I use the phone line

Avneesh Singh: use the phone line

Ivan Herman: We will keep using irc and other tools

Avneesh Singh: depends who is chairing meeting

Ivan Herman: not done in GtM or Webex

Tzviya Siegman: just phone line?

Avneesh Singh: If phone line, then they are about the same

Rick: Someone has to start the meeting

Avneesh Singh: When I chaired accessibility group, someone has to be there to start the meeting

Ivan Herman: GtM…if we need a one-off call, webex is also at our disposal
… what we have experienced
… is that with a call for a year or two, the call gets hacked and then we find out Cisco cuts off calls five minutes before our call
… if TF prefers for webex
… then I can set those up for you

Garth Conboy: I can always start it

Leonard Rosenthol: Adobe’s connect product is accessible if we need it
… I can give individual chairs their own groups
… you can let me know

Ivan Herman: That’s it

Leonard Rosenthol: Connect accessibility info - https://www.connectusers.com/tutorials/2008/11/meeting_accessibility/

Garth Conboy: I will send an agenda
… and hope to have the majority of this group and others on Monday’s call
… thank you!