Publishing Working Group Telco — Minutes

Date: 2017-08-21

See also the Agenda and the IRC Log


Present: Ivan Herman, Avneesh Singh, Baldur Bjarnason, Luc Audrain, Benjamin Young, Jeff Printy, Jun Gamou, Rachel Comerford, Tzviya Siegman, Leonard Rosenthol, Matt Garrish, Romain Deltour, Laurent Le Meur, Tim Cole, Charles LaPierre, Ben Schroeter, Leslie Hulse, Bill Kasdorf, George Kerscher, Yuri Khramov, Garth Conboy, Brady Duga, Hadrien Gardeur, David Stroup, Harriett Green

Regrets: Chris Maden, Marisa DeMeglio, Reinaldo Ferraz, Peter Krautzberger, Bill McCoy


Chair: Garth Conboy, Tzviya Siegman

Scribe(s): Jeff Printy


Tzviya Siegman:

1. last minutes

Tzviya Siegman: Comments on last week’s meeting?

Resolution #1: last week’s minutes approved

Tzviya Siegman: next item, welcome to new members
… no new members on call

Tzviya Siegman: Moving along, a number of things to cover today

2. Dave & Benjamin’s HTML TOC as manifest proposal

Tzviya Siegman:

Tzviya Siegman: Let’s take a look, Dave wisely went on vacation after sending this out, Benjamin please give an overview

Benjamin Young: What dave and I wrote up was a basis for using HTML as a binding document
… By referencing just primary resource you could get the boundaries of a publication and the browser has enough builtin tooling to bring in secondary resources/dependencies and thereby offline the whole thing

Ric Wright: muting in GTM:

Benjamin Young: My contribution was a proof of concept that it could be done. IT brings a number of publications in, in including an audiobook. It can grab the primary resource and retrieve them for offline consumption.
… Using service workers to update the local cache to maintain a local copy of the remote resource. Including notification for updates to the retrieved data.

Benjamin Young:

Benjamin Young:

Benjamin Young: There’s a demo section where you can try demoing public domain books using this system

Benjamin Young:

Benjamin Young: Issue #35 contains the proposal
… No comments thus far
… Add comments and feedback to the above linked issue.

Leonard Rosenthol: Interesting conversations already happening in the comments of the issue

Garth Conboy: There are a handful of high level issues wit this approach raised by myself and adrian as well as higher level comments about using HTML for this kind of purpose. This is a fundamental direction and there is a competing fundamental direction and I don’t know how to drive this toward consensus. Ivan is in the queue and he should see a clear path through this

Ivan Herman: My personal conclusion is that to use the HTML manifest in general and TOC in particular as an exclusive syntax for the manifest is not necessarily the way we should go
… The issue we didn’t discuss is what happens to other manifest items.
… If we go the other way and the only other avenue we have is to basically do things in JSON
… We have already discussed in general if we use JSON we have a number of fallback cases where the info is not present in JSON but maybe be available in HTML
… I see the HTML manifest as a fall back for use in those cases. I would propose we go ahead with some form of JSON (which format is a separate discussion). While we’re doing that we’ll have to talk about what kind of data can and can’t be encoded in the JSON manifest and what should the fallback target be

Bill Kasdorf: Comment re: How prescriptive vs how accommodating we can be. We should aim to be as accommodating as possible. Are there cases where this is sufficient?

Garth Conboy: For simple publications we could perhaps provide a fallback to HTML, Ivan’s solution is a good way of “splitting the baby

Benjamin Young: marcos’s contribution about JSON and the web app manifest. Summed up well in issue #7
… web app manifest is additive information that isn’t used by the browser rather provided to the browser API to create universal windows platform apps or install an icon on your home screen which is a shortcut to chrome without the UI
… it is entirely additive information, secondary to the primary information. Our JSON must be authoritative, exhaustive and contain all the dependencies as well as accounting for what happens when resources are unavailable.
… Garth the issue you mentioned was potentially spidering wikipedia
… If you reference wikipedia it will get only that one article and not all related links. We could add specification about what should be gathered for offline use but at this point you can’t get stuff off other people’s sites even if you add it as a primary resource
… The issue is less about the serialization format and more about the type of info we’re encoding

Ivan Herman: More an admin comment, let’s separate between two different questions.
… 1) We propose to use HTML to encode the data
… 2) We discuss the details on how the entries of the TOC, the list of primary and secondary resources, etc, relate

Jeff Printy: Let’s move through the queue and circle back

Leonard Rosenthol: Back to Bill_Kasdorf’s comment about serialization, it made me wonder whether or not we needed a single mandatory serialization, or maybe there are multiple serialization for different use cases. We’ve had arguments in favor of JSON and HTML, we should keep in mind multiple options.

Baldur Bjarnason: Just want to note that having multiple serialisations is an interoperability and adoption nightmare.

Garth Conboy: We have to have a place that is the root, HTML vs JSON we could argue a long time. I have a little trouble with multiple choice there. I wouldn’t have a problem with an agreed route with fallbacks but multiple choice could be a problematic design philosophy

Tim Cole: The html avenue is a little premature, minimum manifest vs maximum manifest, manifest vs toc, manifest vs reading order. Are the separate or can the be conflated.

Tzviya Siegman: +1 to Tim

Benjamin Young: +1 to Tim also. Big issues first. Serialization second (perhaps even last)

Tim Cole: It doesn’t seem that complicated to offer multiple serializations for a minimum manifest but for a complex manifest it becomes more complicated to offer multiple serialization options.

George Kerscher: George’s 2 cents is that one way of doing things is good.

Tzviya Siegman: We seem to be agreeing that this proposal is preliminary we should spend this time discussing other issue in the queue

Yuri Khramov: My note is that we should take into account the implementation and development that are going on already (epub3 readers) using JSON manifest

Bill Kasdorf: I think that it would be helpful instead of considering this a proposal consider it as an ongoing Proof of Concept. Dave and Benjamin have a working model to show whether HTML can accomplish it or not. It’s very useful work they have done, it’s a good way for us to test this particular syntax as the definition of manifest evolves

Ivan Herman: Admin issue, are we now talking about serialization or something else?

Garth Conboy: I was unclear on the request, I thought we were in semi-agreement on the manifest

Leonard Rosenthol: @garth - no, we are not there…

Ivan Herman: We are not yet complete

Leonard Rosenthol: We’re not there yet because we haven’t sorted out what belongs in the manifest and what doesn’t and we’re nowhere near deciding what belongs in the manifest

Tzviya Siegman:

Leonard Rosenthol: It’s not clear that we should be talking serialization if we don’t know what we’re serializing

Benjamin Young:

Garth Conboy: Specifically I would like folks to look at section 3.2

Leonard Rosenthol: This is a list that we all agreed were important we just haven’t agreed on musts vs shoulds.
… Some of these are even more subtle. eg. #2, it’s unclear whether the language of the manifest is the language of the publication or the language of the manifest.
… I think we need some more time to nail down what the MVP of the manifest is then we can start serialization

Tzviya Siegman: Add comments to the particular draft to make remarks about the proposed language used.

Baldur Bjarnason: A lot of the extant issues are IMO serialisation issues: i.e. where the data belongs, how it is expressed, where the UA should look for it, how to deal with incompleteness or incorrectness.

Leonard Rosenthol: @tzviya - where you would like the text comments? In new github issues that are duplicative of the existing (unresolved) ones?

Benjamin Young: The github page details what is built in to the document. The target audience of the HTML proposal is the browser because if you give the browser JSON data it doesn’t know what to do with it but if you give browser HTML it renders it even if you add more data to it.

Benjamin Young: The target audience has been the browser, I just want to make sure that whatever serialization we decide on we keep the browser in mind. JSON is used in browsers but if you load it directly in a browser it chokes. Javascript is the consumer of json, not the browser

Tim Cole: I’m not comfortable specifying a serialization format until we know whether the manifest format can handle the types of data we may want to include in the infoset longterm

Benjamin Young: +1 to Tim’s cart/horse concerns about infoset and seralizations

Bill Kasdorf: We need to be careful not to confuse whether something is a “should” for the manifest vs. a “should” for the publication. A TOC, for example, can be a “must” for the publication but a “should” or “may” for the manifest.
… the manifest might just point to the TOC, not contain it.
… that is may or may not contain it.

Ivan Herman: semi-administrative comment, let’s try to focus on our goals and significant milestones. The goal is to have a first working draft at the end of the year. it’s important to have a working draft that puts down the significant milestones of what we want to do even if they’re not completely polished. This may help new members etc decide to join or not, talking about the browser makers here who are not represented in the wg

Laurent Le Meur: @Bill_Kasdorf, this is what the spec tends to express via the section “infoset” (-> publication) vs “manifest” (-> a specific structure)

Bill Kasdorf: +1

Ivan Herman: an abstract manifest will not cut it, the community needs something more specific. We need to have a direction for solving this problem, abstract discussion will not solve it because we need to have a fairly good idea about serialization by the end of the year. We need to define for ourselves how we work with the web app manifest which obviously everyone will ask about and we’re presently far away from giving an answer about this
… We can lose our way in very detailed arguments but it’s too early to get lost in details
… I had the pleasure to work for several years with RDF and I have seen 4 different serializations that have been defined for the RDF. Let’s not underestimate the complications of defining and abstract model. I’m against defining 2 or 3 syntaxes as standards for the manifest

Leonard Rosenthol: @ivan - you are allowed to be sorry :)

George Kerscher: +1 to a single way

Tzviya Siegman: I was wondering if we should leave the serialization discussion to github and instead go to the next agenda item

Laurent Le Meur: @leonard, about options, pls see the XML goals #5: The number of optional features in XML is to be kept to the absolute minimum, ideally zero.

Garth Conboy: Given ivan’s comment it probably makes sense to set some sort of deadline to say we’re as far as we are on the abstract manifest and we’re going to move on to serialization and given our WG schedule it shouldn’t be multiple weeks from now. If there is further discussion on section 3.2 it should be a homework assignment for the group to be able to say a week from today we start to move in to the concrete manifest/serialization? Reasonable

Jeff Printy: ?

Tzviya Siegman: Original deadline was tomorrow so you’re being far more generous

Leonard Rosenthol: Can you clarify where you’d like feedback on section 3.2

Tzviya Siegman: Comment on the pull request

Laurent Le Meur:

3. milestones

Garth Conboy: Now there are 10 minutes left, the eclipse is starting on the west coast

Tzviya Siegman:

Garth Conboy: moving on

Tzviya Siegman: On friday and over the weekend Ivan, Garth and I looked through the taskforce status and milestones

Ric Wright: MattG

Tzviya Siegman: Abstract manifest due next week. We saw the metadata this morning, we’ll work that into the agenda next week. I put out a call for the archiving task force next week. So far one person. Any more interest in the task force?
… Security and privacy, any work we don’t know about yet?

Leonard Rosenthol: No work on security yet

Tzviya Siegman: We’ll continue with the feedback and once the doc is cleaned up we’ll ask matt to add it to the main doc and add it to the agenda for next week

Benjamin Young: I volunteer for offline

Tzviya Siegman: Offline instance of web publications, do we have volunteers?

Leonard Rosenthol: clarification: Offline is not the same as packaged

Tzviya Siegman: Tim, working on web publication reference, deadline sept 30th

Tim Cole: Getting ben to help, it’s good

Baldur Bjarnason: I can help with offline.

Tzviya Siegman: There’s a lot of writing to do, these are very broad categories

Leonard Rosenthol: Will do (on mateus)

Tzviya Siegman: Matteus offered to work on privacy/security section
… We’ve got to get this out before TPac:

Ivan Herman: The goal is to have a clear set of problems we have, a direction we’re planning to go and an idea for the solution

Tzviya Siegman: metadata proposal

Luc Audrain: Metadata TF : Baldur, Boris and Luc

Garth Conboy: Keep in mind this is a public working draft, there will be issues but we need to drive toward completeness
… get on a task force and help us nail down the manifest before the next meeting

Ben Schroeter: sun is a crescent now here on west coast

Garth Conboy: Let’s go look at the sun but not really

4. Resolutions