Publishing Working Group Telco — Minutes

Date: 2019-04-01

See also the Agenda and the IRC Log


Present: Tzviya Siegman, Ivan Herman, Rachel Comerford, Ric Wright, Wendy Reid, Romain Deltour, Franco Alvarado, Jun Gamou, Ben Schroeter, Charles LaPierre, Bill Kasdorf, Marisa DeMeglio, George Kerscher, Avneesh Singh, Luc Audrain, Garth Conboy, Joshua Pyle, Mateus Teixeira, Laurent Le Meur, David Stroup

Regrets: Dave Cramer


Chair: Tzviya Siegman

Scribe(s): Romain Deltour


1. Approving last week minutes

Tzviya Siegman:

Tzviya Siegman: any comments or feedback?
… minutes approved

Resolution #1: last week’s minutes approved

2. Web Pub doc reorganization

Matt Garrish: I’ve done a couple of PR merges
… we said the ambition was to pull apart the manifest from Web Pub
… so audiobooks may not conflict with Web Pub
… we split it in 3 parts
… Part I is the abstract manifest, shared with all digital publications
… the section largely looks the same as it did (manifest, IDL, etc), just generalized
… in 2.6 properties have been generalized as well, categories are now defined for each property
… it will help with extensibility, but also helps describing the property type
… the most recent update was to move the manifest lifecycle into this section as well
… once you have the manifest, the process tends to be the same across the different formats
… Ivan generalized the steps between the different manifests
… Part II is about Web Publication
… all the sections say how it extends from the general manifest section, so no need to repeat
… the rest of it is unchanged
… Part III is where we intend to define how to extend the manifest, and compatibility requirements
… that’s the quick summary of the latest changes, questions welcome

Ivan Herman: one more thing to emphasize: in Part III we define what is needed when you define a new profile (audio profile will be a test case)
… a profile can define extra steps in canonicalization, extra checks, etc. the extension points are well defined

Tzviya Siegman: other questions, comments?

George Kerscher: must there be a profile for all formats?
… if you want to do a single file journal article, would that be a profile?

Ivan Herman: in theory, no.
… the question will be if for some categories there are requirements that make it necessary to have a specific profile
… e.g. if having an author is a MUST, then you need a profile (as the generic rule is a SHOULD)
… or an audiobook could say all entries in reading order must be audio files

Tzviya Siegman: for things like a basic journal article, which is not different from HTML, I don’t expect people to start using it

George Kerscher: the base Web Pub would support trade books, and text books?

Ivan Herman: as far as I can see, yes

Bill Kasdorf: would it be reasonable to say that a Web page of a journal article is inherently a Web pub?

Tzviya Siegman: I don’t think so, there are requirements that many web sites don’t meet

Luc Audrain: +1

Bill Kasdorf: ok, so it’s one direction not the other

George Kerscher: +1

Tzviya Siegman: I think we can vote, are we ready to publish that as the next editor draft?

Proposed resolution: publish the latest ed as a TR document (Ivan Herman)

Tzviya Siegman: +1

Wendy Reid: +1

Garth Conboy: +1

Mateus Teixeira: +1

Ivan Herman: +1

Marisa DeMeglio: +1

Franco Alvarado: +1

Rachel Comerford: +1

Joshua Pyle: +1

Matt Garrish: +1

Charles LaPierre: +1

Bill Kasdorf: +1

Luc Audrain: +1

Ric Wright: +1

Ben Schroeter: +1

Romain Deltour: 0 (didn’t take time to review, sorry)

Resolution #2: publish the latest ed as a TR document

Ivan Herman: the group has a possibility to react within 5 days!

Tzviya Siegman: thx Matt and Ivan for the work!

3. ToC algorithm

Tzviya Siegman:

Tzviya Siegman: issue 378
… the issue is “what goes into ToC?”
… the proposal is to leave things as is, unless we have evidence it needs to be adjusted
… mateus said they need extra types (chemistry, music) in the ToC

Mateus Teixeira: I can provide examples from NN

Matt Garrish: there are two issues, one is allowing markup within the ToC labels (#414), but #378 is more about the various structures of ToC
… what kind of different structuring of the ToC should we try to account for
… maybe we should wait and see

Ivan Herman: my impression is that the possibility of putting advanced markup in label is different from #378
… the reason I raised it back then is that some structural things (e.g. section elements) are ignored by the algo, and I was wondering if other things should be ignored too
… my feeling is that the answer is no; but it doesn’t mean we can’t allow MathML
… it seems we can close #378 without much problems, regardless of what we decide for the markup of labels

Tzviya Siegman: I agree with that

Mateus Teixeira: +1, but I’ll share examples either way

Tzviya Siegman: mateus can provide examples for #414 then

Matt Garrish: right, we’re waiting for evidence for more table of contents
… we can close it and raise specific issues, specific kind of ToC when we have evidence or examples

Tzviya Siegman: the proposal is to leave the algo as is and close #378

Ivan Herman: yes, I have the impression that the structure of the ToC, as currently described, should be fine as-is.

Mateus Teixeira: true

Avneesh Singh: +1

Tzviya Siegman: ok, so overwhelming support for closing #378, and mateus will add comment to #414

Romain Deltour: +1

Charles LaPierre: +1

Wendy Reid: +1

Luc Audrain: +1

Mateus Teixeira: +1

Joshua Pyle: +1

Ivan Herman: +1

Resolution #3: overwhelming support for closing #378, and Mateus will add comment to #414

4. draft proposal for audiobooks

Wendy Reid:

Tzviya Siegman: also works

Wendy Reid: special thanks to Laurent for the link
… I proposed a first PR, Ivan and Laurent left a ton of feedback
… we’re planning on merging it shortly, as it’s not easy to comment right now
… the approach is one of balancing between WebPub and Audiobooks
… I wanted to make it readable for someone who didn’t read WebPub
… happy to fix it wherever needed

Ivan Herman: can we pick one or two of the issues I listed, which require brainstorming?

Wendy Reid: yes!

4.1. Does Reading order only includes audio files?

Ivan Herman: one question I had is: if I take an audio book, and want to differentiate it from other Web Pub, is it reasonable to say that the reading order contains audio files only?
… (and anything else can be in the list of resources)
… is it reasonable to say? would it make it easier to produce? or is it unnecessary?

Wendy Reid: I think we discussed it, about supplemental content
… we came to the conclusion that for the initial version, we would make the reading order audio only, and non-audio resources would be in the resource list

Avneesh Singh: I remember the same. Audio profile will have list of audio in reading order.

Garth Conboy: +1 to Wendy

Marisa DeMeglio: I don’t disagree, but looking at the examples, I wouldn’t know where to render cover and content (rel values)

Wendy Reid: I think Laurent opened issue #405 about this

Marisa DeMeglio: historically it seems UA don’t want to do anything, they want to be told what to do

Tzviya Siegman: before we continue, I think we’ll want to keep the audiobook issues in the main repo

Wendy Reid: right, do not log issues to the audiobook repo, use the main Web Pub repo

George Kerscher: when we discussed this there were two proposed solutions, one is the rel value, one is the supplemental ToC

Wendy Reid: I did include it in the draft that if you have supplemental content, you must have a ToC

Ivan Herman: then this should be in the document? Formally this is one area where audiobooks impose a restriction that WebPub doesn’t have, so it should be documented
… in the general structure there is a possibility that I do not provide a reading order at all, in which case it falls back to the default where the entry page is the only content in reading order
… which is HTML and not audio, so we have to say that a reading order MUST be provided

Bill Kasdorf: there can be audiobooks of podcasts, with a transcript for accessibility?

Wendy Reid: yes

4.2. Usage of duration; is it required?

Ivan Herman: so, duration is a new property, there was several questions there
… are there 2 places where ‘duration’ can be used? For the entire publication and individual audio resources?

Wendy Reid: yes

Ivan Herman: is the duration, for the whole publication, a required property?

Wendy Reid: I think it should be required as UA would find it very useful, but on the other hand it could be calculated
… it’s nice to know

Tzviya Siegman: based on discussions we had (calculation can be incorrect), I think I would make it a required property

Avneesh Singh: yes, it is a burden to expect RS to calculate it; it is also an important piece of metadata
… so it should be mandatory for the publication, and can be optional for individual files
… another question was about media fragments, can you point to a precise position into an audio resource?

Wendy Reid: yes, you’re right we should specify that

Luc Audrain: for the duration, it’s true that publishers will expose the total duration to the users
… it’s displayed on vendor’s web sites; but we don’t want to rely on metadata that is inside the package
… but I agree with Avneesh that it should be mandatory

Bill Kasdorf: that sounds like an ONIX issue

Garth Conboy: thinking back of EPUB where we have metadata about properties about individual resources: it can be useful, but the RS will have to process the resources and figure out what the real length is anyway, there’s no guarantee the metadata is right

Tzviya Siegman: historically we had a problem with EPUB about the ability to rely on the provided information, but that should not be a reason why it can’t be in the specification
… not all distributors are using ONIX
… since the property exists in we can reuse it

Joshua Pyle: +1 to Tzviya

Garth Conboy: the question is whether it should be required, you can have arguments both ways

George Kerscher: a+

Wendy Reid: right. I think it should be required; of course it puts the onus on content creators

Laurent Le Meur: the metadata must be taken as a descriptive metadata
… we should not make it required; like all descriptive metadata it can be optional
… if there’s a discrepancy with ONIX metadata and in-package metadata, it’s not an issue if it’s descriptive only

George Kerscher: the ONIX record needs to find out from somewhere what the information is
… it seems like having metadata about the book in the book enables correct cataloging in all kinds of different places
… I would think the metadata would be the authoritative

Avneesh Singh: I agree with George; we’re producing audio books since 1996
… if there are inaccuracies it’s because there are no proper standards, but if this becomes mandated, production tools will generate this metadata correctly

Ivan Herman: I think editorially we have an open issue (no consensus)
… you (wendy) as an editor makes one choice, and link to the issue

Joshua Pyle: as a system implementor, I love required things
… if something is optional, then my implementation will have to do it, to calculate it
… and if I do it, I’m always gonna use it

Luc Audrain: +1 to Josh

Joshua Pyle: so as an implementor an optional info is useless
… that is critical info, so why would you not want it there?

Luc Audrain: I agree with Josh’s earlier comment; the content creator knows the duration of the book
… it’s not an issue to make it mandatory in the manifest, it’s a better situation of UA developers

George Kerscher: the audio book (complete publication) is in audio, with supplemental material
… with a Web publication with video clips, there can be video clips here and there, it’s not a global duration
… the publication is not fundamentally time based
… so we need to differentiate something completely time-based, and something with time-based content

4.3. Duration for non-audio

Ivan Herman: can we move on to a different question on duration?
… I presume that you refer to duration?

Wendy Reid: yes

Ivan Herman: the duration is defined on the MediaObject type
… which also applies e..g. to video
… so maybe duration as a property should move to the generic Web Publication spec
… maybe the property itself can be defined in general, but generally defined on individual resources
… but audiobooks (and “videobooks”) would specify its usage for the book as a whole
… we separate between the usage of the property on top-level and individual media entities

George Kerscher: +1 to what Ivan is saying

Ivan Herman: there are vocabs that we have to discuss, but we can leave it to later discussions

Wendy Reid: I agree

Wendy Reid: this was a productive discussion!

Tzviya Siegman: +1

5. Resolutions