Publishing Working Group Telco — Minutes

Date: 2019-02-25

See also the Agenda and the IRC Log


Present: Tzviya Siegman, Simon Collinson, Wendy Reid, Luc Audrain, Nick Ruffilo, Benjamin Young, Gregorio Pellegrino, Avneesh Singh, Romain Deltour, Mateus Teixeira, Ivan Herman, Marisa DeMeglio, Ben Schroeter, Rachel Comerford, Charles LaPierre, Brady Duga, Garth Conboy, George Kerscher, Geoff Jukes

Regrets: Matt Garrish, Bill Kasdorf, Deborah Kaplan, Joshua Pyle, Vladimir Levantovsky, Franco Alvarado, Dave Cramer


Chair: Tzviya Siegman, Wendy Reid

Scribe(s): Nick Ruffilo, Mateus Teixeira


Tzviya Siegman: A number of regrets - so a change in the agenda. We thought we would have Josh and Franco to talk about use cases and requirements. Franco is on Vacation and josh sent regrets. 2nd half of the meeting will be Wendy following up on audiobook issues.
… Lets approve the minutes from the Asia-friendly timezone, as well as our meeting from 2 weeks ago

Tzviya Siegman:

Tzviya Siegman:

Tzviya Siegman: Any comments? I hope you read the summary, but minutes can be approved

Resolution #1: meeting of the 8th (asia) approved

Tzviya Siegman: any comments on the meeting from 2 weeks ago? Minutes approved

Resolution #2: meeting of the 11th approved

1. ToC in JSON-LD

Tzviya Siegman:

Tzviya Siegman: Next item on the agenda is the TOC. We need to come to a resolution on this issue. Github link posted. This is about if the TOC directly encoded in the manifest. We came close to a resolution but didn’t quite get there. Continued on github (and twitter…)
… the last point of discussion in our meeting was that - with audiobooks they don’t necessarily have a TOC, but we don’t make our specification audiobook specific or exclusive. So if serializing HTML or JSON were really very different…
… then some discussion about if the TOC is required. There’s a link in the github comments… If the TOC isn’t required then it doesn’t matter if it’s HTML or JSON. I believe what we’re coming around to is the TOC is required to be in HTML but it still remains optional

Proposed resolution: for AudioBooks, ToC is optional; serialized in HTML. (Garth Conboy)

Tzviya Siegman: the proposal is for WP, not just Adiobooks

Garth Conboy: we can have different rules for different profiles.

Wendy Reid: I want to state, we discussed this last time, and I believe it was Geoff – for audiobooks, we’re having the opposite problem. They don’t want HTML. The current environment for audiobooks, no one is using HTML. None of the readers are using HTML for audiobooks, so introducing it could cause a lot of issues.
… it’s still good that different profiles will have a different implementation, I just want to say that it shouldn’t have to be in HTML — Audiobooks have TOCs, but requiring HTML is the issues.

Ivan Herman: the fact that the TOC is optional is already in the document.

Tzviya Siegman: thank you for clarifying

Brady Duga: Is JSON already used by audiobook publishers? For reading systems/listening systems?

Wendy Reid: Blackstone and Kobo use JSON.

Garth Conboy: I was under the impression that the proposal I put in was where we were in agreement. If it’s optional, then we don’t really care ab out the serialization — so any system ingesting audiobooks can read any TOC. And we’d have rules as to how to serialize a TOC in HTML…
… I am not in favor of widening it to additional serialization

Geoff Jukes: We don’t use HTML for TOC, we use JSON — the main reason is that we don’t want our publishers to define how things are displayed. It’s optional and it’s use is optional — so we don’t mind it being part of the spec. If the TOC is optional, but having it requires we use it as such, that will not be OK.
… our applications are not EPUB3, they are audiobooks, and different.
… It would require additional application development for us to unserialize the HTML. It would also provide no value to us or our customers, so it seems crazy to think about.

Tzviya Siegman: I will point out that one of the things that comes up when working on specs is that we have to make compromises. It might require changes to the tool chains and how they operate. What we’re hearing is that google creates audiobooks in one way and Blackstone another. One will have to change.
… The value to creating the table of contents in HTML is that it would be useful in more than 1 implementation, not just audiobooks. All types of publications would have to allow for that possibility.
… instead of one publisher saying “I have to adjust my tool chain” all might have to think about the options. One possibility is we make a change that affects only audiobooks — but we’re trying to make things generic if we can.

Mateus Teixeira: +1 duga

Brady Duga: This has nothing to do with existing tool chain — we can parse HTML. If we adopt JSON it’s trivial. My note is that HTML is a better representation because it can do things like ruby which allows for more languages

Benjamin Young: +1 to HTML has more (i.e. i18n, a11ly, etc)–and that’s good

Laurent Le Meur: the reading system would have to handle both — just to clarify, not the publishers.

Tzviya Siegman: we do have to make a decision. Consensus is difficult, but compromise is key

Ivan Herman: What we can try to do as a pay forward, in general, for Web publication and the various profiles. We do not define a TOC in JSON in general. Some profiles may do something specific to their profile. The Audiobook profile could say: “for audiobooks, it’s possible to do it in JSON or HTML. It’s up to the publishers which way to go.” What I do not want to see is that Audiobooks say: ‘you must not do it in HTML, you must do it in JSON’ That will really create a schism that would be detrimental.

Garth Conboy: +1 Ivan

Wendy Reid: I agree with Ivan. I understand why HTML is important. It would be interesting to see audiobooks adopt rich-text experience. In terms of adoption as the industry stands today would be better with JSON. A JSON TOC within the manifest, then an HTML document with a rich version of the TOC.
… I know giving publishers options can create issues, but in the very least, if the publisher wants to create a rich-text verison, they can.

Luc Audrain: +1 to wendyreid

Ivan Herman: What this means is that in the audio profile, you’ll have to spec out not only the format, we’ll have to define if the publisher has two types of TOCs, which one wins? For the general web publication, what we have is what we have.
… We’ll have to spec this out for audiobooks only, and what it means for something to be audiobook and not audiobook.

Garth Conboy: I don’t think of the HTML as a rich-text format, as I don’t see it as something that is displayed. It’s meant to be machine readable. What I’d be proposing in the lines of compromise. In Audiobook land, the TOC can be JSON or HTML. If there are both, the HTML version wins.

Avneesh Singh: +1 Garth, html should win

Benjamin Young: Looking at linked resource, this is more an ask for Wendy and Geoff, but is that sufficient for the TOC you’re producing now? Or are you needing a TOC that is more complex and doesn’t map to the reading order?

Wendy Reid: Right now, the average audiobook we receive has a very basic TOC. Chapter 1 is this, and X long; chapter 2, etc… Sometimes we don’t even get chapters. Having a detailed TOC for the audio industry might be a push…

Benjamin Young: So that could be ranges of parts of files, right?

Wendy Reid: Yes, it could point to timings within files.

Proposed resolution: For AudioBooks ToC may be provided as serialized in either HTML or JSON, if both are present, only the HTML will be processed by the Reading System. (Garth Conboy)

Nick Ruffilo: Could we have two different designations for a TOC? Machine readable & visual? JSON could be machine-readable, for example.

Geoff Jukes: We don’t use the JSON to define parts of resources — we don’t include timestamps. Our dataset is very minimal. It’s covered by the resource duration, the name of the file, the label — although we deal primarily in English…
… From a business-to-business perspective, I don’t have issues passing HTML as long as it is strictly defined — as the JSON is. My assumption is that when we tell users they can produce HTML, they will send anything.

Ivan Herman: Current structure for TOC in HTML ->

Geoff Jukes: if it is strictly defined, then I don’t have a problem with it. For our existing applications, it’s of no use, but we’d translate that from B2B to something our apps use. As long as it’s strictly defined, I need it machine parsable.

Garth Conboy: Ivan just pasted in a relevant link for Geoff. The TOC we have spec’d in WP for HTML serialization. There are rules about pulling out anchors, chapter names, and it’s purely machine processible. Either HTML or JSON, if it comes in the package with the audiobook, it’s a huge improvement.

Proposed resolution: For AudioBooks ToC may be provided as serialized in either HTML or JSON, if both are present, only the HTML will be processed by the Reading System. (Garth Conboy)

Garth Conboy: It’s a big improvement than just a collection of files with an external TOC. Either of these — I think we’d be in a pretty good shape with this.
… It is designed to be machine-processable TOC, not necessarily a visual representation

Tzviya Siegman: In summary, our spec has very strict rules with HTML.

Garth Conboy: The spreadsheet is the worst of all possible worlds, but it does exist with some large publishers.

Brady Duga: We have experience using machine readable TOCs. We don’t have a huge issue with publishers going to town with the HTML. We convert it to something else. It can be converted to JSON and sent to clients.
… I wanted to add: I would rather have an ascii text based format than 2 formats. I don’t want to have JSON and HTML. I just want one. I prefer the HTML.

Tzviya Siegman: Geoff, we very much want to include the existing toolchain. I want to make it clear what our intent is with HTML. It’s a restriction on what is allowed in HTML — and it is very limited. I’m hoping we can come around to some agreement.
… We want this to work for the audiobook retailers that exist today. To comment on Nick’s proposal for rendered vs machine readable. That existed in EPUB2 — but we fought to get rid of it, so we aren’t going back. Nav was created to address that…

Proposed resolution: For AudioBooks ToC may be provided as serialized in either HTML or JSON, if both are present, only the HTML will be processed by the Reading System. (Garth Conboy)

Tzviya Siegman: Lets just review what Garth proposed, that way we can continue to review that and come to resolution on this.

Avneesh Singh: All the good things about HTML has already been said. The 2nd part — B2B and B2C — two use cases. Are we targeting B2B? It may shape up the specification in a different way. Our focus seems B2C.

Geoff Jukes: We deal with studios sending us raw audio, publishers sending us produced audio, and a consumer website. We send quite a bit of audio to other publishers as well — Kobo, etc. When I’m looking at this, I’m trying to find one system to rule them all…
… The end-user scenario is different, but the more that we can bake in, as early as possible, makes sense to me. If we can get publishers/partners to produce things early… We lose data as part of the process, because publishers do not send us standard data…
… publishers don’t send things with the same file naming conventions, etc. The more data we can standardize at the beginning, the better. I’m still new and trying to get caught up.
… As I can see it right now, an HTML TOC provides some options that we would use for our enhanced audiobooks, and it would be a natural lead-in. HTML does not seem like a good way to include the extended information we’d like to get from publishers, like hashes or filenames…
… that we can confirm the data we have is in the right order, etc. I’m a fan of JSON because it’s strictly structured, there are strict definitions for resources…

Ivan Herman: I would propose, at this point Geoff — on the one hand it’s extremely important to have your on board with what we do. On the other hand, it’s clear that there are lots of things that need to be caught up on. Some of the remarks are solvable with what we have now…
… I would be happy if we had a separate call — you, Wendy, I, Matt, etc. To look at some of the technical details to see where your problems are and if what we’re proposing is answering your concerns. In a sense help you to catch up on 1 1/2 years of WG work.

Avneesh Singh: +1 to Ivan

Ivan Herman: I think it would be important to have it done. I don’t want to leave you behind or make a resolution while this is still missing.

Tzviya Siegman: Wendy has been working with the audiobook publishers to try to get a meeting, but it’s difficult to get on the agenda. I like Ivan’s suggestion. A year or so ago we had a meeting with newer members. Not sure how many new members we have, but maybe it’s time.

Geoff Jukes: +1

Geoff Jukes: me alone :)

Laurent Le Meur: +1

Laurent Le Meur: oh, -1

Tzviya Siegman: Maybe we’ll talk about that at the chair’s meeting…

Tzviya Siegman: I agree with Ivan, that we should not have a resolution, as we have incomplete information.

2. Mixed media in audiobook linear reading order (Issue 322)

Wendy Reid:

Wendy Reid: Some of this is complicated by the TOC situation. We have 6 open issues that are tagged as audio issues.

Wendy Reid: There was a discussion about the reading order - there is no restriction as to what can be in the reading order. We were wondering if in audiobooks, since it has a use case of being mainly audio, but sometimes it contains supplemental info (pictures, charts, etc)…
… would that be considered a resource or part of the reading order? It’s innovative/wishful thinking. Based on the discussion. The likely best resolution is that it should be considered as not-part-of-the-reading-order

Ivan Herman: I don’t fully understand as I was not part of the discussion. Why do we need to specify? Why isn’t it something that the author can supply/note?

Wendy Reid: It’s a decision we have to make. I’m pro having it in the reading order, but the issue could come down to the reading order. The User-agent could end up not showing it. That is another option we have to consider.
… It might be easier to say: “these are all resources”

Benjamin Young: I don’t like sub-typing. I prefer to keep things consistent — so readers with varying skills can play and handle only what they know. If you have a screen, it can show you what you want, otherwise not. As a spec, we already accommodate it

Laurent Le Meur: I would like to have authors coming back to developers — they say: “your system doesn’t work as it doesn’t show HTML when playing audiobooks” We have to set expectations. There is no expectation that a classic audiobook will render HTML.
… I understand that some special reading agents would be able to read something else. Standard one would not. We should have expectation that it SHOULD be audio files, and it’s expected that a reading system can handle that, and everything else should be in resources. We can let the JSON schema be open to any resource…
… but expectations should be clearly set.

Garth Conboy: q

Marisa DeMeglio: I don’t have a strong opinion if audiobooks should include other resources, but is this issue about having no playback model for non audio files?

Wendy Reid: There is no model, at least not for now…

Marisa DeMeglio: A resource in the middle of playing audio — you don’t know how long you should keep it up on the field…

Garth Conboy: There is a class of audiobooks today — that come with a PDF or bunch of PDFs that are ancillary material. There’s not expectation for them to render along with the content, or that a reading system is to show it, but the content is to be bundled…
… so a user’s action is to look at the supplemental content. We need to have the ability to have those types of audiobooks to be produced, but we should allow supplementary material

Ivan Herman: I’m trying to combine those issues with others that are around — there may be other connections. If we say that an audiobook is defined that everything in the reading order must be an audio file. Which is fine — it’s a sensible definition for a well-existing market…
… if we do that, nevertheless, there is a need for books that have a mixture of the various media for a bunch of other reasons. They are web publications, but they are not audiobooks. Audio still has items open about items in the manifest - readBy, etc.. All those properties should not be in the audiobook profile…
… they renamed general properties that are relevant to audio. If we are strictly audiobooks, then everything related to describing the audio items, they need to be generally available, because they could describe items generally available.

Wendy Reid:

Tzviya Siegman: Based on Garth and some email descriptions and from HC — I’m a bit confused about supplemental materials. If the method is a URL that the publisher provides, how does the user see it? Is it HTML? I don’t know what Google does, but I know it’s a huge problem.

Wendy Reid: Publishers are providing a PDF file — I’ve yet to see anything but PDFs. Right now, if a customer asks for it, we can send it, but we don’t have a good way to deliver it. It’s a very awkward system if you’re using a reading system or an app. There’s no way to surface a URL.

Tzviya Siegman: If we have a method of conveying HTML to a user — we have a solution for ToC

Garth Conboy: I agree with what Wendy said, on our initial implementation — a user contacted us and we sent the PDF. Right now our listening system surfaces in the player “open supplemental” and we deliver the PDF behind the scenes so the client can click to open…
… it works very well. What we need to do — capability wise — is that we need to allow the supplemental PDF to tag along with the content. If the reading system has way to deal with it… Publishers need to be able to package the material and it’s up to
… the reading system to implement

Nick Ruffilo: I agree with a lot of what’s been said. I think if we go with a web publication with audio and other stuff interspersed, versus just audio… If we give the user the option to download the supplements and letting them know… I’m failing to see why that would be a bad thing.

Tzviya Siegman: It sounds like with the existing workflows — this isn’t something that exists today. Maybe we should hold off on spec’ing out. It’s a problem, but maybe it’s something we defer until we understand the scope of the issue.

Wendy Reid: To clarify, this issue is about — every discussion the audiobook group talks about, it should include supplemental content. Knowing that not all reading systems will render it. So the question is if the supplemental content should be in the TOC or resource only

Wendy Reid:

Garth Conboy: +1 Wendy

Wendy Reid: so if we use the type specifically, the audiobook TYPE would only see audio files in the TOC>

Benjamin Young: I’m not sure what the distinction. Why split the hair? If you have the ability to skip a file that you didn’t understand. If you stuck in a resource that it doesn’t understand. It either fails, or sees what is next and keeps going — possibly with mention that it had to skip.
… RSS and ATOM feeds that distribute podcasts have a very similar data model. Those can mix and do often mix. You’ll have a text blog, media blog, etc. The podcast consumer will only process the audio ones. A richer RSS reader will process all…
… We shouldn’t limit the resources, but include resources and information that gives implementors what to do with the files, but we should try not to limit ourselves and paint ourselves into a corner.

Garth Conboy: I agree with what Wendy said before — we have to allow for supplemental materials. If they are in the reading order or other resources. I lean towards other resources, but it’s just an opinion. To harken back to early EPUB discussion, it can always just skip resources.

Avneesh Singh: This has some TOC dependency. Reading order and TOC are different items - they may or may not have overlap. It’s possible the default reading order keeps just audiofiles and the HTML TOC has all

George Kerscher: I get that novels are traditionally what is done today, but we should open to a wider range of materials. Magazines — yes there’s a reading order, but more likely I want to go to article 6 etc. A book of poetry — human narration of poetry is much better than machine read poetry…
… I don’t want to read poetry book cover to cover, but to navigate to the poem, bookmark, etc. If we ignore this market of different types of publishing, then it gets relegated to someone figuring out how to put together a bunch of podcasts.

Mateus Teixeira: +1 George… I’ve even seen requests for audio-textbooks

George Kerscher: so we want to come up with something that does all these types of publications.

Nick Ruffilo: +1 to mateus with the audio-textbooks

Wendy Reid: we’ll talk about this next time we have audio issues. Tzviya, anything else?

Tzviya Siegman: Hopefully we’ll do implementations of use cases in the document — so bring your implementation hats. If you can take a look at the use cases in the use case document and think about that for next week
… and a discussion with new members, we’ll set that up shortly

3. Resolutions