Publishing Working Group Telco — Minutes

Date: 2019-10-07

See also the Agenda and the IRC Log


Present: Ivan Herman, Dave Cramer, Geoff Jukes, Laurent Le Meur, Juan Corona, Gregorio Pellegrino, Wendy Reid, Romain Deltour, George Kerscher, Matt Garrish, Avneesh Singh, Ric Wright, Brady Duga, Ben Schroeter, Marisa DeMeglio, Franco Alvarado, Bill Kasdorf, Nellie McKesson, Charles LaPierre, Benjamin Young, David Stroup

Regrets: Luc Audrain, Joshua Pyle, Mateus Teixeira, Garth Conboy


Chair: Wendy Reid

Scribe(s): Dave Cramer, Wendy Reid


Wendy Reid: let’s get started

Wendy Reid: See Last week’s minutes

Wendy Reid: let’s approve minutes from last week
… minutes approved

Resolution #1: last week’s minutes approved

1. Directionality

Wendy Reid:

Wendy Reid: first issue is about directionality

Ivan Herman: See Pull request for directionality

Ivan Herman: I have already reported that this whole direction issue now has a clear solution
… I’ve made a PR
… you can do direction much like you can do language
… with the same syntax, either globally or for a given member
… so the PR fixes all the bits
… there is one issue which I put into the document
… having direction in the system means that we have a dependency on JSON LD 1.1
… which is running in parallel with our spec; they plan CR at the same time, they have implementations.
… so they may be ahead of us
… but if you want to run a manifest through a JSON-LD processor that only supports 1.0 you will get errors
… so I created a separate context file that doesn’t have direction
… and made an appendix in the doc
… so you can use this context if you don’t need direction, and need to use a JSON-LD 1.0 processor
… so the question to the group is that it’s OK to keep this as an appendix in the document
… or put it somewhere else, or not do it at all and depend on JSON-LD 1.1
… so I want help from the group in deciding what to do

Wendy Reid: any questions?

Brady Duga: I don’t have strong feelings, but it does feel strange to put in the spec itself
… most things that process this won’t be JSON-LD processors
… but I’m not against it
… I think it should be in a read me or impl guide

Matt Garrish: I’m concerned about how this might be interpreted
… changing our context which is required seems weird
… it could add confusion to have an alternate context

Ivan Herman: I put it there without being convinced it’s a good idea
… I’m fine removing it
… so where would this information go?
… we could give it to the testing people and have this added to information about testing

Matt Garrish: I would put in our wiki for implementation details, so it’s discoverable

Wendy Reid: a wiki is good

Dave Cramer: This seems like it would be a helpful note in the spec itself, if this language property is used, mention that JSON-LD 1.1 is required
… it’s a good use of non-normative text

Ivan Herman: any other comments?
… one comment… for those of you who have been looking at this
… if you look at this example
… it is a good example there for real cases where base direction is necessary
… some people were wondering if this stuff was necessary
… We found a book with a title in arabic and english

Ivan Herman: so it’s good to have a real-world example

Matt Garrish: I think it’s useful to have direction; we have it in EPUB.

Ivan Herman: is there a consensus to merge?

Matt Garrish: +1

Wendy Reid: +1

Brady Duga: +1

Dave Cramer: +1

Gregorio Pellegrino: +1

Ivan Herman: we should also pledge to never talk about direction again :)

Resolution #2: ivan to merge direction PR

2. wrap-up items for CR

Wendy Reid: the rest of the meeting will be wrap-up items for CR

Wendy Reid:

Wendy Reid: we have three remaining issues

2.1. issue 57, metadata in the publishing world

Wendy Reid:

Wendy Reid: dave raised an issue about how metadata works in the publishing world

Dave Cramer: Everybody knows I worry about a lot of things, our experience with EPUB has been spent in metadata rabbit holes, new vocabularies
… everyone has a property that is important to them, we spend a lot of effort
… metadata is not always exposed to the reader, and it travels separately from the EPUB itself

Charles LaPierre: except VitalSource is starting to expose the Accessibility Metadata

Dave Cramer: I raised this so we could be thoughtful about the metadata we expose

Ben Schroeter: to add to that
… we do supply a11y metadata in the epub that is used by distributors

Charles LaPierre: I’m on the a11y metadata thing
… vitalsource is exposing EPUB a11y metadata to users

Wendy Reid: that metadata is in ONIX, too?

Charles LaPierre: some of it; it’s not a 1:1 mapping; there’s more in ONIX 3
… but it’s not in ONIX 2.1, which is still widely used in US

Brady Duga: what Dave said is true for publisher-supplied ebooks, but not so much from user supplied epub
… but the metadata that matters is mostly author and title

Ivan Herman: I’ve said several times, if this manifest exercise becomes successful, it may not be in the worlds where EPUB is already successful
… we should not be bound by EPUB or ONIX

Laurent Le Meur: in the thorium reader we try to present metadata in the OPDS feed or in EPUB
… but there is no consistent set of user-oriented metadata
… but we would like to get it right… publisher, language, category, subject, narrator…
… all would be useful

Matt Garrish: there’s room to do metadata standardization outside of the standard itself
… rather than putting every metadata scheme in the spec itself, leave it to the communities
… there should be some core stuff
… it’s probably more efficient to do things outside

Bill Kasdorf: will there be a generic way to incorporate community-specific metadata?
… so scholarly publishers can include what they think is essential?

Matt Garrish: that’s exactly how it would work and how it is set up right now
… we’re a proxy for; we can use anything there without having it directly in our spec
… and our context files include more prefixes
… we are very flexible

Avneesh Singh: +1 Matt

Matt Garrish: there should be a clear purpose to list metadata in the core spec

Bill Kasdorf: I am anti-bloat

Gregorio Pellegrino: I understand what Matt says but some metadata is essential, like description
… we should suggest to use some metadata, because otherwise reading systems won’t implement

Laurent Le Meur: I agree that in the core spec maybe we don’t need an extensive set
… communities can define their own community
… who is the community? Is the audiobook community literally this working group?
… but we need something defined somewhere

Ivan Herman: one of the reasons we took JSON-LD is because used it
… but JSON-LD is ideally suited for this… you can just add things and it’s OK.
… laurent is right; for different areas there should be communities defining metadatas
… I don’t know if there is additional metadata required by audiobooks, if so let’s add it
… it depends… a CG might be able to define some of these things
… the main goal is to provide a framework

Dave Cramer: I’m not opposed to metadata, we seem to think that embedding metadata is always good but past experience shows this data is rarely used, I’m aware of few reading systems that use title and author but we’ve made many EPUBs using copyright statements

Avneesh Singh: let this spec go to CR as is

Ivan Herman: See example to link external metadata (ONIX in this case)

Avneesh Singh: and we have the audiobooks spec; we need to know what audio publishers want

Ivan Herman: +1 to Avneesh

Avneesh Singh: and we can do a note or registry with metadata

Gregorio Pellegrino: I agree with avneesh
… the possibility to define the role of contributor in are very poor
… we need ways to add that

Laurent Le Meur: adding metadata can be done step-by-step
… we need a group that can host these needs
… which group is it? This WG working on audiobooks?
… then we can wait for needs from publishers of audiobooks

Matt Garrish: to what gregorio said, we can request that add stuff that’s missing

Ivan Herman: +1 to matt

Gregorio Pellegrino: +1 to matt

Matt Garrish: it never ends well when we add metadata to our own standards

Ivan Herman: a partial answer to laurent
… there are two issues. one is, who are the groups that develop metadata? I don’t think there is one answer.
… two: how do you find the metadata that has already been developed?
… we may need a registry

Bill Kasdorf: in some sectors of publishing there are organizations that govern metadata
… IPTC, JATS, etc
… as we reach out to other sectors, we will find there are already metadata standards

Wendy Reid: this is not a question for us to solve today
… we have sufficient metadata in our specs for now. We’ll see how it goes in CR.
… so let’s move on

Ivan Herman: is it OK if we close this issue?

Dave Cramer: refresh your github

Gregorio Pellegrino: if we close the issue, how can we say we are thinking about this?

Gregorio Pellegrino: Fine

Ivan Herman: we could say it’s deferred

Gregorio Pellegrino: Defer

2.2. WebIDL yea or nay?

Wendy Reid:

Romain Deltour: this is mostly an editorial issue?
… after talking to marcos at TPAC
… for web app manifest spec they are going to stop using WebIDL
… they think it’s a poor fit for data structures
… there’s an issue in their github
… so I wondered if we should align with what they are doing
… and there’s also a TAG issue about the proliferation of manifest formats
… so I think it’s good we use the same kind of specification process

Ivan Herman: for those who did not read the thread
… there are two issues
… one, the community is coming up with an agreement to use similar language when describing processing for these things
… matt has looked at it, and we could adopt this language
… the other problem is that those approaches do not provide one place where you can look at the whole data structure
… and WebIDL does
… nobody really likes webidl, neither do I
… but I don’t have an alternative
… if an alternative comes up, then we can use a new things
… but I think we should go ahead with CR

Romain Deltour: a couple things
… since we first discussed things, the specification landscape evolved
… we now know how to parse JSON into infra parts
… and that’s what the app manifest folks will use
… the question to matt, our use of webidl
… we don’t rely on it in the same way, I’m told, and I don’t understand

Matt Garrish: what I’m getting at is that it’s integrated–they define via their dictionary
… we don’t do that–our properties are not defined via webidl dictionaries
… and our processing is separate
… so we’re independent, it’s just a way of visualizing our data structure
… what they’re going to do in WAM is to take the interleaving of the webidl, and depend less on that and more on infra

Ivan Herman: if I remember well, use of WebIDL is not only for data structure but for functions

Dave Cramer: I’m a little puzzled by Ivan saying that there is no alternative for WebIDL but that is only as a way to visualize the data structure? I am confused that Matt says we’re defining it in the spec but we depend on it as expressing vocabulary?

Ivan Herman: it would be equivalent to using Typescript interface

Dave Cramer: And where do things like JSON schemas fit into this question?
… I don’t know what other kind of formalism would I use

Ivan Herman: json schemas are orthogonal to this

Romain Deltour: one of the similarities we have with app manifest
… the result of processing the manifest is what we define with webidl
… so this algo defines a string is converted into webidl
… I’m not an expert in webidl, and I’ve only read their issue a couple of times
… the issues are in defining the details of the conversion logic
… and then we have to define the conversion of json to web idl

Matt Garrish: I wouldn’t say that what we’re processing is intended to be webidl
… it was canonical json
… we’ll probably end up with infra types
… which is problematic
… and the fact that it gives the impression we have a web api
… I don’t know what else there is that easily describes the objects that are going to be created
… possibly the solution is that we don’t define the webidl
… and web devs figure it out

Romain Deltour: it does seem that it’s mostly an editorial issue? It doesn’t affect normative language?
… we can deal after CR?

Wendy Reid: can we call this postponed?

Ivan Herman: I would love to have a replacement for WEbIDL

Romain Deltour: in some ways it’s editorial, on the other hand it’s a big thing to change how things are described
… it’s a profound editorial change; not lightweight
… the sooner we know the better

Ivan Herman: yes, it is a profound editorial change
… I don’t see what we are changing for
… and I don’t see alternatives
… I’ve looked at many things
… using typescript is not a good idea
… it’s weird that there’s no standard formalization for data structures

Romain Deltour: one thing worth clarifying before CR, is to clarify what is the type of the result of processing algo
… we say we convert to an internal representation, but we don’t say what the internal representation is
… what does the algo produce? a json object?
… we should clarify that?

Matt Garrish: that’s part of what I hope to clean up with the infra types
… the normalization is more complicated than I thought
… our outcome will be an infra map
… if it’s purely about visualizing the data, maybe it doesn’t belong in the spec, maybe it should be in a wiki
… so we have the processing steps and then a separate numbering

Wendy Reid: could we see a PR with the infra in it?

Matt Garrish: it’s coming
… I hope it’s done later today

Ivan Herman: I’ll have to merge the other PR

3. CR timing

Wendy Reid: we’re out of time
… goal is CR by end of month
… so we need horizontal review
… I’ve sent messages to everyone
… next monday, the 14th, is the final day to accept issues
… speak now or forever hold your peace
… that’s all
… bye

4. Resolutions