Publishing Working Group Telco — Minutes
Date: 2019-10-07
See also the Agenda and the IRC Log
Attendees
Present: Ivan Herman, Dave Cramer, Geoff Jukes, Laurent Le Meur, Juan Corona, Gregorio Pellegrino, Wendy Reid, Romain Deltour, George Kerscher, Matt Garrish, Avneesh Singh, Ric Wright, Brady Duga, Ben Schroeter, Marisa DeMeglio, Franco Alvarado, Bill Kasdorf, Nellie McKesson, Charles LaPierre, Benjamin Young, David Stroup
Regrets: Luc Audrain, Joshua Pyle, Mateus Teixeira, Garth Conboy
Guests:
Chair: Wendy Reid
Scribe(s): Dave Cramer, Wendy Reid
Content:
Wendy Reid: let’s get started
Wendy Reid: See Last week’s minutes
Wendy Reid: let’s approve minutes from last week
… minutes approved
Resolution #1: last week’s minutes approved
1. Directionality
Wendy Reid: https://github.com/w3c/pub-manifest/issues/75
Wendy Reid: first issue is about directionality
Ivan Herman: See Pull request for directionality
Ivan Herman: I have already reported that this whole direction issue now has a clear solution
… I’ve made a PR
… you can do direction much like you can do language
… with the same syntax, either globally or for a given member
… so the PR fixes all the bits
… there is one issue which I put into the document
… having direction in the system means that we have a dependency on JSON LD 1.1
… which is running in parallel with our spec; they plan CR at the same time, they have implementations.
… so they may be ahead of us
… but if you want to run a manifest through a JSON-LD processor that only supports 1.0 you will get errors
… so I created a separate context file that doesn’t have direction
… and made an appendix in the doc
… so you can use this context if you don’t need direction, and need to use a JSON-LD 1.0 processor
… so the question to the group is that it’s OK to keep this as an appendix in the document
… or put it somewhere else, or not do it at all and depend on JSON-LD 1.1
… so I want help from the group in deciding what to do
Wendy Reid: any questions?
Brady Duga: I don’t have strong feelings, but it does feel strange to put in the spec itself
… most things that process this won’t be JSON-LD processors
… but I’m not against it
… I think it should be in a read me or impl guide
Matt Garrish: I’m concerned about how this might be interpreted
… changing our context which is required seems weird
… it could add confusion to have an alternate context
Ivan Herman: I put it there without being convinced it’s a good idea
… I’m fine removing it
… so where would this information go?
… we could give it to the testing people and have this added to information about testing
Matt Garrish: I would put in our wiki for implementation details, so it’s discoverable
Wendy Reid: a wiki is good
Dave Cramer: This seems like it would be a helpful note in the spec itself, if this language property is used, mention that JSON-LD 1.1 is required
… it’s a good use of non-normative text
Ivan Herman: any other comments?
… one comment… for those of you who have been looking at this
… if you look at this example
… it is a good example there for real cases where base direction is necessary
… some people were wondering if this stuff was necessary
… We found a book with a title in arabic and english
Ivan Herman: so it’s good to have a real-world example
Matt Garrish: I think it’s useful to have direction; we have it in EPUB.
Ivan Herman: is there a consensus to merge?
Matt Garrish: +1
Wendy Reid: +1
Brady Duga: +1
Dave Cramer: +1
Gregorio Pellegrino: +1
Ivan Herman: we should also pledge to never talk about direction again :)
Resolution #2: ivan to merge direction PR
2. wrap-up items for CR
Wendy Reid: the rest of the meeting will be wrap-up items for CR
Wendy Reid: https://github.com/w3c/pub-manifest/issues/
Wendy Reid: we have three remaining issues
2.1. issue 57, metadata in the publishing world
Wendy Reid: https://github.com/w3c/pub-manifest/issues/57
Wendy Reid: dave raised an issue about how metadata works in the publishing world
Dave Cramer: Everybody knows I worry about a lot of things, our experience with EPUB has been spent in metadata rabbit holes, new vocabularies
… everyone has a property that is important to them, we spend a lot of effort
… metadata is not always exposed to the reader, and it travels separately from the EPUB itself
Charles LaPierre: except VitalSource is starting to expose the Accessibility Metadata
Dave Cramer: I raised this so we could be thoughtful about the metadata we expose
Ben Schroeter: to add to that
… we do supply a11y metadata in the epub that is used by distributors
Charles LaPierre: I’m on the a11y metadata thing
… vitalsource is exposing EPUB a11y metadata to users
Wendy Reid: that metadata is in ONIX, too?
Charles LaPierre: some of it; it’s not a 1:1 mapping; there’s more in ONIX 3
… but it’s not in ONIX 2.1, which is still widely used in US
Brady Duga: what Dave said is true for publisher-supplied ebooks, but not so much from user supplied epub
… but the metadata that matters is mostly author and title
Ivan Herman: I’ve said several times, if this manifest exercise becomes successful, it may not be in the worlds where EPUB is already successful
… we should not be bound by EPUB or ONIX
Laurent Le Meur: in the thorium reader we try to present metadata in the OPDS feed or in EPUB
… but there is no consistent set of user-oriented metadata
… but we would like to get it right… publisher, language, category, subject, narrator…
… all would be useful
Matt Garrish: there’s room to do metadata standardization outside of the standard itself
… rather than putting every metadata scheme in the spec itself, leave it to the communities
… there should be some core stuff
… it’s probably more efficient to do things outside
Bill Kasdorf: will there be a generic way to incorporate community-specific metadata?
… so scholarly publishers can include what they think is essential?
Matt Garrish: that’s exactly how it would work and how it is set up right now
… we’re a proxy for schema.org; we can use anything there without having it directly in our spec
… and our context files include more prefixes
… we are very flexible
Avneesh Singh: +1 Matt
Matt Garrish: there should be a clear purpose to list metadata in the core spec
Bill Kasdorf: I am anti-bloat
Gregorio Pellegrino: I understand what Matt says but some metadata is essential, like description
… we should suggest to use some metadata, because otherwise reading systems won’t implement
Laurent Le Meur: I agree that in the core spec maybe we don’t need an extensive set
… communities can define their own community
… who is the community? Is the audiobook community literally this working group?
… but we need something defined somewhere
Ivan Herman: one of the reasons we took JSON-LD is because schema.org used it
… but JSON-LD is ideally suited for this… you can just add things and it’s OK.
… laurent is right; for different areas there should be communities defining metadatas
… I don’t know if there is additional metadata required by audiobooks, if so let’s add it
… it depends… a CG might be able to define some of these things
… the main goal is to provide a framework
Dave Cramer: I’m not opposed to metadata, we seem to think that embedding metadata is always good but past experience shows this data is rarely used, I’m aware of few reading systems that use title and author but we’ve made many EPUBs using copyright statements
Avneesh Singh: let this spec go to CR as is
Ivan Herman: See example to link external metadata (ONIX in this case)
Avneesh Singh: and we have the audiobooks spec; we need to know what audio publishers want
Ivan Herman: +1 to Avneesh
Avneesh Singh: and we can do a note or registry with metadata
Gregorio Pellegrino: I agree with avneesh
… the possibility to define the role of contributor in schema.org are very poor
… we need ways to add that
Laurent Le Meur: adding metadata can be done step-by-step
… we need a group that can host these needs
… which group is it? This WG working on audiobooks?
… then we can wait for needs from publishers of audiobooks
Matt Garrish: to what gregorio said, we can request that schema.org add stuff that’s missing
Ivan Herman: +1 to matt
Gregorio Pellegrino: +1 to matt
Matt Garrish: it never ends well when we add metadata to our own standards
Ivan Herman: a partial answer to laurent
… there are two issues. one is, who are the groups that develop metadata? I don’t think there is one answer.
… two: how do you find the metadata that has already been developed?
… we may need a registry
Bill Kasdorf: in some sectors of publishing there are organizations that govern metadata
… IPTC, JATS, etc
… as we reach out to other sectors, we will find there are already metadata standards
Wendy Reid: this is not a question for us to solve today
… we have sufficient metadata in our specs for now. We’ll see how it goes in CR.
… so let’s move on
Ivan Herman: is it OK if we close this issue?
Dave Cramer: refresh your github
Gregorio Pellegrino: if we close the issue, how can we say we are thinking about this?
Gregorio Pellegrino: Fine
Ivan Herman: we could say it’s deferred
Gregorio Pellegrino: Defer
2.2. WebIDL yea or nay?
Wendy Reid: https://github.com/w3c/pub-manifest/issues/98
Romain Deltour: this is mostly an editorial issue?
… after talking to marcos at TPAC
… for web app manifest spec they are going to stop using WebIDL
… they think it’s a poor fit for data structures
… there’s an issue in their github
… so I wondered if we should align with what they are doing
… and there’s also a TAG issue about the proliferation of manifest formats
… so I think it’s good we use the same kind of specification process
Ivan Herman: for those who did not read the thread
… there are two issues
… one, the community is coming up with an agreement to use similar language when describing processing for these things
… matt has looked at it, and we could adopt this language
… the other problem is that those approaches do not provide one place where you can look at the whole data structure
… and WebIDL does
… nobody really likes webidl, neither do I
… but I don’t have an alternative
… if an alternative comes up, then we can use a new things
… but I think we should go ahead with CR
Romain Deltour: a couple things
… since we first discussed things, the specification landscape evolved
… we now know how to parse JSON into infra parts
… and that’s what the app manifest folks will use
… the question to matt, our use of webidl
… we don’t rely on it in the same way, I’m told, and I don’t understand
Matt Garrish: what I’m getting at is that it’s integrated–they define via their dictionary
… we don’t do that–our properties are not defined via webidl dictionaries
… and our processing is separate
… so we’re independent, it’s just a way of visualizing our data structure
… what they’re going to do in WAM is to take the interleaving of the webidl, and depend less on that and more on infra
Ivan Herman: if I remember well, use of WebIDL is not only for data structure but for functions
Dave Cramer: I’m a little puzzled by Ivan saying that there is no alternative for WebIDL but that is only as a way to visualize the data structure? I am confused that Matt says we’re defining it in the spec but we depend on it as expressing vocabulary?
Ivan Herman: it would be equivalent to using Typescript interface
Dave Cramer: And where do things like JSON schemas fit into this question?
… I don’t know what other kind of formalism would I use
Ivan Herman: json schemas are orthogonal to this
Romain Deltour: one of the similarities we have with app manifest
… the result of processing the manifest is what we define with webidl
… so this algo defines a string is converted into webidl
… I’m not an expert in webidl, and I’ve only read their issue a couple of times
… the issues are in defining the details of the conversion logic
… and then we have to define the conversion of json to web idl
Matt Garrish: I wouldn’t say that what we’re processing is intended to be webidl
… it was canonical json
… we’ll probably end up with infra types
… which is problematic
… and the fact that it gives the impression we have a web api
… I don’t know what else there is that easily describes the objects that are going to be created
… possibly the solution is that we don’t define the webidl
… and web devs figure it out
Romain Deltour: it does seem that it’s mostly an editorial issue? It doesn’t affect normative language?
… we can deal after CR?
Wendy Reid: can we call this postponed?
Ivan Herman: I would love to have a replacement for WEbIDL
Romain Deltour: in some ways it’s editorial, on the other hand it’s a big thing to change how things are described
… it’s a profound editorial change; not lightweight
… the sooner we know the better
Ivan Herman: yes, it is a profound editorial change
… I don’t see what we are changing for
… and I don’t see alternatives
… I’ve looked at many things
… using typescript is not a good idea
… it’s weird that there’s no standard formalization for data structures
Romain Deltour: one thing worth clarifying before CR, is to clarify what is the type of the result of processing algo
… we say we convert to an internal representation, but we don’t say what the internal representation is
… what does the algo produce? a json object?
… we should clarify that?
Matt Garrish: that’s part of what I hope to clean up with the infra types
… the normalization is more complicated than I thought
… our outcome will be an infra map
… if it’s purely about visualizing the data, maybe it doesn’t belong in the spec, maybe it should be in a wiki
… so we have the processing steps and then a separate numbering
Wendy Reid: could we see a PR with the infra in it?
Matt Garrish: it’s coming
… I hope it’s done later today
Ivan Herman: I’ll have to merge the other PR
3. CR timing
Wendy Reid: we’re out of time
… goal is CR by end of month
… so we need horizontal review
… I’ve sent messages to everyone
… next monday, the 14th, is the final day to accept issues
… speak now or forever hold your peace
… that’s all
… bye
4. Resolutions
- Resolution #1: last week’s minutes approved
- Resolution #2: ivan to merge direction PR