Publishing Working Group Telco — Minutes

Date: 2019-06-03

Attendees

Present: Tzviya Siegman, Ivan Herman, Nick Ruffilo, Wendy Reid, Laurent Le Meur, Deborah Kaplan, Simon Collinson, Jun Gamou, Charles LaPierre, Teenya Franklin, Nellie McKesson, Matt Garrish, Bill Kasdorf, Mateus Teixeira, Luc Audrain, Benjamin Young, Franco Alvarado, Gregorio Pellegrino, Marisa DeMeglio, Garth Conboy, Ric Wright, Brady Duga, George Kerscher, David Stroup, Rachel Comerford

Regrets: Avneesh Singh, Tim Cole, Geoff Jukes, Dave Cramer

Guests:

Chair: Tzviya Siegman

Scribe(s): Nick Ruffilo

Laurent Le Meur: I’m there

Tzviya Siegman: https://www.w3.org/publishing/groups/publ-wg/Meetings/Minutes/2019/2019-05-20-pwg

Tzviya Siegman: Any comments on the minutes? … Minutes approved

Resolution #1: last telco’s minutes approved

1. open action items

Tzviya Siegman: https://github.com/w3c/publ-wg/labels/action

Tzviya Siegman: Next agenda item is to go through the list of open action items. Ivan put together a good list. We don’t need to solve them, just a quick comment on the people assigned
… #48 - Benjamin - the publication address ending in / - comments? (Benjamin is not here)
… #47 Wendy - you added a comment about audio object. You have this in progress. Anything to add?
… #46 - Franco - working on a GAP analysis. (franco is not here today)
… #45 = Ivan - set up discussion on the archival committee.

Ivan Herman: I will close this because we had the meeting 2 hours ago. They will probably set up a CG at some point.

Tzviya Siegman: Does this need to be standalone or will you launch as well?

Ivan Herman: Standalone as there might be issues that are no publication-only but I will continue and follow up.

Tzviya Siegman: Part of the point of the CG was to launch from within the group, so I’d say it’s still a good candidate, but we can discuss separately
… #44 - Talking to Ping about walking through the questionnaire. Ralph suggested someone in the group walk through and I need to talk to someone
… #43 - is Marissa here? It’s on the agenda for next week, so we’ll have an update then
… #42 - Ivan and Garth - where are you on the bespoke previews

Garth Conboy: I have given it thought… I have not created the pull request but it’s on my todo list.

Tzviya Siegman: Please add a comment on #42 for that.
… #41 - editing the explainer for web-app manifest. I haven’t done that yet but hopefully this week.
… #37 - wendy talked to tess about the audio explainer issues

Wendy Reid: I’m still waiting on Tess’s reply to my email.

Tzviya Siegman: #35 - write a proposal to web publication. Jeremy is not here…

Ivan Herman: I had a discussion with him an hour ago. He’s in progress with this.

Tzviya Siegman: #34 - find a home for the sync narration spec

Wendy Reid: We’re going to have the sync media group publish their findings as a note, then have the working group publish that as a note.

Tzviya Siegman: #33 - something for Dave (who is not here) - looks like there’s some work to be done… We’ll leave to next week.
… Please get your work done you awesome people
… This might be good to put on the agenda every other week to review

2. Packaging spec issues

Tzviya Siegman: Laurent - we have the list of packaging specs. There were a list of issues in the agenda, hopefully people have reviewed them

Laurent Le Meur: https://raw.githack.com/w3c/pwpub/master/spec/ocf-lite.html

2.1. ua conformance

Laurent Le Meur: https://github.com/w3c/pwpub/issues/36

Laurent Le Meur: Link to the draft spec in minutes. First item #36 - followup on discussion about conformance requirements for user agents. What should we include inside the packaging specification. What I did was to specify that - simply - a user agent is conformant if it’s capable of processing a conformant package…
… and if it fulfills the criteria specification. And then it links to the W3C specification.
… That way we could rely just on the WP specification. We added a line to ensure that the User Agents conform to the LPF spec.

Laurent Le Meur: https://raw.githack.com/w3c/pwpub/master/spec/ocf-lite.html#sec-ua-conformance

Laurent Le Meur: This is something you find in section 3. (link in chat) If you agree, we can move forward and close this issue

Ivan Herman: +1

Luc Audrain: +1

Nick Ruffilo: +1

Nellie McKesson: +1

Wendy Reid: +1

Ric Wright: +1

Benjamin Young: I was just looking where processing is defined.
… Is it just unzipping - or is it other things
… ‘it is capable of processing a conformant package’ what do you mean by processing

Laurent Le Meur: there is no strict definition - just following the rules of section 2 of the same spec.
… But I don’t see a big semantic difference between handling and processing

Benjamin Young: Is an LPF compliant processor going to turn this stuff into something else, and what is that?

Laurent Le Meur: we decided not to define an LPF processor that turns the package into something else. Most would turn it into a WP, but we won’t exclusively add it. If we only make a reference to the user agent conformance, there would be no mention of the file name (index.html or publication.json)
… we can come back to that later - by making a reference to the conformance of WP, we will have to go through - step by step retrieval of the manifest…

Tzviya Siegman: I hear Benjamin’s concern, but I’m not sure what we can put in there to address it.

Benjamin Young: Zip has a clear intention, but is this meant to be like zip, where it’s just a transfer format, and you explode it, then process it… but if it’s going to a mobile app, there are expectations. I realize the scope here is mixed because it’s tied to web publications.
… I’m just - since these are conformance requirements, these are not a testable phrase.

Luc Audrain: This document defines a file format and processing model. I suppose we should have something to explain the processing model?

Laurent Le Meur: the previous decision was to rely on the processing model.

Luc Audrain: so maybe we add a line pointing to the web publication spec? It will help the concern perhaps?

Tzviya Siegman: We’re then defining two different processing models in one document. In 2.1 - a web publication lightweight package - we’re already referencing the zip spec.

Ivan Herman: There is an LPF processing model - all it is is the zip processing format that is defined elsewhere. The result is processed as defined. In a sense, this does not define it’s OWN processing model separate, just the combination
… We might have to be a little more clear about the two different processing models, but that is the intention
… Maybe the user agent conformance is the first bullet item - we can make it more clear that the user agent must follow whatever is described in the zip - must unpack the content, and the content must follow web package conformance.

Laurent Le Meur: this is only one type of processing - unzip, explode, use after that, but there might be other types of processes.

Ivan Herman: I think what we have to be careful is that we define a conceptual processing engine, but can it be implemented. Conceptually speaking - per spec, the result should be a web publication
… if someone does something different inside, that’s not for us to decide

Laurent Le Meur: I would say that reading or exploiting the zip is possible.

Benjamin Young: I was going to suggest that we use the same phrasing that the zip spec uses when it refers to accessing the contents of files. Laurent is right - unzipping is a specific step, but there are other ways to access files.

Tzviya Siegman: https://raw.githack.com/w3c/pwpub/master/spec/ocf-lite.html#sec-zip already exists in the document

Benjamin Young: but we have to make sure you can go from that knowledge to whatever is in the publication. The discovery methods seem different - so some sort of processing description determining the difference between the zip and the web publication - that would be really helpful

Laurent Le Meur: Benjamin, do you think we should write the processing model in this document?

Benjamin Young: in so much as it differs. In the case where there is a missing primary entry page - that would be different from standard zip

Laurent Le Meur: there are also differences in the discovery of the manifest. In the web publication document - we use the word ‘fetch’ which is an HTTP word, but in zip you don’t do that, it’s a different concept, so there is something to discuss there.
… the action item i propose is to state that we agree on adding wording that will be modified from the original. Some wording that links to the web publication specification
… we will see in another issue the exact issues we have with the web publication processing model and try to solve this there. I propose we table this issue and see the next issue

2.2. better define ua conformance

Laurent Le Meur: https://github.com/w3c/pwpub/issues/46

Laurent Le Meur: issue #46 - I propose we close it to clean the table. I propose to keep 36 open and close 46…

Tzviya Siegman: so the idea is the previous issue will solve this?

Laurent Le Meur: Yes

2.3. metadata specific to PWP

Laurent Le Meur: https://github.com/w3c/pwpub/issues/10

Laurent Le Meur: the next is about descriptive metadata. We have 3 specifications… Issue #10 - what metadata is specific to PWP - we saw that there is no descriptive metadata related to a package, but there is a canonical ID and web publication address.

Ivan Herman: +1

Laurent Le Meur: both are discussed in a different issue. I propose we close this and note that there is no descriptive metadata related to the package - but keep open #47

Tzviya Siegman: +1

Luc Audrain: +1

Nick Ruffilo: .. do you agree and can we close

Benjamin Young: The manifest includes a reference to the canonical ID - are you saying that sufficient?

Laurent Le Meur: we must treat descriptive metadata and canonical ID separate. We’re discussing specifically descriptive metadata. This issue is exclusively about descriptive metadata.

Nick Ruffilo: +1

Wendy Reid: +1

2.4. Packaged publications vs canonical id and WP address

Ivan Herman: https://github.com/w3c/pwpub/issues/47

Laurent Le Meur: #47 - packaged publications vs canonical ID: this is more complex. A WP can have an address, because we know where it’s located. By definition the canonical ID is the preferred WP address (a URL). If an LPF file is born inside the publishers house, before it becomes a WP, it has no address
… so the solution is to have no canonical ID. It’s possible in the spec to have no canonical ID. It can have a DOI, or an ISBN - so we had an issue with canonical ID.
… We have a different issue with WP URL…

Ivan Herman: The current text does say that the unique identifier - but I don’t know if bill is here, but what we are really talking about here is a unique identifier for the publication, which is preferred to be a URL.
… If the unique identifier is different, that should be perfectly accepted and the text in the web publication should be ammended. In your case the DOI or URN should be OK. It is used to identify something.
… Yes, the DOI can be tacked to the URL but when you look at the DOI, it’s an identifier. I think we might go the other way from the web packaging work - it’s OK as is (we might want an editorial note that it’s not always a URL)
… and then make the text on a canonical ID a little bit lighter.

Luc Audrain: Can this identifier be an ISBN?

Laurent Le Meur: https://www.w3.org/TR/wpub/#example-49-example-for-setting-both-the-isbn-and-the-address-of-the-same-document

Laurent Le Meur: If you look at the example in #49 - you’ll have your answer.

Ivan Herman: Wait - we may have a problem. The ID - being a jsonLD term, and the value must be a URI. Benjamin is that correct?
… it’s a question. In JSONLD - the ID must be a URI?

Benjamin Young: Yes, IRI…
… If you were to use an ISBN, it could be ISBN:[number]

Laurent Le Meur: so it’s solvable. For the WP address?
… they are related because they are sometimes the same.

Ivan Herman: What should we do with #47?

Laurent Le Meur: should we add a note in the spec that any URI can be used as a canonical ID?

Tzviya Siegman: I think we should include some examples.
… if I understood what Benjamin said, the examples we had were not quite right?

Ivan Herman: that’s a schema.org issue. They have a separate term for ISBN. That’s a problem in schema.org

Deborah Kaplan: +1 for including examples that are non-url URNs

Benjamin Young: The examples aren’t wrong. We show and example of schema.org ISBN - but we don’t state that there is any use for that. If we move it into use it as the ID of the document.
… It just changes - not everything damaging

Garth Conboy: I was not concerned, just wanted to go back to what Benjamin said for the value of ID. Not sure it’s a problem, but it was a good example

Ivan Herman: there is a need to make a small review of that section

Tzviya Siegman: Do we have a resolution for this point?

Laurent Le Meur: There is no need to resolve anything. The issue #47 is also about the WP address. the Wp address is another issue because the property is a URL yet the package has no URI

Ivan Herman: Isn’t it issue #45?

Laurent Le Meur: We should open a new issue for the publication address

Ivan Herman: and the canonical issue should be closed - we put some examples in the document and update the main document

Benjamin Young: This isn’t as massive as it sounds but it’s going to come down to the origin one. if we have a canonical ID - whether or not reference into the publication - scrolling, etc should use the canonical ID. It doesn’t have to be dereferencable - so it can be any identifier.
… so we’re going to have to make some determinations about package referencing. ‘Is that going to be the thing’ or is the web publication address the thing we point to. It’s no longer a packaged web publication, then it would be a packaged publication
… because if it’s offline, we’re just pulling it out of the box. If they are more epub-y, then they are already off the web. The identifiers have a whole set of new issues, and may not have a URL… it’s an identity crisis

Garth Conboy: https://www.w3.org/TR/wpub/#canonical-identifier

Garth Conboy: I didn’t quite understand if we made a decision or moved to another issue, but are we also saying - as it’s relevant to audiobooks - that the canonical ID could be an ISBN or DOI… Did we get to anything?

Ivan Herman: My understanding on the canonical IDs is it can be missing or be any type of valid URI - ISBN or DOI or whatever else

Garth Conboy: Is that a change?

Ivan Herman: it’s a slight change from the text - because the text requires dereferencable.

Garth Conboy: It also requires the term ID

Ivan Herman: It uses the term ID. We have a schizophrenic issue with ID. We have explicit terms for many of them, but it’s not exhaustive. They don’t have a term for DOI but they do for ISBN…
… people can use the ISBN because it’s schema.org - but ideally they should use the ID in a valid URI form so it’s valid for JSONLD

Garth Conboy: We explicitly say it’s expressed using the ID property

Tzviya Siegman: if we’re not clear on it, it’s confusing in the document

Benjamin Young: If we - whatever we put in the ID field as the canonical - should/must (probably must) would become the identifier of the publication. Ideally on the hard drive or the server… We made it a URL but accommodated for DOIs - so you could use a DOI so it’s more permanent than just a URL…
… we were trying to have a canonical ID. When it was just web publications, we dereferenced the ID. That ID doesn’t have to be dereferencable and there might not be a URL in it at all. But we need to think about the systems and what they do with the ID. In JSONLD, it becomes the identifier of the thing.
… so if you release version 2 of this, you give it a different ID - or it has a different name entirely.
… so there’s a whole host of identification issues that this brings back up

Ivan Herman: I think we have discussed this a long time ago - the difference between an ID and a URL. This is not a packaging issue. You should be able to use a URN as an identifier - even if it’s on the web. You have the address, which is different from the ID. What we have right now - regardless of packaging - is a bit too restrictive
… and that’s why we have 2 different things, the URL and the ID.

Tzviya Siegman: I think what we do need in the document is clearer language. It’s confused enough of us. Not sure how to propose that. Garth, it sounded like you had some suggestions.

Garth Conboy: I had more questions than suggestions. I found that what was in there wasn’t matching this discussion. Matt can probably work magic on it though.

Tzviya Siegman: Well, if that’s how we’re leaving it…

Matt Garrish: I can give it a try but I’m a bit confused - does it have to dereference, should it dereference?

Ivan Herman: we’ll come up with something

Laurent Le Meur: #47 should be closed then?

Ivan Herman: we can put an example there, but nothing really heavy

Ivan Herman: +1

Nick Ruffilo: +1

Luc Audrain: +1

Benjamin Young: it should stay open because it’s related to packages - and it’s a packaging driven requirement

Laurent Le Meur: in this case - i want to move to #49 - the real issue - most packages created by publishers will have no address or URL until they are exposed on the web

2.5. Packages vs WP Address

Ivan Herman: https://github.com/w3c/pwpub/issues/49

Laurent Le Meur: the web specification requires a URL inside the manifest. I would propose that this address must be index.html - a relative URL which is the name of the primary entry page - the URL value. Even if the primary entry page isn’t in the package. the URL will be updated by the processor when it has a proper URL (when it becomes a web publication)
… this is the only way I see being able to relax the web specification

Ivan Herman: I almost agree, but we have an open action - something to discuss - is use the URL with a trailing ‘/’ as a required address. I think he has an action on coming up with a proposal for that
… if that approach flies, that would be a resolution to what you asked. Wouldn’t need index.html - even if it’s localhost, using that works with anything

3. WPUB changes

Matt Garrish: the changes are related to the responses to the TAG comment (E.g., WebIDL restructuring, changes in the i18n related wordings,…). Better use next week’s call for more details.

4. Resolutions

Resolution #1: last telco’s minutes approved