Publishing Working Group F2F Meeting—Day 1 — Minutes

Date: 2019-05-06

See also the Agenda and the IRC Log


Present: Deborah Kaplan, Wendy Reid, Charles LaPierre, Ivan Herman, Romain Deltour, Garth Conboy, Dave Cramer, Rachel Comerford, Luc Audrain, Marisa DeMeglio, Ralph Swick, Avneesh Singh, George Kerscher, Tim Cole, Matt Garrish, Franco Alvarado, Karen Myers, David Stroup, Benjamin Young, Brady Duga, Nellie McKesson, Tzviya Siegman, Toshiaki Koike, Daniel Weck, Laurent Le Meur, Nick Ruffilo, Ric Wright, Geoff Jukes


Guests: Andrea Martucci, Geeth (Sangeetha) Sivaramakrishnan, Jeremy Morse, Leslie Hulse, Maurice York, Jeff Jaffee, Ralph Swick, Hadrien Gardeur

Chair: Garth Conboy, Tzviya Siegman, Wendy Reid

Scribe(s): Romain Deltour, Wendy Reid, Rachel Comerford, Dave Cramer, Nick Ruffilo, Jeff Jaffee, Nellie McKesson, Brady Duga


Tzviya Siegman:

Deborah Kaplan: to talk to the irc bot

Tzviya Siegman: Introducing the schedule for today. We are going to cover getting WP to CR, audiobooks to FPWD, and some ideas for future things.

1. horizontal reviews

Garth Conboy: we’ve taken the draft to the TAG for review

1.1. TAG review on WPUB

Garth Conboy: tzviya and dave are set up to lead us in this session

Ralph Swick: -> WPub #417 Audiobooks

Tzviya Siegman: we opened two issues- audiobooks and webpub
… let’s start with webpub
… we haven’t received much formal feedback on webpub

Ralph Swick: -> TAG #344 Web Publiations

Dave Cramer: -> TAG on manifests

Tzviya Siegman:

Tzviya Siegman: we’re not going to come up with meaningful answers this morning but it’s worth reviewing these questions this morning so we can keep them in the back of our mind as we go through the rest of these sessions
… let’s go through the highlights
… the biggest question is the relationship to manifest and what a manifest is
… what is it that we are trying to accomplish with a manifest
… what is a package and why do we need it
… what is is we’re trying to accomplish with jsonld
… the question of protection in the use case document
… see the document for the rest

Dave Cramer: some of this feedback starts around the manifest
… TAG has a design principle around not recreating the web application manifest
… I think that’s a key issue in getting buy in from the tag and the browser vendors
… I think the arguments we’ve heard that the extension model does make people happy

Ivan Herman: More detailed (and final?) comments:

Ivan Herman: I put in IRC a reference to more detailed comments
… if we look at dbaron’s comments from a couple of weeks ago they are much more specific
… they will allow us to avoid these philosophical discussions
… they look at why webidl is right to use, etc
… unless/until tag says we must use web app manifest, I think we just move on
… and use it

Dave Cramer: I wouldn’t characterize it that way, I think they gave us a bunch of feedback about the manifest that we described

Ivan Herman: if I look at the comments in 344
… I don’t see anywhere that this general principle is something that they came back to us with
… this manifest conversation is an endless loop

Luc Audrain: +1 to Ivan

Matt Garrish: from his comments I think he is saying that the expectations from our manifest description are not clear

Tzviya Siegman: I think it’s important that we not try to solve this in this session
… what’s important is this was less than clear
… and we consider this in refinements
… I think we need to go back to dbaron and consider what it is clarify - webidl (why), bounds (urls), obtaining a manifest

Matt Garrish: they would like to see more of what a user agent would do with this manifest

Ivan Herman: there are some very specific issues
… we can carve it up into specific assignments to tackle here or later with dbaron
… he has created an issue with a bunch of sub-issues and we have to divide and conquer

Benjamin Young: we need to be more explicit in our use of this thing

George Kerscher: it sounds as though naming may be part of our issue with the manifest

Ivan Herman: how about waybill?

Rachel Comerford: I want to go back to a point of Ivan’s that there’s some action items to follow up on, instead of circling back on the WAM discussion
… we may want to point to the new version and ask dbaron

Action #1: talk to david about WebIDL (Ivan Herman)

Action #2: talk to david about Localizable strings (Ivan Herman)

Action #3: edit explainer about WAM (Tzviya Siegman)

Action #4: report back to dbaron that link rel value issue is resolved (Matt Garrish)

Action #5: obtain manifest integrating with CORS explanation - there is an open issue for resolution as well (Dave Cramer)

1.2. TAG feedback on audiobooks

Dave Cramer: there are quite a few comments - a basic one is why are zipping things that are already compressed
… we could use bundled http exchanges or tar
… we were asked to go into why things were rejected
… the do not reinvent the wheel
… ie the atom file
… I think the fundamental thing they want to see if why we’re not using more of what’s out there - is an audiobook closer to a webpub or a podcast and why?

Wendy Reid: we can make some action items out of this
… we need to better explain the discuss on the packaging items

Action #6: better explain the packaging options we rejected and why (Wendy Reid)

Wendy Reid: we need more feedback from podcasters
… we might be trying to solve a problem for the podcast industry that they just don’t have

Matt Garrish: when this is open and unpackaged, is this a web publication? We don’t want to proliferate formats. That’s part of the reason we are looking at these formats

Dave Cramer: being the audience to a podcast makes you a valuable participant
… I think we should detail why we are looking into this

Ivan Herman: avneesh asked about accessibility and the alternate format issue is one we have to continue to consider

Deborah Kaplan: don’t judge what a podcast can do based on what they do now
… 1. Podcasts can do structure
… 2. TAG is keeping us honest; “publishers know how to do zip” is a good feature in zip’s favor, but it is not the last word when we write a spec we get to recommend a new technology.
… 3. We have a great document of our needs and requirements, so we can compare that to existing techs, eg. podcasts.
… podcasts can have structure, but implementation is non standard

George Kerscher: I’m a little annoyed by the lack of structure in the audio spec
… why aren’t we defining the webpub TOC
… it’s the collection that’s the most important the navigation through the collection

Benjamin Young: We need to determine if we’re targeting a package-based distribution model (ala EPUB) or a Webby “constructed” model of distribution (ala Web Apps) or both.

Benjamin Young: Determining that will clarify our description format (manifest), our package (or lack of packing) choice, and how we expect these to be created and deployed

Leslie Hulse: I think it’s important when you’re talking through the delivery format that it’s zipped up and clean when you’re talking to a publisher

Wendy Reid: Action items!!!

Action #7: talk to Tess on the audio explainer issues (Wendy Reid)

Benjamin Young: in thinking about the audiobook - our processing model is a little fluid
… this is our first attempt at a profile - I wonder if there are upstream rules we need to set for future profiles of wpub
… like, you need a TOC

Matt Garrish: I fully agree with that

Luc Audrain: I think we should bring back the question of “for who is this spec written”
… we should be thinking about these specific needs that we have as we consider moving forward

Avneesh Singh: +1 Leslie

Leslie Hulse: we need to think about efficiency and consistency as an element of this as we did in the development of epub

1.3. I18n review

Ivan Herman: The only major I18n issue is the base direction of text
… which we are not in the position to properly solve it
… we are dependent on jsonld - and we also know it is not up to the jsonld working group to solve
… I have spoken to the i18n group extensively about this
… the most recent comments are that what we have now is okay as is
… dbaron also said that this paragraph needs improvement
… its not a proper/final solution
… there is one more issue that was not raised which was to use language maps in the manifest
… I wouldn’t discuss that in the face to face
… apart from that i18n seems complete

1.4. accessibility review

Tzviya Siegman: we’re in pretty good shape with a11y
… Avneesh has been working for a formal a11y review

Avneesh Singh: when should it start?

Tzviya Siegman: now

Avneesh Singh: we had a discussion about the fast check list - it’s not complete
… after we take care of that we can start the formal review

1.5. security/privacy

Tzviya Siegman:

Tzviya Siegman: we have no volunteers for working in security/privacy
… we really need to get through this questionnaire
… ping is the Privacy Interest Group
… bigbluehat and dkaplan3 can help a little

Action #8: ask PinG if someone can walk us through the questionnaire because we are without an expert (Ralph Swick)

Tzviya Siegman: Brady? Can you help?

Brady Duga: I guess because you’re making me?

Tzviya Siegman: Task Force: bigbluehat, dkaplan3, duga

Benjamin Young: it’s not as scary as it seems
… (gives inspirational speech)

Deborah Kaplan: if you care about business impact - this is a good place for you to participate

Brady Duga: who’s in charge?

Tzviya Siegman: I can help coordinate to start

2. Synchronized media

Marisa DeMeglio: I work for DAISY and what we’ve done for 20+ years is provide standards and support for synchronized audio books, for people with print disabilitites
… there are various flavours of audio books
… sync clips of audio and HTML text, sync audio with no text, etc
… we looked at how to do this for Web Pub
… the precedent is DAISY tech, and after that EPUB Media Overlays (MO)
… we looked at a lot of different technology

Marisa DeMeglio:

Marisa DeMeglio: the documents I linked to are the first draft of an overview, explainer, use cases, and specs
… the CG is called “Synchronized Media” which could cover a lot of things (captions, audio, etc)
… after looking at our UC, we decided to focus on synchronized audio narration
… so right now we call the spec “Synchronized Narration” (sync audio with HTML text)
… we have an explainer that goes over why we didn’t pick other technologies (we talked about that in the previous f2f)
… we decided to go for a custom solution
… Synchronized Narration isn’t necessarily tied to Web Publications, you could use it for standalone HTML
… it’s a simple and straightforward JSON structure
… essentially a sequence of audio clips and text pairs
… the advantages of keeping it very simple is that it’s easy to implement
… (an experimental implementation is almost ready)
… what we need to do still is flesh out the spec and the explainer, and look at how to include it in Web Pub
… we looked at how to apply this narration structure to an audio book or a textual Web Pub
… and decided it makes more sense to add that as a property to the Web Pub, rather than making it a Web Pub profile

Ralph Swick: I’m curious about the potential other publishers
… I wonder if you got feedback from other groups like SMIL, TTML, etc

Marisa DeMeglio: we used to be in the SMIL WG when it existed

Daniel Weck: smil :(

Daniel Weck: (pun intended)

Marisa DeMeglio: SMIL was isn’t really implemented outside of EPUB 3
… our usage of SMIL in EPUB 3 always was a bit different compared to how it was used in other places
… for TTML, the text lives in the same file as the timing information
… for WebVTT, there was no easy way to put text references other than using metadata

Ralph Swick: what I was wondering if e.g. the WebVTT group agreed with this analysis
… and record these conversations

Marisa DeMeglio: we discussed with them at TPAC

Ralph Swick: you can record that you had these discussions, the next step is to file an issue on their tracker

Garth Conboy: this looks like how we used SMIL in EPUB, which is to my view a feature
… maybe you can walk us through the spec, e.g. what is role? do you need to add IDs to the HTML? etc

Marisa DeMeglio: the examples are based on HTML having the IDs
… we thought about using CFI, but I’m reluctant to make this complex
… the dream is to have a non-destructive way to add narration to an HTML
… if there is a way to do that in Web Pub, we’ll be the first to use it
… another problem we had in MO was that you had to structure your document in the way the HTML was structured
… I don’t think anybody ever did this except for skippability/escapability
… (see for definitions of “skip” and “escape”)
… so in order to have meaningful skipping and escaping you need to have a “role” on the narration object

Romain Deltour: [marisa describes the structure of Sync Narration, based on the example in the spec draft]

Marisa DeMeglio: linking the synchronized narration can be done with the HTML link element
… one sync narration corresponds to exactly one HTML doc (which is a useful simplification from MO)

George Kerscher: in the last f2f we went to the APA meeting and they asked the same questions about why we were doing something different
… we explained to them and they gave their blessing, WebVTT people were in the room
… another point: to have the synchronized narration file, do you have to touch/edit the HTML file?

Marisa DeMeglio: if the elements already have IDs, you don’t have to do anything else, if they don’t have IDs you need to add them

George Kerscher: pretending you have an independent publication that’s text, another one that’s audio, can you add that in between to synchronized both?

Marisa DeMeglio: that’s the dream, but we’re not there yet

Leslie Hulse: I’m working with a vendor that is doing media overlay on epub - will that map?

Marisa DeMeglio: I think it’s reasonably straightforward, you can reasonably go from MO to sync narration, and the other way around too

Ivan Herman: on the issue of the text reference, there are 2 things that are worth referring to
… if you say this is a URL and not an ID, it leaves the door open
… there is a spec out there that the Web Annotation group has developped, on how you can express references in JSON
… this use case is very close to what is done in Web Annotation
… the problem we had was that if you want to use a URL it becomes controversial, but we don’t have this constraint
… so we could say either you have a URL, or an object which structure is defined in the Web Annotation spec
… another thing that may be a problem is if this file (sync narration) is referred to from a manifest as a separate file, and all the discussion about the origin and absolute URLs apply
… the 3d question I had was administrative: are we confident enough to say that this would be a Rec from this WG (currently it’s a CG report)

Tim Cole: in addition to the technical note, we talked about CFI in Web Anno and one feedback was: are we making enough use of CSS Selectors Level 3?
… they go well beyond IDs and allow you to refer to some text without IDs
… another thing that came out was shadow dom
… sometimes content can come from JS, as shadow DOM, there is an API to point to that
… we want to make sure we look at all those issues and possibilities

Dave Cramer: -> Selectors 3 and selectors 4

Wendy Reid: -> EPUB CFI 1.1 (Canonical Fragment Identifiers)

Luc Audrain: the explainer says that Web Annotations are not implemented in browsers, but if we’re doing something new it won’t be implemented either
… I’m a bit sad that Web Annotations cannot be used here, is there a possibility that they could still be used?

Marisa DeMeglio: I’m open to revisiting that, what we found is that there was not an associated processing model for playback
… it also didn’t feel like a good fit for nesting
… there are possibly some customization you can do in Web Annotation, if you know more or have example please share

Luc Audrain: Web Annotations is used for a11y, for instance for dyslexic students to colorize syllabus, this is close to what synchronized audio could do

Marisa DeMeglio: it’s all worth exploring, but I’m always hesitant about standardizing something that we don’t have people representing here

Benjamin Young: +1 to building on a shared foundation–i.e. the selection/targeting/pointing-at model from Web Annotation

Ivan Herman: we can use the selection model from Web Annotation without using the full Web Annotation
… it can be used without referring to the processing model

Marisa DeMeglio: how is the implementation support on Web Annotation selectors?

Benjamin Young: in what language and what platform?
… there are a lot of JS libraries
… audio and video-focused libraries predate the Web Annotation spec
… the textual selectors have a lot of support in JS libraries in the browser
… chromium developers work on a simplififed version, possibly done in an extensible way
… Apache Annotator is the one that has the most advanced text selectors implementation

Tzviya Siegman:

Marisa DeMeglio: I don’t worry so much about the audio
… the text is the trickier case
… when I looked at the Annotation selectors model, there was so many ways to do it that I was afraid of confusing people

Benjamin Young: in your document, you almost have a Web annotation selector (given properties renaming), it’s the same model

Ivan Herman: I wouldn’t be shocked if we limit the option of selectors for this use case
… there is already a mechanism that can be used to refer to portion of the text without having to modify the text

Tzviya Siegman: we should think about this needs to be kept in the Pub WG or could even be moved to another WG

Daniel Weck:

Daniel Weck: I have a concern with our current proposal in that it uses the same “hack” we used in MO that allows content creators to define the style of the currently active narrated text
… for that they define the CSS class that is used to style these elements
… I feel that this is a hack and we can do better, like reusing a pseudo CSS selector (:current, :past, :future) which looks great on paper
… we need to figure this out with the CSS WG

Romain Deltour: +1

Marisa DeMeglio: yes, totally agree with you, we added a note about that in the draft

Garth Conboy: I think this draft is amazingly cool, also that it can be converted to EPUB MO
… we don’t have to tell people to stop doing MO if they can easily migrate

Dave Cramer: I haven’t looked up the draft, but if you need me to bring things to CSS I’m happy to do it!

George Kerscher: regarding the groups, several people can work together: Pub WG, EPUB WG, etc
… I just see those two groups need to work together to “birth this child”

Wendy Reid: Action items!!
… this spec is very important to Audiobooks, it needs to have a home
… who would like to adopt Sync Narration?

Ivan Herman: I think that formally [looks at his boss] we need to publish a formal and finished CG report
… then we have a draft we can use to carry the torch
… then the difficult question is do we feel confident enough to move it to the Rec track?
… my personal feeling is that this WG charter ends in approx a year, we need to discuss (in Fukuoka) what we want after the charter ends
… I would think that this is a document that can go in the Rec track in a renewed charter
… for the time being, we would need to refer to this document as an informal document
… a next step is to take over the document from the CG and republish it as a WG Note to say that it’s important to this community

Luc Audrain: the European Accessbility Act has been voted by the EU Commission
… one of the thing it includes is that if an ebooks has an audio equivalent it needs to be synchronized
… so it will be mandatory around 2025

Avneesh Singh: a process-oriented quesition: we know that the accessibility horizontal review relies on sync narration, is it important if it is a Rec track document or not?

Ralph Swick: the horizontal reviewers may not object, but transition to Rec can be difficult

George Kerscher: when you have an audiobook we know it’s not fully accessible (it’s a specialized publication)
… it doesn’t address the issue mentioned by Luc, correct?

Luc Audrain: yes

Tzviya Siegman: we should try to avoid being stricter than is necessary
… we don’t want to box ourselves in a corner, we know we want to achieve global accessibility, but if can only get that in a year with Rec track documents we can already reach miletsones
… we need to accomplish what we can

Garth Conboy: coming back to what Luc said, what do you mean by sycnhronized? sentence level? word level?

Luc Audrain: the directive doesn’t specify the details, I think at least at the file level
… the community will decide what to recommend

Garth Conboy: if the publisher has the digital rights to an ebook and not the audio ?

Luc Audrain: I understand it has to be synchronized when both are in the same ebook

Wendy Reid: we need to explore finding a home for Sync Narration

Action #9: find a home for the Synchronized Narration spec (Wendy Reid)

Marisa DeMeglio: I have a lot of action items related to finishing the draft

Action #10: complete the drafts of explainers and specs (Marisa DeMeglio)

3. audiobook previews

Garth Conboy:

Garth Conboy: I have four slides on audiobook samples
… Google Play Books needs to provide samples for customer previews
… to help people make purchase decisions
… sometimes it’s a percentage of the content
… sometimes they just want to hear the narrator’s voice
… publishers are willing for samples to be given away for free
… it’s usually 10% or 5%, but ten percent of a super-long book is not desireable
… so publishers might want to be able to set the sample size on a per-title basis
… or there might be bespoke samples with music, bells, whistles, etc
… that’s happening today
… I would like our specification have a way to support this, so metadata could express what the publisher’s desire is for sampling content
… I thought about this while [redacted]
… I’d like to talk about whether this is a good idea, and if my first thoughts are going in the right direction
… often we think audio-specific stuff, and then later realize it’s applicable to web publications
… we need duration, either a time or percent
… we might want a start position
… and a link to a bespoke sample
… I’ve talked to our ingest people, and it seems to cover their use cases
… (outlines priority of these choices, depending what’s present)

Ralph Swick: “bespoke” – made for a partiular customer or user

Dave Cramer: (much chaos and merriment)

Garth Conboy: from a generic WP perspective
… this stuff exists in EPUBland
… most epubs at retail sites are sampleable
… they usually start at the beginning, and have a percentage
… we could use % for any type of publication
… some way of setting start/end
… and the ability to set a bespoke sample
… what I’d like to do before lunch
… is see if this is a good idea
… I think this is important for audio at least
… and should this be an audio thing or more general?

Dave Cramer: I have spent too much of my life making samples for ebooks
… I have always avoided awkward endings of samples, not a good user experience, we need to address the end

Garth Conboy: I think that’s a good point
… we do something to fudge it

Dave Cramer: Other question I have, this is a business matter

Leslie Hulse: +1

Dave Cramer: almost describing contractural matters with retailers
… we run into this problem with EPUB, the metadata is ignored, would it be ignored in Audiobooks
… would this better be communicated with ONIX (or similar)

Garth Conboy: that’s an interesting point
… this gets done haphazardly
… there’s nothing in ONIX now
… it’s not served well by any metadata now
… so I decided this needed a better way, per title

Ivan Herman: you won’t be surprised if I think this should be WP-level
… we already have a bunch of metadata we add to links wherever the links reside
… in resources or in readingOrder
… I think that’s the right place for this information
… each resource might have a sample begin/end
… it could be in extra resources
… what this metadata should be, and how it should be named, is a detail for later
… the media fragment URLs are great because you can have intervals
… but there is not yet something like that for textual HTML
… except for selectors, which can select an interval

Garth Conboy: the thing with textual selectors, if this wants to be a WP concept it needs to flow across HTML resources
… begin and end might do that, for diffferent files in reading order

Tzviya Siegman: I echo Dave’s concerns

Tzviya Siegman:

Tzviya Siegman: we tried this with EPUB, we had the EPUB preview spec
… I’m concerned we’ll have the same problem–spending energy writing a spec that everyone ignores
… today, every retailer has a previewing mechanism; some will allow publisher overrides

Geoff Jukes: the metadata you have fits our use cases
… we produce original content and redistribute other content
… we get lots of custom samples
… that have different contractual information
… and not all books are created equal
… we generate thousands of samples
… and there is an accepted amount of time, it works out to 5min of audio
… and most people don’t listen to the end, they mostly just want to sample the narrator
… we also sample based on the longest track
… and we never do the beginning of the book because it’s not real content
… from a redistribution perspective; we receive samples via ONIX url or physical MP3
… most of our partners like samples as separate files
… from a packaged audio book perspective, it doesn’t make sense for us to have it in metadata, but we don’t do audiobook on web

Garth Conboy: I view, much like with EPUB, EPUB is most used as from publisher to retailer, then something happens after that
… and to tzviya’s comment
… I thought it wasn’t a finished spec, but point taken :)
… if this is getting traction, it’s doing something new from the mess today
… maybe we can do better

Dave Cramer: I am going to ask the Rachel question, what problem are we trying to solve, is there a need for interoperability here?
… what is happening currently seems to work

Garth Conboy: You may be right, but if we are starting from scratch, is this possibly worth doing? If the information came in the package, would it work?

Leslie Hulse: this is a business issue, we set it at the account level, so when we change our mind we don’t want to have to change files

Garth Conboy: what about bespoke samples?

Leslie Hulse: we don’t usually do that
… maybe we should just spec the bespoke sample part

Nellie McKesson: Leslie and Geoff have presented interesting use cases
… as an EPUB maker and file creator, are we talking about previews or samples?
… previews are a thing with video etc on the web. A sample may be a different thing
… with specific business uses

Garth Conboy: and it was the preview spec :)

Nellie McKesson: I feel like the web publication standard is a response to EPUB
… we know there have been issues around reading systems not supporting all the features of EPUB
… but if you’re moving to a more generic web standard, the possibility is greater for things working in a web environment

Romain Deltour: +1 to Nellie

Wendy Reid: I understand the arguments against this. I like the video preview idea
… it’s like a highlight reel, it’s not the first five seconds
… it’s bespoke :)
… the thing about audio is that they’re not focusing on content, they are focusing on the narrator
… and you don’t care so much about the content
… there might be an argument for wanting that 30sec bespoke
… I think this is worth exploring, and it might not be used by everyone

Garth Conboy: it could open a new market for Patrick Stewart to record all previews :)

Ivan Herman: in order to move on
… I think garth or brady should come with a clear proposal pull request
… this is what should be added to the spec
… and then we can see if it works

Garth Conboy: it seems like rough consensus that bespoke previews might be more valueable
… I could make a proposal for that

Action #11: create PR for bespoke previews in WP (Garth Conboy)

Luc Audrain: what is difference between web publication preview and entry page?

Garth Conboy: I think of the entry page as the first page of content
… the preview could be anything

4. libraries and archiving.

Tzviya Siegman: slides ->

Maurice York: I’m at the U-Mich library and do IT there

Jeremy Morse: In the publishing division of the U-Mich library. Includes u-mich press

Deborah Kaplan: AUL = associate university librarian

Maurice York: This is about us and where we come from. There are 6 divisions - IT, built HathiTrust digital library (16.9M volumes). We also have fulcrum, our publishing platform.
… we also have the University of Mich press, digital collections activities. We have the early-english books project - 125,000 books from 1475-1700. Creating structured text markup to make them searchable.
… we have a large papyrus collection - all digitized and accessible. We work across a wide range spanning hundreds of years.
… We would like to walk through our process and some of the challenges we face.
… [narration over slide] The key question is working over long-term durability of content, as they work over hundreds of years. Preservation repo software and discovery are purpose-built with strict requirements
… Items that don’t meet requirements need to get stripped down. Going to walk through 7 examples

Jeremy Morse: [image of a preservation advertisement ‘preservation works!’ on a restored building]. I’m always nervous that digital preservation isn’t going to work, if we find out in the future that something fails…
… We deal with many challenges. I’m a publisher and a preservationist. We publish new and reformated content into our preservation platform. Here are the challenges we see:
… [content on slide] Linking and URIs. How do we create an archival package for a complex digital object that links to other objects on the open web that may not be preserved or stable. How do we implement graceful degredation?
… there is a 3d webGL application embedded within the content. The text can be navigated independently of the webGL content.
… There are interlinks between the webGL and the text, which takes you to specific locations. There are also interactive elements. These are all URLs. We’re managing the text and the 3d content which ensures things will remain live…
… from the content, we’re linking to a database to get refined dig notes. In a print version, this would be added into the print, but instead of putting them in the epub, they are just linked out, but it is still considered essential data.
… Example 2 - how do we handle deep linking to content within a publication. Canonical fragments. I’m happy to hear about text-fragment selectors.

Jeff Jaffee: I wanted to get clear in my mind is standardization VS research. A standard problem is one where doing things the same way makes it easier for all, but a research problem is just “we have a hard problem and don’t know how to solve it”…
… When it comes to content disappearing - is it a standards or research problem. We need some deep thinking. What is the level of maturity of this issue?

Jeremy Morse: I’d venture to say that it’s new. I’m here to present problems, not solutions. There are quite a bit with RGIS data - where content is subordinate to the text…
… There is certainly a pattern there and something we need to replicate. But it is not a mature problem.

Maurice York: The particular problem of linking and URIs - and graceful degredation. There are many known solutions, but we lean towards a standard because it’s so difficult to know which is best, so we lean to standards.

Tzviya Siegman: That’s the reason I want to chair the publishing group…

Jeremy Morse: Annotations - how do we improve annotations in 3rd party systems. Hypothesis is using annotations well, so we’re using them as a layer on top, but we’d like to possibly see a link back to publications…
… There is now the hypothesis publisher group that will have annotations that can be part of the version of record - but this is something we don’t have a plan for yet.

Ivan Herman: On that front - withouth making negative comments - but because annotations are stored on the server - and implementation choice - the architecture with servers, annotations stored on those servers, is there. It becomes the implementations of that system - if you wanted those to be part of your ebook. It’s a matter of implementations only…

Jeremy Morse: Hypothesis appeals to us because it is a cloud service. We don’t want to know identity of the user, etc…

Ivan Herman: They could give you a tool that you set up yourself and you could get the info - they control the server, which is not required by the standard.

Benjamin Young: Go to iAnnotate - thats where things get down to brass tacks. Most annotations are public domain, so they can be archived. It’s your best bet to archive annotations with the publications themselves.

Jeremy Morse: Accessibility - many features are embedded, but some are provided by the access layer only. For the webGL content, we didn’t attempt to make it accessible, but if you change the browser to mobile-mode, then it skips the map entirely…
… So you get the same data, but not in visual form. We dont’ know how to encode a different location - all done based off the ID out of the HREF.
… Rights: how can we determine verifiable claims. If we misrepresent the trustworthiness of the archive or content. Some content may have more restrictive use after it has been archived. Licenses can change.
… Sometimes we’ve had the metadata that had Creative Commons license but didn’t have version on it, so we had to apply the latest version to it, but some was previous versions…
… but our lawyer noted that use of that content under previous license is still permissable but new uses may not know which to enforce. But, if from a 3rd party, we can’t track the versioning of that license.

Maurice York: It’s a complementary problem to deep linking. If you have complex packages with different digital content, you could have different URIs and different rights to each object. So the problem gets worse and worse.
… especially after 2 decades

Jeremy Morse: Validation - how can we add preservation validation early into the process. There is content we produce - that’s great, we can validate it and follow a spec. More and more of our content is created by scholars - we are more and more moving production outside of us…
… so we want to provide better validation tools around the content so we reduce the iteration of telling the authors to change their works to validate.
… Finally - packaging interactive maps. These are components with the epub, but authors need to know what it is they are going to transmit. We need a manifest which is part of the larger application. It’s another example … A web-like resource that could be organized in any way (a mini website). There is just a top-level HTML document.
… When it’s in a repo, and the type of leaflet is deprecated, I’d like to be able to update things in batch knowing their old convention and the new one.
… those are the 7 examples, hopefully interesting. Any questions?

Deborah Kaplan: 1. Dynamic, even-if-local content (eg. embedded interactives, or rich visualizations based on manipulating large datasets)

Deborah Kaplan: … 2. Metadata questions in general: descriptive metadata authorities; the standard accessibility metadata; discovery and access; access rights; retention; preservation metadata.

Deborah Kaplan: … 2a. How much of this can or should be embedded in an object?

Deborah Kaplan: … 2b. If the metadata is in the object, can it be usfully used? Eg.

Deborah Kaplan: … – Can the preservation system query accessibility metadata in response to user query?

Deborah Kaplan: … – Can retention metadata be queried by a records management system?

Deborah Kaplan: … – Can embargos be parsed and enforced?

Deborah Kaplan: … 2c. Can metadata be updated in objects in the system? Does that modify audit trail?

Deborah Kaplan: … 3. Verifiability and provenance.

Deborah Kaplan: … 4. Versioning, versioning metadata, and change notification. Audit trails.

Deborah Kaplan: … 5. Defining the scope of a publication?

Deborah Kaplan: … 6. Documenting what’s been lost in a lossy conversion (special case of preservation metadata)

Deborah Kaplan: … 7. File formats, format databases, and obsolete data

Deborah Kaplan: … 8. Live datasets as part of publications

Jeff Jaffee: Presumably an archivist could make decisions about all this - but I’m curious if you see a demand in the archival community that these things need to be standardized and what the view is?

Maurice York: I think it’s an excellent question and part of why we are here. Right now there is a big gap between the work going on in this group, with standards and the W3C more broadly, and the archiving community.
… the preservation community has to catch things in the end. What often happens is a loss of content, or a loss of functionality. We lose alot over time. It damages the content, etc.
… we are trying to look at the whole lifecycle and move the conversation to the early stage and engage better and solve things at a more holistic level.

Laurent Le Meur: An experiment we did with the French national library. It would not be archived without the metadata, so the file + metadata needed to be distributed as it would be to any vendor. This is done as an experiment with the context of a legal deposit.
… The context is quite clear and the idea is unfortunately that what is archived is what is published first, but what about editions and versioning?

Tzviya Siegman: Ivan, Deborah, and I put together a workshop on archiving. We pushed it out to next year but some (but not all) of the issues are already on our list to talk about. I think the point to discuss is what do we do with this information. Are there concrete next steps you want to see?

Jeremy Morse: It’s food for thought, I’m still learning how the group works.

Maurice York: Having conversations with karen and ivan - what are the opportunities to link the concerns
… is there a way to strenghten the presence of library and archive within the W3C. They are both large communities, but they use different langauges, so it’ll be unique to see the intersection. We need to figure out the important issues, and figure out how they incorporate, etc…
… Particularly - organizing and working on the library side. We know our people - what’s the best way to generate a presence and crosswalk for these important conversations

Ivan Herman: I am wondering - we are in the phase where in a couple months we have to do a feature freeze. But that is not today. Would it be possible for you to look at the specification draft we have today to see if there are some entries / metadata that would be a high priority for you?
… that we could feasibly add to what we have today. There are some issues, like external links that we cannot solve here and now - that goes beyond what we can do, but there might be other things that are simpler or more obvious that we could try to incorporate right now. Even if just to signal to the world that there are issues.

Deborah Kaplan: (btw for publishing people, a subset of archives standards, which are orthongonal to what we do but help you understand the archives mindset and where those specs are well-understood and developed: (and these have broad uptake and adoption, and are quite mature, mostly))

Ivan Herman: and that we’re trying to take it seriously and we’re trying to lead the way

Maurice York: We have to look at the longer term conversations, but it would be an interesting follow up and what it would look like. We’ll get some of our folks to look at it.

George Kerscher: When the epub spec went through ISO, accessibility was part of WCAG. It has benefits to archivists - where we’ve seen archives of collections just becoming image PDFs. But we provide information that textual content is available. Continue to having the accessibility metadata as it describes content well.
… I think all specifications should have accessibility provisions. The accessibility data should make its way into our spec.

Tim Cole: We make extensive use of in the spec. This group knows onix as a metadata scheme. There are different types of resource types that might be helpful.

Jeremy Morse: there’s no way to describe relationship of resources of links. So each record has a DOI, but items could be links - so there is no way to show that something is a child link, or that one link is equivalent to a specific resource. One way to express relationships within links.

Deborah Kaplan: This is one of those places where the W3C and publishing community (archives community) the fact is that the archives community has a very robust, well established set of standards and specs that it uses internally. It’s young and has less adoption but the OAI ORE…
… It’s probably not exactly what we need here, but it is the fact where the archives community has a robust and practical (based on reserach) history of describing what is important - so it’s a good starting point especially relationships.
… the archives community has done the work and the research and written and adopted a spec. Instead of inventing a wheel for a need, we can borrow or use a wheel from the exising archives community.

Maurice York: Great point - the possibility here is for a really rich 2-way conversation. What can we bring from the W3C world and from the Archives community. Both communities are standards oriented so how can we be productive

Wendy Reid: Anything else? An action item?

Jeff Jaffee: In terms of next steps it sounds like there is a workshop happening next year. Also - I’m not sure if the archive community has a community group, but it would be a great way to get some communications going at the W3C…
… it doesn’t address all the items that could be shoehorned in, but i want to make sure we get started on some longer-term things as well.

Wendy Reid: Ok - 2 items. One to have the two of you write a proposal for the things you’d like to see that you don’t see today. The second is a much larger effort to create your own community group or present to the publishing community group as a task force.
… Jeremy/Maurice - will you take the action item to write the proposal?

Action #12: write a proposal to the WPUB on changes (Jeremy Morse)

Wendy Reid: Does someone want to ressurect the old archive group or making a proposal to the publishing group?

Ivan Herman: Maurice and Jeremy - lets chat at some point. The 4 of us can sit down and figure out what the community group would mean, how to set it up, to see if it makes sense.

Jeff Jaffee: [Robustness and archiving CG might be a place to find more participants –>

Action #13: set up discussions with Karen, Jeremy, Maurice on an archival CG (Ivan Herman)

Deborah Kaplan: also there is Web Archivability.

5. open issues

Wendy Reid:

5.1. what is the origin of a Web publication

Wendy Reid: Gist of problem is about origin
… can we use at base;
… does it interfere with doc URLs.

Laurent Le Meur: Main question is whether to first speak about packaging
… session later
… but should come first

5.2. Manifest files need their own MIME Media Type

Wendy Reid:

: be discussed today or tomorrow.

Benjamin Young:

Benjamin Young: Create a mime type for manifest files
… have operational set of actions
… convert from authored manifest to canonical manifest
… user needs
… beyond json.parse
… beyond graph representation
… 2 expressed formats
… operationally different
… so if people implement canincialization process
… we need a new media type
… wpub + json or some such
… as activity streams people did
… beyond json-ld
… needed their own media type
… we should do the same for both authored and canonica

Ivan Herman: This is the issue about which we say “specification purity less important than good of community”
… the authored manifest; if not using LD + JSON media type
… then will be ignored by processors
… killing its raison d’etre
… should not touch MT
… could add profile
… for whatever reason
… we could decide to give a differnt MT to canonical manifest
… but CM can be used as AM
… same formate
… same data
… so should not be different MT
… strinctly speaking CM and AM have different RDF representations
… but that is specification purity
… backfire on practicality
… processors say something is a URI or stream
… we accept the lack of purity
… we should not touch

Benjamin Young: profile = does not solve the issue
… for; it is ignored
… JSON - LD.js going into Chrome lighthouse
… so they use json-ld going forward
… if we don’t go through some process
… they are equivalent in doc; but not really
… so pub has different states of meaning
… authored v consumed
… If Wiley takes Moby Dick as authored get one result
… through canonicalization has different meaning
… could do what does
… but how does an implementor know?

Tzviya Siegman: Is there a way to end the stalemate?

Ivan Herman: Say the authored manifest must use creative work
… or a subtype thereof
… could define a separate type and demand that it is added to the manifest
… we signal it is not just a creative work
… also a web pub
… needs canonicalization to get web pub features
… the type is an array of types
… AB, VBs
… this works and answers concerns

Laurent Le Meur: I fear it would be an abuse of the mechanism of context types in
… used to indicate properties within a structure

Ivan Herman: It’s an RDF type… no more

Tim Cole: defines additional type property
… can be used for this
… make sure understands
… could do an extension
… as long as not primary

Ivan Herman: A subtype of creative work?

Tim: An extension
… creative type by inheritance
… external vocabulary

Ivan Herman: It is a syntactic hack

Benjamin Young: This came from canonicalization
… not to express more
… VC has a processing model
… an intended use for data models
… equivalent to using json - ld parser
… but we have two types: AM and CM
… a consumer does not know what you have
… you are left wondering
… it may be a question of who runs canconicalization
… publisher does not want messy author thing
… we want a canonicalized thing
… developer won’t know
… will consume messy thing wrong

Ivan Herman: Your solution works in an ideal world
… too high for publishers
… need to lower the bar
… (except Wiley)
… there are self-publishers, etc.
… we want a simple manifest
… requiring caninical manifest not realistic

Wendy Reid: I don’t hear the conclusion
… can the paricipants work it out?

Benjamin Young: Developers can be smaller than Wiley
… but the technology does not say when to use CM
… signal what processing to do
… today; nothing that distinguishes
… no clarity about process
… different from structured data testing tool
… need to signal when to execute

George Kerscher: Does a wpub check resolve this problem?

Benjamin Young: “The tools will save us”

Matt Garrish: You always run the canonicalization
… but maybe nothing to do to AM if everything is already there
… don’t bypass

Ivan Herman: Can clarify doc to say “when reading system turns AM into abstracted web idl, in that process it canonicalizes the manifest and converts to JS classes”
… if AM is complete, then canonicalization is the empty set

Benjamin Young: The way you know to run that is linkrel
… SEO bots will get something different
… the AM output
… which will differ and may not be found

Romain Deltour: To George’s question
… epub checking very different
… on web, content is not validated
… don’t require valid content
… so future web pub checker cannot be used this way
… just a lint
… user agents won’t request content

George Kerscher: You can require consistency

Romain Deltour: But you can have content fail; OK for the web

Wendy Reid: Do not see consensus
… need working the issue + referee

Benjamin Young: Ivan and Matt have pointed out that canonicalization only targets wpub processors
… they are looking for rel relationship and and abstracting it
… so we are ok
… seo bots and post processors will be confused, but that’s ok
… we can close the issue
… do not need a media type

Wendy Reid: Can you formalize that proposal
… (Issue #44 is for tomorrow)

Proposed resolution: the rel=”publication” discovery mechanism will be what signals the need for canonicalization/processing (Benjamin Young)

Ivan Herman: +1

Wendy Reid: +1

Tim Cole: +1

Matt Garrish: +1

Marisa DeMeglio: +1

David Stroup: +1

Tzviya Siegman: +1

Deborah Kaplan: +1

Benjamin Young: +1

Nellie McKesson: +1

Dave Cramer: just make it stop!

Rachel Comerford: +1

Romain Deltour: +1

Resolution #1: the rel=”publication” discovery mechanism will be what signals the need for canonicalization/processing

Wendy Reid: So resolved

6. Audiobook document publication

Wendy Reid: The audiobook spec has been cleaned up

Tzviya Siegman:

Wendy Reid: the only question is the packaging problem
… it likely does not block the push to WD

Ivan Herman: compared to most public WD, it’s way beyond the usual state and should probably be pushed.

Proposed resolution: Publish the Editor’s Draft of the Audiobooks Profile of WP (Wendy Reid)

Proposed resolution: publish the editor’s draft of the audiobooks profile of WP as a First public working draft, shortname should be ‘audiobooks’ (Ivan Herman)

Dave Cramer: there is concern amongst the general publishing community that there isn’t a problem to solve here
… he doesn’t see adoption being driven.

Leslie Hulse: +1

Wendy Reid: has done work to talk to publishers, and the current state is based on who has been willing to talk with her.

Leslie Hulse: +1 to dauwhe

Wendy Reid: the european publishers have been very keen on this; Canadians have been interested but wary; USA doesn’t see a problem.

Brady Duga: it was a lot of work to sort through the various formats that were being dumped on them.
… An example of a problem this would solve: they can’t accept TOCs, because they get TOCs in spreadsheet form.

Leslie Hulse: not the same situation as with ebooks, because publishers don’t deal with the pain directly.

Deborah Kaplan: adoption from Google, that duga implies we would get, is a very consequential adoption.

Leslie Hulse: there are middlemen who’ve founded their whole business on cleaning up messes, who likely won’t cooperate.
… so, there’s much that makes sense in the spec, but it’ll be a hard sell. A meeting or working session with the Audio Publishers Association is likely in order.

Benjamin Young: +1 to co-coordinating with the larger audio publishing groups

Benjamin Young: Audio Publishers Association -

George Kerscher: There should be a tool to help publishers get their content into this format.
… TOC gives them a path to do more, which should be attractive.
… Would like to see page numbers and accessibility added to the spec.

Avneesh Singh: issue #44 is still pending – should we resolve it before sending the draft?

Tzviya Siegman: we hope to resolve issue #44 tomorrow.
… page list is part of the spec now, and many of the accessibility features are built in already.
… Re: collaboration, attempts have been made to work with the Audio Book Association, without much success. If anyone can provide a connection, that would be greatly appreciated.
… Geoff Jukes joining the group has been a big win, but is just a start.

Ivan Herman: for the first public working draft, its ok to leave some issues unsolved
… but references to any open issues should be included in the document

Luc Audrain: Hachette Livre is very interested in a new format/spec for audiobooks, particularly for metadata, so that they can add more info like keywords.

Proposed resolution: publish the editor’s draft of the audiobooks profile of WP as a First public working draft, shortname should be ‘audiobooks’ (Wendy Reid)

Ivan Herman: +1

Tzviya Siegman: +1

Matt Garrish: +1

Garth Conboy: +1

Rachel Comerford: +1

Tim Cole: +1

Marisa DeMeglio: +1

Romain Deltour: +1

Nellie McKesson: +1

Deborah Kaplan: +1

Franco Alvarado: +1

George Kerscher: +1

Charles LaPierre: +1

Avneesh Singh: +1

Resolution #2: publish the editor’s draft of the audiobooks profile of WP as a First public working draft, shortname should be ‘audiobooks’

David Stroup: +1

Dave Cramer: +1

Ivan Herman: practical thing: I will be at a conference in San Francisco next week, so won’t be available to act on this before/during that period.

Jeff Jaffee: Wondering at a higher level what the plan is to get more people involved and drive adoption
… About 6 months ago, they thought we’d get more impact by focusing on audiobooks, but is now concerned about larger strategy given the feedback about lack of interest in adoption
… So, can those companies who are enthusiastic about this reach out to other companies, and help to push this forward?

Wendy Reid: Has made herself very available to talk to publishers and the APA. The last update she got was that they are very concerned about anti-trust, but she hasn’t heard from them in a couple months.

Leslie Hulse: There shouldn’t be an antitrust issue, and she’d love to be able to push for an in-person meeting about this with the APA, via the big publishers who are already onboard.

Benjamin Young: We have the publishing community group onboard now, so this seems like a potential opportunity to organize a community group.

Garth Conboy: A retreat to the community group doesn’t seem like the right thing to do now
… Amazon has never participated in epub, yet they continue to support it as an ingestion format. Similarly, if people just start giving this new format to distributors, they will begin to support it.
… That said, he doesn’t expect Amazon to ever participate in these kinds of meetings.

George Kerscher: If someone has a tool to demonstrate ingestion of this format, then we could make a demo video, showing the side-by-side comparison of the current messy process
… So, tooling would dramatically help with adoption.

Tzviya Siegman: The EU is eager to adopt, so there likely won’t be a problem there.
… So this isn’t a case of creating a spec with no path for adoption – there’s an established group of adopters ready and waiting.

7. lightweight packaging

Tzviya Siegman: slides ->

Laurent Le Meur: Audiobook use cases: B2B, who need to distribute files to distributors, etc., to all supply channels.
… For print disable people, and direct to consumer.
… Goals: must be easy to author – we really need some authoring software.
… Must be usable for B2B supply chain, must be usable to replace pure audio, and must be usable for B2C
… Even if EPUB3 isn’t going to be replaced, there are still use cases. E.g., an academic who wants to put together a long article, package it, and send it out to a publisher.
… Out of scope: Synchronized media, DRM, and WP to LPF conversion
… What is this package? A zip, that has a primary entry page, a json manifest (possibly embedded in the entry page), a cover, contents, and supplemental contents
… Where we are: pretty close to finalization. See the link in slides for the draft spec.
… There are some remaining open issues: LPF processor: what requirements apply to these processors? and what happens if there is a reference to something in the manifest that is not in the package?
… And what should happen if there are extra resources in the package?
… Once we wrap up these conformance requirements, we should be done.
… Another question: Is there something in the package that can be called a “base url” or “origin”
… Maybe we can talk about those issues.

Ivan Herman: A comment for those who aren’t familiar: this packaging format is NOT on recommendation track, so it’ll end up as a working draft but not a recommendation.

Tzviya Siegman: The main feedback from TAG: we need to explain/understand better why we’re going with zip rather than an existing packaging format that’s widely used on the web
… So part of what we need to accomplish in this F2F is to put together a better explanation/understanding to put out into the world

Charles: What about something like epubcheck?

George Kerscher: Assuming any links in toc point to audio file with time offset
… assumes files are gathered and dumped into a folder in package
… concerned that we have lost URLs to original files
… for roundtripping, do we have a way to preserve those?

Ivan Herman: Should remember this is not a replacement for full-blown web packaging
… that is why it is a note
… doing this because web packaging spec isn’t ready, this is a stopgap
… This is one of the things that web packaging should give you
… beyond our charter to produce web packaging spec
… this is lightweight because it does the minimum
… can’t put full web publication in this package. Just have to accept that

Matt Garrish: Don’t want to bikeshed but …
… is lightweight really conveying what we want?
… should we discuss today?

Ivan Herman: Yes, we should discuss today or tomorrow

Benjamin Young: Packaging format so far is targeted to a publisher distribution model
… not webby, and may be fine (may never be on the web)

Romain Deltour: +1

Benjamin Young: Could exist in this distribution format, could exist in some webby sort of thing later
… Make clear we are not trying to make something that can package web stuff
… Have concerns about introducing a new zip format (next to epub)

Romain Deltour: Note so not rec track, but still need to be careful about language
… “exposed” is not a well defined term
… Need to be clear for UA requirements

Laurent Le Meur: Agree not webby
… Disagree that “it’s only a zip, just throw in what you want”
… need a little more. Need a manifest.
… Need to specify the files names and formats, even if it is not a web package
… Also agree wtih romain that this needs more specification

Ivan Herman: Want to avoid going to the other extreme
… True, not a web format. But to say it is not useful for web publications goes too far
… there are a number of web pubs that could be placed in this package
… eg scholarly article
… also eg the HTML spec
… the only non-relative references are to CSS
… (to stylesheets, not the css spec)

Garth Conboy: Utility is highest in distribution world
… don’t want to put this effort into a standard OCF from epub
… the content isn’t epub, lots of differences
… we are now more webby

George Kerscher: In agreement with what has been said
… the tool that tends to make these puts everything in a big folder
… LCFing those is pretty straightforward
… it would be easy to turn that into a web pub in the future

Benjamin Young: Once these things exist and you get a .lpf (or whatever), who opens this? What do you expect to run?
… Why isn’t this an appendix in web pub spec with a list of names to use if you zip it?
… The only reason for these specs is to say what files you should look for after you unzip it
… What do we expect to open these, and what do we think they should do once they open it?
… Need a new package format to do interesting things like properly hook up offline annotations
… if this is just distribution, you just zip it up and send it wherever, and whoever receives it just does whatever they want with those files
… Just need to have some well defined names

Ivan Herman: Completely agree! Could be an informative appendix
… but WPUB is already so big that adding this as an appendix makes it unmanageable
… editors are even now wondering if the existing spec should be split up

Benjamin Young: +1 for making WPUB multiple docs each with their own value(s)

Ivan Herman: making laurent do this is just more efficient than leaving it to the current editors

Tzviya Siegman: We seem to be spinning in circles
… will try to summarize
… not really resolved, which is surprising
… we do need to respond to the TAG

Laurent Le Meur: many things were about the profile, not just the packaging
… there are webby and non webby parts

Tzviya Siegman: Maybe packaging is a bad name, might consider distribution

8. Implementations, testing plans, CR

Tzviya Siegman: Testing plans
… our testing champion is gone
… need tests for everything

Ivan Herman: What does testing mean?
… depends on which part of the spec
… first, manifest for specific metadata
… what does it mean to move a vocab to CR
… have to prove that any term we define, we have at least 2 potential users who use or intend to use that term
… to show that every term makes sense

Ralph Swick: As you go into CR you expect that before rec there will be at least 2 users of each spec

Dave Cramer: Can I just build a JS implementation?

Ralph Swick: Not sure that really meets the intent

crowd: What do you mean?!

Wendy Reid: Colibrio has agreed to do an audiobooks implementation
… but we need to define exactly what we need from them

Tzviya Siegman: Isn’t there a test subgroup that worked on this?

Ralph Swick: The question is what does testing a vocab mean
… essentially there is both a producer and consumer, and somewhere along the line the is a term that is produced and consumed

Dave Cramer: Not sure how we red/green test “a toc should be processed”

Laurent Le Meur: Readium committed to audiobook implementations
… there will be a way to injest an lpf file

Matt Garrish: There are some things we can’t explicitly test, have to rely on people
… what happens when you read the json on the primary entry page? How does that relate to the whole pub?
… may need to look at that again

Benjamin Young: We did a spec once (web annotations spec)
… did validation tests, did run automatically
… just hoped people would implement it and test
… it is just a validation spec
… There are a few musts, so we can do those tests
… there isn’t much that is user spacing
… and a blank white page is a legal web pub
… there is no requirement that we have a linear nav. It is hard to test eg “can you get to page 2”

Ivan Herman: In Lyon we discussed “minimum viable reader”. Really minimal stuff we expect an implementation to do
… We can test the minimal set. We stopped, but maybe we should revisit as a test harness

Dave Cramer: Looking for musts, it is all about information processing

Benjamin Young: Discovery is another thing we can do
… for web pubs, there is an entry point file (which is also the identifier), which must have a rel
… that points at some json which must have some things, that can then be canoncicalized
… maybe can check if some terms are properly set. Can test all those things
… 2 days to 2 months depending on how we implement

Romain Deltour: Not exposed by UA, so can’t really test

Tzviya Siegman: Need some volunteers for testing
… no one will test, so we will never release
… Nellie can help, but not lead
… need a minimal amount of testing
… need someone to go through the spec and figure out what needs to be tested
… look at the musts and shoulds
… and do what web annotations did

Benjamin Young: No, that was too much effort

Jeff Jaffee: Make the plan or tests?

Tzviya Siegman: Both

Jeff Jaffee: break it down then

Tzviya Siegman: Ok, anyone want to make the plan?

Brady Duga: crickets…

Benjamin Young: web annotations (and others) did mocha tests
… those are red/green tests, you either pass or fail
… need to understand is this packaging? Or just wpub?
… There is also some vagueness in the spec, need to understand if that is on purpose
… the tests have to test the entire spec
… audiobook tests would fail many wpub tests, so that needs it’s own tests

Dave Cramer: Say we have a test plan
… and some things that pass
… we could end up with a web pub spec that does nothing and is unusable

Tzviya Siegman: We can discuss that depressing topic at dinner

Jeff Jaffee: Positive spin on dauwhe’s comments
… this is a great opportunity to learn all about the spec!
… The test plan people will prevent the terrible future that dauwhe predicts

Tzviya Siegman: We are out of steam
… tomorrow we will organize all that

9. Resolutions

10. Action Items