Publishing BG, IG, & WG Joint F2F, 1st day — Minutes

Date: 2017-06-22

See also the Agenda and the IRC Log


Present: Leonard Rosenthol, Ivan Herman, Rick Johnson, Charles LaPierre, Takeshi Kanai, Peter Krautzberger, Dave Cramer, Brady Duga, Jun Gamou, George Kerscher, Tzviya Siegman, Ric Wright, Laurent Le Meur, Bill Kasdorf, Avneesh Singh, Romain Deltour, Garth Conboy, Mateus Teixeira, Matt Garrish, Vagner Diniz, Vladimir Levantovsky, Dan Sanicola, Rachel Comerford, Matt Kuznicki, Selma Morais, Hadrien Gardeur, Toshiaki Koike, Leslie Hulse, Micah Bowers, Liisa McCloy-Kelley, Cristina Mussinelli, Bill McCoy


Guests: Fred Chasen

Chair: Tzviya Siegman, Garth Conboy, arth

Scribe(s): Dave Cramer, Ric Wright, Brady Duga, Mateus Teixeira, Matt Garrish


Dave Cramer: Hello, world

Tzviya Siegman: let’s get started

1. introductions

Tzviya Siegman: welcome everyone
… Ivan will talk about administrative stuff

2. administrative stuff

Ivan Herman: quite a few of us have never been to a w3c meeting
… so I’ll go into some details which may be boring for the old hands :)

Dave Cramer: [general webex disasterousness]

Ivan Herman: how we work, how we operate day-to-day, what kind of tools we use…
… there is a home page for the working group

Tzviya Siegman: does everyone know how to use IRC?

Garth Conboy:

Mateus Teixeira: mateus-teixeira has joined #pwg

Ivan Herman: we use IRC for lots of things, including taking minutes during the meetings
… the minutes themselves are stored on the web
… published a day or two after the call

Dave Cramer: info about IRC:

Ivan Herman: there are lots of references on the working group home page
… apart from IRC, we have two major tools for work
… one is email, the other is github
… there’s a public mailing list (the archives themselves are public). this is the one we’ll use most of the time
… probably mainly for administrative things
… there is also a member-only mailing list
… it is rarely used
… mostly for very confidential stuff
… often for copyright, patent stuff
… if you want to send something only to the chairs, there’s an email list for us

Dave Cramer: [various newcomers enter]

Ivan Herman: there’s more info in the work mode page
… the other big tool is github
… all our documents will be in github
… we plan to have a separate repo for each document
… there is also an overall repo for the wg (which we are using for the home page)
… every repo also has a wiki
… and the most useful tool of all is github issues
… threads around specific issues can be organized, found, labelled, etc
… for those who haven’t used github before, it can be frightening
… I’ve put together a GitHub for Poets :)
… everyone will need to have a github account, and let me (Ivan) know
… due to access control for github
… WG members can edit wikis, etc

Leonard Rosenthol: @pkra - we don’t have audio working right now in the conference room, sorry!

Ivan Herman: and there’s a separate admin group with doc editors who can accept PRs and make direct updates to the repo
… some of you might be already in that group

Peter Krautzberger: @leonardr np. just wondering if that was an attempt at dialing in.

Ivan Herman: everything I’m saying here is for working group members (as opposed to BG)
… we plan to have one 1hr call per week
… we will reuse the slot we used for DPUB
… it’s Monday at Noon Eastern time

Tzviya Siegman: the timing works well for US and Europe, but we need to take everyone in account

Jun Gamou: The call is 1AM in Japan; I think it’s OK

takeshi: I’m used to it

Tzviya Siegman: when the US changes clocks in November, should we ask again?

Jun Gamou: yes

Ric Wright: the IDPF oscillated

Ivan Herman: that was complicated

Dave Cramer: csswg has one call a month scheduled for an Asia-friendly time

Vladimir Levantovsky: we’ve tried things like that

Ivan Herman: time changes are always messy
… for next Monday, we will have lots to do
… (US holiday coming up)
… I would propose to use the DPUB IG webex data for next Monday, then I’ll set up a new series
… for face-to-face meetings
… every year W3C has a week-long extravaganza called TPAC
… where most WGs have their F2F meetings, and talk to other groups
… this will be important to us due to our interactions with lots of other WGs
… TPAC is in Burlingame, CA Nov 6-10. Make hotel reservations now!

Tzviya Siegman: somewhat less fun than Lisbon :(

Ivan Herman: TPAC 2018 is in Lyon, France, which is more fun than silicon valley
… TPAC 2019 is in Korea, possibly
… we will need to decide if we want a springtime F2F
… much depends on money
… we also need to talk about the patent policy (IPR)
… there are a number of process steps at w3c that look complicated
… and some of that complication is due to the IPR policy
… w3c’s goal is that any formal specification can be implemented by anyone without any patent encumberances or royalties
… what is expected from w3c members of a WG is, even if they have a patent, they accept that the patent is free to use

Leonard Rosenthol: when we start to have documents, official notices will be sent out

Ivan Herman: yes
… we recognize that there may be companies who consider these patents as their crown jewels
… in that case they are required to disclose the patent
… and the group has to work around that
… there are milestones in the spec process, and FPWD is one of the milestones where all members are asked about patent exclusions

Dave Cramer: FPWD = First Public Working Draft

Vladimir Levantovsky: does not responding to the call for exclusions mean that you accept things?

Ivan Herman: yes

Tzviya Siegman: lots of people in publishing don’t may much attention to IP law
… when you submit to github, you are agreeing to the license terms of that repo
… if you have questions, talk to your own lawyer

Vladimir Levantovsky: if you want to get something done, don’t talk to your lawyer :)

Garth Conboy: we can make an example of dave
… he did a “q+” in the IRC channel when he wanted to ask a question
… in a f2f it’s somewhat stilted to do that
… the chairs will watch the queue

Tzviya Siegman: or raise your hand

Garth Conboy: we want to make sure everyone can participate

Ivan Herman: anything I forgot?

Bill McCoy: just being a good example :)

Dave Cramer: [fun discussion about details of licensing of github repos]

Cristina: if you want to add other people to the WG?

Ivan Herman: The AC rep of the company has to appoint a person to the WG
… it’s a two-step process. the company has to join the WG, due to the IPR issues (as IPR is on a company)
… then the AC rep has to nominate a person, even if that person is the AC rep
… there are two companies who’ve joined the WG, but have not nominated any people

Tzviya Siegman: if you joined the IG (DPUB), your membership does not automatically transfer

Ivan Herman: one more thing on membership
… we do have the notion of an “invited expert”
… it’s an exceptional thing, but there are some exceptional people who can’t join as members
… these people can apply for invited expert status, and we have to evaluate
… right now we have two
… both were members of DPUB, one is Heather Flanagan, IETF spec editor
… and the other is Peter Krautzberger
… who is a core person for MathJax

Peter Krautzberger: waves

Cristina: you email the chair?

Ivan Herman: no, there’s a form

George Kerscher: what is the minimum time commitment to the working group

Ivan Herman: official is half a day a week that you’d need to commit to be effective
… the reality is that some people who only lurk, which is OK

Tzviya Siegman: invited expert form:

Ivan Herman: as long as they don’t veto everything at the last minute :)
… even if you look at the DPUB list, there are some unfamiliar names who’ve never been on a call

Tzviya Siegman: any other questions or comments?

Charles LaPierre: for a particular company, how many people can you nominate?

Ivan Herman: as many as you want
… w3c operates on the principle of consensus. It is our goal.
… this can require a lot of talking.
… we may have votes and record resolutions, and votes are taken per individuals
… but there might be decisions where the vote may be taken by members
… anything else?
… let’s go on with the agenda

Dave Cramer: [general praise for Ivan’s voice]

3. Charter

Ivan Herman: there’s a link to the charter from our home page
… we are chartered for three years, which is unusually long… it’s usually two years
… the reason I will come back to
… let me go to the deliverables
… we are charted to deliver four recommendations (in the jargon of w3c)
… REC = web standard
… and we can publish w3c notes
… in this charter we did put in examples for possible notes that we might publish
… and we took over one document from DPUB, which is Latinreq
… a collection of knowledge about western typesetting, inspired by JLreq
… work continues on Korean, Chinese, Indic scripts, etc.
… we are chartered to do a web publications spec
… with publications as first-class citizens on the web
… we separated the publications from packaging

… these are books, journals, all sorts of publications

George Kerscher: these are two separate RECs?

Ivan Herman: yes.

Leonard Rosenthol: this afternoon we’ll start of the details

Ivan Herman: then we have a separate rec for EPUB4
… we don’t know what these will contain
… EPUB4 might be one page :)
… but the core work is web publications
… BFF work was done in EPUB 3.1 WG
… there is also a rec called DPUB-ARIA 2.0
… there has been joint work between DPUB and ARIA, which resulted in DPUB-ARIA 1.0
… the epub:type vocabulary has been transferred to ARIA @role

Matt Garrish: I think it’s in CR

Ivan Herman: here, there are many vocabulary items that the EPUB community uses, especially around Education, which may be added to make DPUB-ARIA 2

Ivan Herman: these are RECs, which means we have to follow The Process
… the first draft we produce, a First Public Working Draft, is a milestone
… it may not include all features, it may just be an outline

Rachel: * dauwhe… DPUB-ARIA 2 - This time it’s personal

Ivan Herman: FPWD gives us a list of what we want to develop
… FPWD triggers IPR review
… some members were advised not to join the WG by their lawyers since the scope wasn’t completely clear
… FPWD should make that more clear
… from that point on, we are supposed to publish relatively frequently
… my preference is that we publish frequently
… then there is CR, Candidate Recommendation
… where the work is feature-complete and relatively stable, and invite implementations
… then we have to document multiple implementations and tests
… before it can become a REC
… I’ve had experience with other WGs where testing comes as an afterthought, and it’s a major pain
… the earlier we start, the better
… one reason we’re chartered for 3 years is that for this complexity, testing will require a lot of work and time

Vladimir Levantovsky: from webfonts WG experience
… developing the test suite took more time than the spec itself
… for version 2 of webfonts, every time we had a “must” in the spec, we started developing the test then
… you must work in parallel to make life better

Ivan Herman: absolutely
… I used to be RDFa staff contact and implementer
… every time a new feature came in, we implemented it in our own systems
… at some point we will have a feature freeze, unless you come with a test :)

Vladimir Levantovsky: you have 2 options, you have capital “MUST” or lowercase “must”, the latter a common-sense statement that does not require testing

Ivan Herman: one more thing…
… that came up during charter review
… this work is not done in isolation
… this is work done to put publications on OWP
… we always must look at what other technologies are available at w3c or under development
… we must not reinvent wheels
… even if the other wheels are not ideal
… furthermore, every WG has a constant shortage of volunteers
… we can’t really ask other WGs to do stuff for us, we should work together with them
… some comments during charter review were afraid we were trying to fork the web
… we want to be integral part of web

Ivan Herman: perhaps quick re-introductions

Dave Cramer: [another round of introductions]

Ivan Herman: if you are already on IRC, please do a “present+”
… if you’re not on IRC, ask your neighbor to help

Dave Cramer:

Toshiaki Koike: toshiaki-koike has joined #pwg

4. horizontal reviews

Ric Wright: rkwright now scribe

Garth Conboy: And continued problems with WebEx — so this channel is the only path for now.

Ric Wright: This means that every REC has to be reviewed by experts in the W3C in several areas. Tech arch, accessibility, etc.

Ric Wright: These reviews are required. There are checklists and patterns to smooth it but it must be done

Ric Wright: Intent is to ensure there is no security, privacy or accessibility problems.

Ric Wright: This group will need to be very careful that we stick to the WP aspects of the spec and not intrude into the other areas of the W3C (e.g. HTML, CSS)

Ivan Herman: We then have to have champions for each of these areas (i18n, A11y, security, privacy)

Ivan Herman: Leonard will speak shortly about security. We have good coverage on accessibility

Ivan Herman: On the i18n side, we will need to be careful about ids, uris, iris, etc. w/respect to i18n char-sets

Ivan Herman: another area we need to be careful about is metadata, which also have issues with the char-sets for the actual text content

Ivan Herman: One example is mixing bidi text in the metadata content.

Laurent Le Meur: Do we need to review the work of other groups whose work touches on us?

Ivan Herman: No, we just need to make sure that they are aware and there is no conflict

Tzviya Siegman: Remember that we need to be sure to be proactive about letting other WGs know about what we are doing. This also extends to the IETF

Tzviya Siegman: We cannot assume other WGs are following our work. We need to be proactive in letting them know what we are doing

Ivan Herman: We also need to ensure that our OWN documents are accessible

Ivan Herman: For accessibility, we will need to defer to Avneesh, Romain, Charles and others

Leonard Rosenthol: Security in one our key horizontals.

Garth Conboy: Slides at

Leonard Rosenthol: Security is all about trust

Leonard Rosenthol: First is where is the content coming from, what is its origin (domain)

Ric Wright: OK. Works for me.

Leonard Rosenthol: How does this relate to ad hoc distribution of web publications?

Leonard Rosenthol: How does the origin aspect relate to WP?

Leonard Rosenthol: Protecting against attacks:

Leonard Rosenthol: Note that there are many overlaps between security and privacy. We will need to bear that in mind.

Ivan Herman: Some of these seem to be general Web security problems, which are not our purview. What is different?

Leonard Rosenthol: I will cover this tomorrow but there are differences.

Bill McCoy: We cover that today in EPUB where the origin is the root of the docuement

Leonard Rosenthol: But that is not a requirement but a lower case “must”, but as we move to browsers are UA, that significantly changes this problem

Brady Duga: In the web today we have third parties providing content which breaks the model.

Leonard Rosenthol: #3: Don’t surprise the user

Handrien Gardeur: Google serves third-party documents from their own CDN and domain in the case of AMP

Leonard Rosenthol: We want to leverage the capabilities of the web, but will user’s be ready or willing to accept this?

Rick Johnson: Talking about publications we don’t always know which part is the document what is the UA.

Leonard Rosenthol: Yes

billk: Isn’t it true that the user expects that the UA is sandboxing the document for the user.

Leonard Rosenthol: Yes

Leonard Rosenthol: @hadrien - funny you should mention AMP, I am talking about that tomorrow s a good example of security…

Tzviya Siegman: Again, we need to ensure that we work with other WG but that work we are doing is specific to WP

Tzviya Siegman: But if we feel other WG are missing something we should work with them to fill that gap

George Kerscher: We have succeeded with the ally folks to get some publication recognition into the a11y spec. We need to continue to do so.

George Kerscher: But we need to ensure that our work doesn’t break any of the AG specs.

billk: Ivan spoke of choosing a champion for the 4 areas, which is good, but perhaps we should have some more systematic connection to the other relevant WGs

Ivan Herman: True it is my job to do this, but it would help if the load could be shared

Ivan Herman: Dave helps out, DanielW will help, but any help would be appreciated

billk: But perhaps it would help to make this more explicit

Brady Duga: One of the horizontals not listed is rights-management. Its tricky area but…

Tzviya Siegman: IDPF never considered it.

Brady Duga: Yes, but on the open web…

Avneesh Singh: Speaking of the plurality of groups, there are many a11y groups and perhaps it is too much for a single champion but instead a TF

Tzviya Siegman: Yes, but the champion would be the point person for a TF, perhaps

Ivan Herman: Perhaps “champion” is not a good name. More like the point person or something else.

Leonard Rosenthol: At the last TPAC there was a meeting with the POE group (permissions, etc.) would they be a good contact re “rights”

Ivan Herman: No, they are only about vocabulary

Ivan Herman: That will relevant when we get to the metadata relevant to “rights”

Charles LaPierre: Would it be useful to have a chart that lays out how all these WGs intersect?

Ivan Herman: Yes, but it is a very complex graph. clapierre: Would you like to start that?

Charles LaPierre: Yes, I’ll look at it.

Tzviya Siegman: WHo would be volunteer?

Ivan Herman: I would be the “ambassador” to the i18n area

Avneesh Singh: A TF for a11y, I would be willing to lead that.

Tzviya Siegman: So avneesh is ambassador for a11y

Leonard Rosenthol: I will volunteer to be ambassador for security

Ivan Herman: Currently, there is no security WG at the moment in the W3C

Ric Wright: Oops. Meant “privacy”

Garth Conboy: Brady may well be excited about helping Leonard out on security.

Tzviya Siegman: Please note that Brady knows a lot about security and it willing to help.

Charles LaPierre: Benetech announces “Global Certified Accessible”

Dave Cramer:

Toshiaki Koike: toshiaki-koike has joined #pwg

Garth Conboy: Start with Rick on testing implementation plans

5. Testing, implementation plans

Ivan Herman: Ric’s slides at

Ric Wright: Only here because he doesn’t trust webex
… structured the talk more as a list of questions
… wants this to be more interactive, looking for input from the group
… Readium 2 to look at what might be the future of epub
… Hadrien and Laurent will discuss in more detail
… started in authoring
… epub authoring has been the Achilles heel
… epub 3 has been particularly terrible, no decent tool for it
… has written some epub 3s, but using only existing web tools and text editors

Leonard Rosenthol: Why is that bad?

Ric Wright: There is a chicken and egg problem. For instance, webgl - need to use a text editor
… there is no decent tool for it

George Kerscher: What RS supports webgl?

Ric Wright: Readium does!
… Doesn’t work as well in iBooks for instance, as they may muck with the css

Garth Conboy: Probably doesn’t work in other RS that may be quite large
… so people code to lowest common denominator

Tzviya Siegman: publishers don’t write epubs with these features because there are no tools and RSes don’t support them

Ric Wright: There isn’t much authoring support for epub 3, epub 4 will likely be more complex

Rachel: * 100% agree tzviya but have forgotten how to say that in IRC… +1?

Ric Wright: When we talk about testing, what are we testing?
… consumption side? That is the standard w3c approach
… but what about authoring tests? Traditionally ignored

Ivan Herman: testing also means whether the spec is consistent and precise
… so it is testing the spec as well
… Should be making sure the spec is implementable, and all the details are exposed and known to be implementable
… originally the whole point of testing was the spec
… and has now also become a test of the consumption apps (UAs)
… but need to remember that original reason for the requirement

Ric Wright: Are we inventing a spec that can actually be implemented?

Ric Wright: Who decides that the tests are written, run, and pass?

Ivan Herman: Usually honor system, self conducted tests
… which is why we require 2 independent implementations
… The fact that it is self interested parties is fine

Leonard Rosenthol: A lot of RSes use the same core engine. Do those count?

Ivan Herman: No known answer

Tzviya Siegman: Part of what we need to talk about as we write is test as we go
… don’t want FPWD and definitely not CR with no testing
… don’t want a unicorn spec
… Need to make sure that actual humans can implement this
… need a bunch of people familiar with testing to cover all the areas
… We talked about ambassadors before, do we need a testing one as well?
… . Probably yes, maybe rkwright?

Rick Johnson: One problem with epub 3 was not being fully implemented. Self testing tends to be the stuff they support
… who decides what is a correct test?

Ric Wright: Atomic tests are nice, but they are really just testing the underlying the UA.
… We need to test the more complex stuff to really test the RS

Handrien Gardeur: Question about the number of testers
… what if the tests are in multiple languages? Is that multiple implementations?

Dave Cramer: I don’t think so

Ivan Herman: We are testing ourselves, so there is no reason to cheat
… we are doing it to convince ourselves
… if you can convince this group that the two implementations are different, then that is good enough

Handrien Gardeur: In the context of Readium 2, one thing we are struggling with is a consistent enough set of examples

Tzviya Siegman: for an example of a small set of tests for DPUB-AAM, see

Handrien Gardeur: having two groups, one writing the tests, the other working on integrations
… want to have people with knowledge of the spec writing the examples

Ivan Herman: Exactly right, and why we are discussing this now
… need to write the tests from the start
… Granularity of the tests?
… no simple answer
… in annotations, had to discuss exactly what was meant by a “feature”
… Did not define a feature as each term in the metadata
… defined it as groups of metadata
… What is it in this case? don’t know yet
… We should not test things that just test the browser engine
… that is irrelevant for this group
… we won’t repeat the 10K to 100K tests for css and html
… our goal is to add to that those things that make sense just for us

Micha: real world requirements of implementors become the test bed
… need to make sure who is it out there that has a mission to create these things and will work with us to do that?
… If we are having resource problems then we aren’t getting the right people to help us

Tzviya Siegman: So open source?

Micha: not necessarily

Dave Cramer:

Tzviya Siegman: Did this before and just collected results
… at the end had a question if that was good enough, without the code behind it
… Was ok, but may not be enough for exit criteria

Ivan Herman: Not sure what the scale is for us, less than CSS/html
… RDFa was several hundred tests, there was a web site where all the tests were available
… it ran the tests and provided a report, everything was public
… test site remained so new implementors could test their code

Ric Wright: At Readium, were lucky to get a school of QA people to test for them
… Had 4 or 5 people at a time
… But results mainly just got stored somewhere
… Goals?
… adherence to the spec
… interop
… quality (find bugs)

Leonard Rosenthol: But that’s not a spec issue
… out of scope?

Tzviya Siegman: It’s a bonus!

Ivan Herman: We had spec testing for RDFa which helped implementators later

Ric Wright: Performance. Not really part of spec, but must be able to make a performant impl
… reliability
… a11y
… failure conditions. Need to test failure cases

George Kerscher: One of the objectives is to make sure UAs are out there when we go to rec
… when we test these RSes are we also testing a11y of them?
… Don’t think it was done for annotations.
… just adds another layer of difficulty

Garth Conboy: Tough nut to crack.
… Spec needs to be implementable with a11y, but product may be able to ignore it

Tzviya Siegman: Epub specs support a11y, but implementations don’t all provide it
… testing can’t really solve that

Dave Cramer: Can we add tests that check a11y?

Leonard Rosenthol: You are imagining a RS with a user on the other end. May just be extracting data, etc

Rick Johnson: I hear us looking at the qualifications for exiting CR
… pass/fail tests
… then I look at the screen and see things that aren’t reasonable for pass/fail for exit criteria
… are we going down a rat hole?

Ric Wright: Just asking questions here
… Need to find a good comprise between not enough and too much testing

Charles LaPierre: Need some tests for the horizontals, otherwise we have a spec that claims it supports x,y,z, but without tests how do we know

Brady Duga: ?

Tzviya Siegman: We do have checklists for the horizontals, which we have to work through

Ivan Herman: At the end we have WP check, then we have more than we need, but it is good for the community
… There is what we need for the group, then there is a little extra like performance, etc. We need to deicde what the right mix is
… This looks really scary, but a lot is already covered by the other specs. We don’t need to test those
… just need to be careful to test what we define
… we do not test any html check. But eoub 4 check should (when it comes out)
… because it will integrate those tests

Leonard Rosenthol: Why should we do that?

Ivan Herman: Because it is a useful service

Leonard Rosenthol: But not for exit criteria!

Ivan Herman: Right!
… one thing we need before tpac is what already exists and how we can leverage it
… then use tpac to discuss with other groups

Garth Conboy: How much is left?

Ric Wright: Tons!

Rick Johnson: The tests for the RSes, sound like a successor to the epub test grid
… should the business group be helping with it?

Bill McCoy: This is a joint meeting
… so when we say “we”, it isn’t just the WG
… so things like epub 4 check might not be exit criteria, but also not a “worry about it later”
… so we can get people from the other groups (who are here) to sign up for some work

Garth Conboy: Test grid is kind of stale
… epubcheck is also kind of stale
… Approach from different directions

liisamk: Need to keep going so we have a reference platform
… almost no one sells what is actually provided
… so side loading doesn’t really help
… currently best way to test is to actually put it in the market and see what happens

Avneesh Singh: In general we seem to be talking about testing the spec … While testing in has been to test what was in the market … so this is a more constrained environment, where we develop specs, implement specs in some reading systems/user agents and test it. If we want to do specs testing for WG, then we need to revisit the mission of and figure out, how to make it suitable for specs testing.

Ric Wright: epub test suite not that useful. Mainly just regressions testing
… automated tests hard to build and maintain, very expensive
… Unit tests are nice for the RS, but irrelevant for us
… BISG test suite, mainly a RS test
… doesn’t cover 3.1, lots of browser tests, nothing to do with RS
… doesn’t really test the spec

Garth Conboy: The purpose was for publishers to know what works on each reading system
… so a publisher would know that say, forms, doesn’t work on a particular RS

Ric Wright: We are testing 3 things: spec, and something else, and RSes
… Missing a lot of what we covered
… Missing fxl!
… limited Japanese tests, almost no Chinese, a little Arabic
… hard to verify as non-native speakers
… authoring: I like it, but don’t need to talk about it

Leonard Rosenthol: What even is testing authoring?

Ric Wright: Must test authoring systems against RSes. Does the authored content match the expected result?

Leonard Rosenthol: So more about testing the tool, not the spec

Ric Wright: Right, which is why I skipped over it

Leonard Rosenthol: How is this different than testing the files?

Ric Wright: It is making sure that the output is what the user expected to get out

Rick Johnson: Agrees with Leonard!
… irrespective of the workflow that creates the file, why do we care?

Ric Wright: We don’t! Stop talking about it!

Garth Conboy: css html tests don’t cover authoring?

Dave Cramer: correct

Leonard Rosenthol: There may be cases where it is relevant
… for instance, zip
… we don’t test the actual zipping
… if we move away from that we might want to test package creation

Ric Wright: Slippery slope! Packaging isn’t much different than content

Leonard Rosenthol: Does epub check validate the zip?

Ric Wright: To some extent

Garth Conboy: Does look at the signature

Dave Cramer: Checks other things specific to epub

Ric Wright: Exit criteria
… because the nature of WPs testing whether a feature works could be interesting
… might in practice be simple, but seems unlikely
… need a better test suite

George Kerscher: Do we need a test to take a WP to a PWP?

Leonard Rosenthol: Some may be under service workers
… but some may be ours

Ric Wright: Tastes like implementation to me

Ivan Herman: And in theory may not use service workers
… so we may not require it

Garth Conboy: We could define a PWP that you can’t create with a WP

Garth Conboy: (with a given set of technologies)

Garth Conboy: we either need to list the technologies explicitly required, or leave it up to implementors. Hopefully the latter

Ric Wright: defined the spec, have a bunch of tests
… at what point do we say this is so slow we can’t say it is complete

Ivan Herman: Judgement call. But if it is that bad it is a spec bug

Ivan Herman: If one is fast and one is slow, then it isn’t our problem

Ric Wright: If we have 2 implementations, do they have to be perfect?
… no bugs?

Ivan Herman: Yes, no bugs!

Ric Wright: These tests tend to be very atomic, so may not be too bad

Dave Cramer: Yes, they should be pass/fail

Ric Wright: Need a champion

Ivan Herman: Reach out to a group that lives on testing
… testing is what they do
… Shane is there, reaching out to them
… Annotation wg was saved bv them
… org is Specops
… problem is Shane knows how to handle the testing side, but doesn’t necessarily have the domain background
… so may need co-ambassadors

Rick Johnson:

Ivan Herman: Shane and someone with domain knowledge

Ric Wright: I am willing to start the process

Rick Johnson:

Ric Wright: not sure how I will do my other job (Readium)
… looking at BillM

Garth Conboy: Ivan, anything else for “these” other topics?

Ivan Herman: No, covered it all

Garth Conboy: Agenda wise we move input technologies until after the break
… take early break now, back in 15 minutes

Leonard Rosenthol: And sugary snack is here

Garth Conboy: Wrong syntax: 15-min break

6. input technologies

6.1. Web Publications

Garth Conboy: looking into input tech: wp, pwp, epub 4, aria, browser manifestations, web app manifests
… starting with tzviya, continuing tomorrow with other technologies

Garth Conboy: Digitial Pubishing WG Doc:

Tzviya Siegman:

Tzviya Siegman: we had some publications in the interest group, will do quick overview of input documents
… what’s a web publication? wp is a publication that lives on a browser
… won’t use word “package” because not necessarily packaged like epub

Garth Conboy: The logo possibilities are wonderful!

Tzviya Siegman: got started because we were frustrated with how things functioned in epub land
… trying to have something that works when you open them in a browser, more than just a regular document, but a collection of linked documents
… collected together and with a relationship, work TBD in this group
… packaging is important if we want to share publications; also need linking mechanisms to reference other publications
… metadata is a huge part of epub and needs to be in WP as well
… important for distribution, need to look at things like URI and IRI and find out how these work in the world of publications
… need to consider how ISBN is expressed in the world of the web
… we love the word “manifest” here; we don’t always mean the same thing, but we need to figure out how to get to that in web publications
… the manifest for our purposes might just mean a list of things included in a publication, and that binds it together

Leonard Rosenthol: @dauwhe - let’s start a support group :)

Tzviya Siegman: WPs also need to be styled, which might be work for the CSS WG, but we talked a bit about how to control this in WP and allow for different presentations

Ivan Herman: there should be a mechanism like service workers to also provide offline access
… a bridge would have to be built between the info that characterizes a web pub and what characterizes a service worker
… service workers are now separate wg from web platform wg

6.2. Packaged WP

Garth Conboy: talking a little bit about packaged web publications (PWP)
… and how epub 4 might play into it as well
… PWP should be “renderable” offline, but most of it might go down when you’re offline
… whether this is done by service workers of other system is TBD, there are many valid ways of accomplishing this
… may be along while before someone like PRH puts a web publication online and allows offline access or distributes a file to third parties or distributes in a flash drive
… there’s an example of this in epub (zip package), so that is a possibility
… there is work on web packaging, done in early 2000s when we were talking about what the packaging format should be
… switched to what became zip, ocf, etc. in epub – Bill M’s fault

Leonard Rosenthol: need to look at whether definition of the packaging format belongs in pwp or individual profiles like epub 4, for backwards compatibility
… look at the option of having the flexibility of different packaging formats

Garth Conboy: Completely agrees with Leonard!
… the definitions for packaging format have evolved, and epub 4 could be different constraints on what a PWP is
… but we could also say that epub 4 can use a different method, in theory; there is some flexibility about how we approach it
… pwp and epub 4 are currently set up as somewhat separable

Bill McCoy: it was the IDPF’s fault, really
… it was a group effort, but we chose zip over multi-part mime because the latter was mostly used for email; the situation is almost the opposite now

Garth Conboy: the issue of security for a PWP that doesn’t necessarily have a domain associated with it is also an interesting problem
… do we go no further than we did with epub or explore it more?

Ivan Herman: the only thing we can say is that this group will not define a new packaging format
… that is an important thing; we will not define something that is neither zip or whatever; need to rely on packaging formats defined somewhere, whether it is w3c work for a new rec (?) or zip

Garth Conboy: (yes, I see folks on the queue)

Leonard Rosenthol: i don’t disagree, but need to clarify if w3c won’t allow, or if this is based on experience

Ivan Herman: a little bit of both
… if there’s work in w3c on a technology, we would not be allowed to replicate the work
… at this moment there is open work for web packaging, so we should not do it
… one of the reviewers in the charter process said he doesn’t understand why we need the separate recommendation in the first place, because we will have packaging already defined somewhere else

Garth Conboy: And somewhat disagreeing with Dave (back a bit): I will continue to harp on some level of round-trip-ablity between EPUB3.x and EPUB4.

Ivan Herman: we may need a 1-page document to specify, e.g., metadata, like in epub
… may be that the pwp spec will be no more than one page

Brady Duga: we will just use something that is already there like zip or web package? zip is not a packaging format
… will we use the epub definition from IDPF directly and not modify it?

Garth Conboy: paraphrasing ivan to say we will not do an archiving format

Ivan Herman: we may not simply refer to IDPF format, because, e.g., we might not want those files in XML; need to modernize
… we would not define a compression or archiving format
… I agree with what Garth said… this time

Rick Johnson: while our document specifies we will create a profile of PWP, will we possibly create others?

Garth Conboy: we will not rule it out but we are not chartered to do it

Tzviya Siegman: the language is flexible, intentionally vague, but we have four defined deliverables

Ivan Herman: we may need to define this in a separate document

Bill Kasdorf: new technologies can come along that define these things

George Kerscher: two thoughts: tzviya mentioned we won’t talk just about books, but it’s not a sinful thing
… we can bring publishing industry into the fold
… point two: when we talk about a profile of PWP with more accessibility requirements, I intend to lobby for WP to be fully accessible, so be ready!

Garth Conboy: next thing on agenda– leonard wanted to talk about next-gen pdf

Leonard Rosenthol: there’s an organization called PDF association; int’l org of 150-ish members representing companies, universities, vendors, etc.
… that have some amt of interest in PDF
… been working for about 9 months on what we are now calling “next-generation PDF”, aka PDF next, responsive PDF, etc.
… it is not a revolution and is not just for books, though we love books!
… attempt to bring PDF and OWP closer together by leveraging existing features (mathml, SVG, etc.), so we can present a classic PDF representation and also a more responsive web representation

Ric Wright:

Ric Wright: NextGen PDF talk by Leonard

Leonard Rosenthol: ideas of what may go into a manifest, accessibility, etc., are also in the picture; we may use this as a profile of PWP int he future – there is a lot of alignment between next-gen PDF and WP/PWP

Rick Johnson: preso about Next Generation PDF in the first session at

Garth Conboy: Ric posted a youtube video of your presentation on IRC

Ivan Herman: what would be in that profile? or, why would it be different from PWP?

Leonard Rosenthol: in my vision, i believe PWP defines things like “here’s what has to be there”, the natural reading order, certain types of metadata–things that are common across all profiles
… profiles then define their own packaging formats, manifest types, etc.
… might decide certain things like metadata is in json-ld and all WPs need to use that–still an open question
… if we keep packaging out of PWP spec and leave it up to profiles, it enables us to make more useful profiles in future use cases

Ivan Herman: understood, but what i want to understand is: why can’t we say that what you want to achieve with PDF-next is exactly the same as PWP or epub 4?
… why having two doc formats when we can produce one that works for everyone?

Leonard Rosenthol: from our constituency, need backwards compatibility with existing PDF, just like bc is important in epub community
… if we can have a rainbows/unicorns solution, that would be great, but my position is that we have a community that needs backwards compatibility

Tzviya Siegman: even if epub 4 is backwards compatible, it does not mean PWP will be; we don’t know this yet

Leonard Rosenthol: if we think of profiles as relative to their constituencies (epub, PDF, etc.), that’s fine

Garth Conboy: probably something like a json manifest would be higher level, like in a WP spec

Ivan Herman: at this point, goal would be to try to get to a point where difference between EPUB 4 and PDF-next (in vague terms) disappears
… we have the opportunity to do that, but we might not be able; we have to minimize the differences, but if we could achieve one format lots of people would be very happy

Handrien Gardeur: in Readium-2, the PWP part of it is simply a naming convention for the manifest in a package

Bill McCoy: elephant in the room is that PDF is not a packaged format, so whether PWP can be jammed into this kind of format, we don’t know yet
… at the moment, we need to consider if it’s worth stretching PWP to allow a paginated format like PDF

Garth Conboy: it’s good to acknowledge this now

6.3. dpub aria

Matt Garrish:

Matt Garrish:

Garth Conboy: now mattg will discuss dpub-aria

Ivan Herman: i leonardr: there suptopic: PDF next

Matt Garrish: ARIA is Accessible Rich Internet Applications, means for user agent and AT to communicate with each other so AT can present content to user

Leonard Rosenthol: PDFNext->Next Gen PDF

Matt Garrish: talks to OS APIs and build a representation (accessibility DOM)
… ARIA was a bridging between interactive components so they could still be used by ATs
… also includes document structure like landmarks, which is key for this group–the semantics that allow a user to navigate around
… we had the epub:type attribute and a vocabulary, but one of the issues is we don’t have mappings and user agents don’t understand it
… it’s useful for landmarks, but doesn’t have high binding with AT
… a question was how we can find a OWP-friendly mechanism for doing semantics?
… trying to come up with a specification and vocabulary for industry semantics in dpub-aria
… what we did was outline fundamentals/essential semantics first, though we need to decide exactly what these criteria are
… we need to meet the 2 implementations exit criterium, so we can’t just prescribe the specifications; publishers need to actually use them
… originally envisioned this going straight into ARIA; now we have an extension mechanism (reason for “doc-“ prefix)
… we talked to different groups working on APIs, can now identify note references, landmarks, etc., in current dpub-aria spec
… but we need to figure out the next iteration of landmarks specifically, and extend the vocabulary

Garth Conboy: this is my first time looking at aria spec. looks influenced by epub vocabulary

Matt Garrish: yes, then got worked a little further by ARIA WG

George Kerscher: the accessibility aspect i totally get; does this help publishing and using HTML as a master format for internal production?

Tzviya Siegman: yes, we use it at wiley … no longer need to create classes in CSS to identify something
… can use roles (e.g., role=”abstract”) and don’t need to create something artificial

Matt Garrish: moving forward we don’t need to map everything, so we don’t overload documents with so many semantics that impedes rendering

George Kerscher: one of our deliverables is a 2.0 version of this vocabulary, and we can’t have 10k options

Tzviya Siegman: anything we add to the vocabulary needs to be added to HTML later, but don’t need to create API mapping document as mattg said

Charles LaPierre: when this was done, we needed two implementations by publishers, but don’t need implementations for AT?

Matt Garrish: yes, for AAM particularly

Tzviya Siegman: we had API mappings done by Microsoft, Gecko … Apple was first one to do it and helped us with testing

Laurent Le Meur: the body matter is not expressed here yet?

Matt Garrish: no because it could conflict with other semantics. need to see how body matter is split. didn’t add to first version yet, we needed more time to figure it out, but we could do so in the future

Laurent Le Meur: who chose “doc” prefix?

Matt Garrish: i think i did! but we had other variants like “dpub”

Tzviya Siegman: to see implementation report:

Ivan Herman: that was pushed back because the roles should be usable by those who are not publishers

Leonard Rosenthol: good that it addresses things other than HTML documents, we can use it elsewhere

George Kerscher: is there an existing AT that uses this?

Tzviya Siegman: yes, fully implemented in webkit, VoiceOver, and there are some other partial implementations

George Kerscher: this is very useful

Ivan Herman: the difference in this group is that we define it and publish it, but it is not the ARIA version
… still need to coordinate with others, but we have to determine who; aria wg determines the mechanism, which is already defined, so we don’t need aria wg anymore
… only need to confirm with HTML groups?

Tzviya Siegman: yes, for validation… if we have a role that is not mapped, it would override the semantics of the HTML element
… let’s start with aria group before HTML

Ivan Herman: yes, as a courtesy, but we they would just check with HTML group as well; we can do that directly
… are we free to determine our own terminology and validation to meet our exit criteria? my understanding is that if we have a properly defined document, the values can be added to the validator as just another step

Matt Garrish: we need to formally state that we can define our own vocabulary

Ivan Herman: yes, the charter says that, and this is not a joint deliverable with ARIA; it is our deliverable

6.4. browser friendly format

Garth Conboy: shuffling next agenda items and go to browser friendly manifestations with dave

dawhe: browser friendly format! (BFF)
… a little history: if there’s an epub on a server and you use a browser to access it, it will just download the file instead of displaying it
… one of the missions of EPUB 3.1 WG was decreasing distance between epub and OWP
… an obvious thing to do is unzip/explode an epub on a server so a browser can display it
… turns out readium was already doing this internally
… even further back in history, there was an alternate epub format that wasn’t even zipped
… so we started exploring idea of unzipping epub and making content directly available to web browser
… but so much of the structure of epub is encoded in custom XML vocab, which browsers don’t know
… e.g., container.xml, various namespaces, etc.
… thought was we can take the information from these files and put in an easier format for browsers/dev to use them
… the idea evolved to have this exploded epub with an alternative serialization for the structural/metadata info
… and started experimenting with json, which is useful for JS, but almost impossible for humans to use
… hadrien and i went through details of the json, e.g., to avoid duplication in defining sequence in manifest, nav document, etc.
… but we needed to consider whether it should support every idea the IDPF had
… Hadrien took off with the idea and built things with it

6.5. readium’s bff and readium-2

Ivan Herman: Slides at

Laurent Le Meur: a little history on BFF on readium: the initial scope was EPUB 3.1, and needed better format for browsers
… using or json-ld to refine manifests for BFF

Handrien Gardeur:

Laurent Le Meur: IPTC has set of XML structures for expressing news documents
… gives a full ontology of a vocabulary, including for books
… the web publication manifest is a further refinement of BFF
… Readium-2 evolves the Readium reading engine, using RESTful API for web publication manifests, platform-native languages, OWP technologies, and designed to be accessible
… several companies are involved

Handrien Gardeur: want to point out we’re working on internal plumbing, not a distribution format at all
… the BFF work turned out very useful for Readium-2; when we started this project, we realized quickly that BFF was a good starting point
… core ideas: manifests tend to be long lists, but this is not the influence here; the influence was hypermedia formats, where building blocks express something simple or something very complex
… involves concepts of metadata, linked objects for discoverable interactions (i.e., via REST), and collections
… everything must have basic requirements, but must be powerful and extensible
… single requirement in this manifest model is that we need a title element, expressed as a single string
… using JSON-LD contexts to map, e.g., with
… we can express something fairly complex with simple expressions linked to definitions
… for example, expressing author names in different scripts or languages
… most important part is the idea of core extensibility
… don’t necessarily consider “epub” as first-class, but consider different parts of epub as extensions
… can use different context documents to express different elements depending on the implementation’s needs
… harnessing JSON-LD, can also use URIs to point to other vocabularies, and easily extend the metadata
… this same idea also applies to link objects
… the only requirement is that there is a self link so the manifest can always be found
… started with idea that if something is on the web, it needs a URI for access; so we need to provide that URI in the link object
… in epub we tend to reinvent the wheel, so we have one element for a spine item, another one for the manifest, etc.
… the concept of a link object works in many places, instead of reinventing the wheel in different contexts
… also useful for URI templates
… using URI templates you can add as many services as you want and extend it easily
… manifest should also be a way to discover different services, e.g., that a publisher can provide
… such as a search engine, index, etc.
… we also want to have good support for other media, not just text
… can add basic additional information about different asset types, such as dimensions, duration, and so on, so browsers can select the most appropriate asset depending on your device

Leonard Rosenthol: why mix metadata with a link? why not have a link to the metadata for that object rather than directly including it?

Handrien Gardeur: we could do that, such as IIIF, which finds metadata first. you don’t necessarily want to fetch the resource before you know more info about it

Leonard Rosenthol: yes, but why include the metadata and the link together upfront?

Handrien Gardeur: it is easier to manipulate the JS object like this
… the idea is to have something easy to work with
… so we have some first-class metadata/properties
… can also extend link objects by listing multiple values for one key in JSON data
… some examples: multiple rels, media types, and properties, where metadata can be stored; two properties are default part of vocabulary, orientation and one other i can’t remember now

takeshi: this looks like a new metadata scheme
… some of the properties can be delivered by as properties of the vocabulary element
… if you have no intention to reinvent the linked data scheme, we should just use the LD specification

Ivan Herman: in some sense, we come back to leonardr’s question: if we have metadata and in the metadata we use, it seems misleading/unnecessary to separate this scheme from what’s already in
… why separate all these things? why don’t we say we have all the metadata already available in, but then only specify outliers in the JSON?

Handrien Gardeur: the purpose is to balance the information available externally on, along with usability of the data

Ivan Herman: in the world of web publications, i would prefer to have as homogeneous a structure as possible. we could use but it would be good to have one homogeneous structure.

Handrien Gardeur: i think that would be sub-optimal for our use case

Leonard Rosenthol: that’s fair, but we should value purity of semantics

Ivan Herman: not necessarily about purity of semantics; using this format for your plumbing is fine, but we have to find a balance between the data and clarity of the metadata in the package

Garth Conboy: we ought to treat this as an input document but not necessarily a proposal of our exact semantics; we will have lots of other opportunities to debate

Laurent Le Meur: even a discussion about the syntax would be useful

Handrien Gardeur: a lot of the epub-specific stuff is not accounted for in

Tzviya Siegman: lives in W3C, so should look at that work

Leonard Rosenthol: there are other standards that we can look at, considering, for example, publishers with a lot of metadata, who want to incorporate the data in their publications

Bill Kasdorf: we can use different schemas in the same format

Handrien Gardeur: that’s the goal. let them use what they already use

Bill McCoy: this is super general. was there nothing this general that already existed?

Handrien Gardeur: not really. most hypermedia models that already exist some have missing components, e.g., metadata definitions
… the final principle is the idea of “collections”
… the concept is pretty simple; only one requirement: need one collection (e.g., “spine”) with one resource
… the collections use the same linked object format; no reinventing the wheel
… spine vs resources: no need to repeat yourself; spine contains reading order, and resources are linked objects pointing to assets needed for rendering, but not necessarily in reading order; no need to repeat spine elements and resource elements

Leonard Rosenthol: are all hrefs absolute?

Handrien Gardeur: yes, it’s much easier to work with absolute URIs
… but nothing forces us to use absolute URIs
… a collection in this model can also be a collection–it can include a role, metadata, links, etc., but it can include sub-collections as well
… finally, a minimal manifest, has a context declaration, a title, a self link, and one resource in the spine
… can also provide text direction

6.6. Readium-2 internal structure

Laurent Le Meur: in readium 2 we have streamer and navigator
… work of the stream is to pass the content to the navigator

Leonard Rosenthol: +1 to @mateus!

Laurent Le Meur: navigator takes the information and puts it on the web view
… the streamer has several flavours - golang is used in web application
… streamer and navigator are two parts on the same server
… streamer can take epub 2 and 3, cbz and later pwp or epub4, etc.
… with one piece of code you can handle all
… relies on caching
… navigator exists in swift, java and typescript - it takes the web manifest and paginates
… nypl implementation makes use of service workers to take content offline for reading
… plug-in system allows different navigators even on the same system
… we are finalizing the streamers and then will work on the navigators
… by end of 2017 we want to finalize the navigators
… is a current implementation

Handrien Gardeur: they create a static version of the manifest - and web app manifest so you can add to your home page
… using appcache and service workers because sw are not yet implemented on ios

Laurent Le Meur: have a prototype of readium, aldiko to build r2-based mobile apps by end of year

Ivan Herman: i’m trying to understand what is the takeaway for the group - one thing is the work on manifest affects how we implement web publications - using or not using json-ld and how it works with web manifests, etc are all questions we need to answer. how flexible is the code in terms of demonstrating a test implementation?

Laurent Le Meur: if the group adopts the same vocabulary and model then you have a reference implementation in readium, second option if you don’t use is that we could ingest - we keep our internal model so won’t be a perfect model

Handrien Gardeur: when work started on bff we wanted something roundtrippable - we have shown that it can be done with a json manifest - only requires one json manifest and not all the files that epub uses
… we use this not just as a representation but as an internal model - two core building blocks: ingestor is all about building blocks - should be able to work with other formats - the navigator needs to work with the internal model and whatever pwp offers - pwp will always be consumed by the streamer part

6.7. web app manifests

Ivan Herman: slides available at

romain: web application manifest is part of progressive web apps - try to reproduce with web technologies what you get with native apps
… web manifest provides the ability to install the app on your device
… is a simple json document that you can link from your web page
… example has lang, short_name, name, description, icons, etc.
… google recognizes that you can add the site to your web site when the manifest is found
… can identify the background color and links to related apps in the store
… the start url is launched when you open the app
… the scope defines which url of your web site are part of your application

Ivan Herman: does this mean any resource in the scope will be made available offline

romain: it is up to the developer whether to make available offline

Handrien Gardeur: you have a different scope when you define a service worker - this scope is not the same

romain: the orientation property defines the default orientation of the app
… display mode specifies the preferred mode: fullscreen standalone, minimal-ui, browser - potentially could be a reading system mode

Tzviya Siegman: web app manifest is possibly extensible - we can work with them on extending

Ivan Herman: we may need to step up as potential co-leaders to move this forward and add the features that we need
… web app manifest is only a working draft at this time

Leonard Rosenthol: tomorrow i’m going to talk about w3c packaging

Ivan Herman: web app manifest is not bound to progressive apps - different emphases between the two - we need to find a balance

Leonard Rosenthol: and web packaging uses web app manifest as a basis for its metadata

Handrien Gardeur: there is a lack of abstract model and extensibility in the web app manifest model

Tzviya Siegman: that’s something we can work with them on - was only a first draft

Ivan Herman: we can’t push json-ld on them

Garth Conboy: we need to research and see how practical it is

7. In Memoriam Pierre Danet

Bill McCoy: one of the former idpf board members - pierre danet - passed away recently - i want to take some time to honour him today - he was a super-influential person in the formation of this effort - it would have not happened without his support as he kept things rolling even when times were tough - i would like to honour his memory by trying to accomplish his goals: create an open standard and to innovate
… in one of his last emails he said we need to revolutionize the reading experience - we have to forget the traditional turn page and reinvent reading - I encourage us as we move forward that we focus on the future - this is our chance to evolve publishing for the future, not just about the details or moving forward what we’ve already done
… the boundaries between books and apps are converging - where do we want to get with this
… i’m very sad to not have him here as a friend and colleague

Matt Garrish: [One minute silence in the room]