DPUB IG - EPUB WG Joint Meeting on manifests

27 Jun 2016

See also: IRC log


Dave Cramer (dauwhe), George Kerscher, Luc Audrin, Avneesh Singh, Markus Gylling (mgylling), Leonard Rosenthol (lrosenth), Romain Deltour, Takeshi Kanai, Ben De Meester (bjdmeest), Bill Kasdorf, Daniel Weck, Makoto Murata, Ric Wright, Garth Conboy


tzviya: lets get started… short agenda sent out earlier today. Its a pretty loose agenda anyway. Dave will offer an overview BFF, then Ben on DPUB locators. We have a hard stop at the hour.

dauwhe: basic history in case people are unfamiliar… BFF has been a side project of the EPUB 3.1 WG, started out as a 3.1 deliverables, but given timing and ambition level we decided to put it on a separate track
... origins go way back. Fundamental idea is that EPUB is not the most natural thing for a web developer to deal with, or anyone else for that matter… zip object with twists and custom XML vocabularies, and so over the years there’s been some interest in how can we make it easier to understand and use. There’s been some historical precedent, there was a file system container in EPUB in 2010, some RS like Readium already work with they call “exploded versions” where everything is unzipped, and JSON structures. Also EPUB Zero was thought experiments on simplification. And so all those threads came into it

<dauwhe> https://github.com/dauwhe/epub31-bff

dauwhe: within 3.1 there’s an unofficial github repo

<lrosenth> (whenever Dave hits a breaking point)

dauwhe: where the existing work has been done, major players have been Hadrien and myself. The basic idea is to unzip the EPUB and use a JSON manifest that conveys the same information as classic EPUB
... the relationship between BFF and PWP, many of us are working in both corners, I see it as experiments vs theory
... in one way this is about playing around with the PWP and see how they look in reality

lrosenth: the way you described it is that it is really about authors, but what you didnt speak about is what the name seems to imply

dauwhe: I used web developer as a synonym for a RS
... the goal is to reduce the distance between those concepts to zero

lrosenth: are the two sides equal?
... philisophically I am trying to understand where the BFF group is coming from

dauwhe: a tough question, philosophically I am coming from many directions, a desire for simplicity. The EPUB ecosystem can be confusing, a lot in there that exists for good historical reasons but still confusing to newcomers
... how can we make it simpler, how can we make it closer to the web of today and tomorrow

[markus got dropouts but is back now]

<dauwhe> https://github.com/dauwhe/epub31-bff

<tzviya> https://github.com/dauwhe/epub31-bff#example-1-omitting-linked-data-and-other-enhancements

dauwhe: example one from the github repo is a good example

garth: one should look at where EPUB is today and see that it is pretty succesful in trade, one would hope that BFF would be to get to a place between the packaged format and the web. Roundtrippability is a core feature

lrosenth: as long as we keep it within the EPUB context it makes a lot of sense. What I dont understand what the context is in relation to PWP

ivan: garth just said that in the case of BFF, the roundtripping with classic EPUB was very important. In PWP we dont have that requirement [???]

tzviya: to summarize, BFF is looking for a browser friendly format with roundtrippability with classic EPUB. Lets now talk about PWP Locators

ben: we were looking for how to do common locators for different states
... without needing to know the state you are in at the moment, and transfer that locator to other users and everything keeps working

<bjdmeest> http://w3c.github.io/dpub-pwp/#state_definition

ben: we were talking about two dimensions of states; protocol and file access as well as packed or unpacked

<bjdmeest> http://w3c.github.io/dpub-pwp/#locator-pwp-func

ben: in the end what we came up with is the canonical locator. Every PWP also includes a concept of a canonical locator that stays the same disregarding state

<bjdmeest> http://w3c.github.io/dpub-pwp/#manifest_algorithm

ben: and we would incorporate the CL as part of the PWP manifest
... and in the end we came up with a proposal of we could, starting with a GET request, we could in the end get the full publication without depending on the state that the PWP is in

<lrosenth> well that’s why we stayed away from the syntax and focused on concepts...

ivan: I think that the work on the manifest work of the two groups has been complementary. They looked at different issues. PWP did not go into details on how the manifest is encoded, for us the manifest is just a kind of buzzword for the information we need in one place. On the other hand we looked at locators, which led to questions like combining manifests
... as far as I remember the BFF work did not concentrate on these issues, the combination of the two is somewhere where we should go
... there is a work going on at W3C with webapp manifest, we have been in contact with the editor

<ivan> https://w3c.github.io/manifest/

ivan: its a JSON manifest, includes a number of features that neither of the two groups didnt look at yet, relating to security, but the basic structure has an extension mechanism built-in

dauwhe: I already have a few examples that uses hybrids of webapp manifest, it seems possible to extend that with the kind of information we need

tzviya: and feedback from Marcos indicates we will be able to do that

lrosenth: its not clear to me what BFF is trying to solve in the PWP context
... in PWP do we want to be focusing on authoring or consumption friendly, or both?

ivan: I think we have to be careful about this, but at some point in time the goal of the work we have may be EPUB 4 or 25. There are things that are already clear from the use case collection even if its not complete yet
... we already know that we have to have somewhere the information about the resources that make up the PWP, the resources have to be categorized as essential or not essential, and if you look at what BFF has done is to start to give structure to how these things should be encoded
... we still have to work on the use cases a lot, but I don't think the relationships are so wildly different that we should not think ahead
... I think we all know that eventually we will use JSON for manifest and we need to keep that in mind

Bill: the whole point of this call is that both of these activities are in mid stream, and we want them to converge not collide, thats the whole point

tzviya: yes, it was clear that it is time to have a joint call, it was getting absurd

lrosenth: if you go in PWP with the expectation that PWP is the future of EPUB, I think thats the major problem, we dont know that it is true, not a conversation we have had
... EPUB may not be the only or the best manifestation of a PWP

ivan: the way I was reading BFF; sure there is a high priority for roundtripping with classic EPUB, but since it became a side project the discussions became more and more general, so there is a convergence there and how that will evolve depends on a number of issues that may not even be technical
... decision will have to be made within weeks or months

billk: roundtrippability means that they are highly related but still two different formats

lrosenth: I’ve been under the assumption that the TPAC F2F was to do the deep technical dive into PWP. Do we need to do something before TPAC?

ivan: “before TPAC” means vacations and sunshine…

tzviya: we have a few proposed manifests, we have the PWP locators proposal, there are a variety of issues we could discuss but lets start with these two

lrosenth: your comment about the removal of CFI for linking in, I have a call with the UK government on the topic of open data and document formats, and they ability to link in is one of the attributes of this

garth: we didnt remove CFIs per se, only linking out from EPUBs was removed because it had zero adoption
... its still the way you link externally into an EPUB

ivan: let me push a very specific technical issue to the BFF guys. One of the features we thought was important in PWP locators is the fact that the final manifest can be the combination of several manifests that are at different places.
... and of course that makes things a bit more complicated. Is this something that BFF considers being important, how does that modify what is out there? At the moment the biggest tech difference between the two approaches

lrosenth: the context is that certain publications cannot be modified and would have to exist as separate entities

hadrien: we havent considered that, I think it is certainly doable, the way we see it its possible for anyone to point to any resource on the web and make a publication of it, so in BFF its not combining manifest, but linking to resources
... we have mechanisms to discover manifests

<Bill_Kasdorf> +1

hadrien: one HTML resource can be part of any number of publications

ivan: we will have to revisit this in our own use case work, we ran into this issue not only due to the locator structures, combining two JSON structures cannot be done purely mechanically, how you combine may be depend on semantics of keys etc, if we can avoid it is fine

lrosenth: just to add a little more to the context, we felt the need to do this, its an include followed by a set of overrides
... for example adding annotations, or replacing an image
... we’ve seen the need for merge and replace in multiple use cases

hadrien: combining manifests is extremely complex, its very easy however to combine multiple resources, merging metadata is hard or impossible, merging reading order is hard or impossible
... I strongly believe in our core idea of keeping it very simple, the teacher who adds annotations or changes an image would create a new manifest instead

ivan: yes, we’ve left it largely open, but we have to look hard at the use cases to see whether its necessary or not

lrosenth: we focused only on resources, and we havent gotten to metadata and reading order yet

tzviya: the original point that Dave made, the BFF group has gotten to practicalities whereas PWP has been more about use cases. I think we are at a good point to talk about where to go from here. What are our next steps?

billK: there should at least be some kind of regular joint meetings between these two groups

<tzviya> PWP Use Cases: http://w3c.github.io/dpub-pwp-ucr/

ivan: yes of course, the work in the IG is now really concentrating on the use cases, for very practical reasons as well, but we have already heard that without use cases this work is meaningless. I would love, when our use cases document becomes more mature, that the use cases be looked at also from a BFF POV. Eventually, at some point in time, 6-7 months from now, if the IDPF-W3C combination will happen, these efforts will be merged anyway.

garth: the IDPF-W3C discussions will be more clear by TPAC time.

tzviya: please look at the use cases document if you haven't yet

lrosenth: I agree completely. My hope has been that the TPAC agenda was going to be to start applying concrete technical decisions against the use cases
... see one or more directions on how we want to move things forward

Summary of Action Items

Summary of Resolutions

[End of minutes]

Minutes formatted by David Booth's scribe.perl version 1.144 (CVS log)
$Date: 2016/06/28 06:59:49 $