TPAC 2015 breakout on EPUB Level 0, Sapporo, 2015-10-28

EPUB 0

dauwhe: session decided at last minute
... surprised the room is packed

idea a few years old

inspired by glazou and his posts about epub3

and his epub editor that is the only one

he found issues

interesting features in epub3

for instance duplication in package files

if we have to add a file to an epub, we add it to folder, manifest, spine, nav, landmarks, ncx and and and

significant work

five ordered lists of content

idea was what if we started the spec from scratch, simplest as possible

what would it look like ?

hence the name epub0

it's an experiment

no intention of changing the world

(room is _packed_)

what if we try

goals were simplicity and keep things closer to the web

html css images and stuff

but the scaffolding around is xml dialects

<kwkbtr> (oops)

certain level of complexity

tzv: no bash of EPUB today please :-D

tzviya: we can talk about that later :-)

dauwhe: started about thinking of the various lists of content

the one that is important is the nav file

accessibility does matter here

I started thinking about could we use the nav file for the other manifests?

first idea was let's take the epub nav file

and call it index.html

all of the sudden you can make usable books just w/ that

opera can make nice things with that

navigation is automatically generated

some people became of course mad at that experiment

the W3C has embarked on that effort

the publishing community was isolated from web and web stanards

the DPUB was created with IDPF and W3C together

that collaboration is shown in EPUB+WEB

<ivan> http://www.w3.org/TR/2015/WD-pwp-20151015/

Portable Web Publications

dauwhe: and so that effort of writing that document has been ongoing for a year

tzviya: started before last tpac

dauwhe: vision for fully-web compatible ebooks

online, offline, etc

dauwhe: at same time, IDPF has chartered epub3.1 WG

starting from 3.0.1

some work areas include "browser-friendly manifestation"

exploding the zip package on server to get access to components

dauwhe: so what other things can we do to ease the pain for browsers?

create data structure that decsribe the epub easier for web developers

Florian_: where cn I read about that?

tzviya: IG meeting tomorrow
... first major revision of EUB3 but must be backwards compatible

ivan: IG started with idea that if possible be backwards compatible BUT no mandatory

<tzviya> formal work plan for EPUB 3.1 http://www.idpf.org/workplans/2015/epub/

dauwhe: explore various versions to make that browser friendly epub so E0 reappeared
... EPUB WG proposes a JSON version of manifests

some other people said can we avoid yet another format?

what is interesting about ebooks we have these collections of html files and we also have to define relationships between these files

some files may have special properties

that's where html has not quite fully addressed the requirements

glazou is working for example on transitions between pages

address questions unadressed by web

Florian_: howcome is also active in that space

dauwhe: leads to a interesting problem space

link relations in html between various files but those things are underused

Florian_: each file describing its relationships with other you can create thinsg beyond what we actually care about

johanness: and you want those files to be readable long time from now in the future

dauwhe: most reading systems don't support scripting for various reasons

most authors don't have knowledge of scripting

tzviya: in PWP no sure it's clear about first P

offline not always most important for portability

archiving in publishing world is crucial

what does that mean ?

linking to it and identifying it is crucial

archivable format

traversing the package has not been defined in the past

navigation is one issue but we also need to traverse the package

a11y is super important for publishing too

dauwhe: long term goal of epub is a11y by default

brady_duga: two things

<Norm> https://en.wikipedia.org/wiki/Zork

comment about having relationships between docs powerful but easy to get into a linking mess

dauwhe: preview of an ebook for instance is a subset of files

brady_duga: about sets, what is a set and what is the publication?

how do you define where the spidering should end ?

dauwhe: fundamental question is the idea of the manifest

does the author need to explicitely list all resources?

brady_duga: mùanifest is almost silly, stupid

fsasaki: yes

brady_duga: manifest existed because we did not have the zip archive
... but now we have a manifest soo.....

tzviya: what if we generate the manifest instead of authoring it?

brady_duga: that's the question
... doable looking at package constraints

dauwhe: how can we keep at least the simple cases simple

heavyweight machinery even for simpler cases right now

brady_duga: even in simple cases we have stuff in different folders

tzviya: what is a simple case ?

dauwhe: a simple case might be the simplest ebooks today
... (explanations about simple books)

tzviya: we could define simplicity

glazou: but authors will always abuse the system despite of definitions

tzviya: epub check the validation tool is very powerful

glazou: I have found bugs and issues in validator so....

tzviya: but we can still prevent undesirable files to appear

ivan: industry regulates itself

dauwhe: right

Florian: +1

dauwhe: the spec is what epubcheck decides
... that's a fact

tzviya: some retailers don't accept anything else

clapierre: they just updated it right ?

<tzviya> https://github.com/IDPF/epubcheck

tzviya: if missing a resource, fallback ?
... we should kill fixed layout

brady_duga: +1

tzviya: questions for us?

dauwhe: curious about other crazy ideas other people have

what would you do?

tzviya: I don't care what a book is (answer to Florian)

<astearns> books don't need any ordering: http://www.fastcodesign.com/1664818/composition-no1-an-ipad-art-book-you-read-on-shuffle

Florian: complex question though

tzviya: a spec can be used and abused and I don't care what is an ebook

dauwhe: a book is a bounded set of web content

ivan: we began to give an answer to that

a web publication different from a web page?

as a collection

we began to flesh out all these terms

ivan: how do I define those resources that make a publication?

it is a collection of resources

Florian: is wikipedia a publication?

ivan: yes

but I don't know how we could describe the content

we will have to add additional things

HeatherF: I'm an editor for IETF
... be archivable is critical
... and contain information on provenance
... history of document and related documents
... should be contained in archive

tzviya: yes

dauwhe: there is metadata applyng to collections and subsets of collections

fsasaki: also question related to metadata

I want to use a book like a database

want to be able to ask structured questions to my book, like I can do with dbpedia or other linked data sources

<astearns> books with apis

I'm saying having a well-defined place in the content for such structured data

fsasaki: not necessarily adding them directly like with RDFa

dauwhe: some of the work we're doing explore such options

tzviya: in journal's publishing, most content is html

so gives me a lot of ideas about books

dauwhe: have referencing to whatever metadata vocab is available

tzviya: if you want to see publications do this, could be in package or online

<Zakim> tzviya, you wanted to mention what learned about books from journals

dauwhe: (digression about metadata provenances)

rossen_: couple of questions to ask

<astearns> not a digression - crucial that packaged documents don't get updated often enough

rossen_: not as poised with epub as you are
... never had to build one
... I am hearing a revolutionary approach to this
... yet I hear evolutionary thoughts
... so revolution or evolution?
... have you spent any time on the minimal viable set of requirements for a publication?
... and then revolutionize from there
... but if evolution, what are the things that work really well

glazou: the zip package, html and css :-)

(laughs)

dauwhe: funny because the core works really well and that core is OWP

ivan: problems with CSS, right?

dauwhe: when we have access to a rendering engine implementing html5 as in 2015, we can do really a lot

mhakkinen: OEB started up in the 90s

and we needed more than audio and ToC

then the navigation structure we needed for audio files

tzviya: EPUB has replaced that and it is the official format now

mhakkinen: can use SMILE to sync html and audio files

ivan: only one good SMIL app...
... interesting question
... similar to other story at W3C... never asked ourselves the question in such terms
... yes we tried to look at it

<dauwhe> http://w3c.github.io/dpub-pwp/

what it means to be apublication on the web, what it means to be portable

ivan: we carefully avoided the "epub" term

how will it evolve ? we'll see

epub with all existing quirks is a remarkable success, full industry using it

we have to consider evolution as well because we cannot tell industry to start all over from scratch

that's why epub3.1 is backwards compatible

but not a mandatory requirement for us but we keep it in mind

<Ralph> [where "us" == IDPF+W3C working together on PWP]

johanneswilm: we can have a browser in readers so what's the point of a separate standard for books
... we should be able to work better because more restrained
... archiving not for hundred years IMO

Florian: annotations should be archivable and sharable

rossen: annotations not right in html world because of html tag complexity

rossen_: I was provoking that conversation
... agree with ivan, this is going to be an evolution

Florian: was in email, what is the simplest thing we can do?

ivan: same people wearing different hats in this industry

<Zakim> ShaneM, you wanted to ask about the dbpedia approach

ShaneM: would it satisfy fsasaki if the triples available were extracted and stored in the zip file?

fsasaki: we may want to comment on the data
... so probably yes
... (gives an example)

<Ralph> [the 'copy-paste incorporating the triples' use case]

ivan: when we are talking about publications, that's not only about html content
... epub has restrictions about content types in the package

but will not last forever

dauwhe: thanks everyone for participation!

(adjourned)

<fsasaki> [additional requirement would be to point from the data to the content, using e.g. annotation mechanism]

TPAC 2015 breakout on EPUB Level 0, Sapporo, 2015-10-28

28 Oct 2015

Attendees

Contents

EPUB 0