TPAC breakout meeting on offlining -- 08 Nov 2017

<scribe> scribenick: mateus

duga: two questions: what do we need to offline, what are the parts of the publication? and, how do we do it?
... what are the publication bounds, what's referenced from it? download just a single chapter vs. the whole book?
... we need to solve this for all cases and somehow make the publication offlineable
... before we have native browser support
... one way is service workers, but not necessarily (depending on platform)

bigbluehat: not untrodden ground, there was a descriptive format (app cache), didn't quite work; service workers are more prescriptive about how to get the resources
... we're not in a scripting business, so we just want to give the "shape" of a publication, a "waybill", and afford a way of offlining based on that
... including sub-resources of a publication like a single chapter
... there has to be some amount of object permanence (e.g., in hard drive)

(JakeA): browser obtains files, browser checks manifest for updates, refetches files on the list of changed files; user would need to refresh the page to see updates

scribe: service workers run in a different direction; given the authority to proactively find the resources
... if wpubs need to run on native apps as well as on the web, we need a layer for something that is not just javascript

<tzviya> scribenick: mateus

ivan: yes, this is one of the main issues; in our mind, we would like to see some sort of an extension (for the time being) built into browsers; the only thing i need to provide as a publisher is a declaration, no lines of code
... in some sense, this is similar to app cache; publisher has to deliver some sort of js together with the publication--this is a major issue because publishers might distribute 2000 instances of a book
... and these would be archived and updating would be a problem

JakeA: why exactly is this a problem?

dwood: a maintainable js runtime would be a requirement

dauwhe: just pointing out that some of the experiments @bigbluehat and i worked on just read a manifest to find a list of cacheable resources

ivan: that means the app cache model can be implemented and works?

nick doty: you could build a layer around service workers that can be a progressive enhancement; what you fall back on could be a directory of html files

bigbluehat: right, but what publishers need is a way to do this without depending what's baked into a service worker

JakeA: a declarative format, if designed too soon, might not be sufficient; a solution is to wait for a solid implementation first
... if your content is purely html, css, images, media, etc., the implementation can be a lot simpler (famous last words), so it can react to files changing and can revalidate if models change

ivan: are there security related implications that make this difficult?
... in general, the model of delivering something from one source and the script comes from someone else--are there security problems?

JakeA: same as html--book could have a form, for example

ivan: right, but resources should be served via https

JakeA: yes

ivan: there might be similar restrictions that i don't know about

rdeltour: a service worker is bound to an origin; what's considered a publication has an origin, maybe we can simplify that by saying a reading system also has an origin, like a web app
... can that web app with service workers import many other publications that also have service workers?

JakeA: a service worker has a context, usually root (/); but you can have other contexts too, and you can have other service worker registrations with other scopes

ivan: if you have one book with combined resources from many places, is that a problem?

bigbluehat: situation that makes app cache is that the app cache file needs to be updated; if manifest and resources are out of sync, app cache gets screwed up

JakeA: no; a service worker can still cache other resources
... don't know how we would avoid that in publishing, because we are not just using cache; we don't always want to go if anything is stale or not, you might want to keep old versions
... in that case, you get separate URLs

dwood: wondering if it's in or out of scope for a service worker to modify the manifest... we read in some json structure that's in the scope--are we allowed to modify the structure?
... technically we can do it; should we allow it in the wpub spec?

nick doty: you could have a separate library that can do whatever it wants... it would be cool to start with a common library to customize service workers... i'm not sure if you need to get into the question of prohibiting service workers

scribe: people can find other interesting ways of maintaining a manifest

dwood: accessing a single url in the 90s versus now, you can have html or javascript
... a service worker would go nuts over that

bigbluehat: likely will be a wild west, even while we polyfill functionality, we can just, as @JakeA said, change URL for a new version
... if a resource changes, if we grant unique URLs, we curtail the problem; but this is not reasonable to expect from everyone (and how they access the resource); example: /TR URLs from s3.org
... i don't want to figure out a new URL to find new resources

timCole: this is ubiquitous in the web; archives try a few different mechanisms to try to understand what version is correct and that was captured at a certain point in time
... would be nice to develop more rigor around our solution
... JSTOR, e.g., have their own solutions

duga: we're moving into archiving, but offlining might not have anything to do with archiving
... my impression is, "here's my chunk of the web" -- i want to still read it when i get on a plane
... we're also working on a "packaged web publication", something you can extract, keep on an HDD, etc., and archive
... i hope archives aren't using service workers to find an online version to archive
... hoping they archive a packaged version

bigbluehat: offlining and archiving... offlining is "i have a piece of the web in my browser, and i still have it when i go offline"
... i don't really "have" it, it have it in cache
... archiving is actually "keeping" the publication... this might be closely related to packaging, as @duga said

duga: right

dwood: quick comment-- heard the group say we do things the "web way", which is laudable, but we should bring book-like use cases to the web as well as part of the process
... however, i do take @duga's point
... we have different set of expectations for books and web pages

JakeA: there's a demo that uses a service worker to offline resources, but could also package those resources... this was after the Sapporo meeting
... i'm hearing conflicting things--the idea of the manifest and the idea of "apps" with a standardized experience
... like a service worker would be an implementation of a manifest

<rdeltour> Jake's demo https://jakearchibald.github.io/ebook-demo/publisher-site/readme/

ivan: we're hoping browsers will eventually incorporate functionality to read books, that the framework we create for these publications eventually become native features
... we hope the book becomes a normal web presence with maybe some special behavior, but that's down the line

JakeA: if the book works on the web, what does having something in the browser do?

ivan: i don't have to install it; i can read the book, and if it's just html and a manifest, there needs to be a reading application... in the meantime, maybe we have extensions, polyfill, etc. to create the reading experience

duga: the reading experience for a book is different from a regular browser experience; we need that experience to be there for web publications to be successful

dwood: just like eventually browsers developed PDF support; the difference between having some code to handle it vs. the browser understanding these things natively

ivan: if i'm a publisher, and i publish the book 10000 times, if there's an obligation for me to add a piece of js to the publication itself for it to be enjoyable, that becomes a problem (e.g., for maintainability)

tzviya: i publish many journal articles, do i need to include the polyfill every time?

bigbluehat: depends what we're affording with the polyfill -- if it's just offlining, maybe not, but if it's supplying the reading experience, we're not making a standard anymore

garth: that's why it's a bad idea for the publication to provide the reading experience
... the content should be declarative and not stop working 10 years from now because the polyfill doesn't work anymore, for example

timCole: about archiving: archives might not use service workers, but others might, if, e.g., university libraries want to track changes and decide for themselves if those updates are archivable

bigbluehat: want to talk about keeping a copy and doing something with it: what is its identification? are we packaging something local and putting in on the web? are we packaging part of the web and keeping it local?
... archives are currently holding onto something that might not work in a few years, holding a package that no longer has its identification because it's lost its relationship to the web
... these are the kind of relationships publishers want, and to solve

JakeA: i wouldn't be afraid of shipping js with a book any more than shipping css as long as the publication is using progressive enhancement
... css layer adds better styles to account for user experience; just so, js would add the "offlining" experience

ivan: yes, but this means the book carries its reading system with it; if i have some disabilities, i might want to override that and use a different reading system

JakeA: and you wouldn't create the competition of systems outside of the browser if you were to prescribe it in the publication

dwood: i believe that there's an argument to be made that the addition of js to browsers has fundamentally changed the social contract between the user, provider, and the web
... i fear we're breaking the social contract between reader and the publisher if we allow js in epub4
... i fear i'm right about this, and i fear for the social consequences
... we should really think about that

stevez: we want to deliver "a" reading system as a default, which can fall back if the reader doesn't want it
... it would just be a fall back if there is no installed reading system

JakeA: the competition for creating a reading system would lie with the publishers

garth: publishers would hate this

laudrain: we just want to deliver books, not reading systems

dwood: we can still add a script tag that's ignored if the UA wants to

<tantek> +1 to being concerned about including JS in publications/epub

ivan: it's okay for us on the technical side to think of this as a solution, but not so with publishers

JakeA: if pubs want to provide books but not prescribe an experience, that sounds kind of broken

tzviya: no, because i want to provide content online; i produce content and documents; i am not going to produce a platform

duga: people who create web pages don't create browsers

bigbluehat: i can publish a page or web app, and we publishers could ship a reader web app so we can own the experience
... and that's where we are now
... what the publishers want now is to go from this state to another one where we're getting a standardized, implicit experience for publications
... the web has never had a collection of documents smaller than a site bigger than a page... we're trying to figure out how to make this bookish experience without having to figure out the reading experience with every publication

dsinger: if one is billed as a "game" and one is billed as a "book" and has the same content... what does that mean?

dwood: books are becoming software, which is where the problem stems

jcraig: maybe it makes sense to include a js polyfill, but i'm not recommending that
... the idea that publisher would provide its own UI seems like a crazy idea
... they're not ebook app developers, they're publishers
... each experience would be different
... there are certain features, like a11y support, that would not be satisfied by this
... maybe there's a place for a small polyfill that provides a basic experience, but that can also be part of the spec--minimal viability when a UA accesses a publication

JakeA: not sure why we would trust browsers to do that correctly more than the people who really care, like publishers

ivan: we trust that browsers will display html correctly...
... the analogy here is the same: as a publisher, i just want to create html and css and trust that the browser will do the right thing
... if it doesn't, i'll go to another browser
... publishers want to make sure a publication is presented correctly, just like web developers trust that html will be displayed correctly

JakeA: but we already do html

ivan: true, but we would like you (browsers) would also do books

garth: ideally, browsers would have a custom experience for document collections (e.g., books)

dauwhe: i wonder sometimes why link relations don't exist more often... it solves some of these book problems
... but there's a fundamental issue about how we use websites vs. books
... i buy a book and i know how it works
... every book has a relatively similar UI

<jcraig> unless it’s ruby vertical text in japan

dauwhe: even ebooks... i know exactly what i need to do to turn pages, annotate, find ToC, etc.
... there's value in not having a learning curve

stevez: this is interesting--where i come out of this is, i think publishers should define what a book readings system should do, and whoever implements it, as long as it follows the requirements, should satisfy that need

ivan: of course, that's why we have the working group
... but there's the distribution aspect, which is a problem

stevez: don't beat on the browser makers! :)

JakeA: the understanding seems to be that we need to provide room on the platform for these things (publications) that have been shown to be valuable

<jcraig> I keep hearing ~“browsers won’t support a book reading experience.” doesn’t “browser” include ebook readers? iBooks for example is a browser that runs WebKit under the hood.

bigbluehat: when TimBL created the web on his HDD, he just wanted to access documents easily, and the rest of the UI grew around it (e.g., back/forward buttons)
... publishers want a similar evolution of the experience that is catered to web publications

avneesh: having the reading system embedded in the book is a big problem for a11y
... i as the user need to learn the system; i can't learn it for every single book
... predictability is critical for people with disabilities

ivan: what i still don't fully see is, we're talking about an extensible browser... is it possible to separate the roles that "actor A" extends the browser as needed and "actor B" just provides the content
... are there technical limits?
... is that kind of separation of role possible?
... or are there technical limits?

tzviya: @jcraig asked about reading systems

jcraig: what i don't understand is... readers/browsers are supporting some native functionality... what else would we need that a dedicated reading device/app doesn't do?

ivan: the same question arises--ibooks reader will do it, but what happens if i encounter this out on the web?

tzviya: they're apart from the web, and that creates problems if i need to integrate with the rest of the web (e.g., citations)
... web publications seek to solve this problem
... we don't have an answer to these questions, and we're out of time
... last time, @JakeA made a demo on the plane... he's obviously going to do this again :)

(and we're done)

- DRAFT -

TPAC breakout meeting on offlining

08 Nov 2017

Attendees

Contents

Summary of Action Items

Summary of Resolutions

Scribe.perl diagnostic output