See also: IRC log
dauwhe: suggested the session last week. See summary: https://www.w3.org/wiki/TPAC2016/SessionIdeas#What.27s_new_in_pubrules_and_automated_publishing
dauwhe: how should the web address discrete collection of things?
dauwhe: higher level than the doc object? Is a web app
manifest the way?
... this has implications for digital publishing, and more than that
... may want to bundle web content
<leonardr> https://mikewest.github.io/origin-policy/#app-manifest
leonard: you also talked about metadata over a series of
docs.
... "origin policy and origin policy manifest" which describes a way to
define a set of metadata (or anything), common response headers,
... so it is a concept that might be tweaked for some of our needs
... leveraging a collection-wide metadata mechanism
dauwhe: is there a possibility of having a collection DOM element, above the document object?
mike: the group that's working on problems around publications
tzviya: DPUB is working on something for web publications.
A publication is a collection of web documents which, for our purposes
today, is published on the web
... a digital book may have many elements--video, images, text--and
these may be actually separate files. In a book, you have a table of
contents (TOC)
... and can easily jump from TOC to page $foo
... we talk about books, journals, magazines, self-published newletters,
etc
mike: at a high level, we should assume that we can make
the web do that's needed fo rthese problems
... making the user experience of books on the web a better user
experience
... arguably, books are not a first class citizen of the web, and we
want users on the web to have the best reading experience we can, even
of long format content
<dauwhe> http://www.clickhole.com/blogpost/time-i-spent-commercial-whaling-ship-totally-chang-768
mike: in thinking in terms of scripting, or
programmatically, what's possibly lacking that we could think about
further is some reputation of DOM above document
... the highest level of representation in DOM is document
... for fulltext search, we have a bounded set of documents, and we
don't have a great story about how to do this. We need to optimize for
full text search against a collection of docs.
... aside: the word collection is not great; we deprecated objects in
the DOM with the word "collection" in objects. Currently developers are
using "sequence", which are ordered
... I think you want things to be ordered; a book is an ordered set of
documents (usually)
... this makes the case different from a website, which may not need to
be ordered
... another case that's important is TOCs. We don't have a good
mechanisms to easily generate a TOC/outline. We have the outline
algorithm in the HTML spec, but browsers don't implement it
... it is good for accessibility, and would be good for other things.
Regardless, that's the single doc case
... we need a way in may books to generate an outline for more than h1
to h6; that's what the outline algorithm is for
... but many books aren't a single doc; how do you generate an outline
for a sequence of documents? There are ways of doing these things, but
we don't have a standard way to do it
... we need a standard way to represent a book in the DOM
dauwhe: aside: while generating an outline from a sequence
of documents, there is is also value in having an additional navigation
document
... EPUB does something like this and it comes close to what we're
looking for
leonardr: about human curation, and extending the idea
further, while this also needs to show up in the DOM, there needs to be
a declarative model for that organization/sequence as well as the
outline and subsections
... these are two of the aspects we look at when we talk about manifests
... Another point, it's not just the structure and navigation, but also
to represent additional aspects
... e.g., here are elements in the reference and why they are important
so a user agent doesn't have to parse the entire document; those items
are called out up front (fetch this early, make sure this is cached)
duga: Are you intended that an entire pub or book would be loaded in the DOM at once?
Mike: no. You have these docs already, so no, you don't want to construct a DOM object for the whole book
duga: so how would this work?
Mike: magic (we don't know yet)
... a big component that needs to still happen is the offline case. The
case where the user doesn't have a network but they want to continue to
read the same book without losing content
... that's a solved problem with Service Worker. We will have
implementation in all user agents soon.
... outside the document sequence idea, it would be a good idea to think
of solutions in terms of building on top of SW
... we're already assuming that the solution involves using HTML, CSS,
and JavaScript, so we should also assume it will involve SW
ivan: the usage of SW is one of the reasons why our community considered that listing all the resources a publication may need...
???: you can already do that in the SW; there is tooling that goes through all the resources
Kennethe: you do this when you create the app, it discovers the resources you need, what you need in what order
???: these need to be fetched, these need to be installed
ivan: it's good that you say that, because we may have some
terminology issues to clear up
... when me as an author creates a book, at that point, I have a place
where I put the list for the SW to use
Kennethe: it is an array in the SW file
ivan: what we are talking about up til now is that this is info in the manifest
Kennethe: marcos is almost convinced that you'll have a SW in a manifest
ivan: to clarify, there were some email from marcos this morning that he refered to SW API in the manifest - is that it? Y
Kennethe: yes. For a long time there was a discussion if
the SW be included in the manifest.
...: installing a SW means to download the SW and run an install step
... if that succeeds, then the pieces are installed/downloaded
... this is not yet in the editor's draft.
... SW is more stable now, and there are several ways you can call it.
So, now the next step is to do this with manifest as well.
tzviya: there is an experimental SW reading systems; it needs attention, but maybe more than one person shoudl work on this. Any volunteers of people who have experience with manifests and SWs, that would be appreciated!!!
Dave and Kennethe - FTW!
<tzviya> ac krd
rdeltour: want to make sure we are done with SW? yes, so
web client
... it was promising, but it failed. It made assumptions on the level of
the headings based on hierarchical position isntead of name
... was one of the few people that tried to implement it, but it has
some value
... using the TOC generation will be even harder in a multidoc concept.
Since browser vendors have moved on, is it even worth pursuing?
<tzviya> https://www.w3.org/TR/html51/sections.html#the-h1-h2-h3-h4-h5-and-h6-elements
Mike: the outline algorithm is a consequence that we
changed HTML to allow h1 elements to be arbitrarily nested. As long as
we have an HTML section article, we're stuck.
...: we could say "don't use h1 anymore"
... what they fixed in HTML5.1 is to say "don't nest h1"
... on the UA processing side, they still have to deal with the nested
h1 case
... accessibility software doesn't use the outline algorithm either
... if all you've used in your doc is h1, then the screen reader wont'
see any structure
rdeltour: should we just move on, or should we put more effort into this?
Mike: don't know how to move on from this; in the multi-doc case, this doesn't change anything.
rdeltour: just ignore automatically generated TOC and manually create it?
Mike: that's what we are doing now
dauwhe: while talking about the collection DOM element,
this might be a mechanism for solving another problem in the multidoc
space
... There is info, esp CSS stuff, that needs to persist past doc
boundaries, such as counters
... we need to resume from last known value, and there's no place to
keep track of that info. If there's a higher level object, that might
allow counter values to persist
... The other question is, if there is such an element in the DOM, can
we have an element in meatspace? If there is an element in a doc, should
it point to a file entity that enstatiates this doc element?
ivan: to add to the list of things we don't understand,
doesn't that mean that the CSS processing should also have this notion
of multidoc?
... if we have to have list counters that jump from one doc to another,
then this goes to the same way. It's not only the DOM in HTML, but also
the way CSS processes things?
astearns: I don't think so; we have longform docs in multiple chapters, and people figure out how to get list counters to persist. Let's keep doing that.
<tzviya> acl le
leonardr: a conversation this morning about storage and
security, this is an interesting problem for a security perspective
... If we assume that want each publication to be unique (unique origin)
because as we add rich scripts, we don't want them to be able to
influence other publications
... that then influences how we want their storage to go (both temporary
and persistent storage)
... things like cookies, local storage, things we can do today - there's
the simple case of local storage, but now span that to a collection of
books (e.g., collection of Harry Potter that you'd want ot have info
available across the collection)
... whatever we do has to abide to the security model of the web. In our
design, we need to ensure we are designing around the security model fo
the web.
... some of these questions, esp. as we think about collections, become
interested and complex to fit into the security model of the web
tzviya: what we're tlaking about today is that we have ten files; you are talking about files of files
duga: please don't make us rely on previous chapter's
counters to render current chapters. Don't want to have to render the
whole book just to know this is list item #74
... it's a lot of work.
tzviya: if I am a publisher, if I have 3k footnotes in a book, please dno't make me manually number them
duga: if you spread 500 footnotes per file, just note at the start of each file that it starts at 500/1000/etc
tzviya: my author may change the footnote numbers the day before publication
Kennethe: why footnote numbers? why not identifiers?
tzviya: that would be lovely, but that's not how the scholarly publishing community functions (and we can't change that, as much as we'd like to)
ivan: this is market reality
... we can't say "change how you've done thing for centuries"
dauwhe: there are also human usability things here. If I'm
reading Moby Dick in print, I get a visceral sense of where I am in the
ocean of text
... as humans, we like having guideposts to things
<astearns> Areference number in printed material will be useful much longer than a URL will be, given current standards of web archivability
dauwhe: our reading systems have created analogs to these things (e.g., character counts, progress bars)
<astearns> so the conservative academic community might be on to something, at least for now
dauwhe: having a one dimensional indication of progress
through a thing is of value
... what the best way to do that in an Internet environment is, not sure
... the system digesting the content is going to have to do some
figuring out of how to do this
leonardr: yes and no. You cannot realistically consume an
entire large book. There are cases where the current state of reading is
not over the entire thing; it might just be the current piece you are
viewing
... maybe you started in the middle of the content.
... Current models of navigation is evolving.
dauwhe: just saying there has to be a model, but saying
thta it's hard for computers is a cop out.
... it will have to take into the entire environment
leonardr: strongly disagree
glazou: listening to everything so far, long ago we wanted
to have multi views, now we want multi doc, single view.
... we have to look at what remains constant. What is constant is the
browser context and the viewpoint.
... it could be a way to preserve data across the rendering of the
documents, because it is about the rendering
... the browsing context and the view port has the concept of scrolling,
which takes into account the "size" of the book. This can be rendered in
different ways
... it's worth investigating if we can glue something here to solve the
collection problem, the CSS persistency problem
... earlier I heard need for search, manipulation, materializing into a
document instance a collection. That's probably an HTML document, but
the content is still TBD
Kennethe: if you want to use the manifest, it would probably make sense to have an entry called the TOC, and it refers to the doc that has an HTML element - where you start
tzviya: this is what we have now in the EPUB space
... any other thoughts from the silent observers in the room?
*crickets*
*brief rabble rousing moment*
<leonardr> https://w3c.github.io/manifest/