W3C

Digital Publishing Interest Group Teleconference

02 May 2016

Agenda

See also: IRC log

Attendees

Present
Ivan Herman, Dave Cramer (dauwhe), Tim Cole, Shane McCarron (ShaneM), Markus Gylling, Alan Stearns (astearns), Brady Duga, Bill Kasdorf, Luc Audrain, Ben De Meester, Leonard Rosenthol, PeterĀ  Krautzberger (pkra), Chris Maden, Charles LaPierre, Romain Deltour (deltour_, Rebeca Ruiz (rebecaruiz), Bert Bos
Regrets
Ayla, Heather, Daniel, Vlad, Jean, Tzviya
Chair
Markus
Scribe
dauwhe

Contents


<scribe> scribenick: dauwhe

mgylling: major topic today is annotations
... but rob is stuck in traffic
... tim, ivan... is this a reason to reschedule?

ivan: I think we can manage

Bill_Kasdorf: can we move it to the end?

mgylling: sure
... OK. Let's do everything else first

<mgylling> https://www.w3.org/2016/04/25-DPUB-minutes.html

mgylling: minutes from last week
... any objections to approving?
... minutes approved.
... it's conference season. Next week many of us will be in Chicago for IDPF
... so I suggest we cancel the meeting on the 9th
... and there's another conference in Stockholm the next week
... let's wait on deciding about the 16th
... we should be back to normal week of May 23
... which is the week of the virtual F2F

Virtual F2F agenda

<mgylling> https://www.w3.org/dpub/IG/wiki/May_2016_Virtual_F2F

mgylling: so we need to work on the agenda for the virtual F2F
... so let's spend some time on this
... they are sometimes great and sometimes not
... which topics are most critical to cover?
... we've already mentioned documents that are nearly finished
... Charles, Deborah, and their team finished the a11y note
... the main topic would be the use cases collection
... are there other things we should add?

ivan: on the previous comment
... I've made a new version of the pwp draft
... which merged with locator work and jettisoned the use cases
... it's on a separate branch
... if you think it's ok I can merge
... or I can merge now then respond to issues

mgylling: is this a f2f topic?

ivan: no, this is about documents
... I think that the use case document is the absolutely highest priority document
... i don't mind if this is the only topic

mgylling: i agree

ivan: we can talk about the manifest format
... but that has a lower priority

lrosenth: and we need a straw man

mgylling: straw man of what?

ivan: manifest
... the real issue is format and syntax of manifest

mgylling: we could ask dave and hadrien to discuss BFF manifest work
... in order to see what topics within the use cases we should focus on
... manifest is one
... we've talked about portability and packaging
... why do you need portable documents when you have the web?
... what are those high-level use cases?
... we've been touching on what are the requirements of the package itself?
... like streamability, random access, digital signatures, etc
... we have some work to do there
... and maybe manifests is a sub-discussion there

ivan: I think so
... when you talk to web people, the question is why isn't the web enough?
... that's the general question to which we have to have very good answers
... including business cases, production flows, etc
... the packaging format (zip vs nonzip) is secondary

lrosenth: I agree
... i think the use cases will address a lot of that
... to both package and offline cases
... one of which being stuff we put to the side--the package and offline scenario

mgylling: packaging, manifest... do we want a 3rd and 4th topic?

ivan: security
... i'm not taking about the D-word

mgylling: privacy?

ivan: the security in terms of hacking into a publication, javascript as security issue, etc

<laudrain> security in terms of rights

lrosenth: in some ways they're related
... both fall into the category of surprising the user
... they're used to web pages, and what they can and can't do
... they're used to publications, and what they can and can't do
... but now we're blending
... are people used to movies in publication, touch events
... but how do users feel about analytics?
... about publications that phone home?
... that's where security and privacy come together
... we're creating publications that are different than they've ever been

ivan: these are all important points

dauwhe: we should learn from the problems of EPUB. Single origin is core of web security, and that might not work here

lrosenth: we also need to consider the environments, like package and manifest
... how is this material exposed? The origin model may vary

mgylling: OK. packaging with manifest as sub-discussion. Security/Privacy. Those may be enough.
... we don't want to be shallow. We want to work and prose and materials.
... so we probably want an interactive editing environment, like google docs
... Dave, would you agree that the f2f is is a good time to do an introduction to manifest?

dauwhe: I'm worried we'd get lost in the weeds

clapierre: do we just go and add ourselves as participants to the f2f?

mgylling: On wiki page? Sure.
... ivan, is that the intent?

ivan: yes.

<ShaneM> BTW I would be happy to talk about note / noteref / notegroup at the f2f if the agenda feels thin.

mgylling: we probably have enough of an agenda so we probably couldn't get to noteref

rdeltour: can you hear me?
... remind us of the timeframe for the f2f on the wiki page? Starts at 12 UTC?

ivan: yes, it lasts four hours

Charles_LaPierre: does it conflict with epub meeting?

mgylling: yes
... but they should move :)
... anything else about the virtual f2f

<rdeltour> https://rawgit.com/w3c/dpub-pwp-ucr/index-update/index.html

rdeltour: quick comment on use cases
... heather is working on this branch

<pkra> that never happens!

mgylling: we've wanted to talk with annotations wg for a while
... can you describe where you are? A brief recap?

Introduction Rebeca Cruiz (new member on the group)

rebecaruiz: hello

mgylling: can you introduce yourself

rebecaruiz: I work for Cornac Servicios Editoriales, a publishing service company
... we are very focused on syntax and typography in Spanish text
... we hope to work in this group a lot

Overview of the Web Annotation Work

<TimCole> https://www.w3.org/TR/annotation-model/

<ivan> http://w3c.github.io/web-annotation/

<TimCole> https://www.w3.org/TR/annotation-vocab/

mgylling: tim, would you like to give us an intro

<TimCole> https://www.w3.org/TR/annotation-vocab/

TimCole: (silence)

ivan: why don't I start
... I hope he hears us
... I have put in the URL for the github because that includes the references to the three docs we've published
... we had a charter that included some other items
... we could not get on with them as we didn't the right people etc
... the work that we started
... we started with the work done in the CG
... that's more-or-less what we went on with, the data model and the vocab
... plus we have the annotation protocol between a server and client exchanging annos and data structures
... the biggest difference between what we have now and the CG doc
... is that we've made a cleaner separation between the model and vocab
... and between the serialization in JSON, and the RDF vocab
... the main reason being we wanted to have a doc
... that is readable acceptable useable by web app devs
... the web anno data model is describing data model in pure json
... we had lots of discussion on what terms to use to make it palatable to json users and js
... it's actually json-ld but we don't fuss about that
... we also have web anno vocab, which is clearly rdf
... with examples in turtle
... there is also a json-ld context file which maps between json and rdf
... i will hand it over to tim in a minute, for the diffs between the cg and the current doc
... the fundamentals of the models is the same
... an anno is a small json structure

azaroth: the model is quite straightforward
... there's an anno, which is a web resource
... it has a body, which is the comment or the tag
... and it has the target, which the comment is about
... there's the intention, the body, and the target
... each of those can have provenance, such as creator, license, intended audience
... there can be multiple bodys and multiple targets
... and any could have any format
... you could have a video about an image
... that hasn't changed from the CG work
... we've just added some more metadata
... the new work is based around the community group's efforts on descriptions of parts of resource
... most annos will be about some fragment of something
... like a range of text on an html page
... so we have several selectors to find the content of interest on a page
... and there are workflow stuff like a state class
... that lets you record the http headers that were sent
... so if you had a single resource with a uri which gives both html or pdf
... then the selectors for how to retrieve the text would be different
... so you need to know if user was looking at html or pdf when they made the anno

TimCole: we've also clarified how to do very simple text annotations, made easier for json
... we also refined the idea of motivations, the role of the body esp. if you have multiple bodies
... like in a copyedit environment
... you can distinguish between replacement text and commenting on substance
... we anticipate that communities may focus on parts of the model most interesting to them, and develop them further

azaroth: talk about protocol?

TimCole: you can do server to server...
... the question is how this would play out in pwp, take books offline, then come online
... we don't quite know how this will play out

ivan: one more thing

<azaroth> Model: https://www.w3.org/TR/annotation-model/

<ivan> http://w3c.github.io/web-annotation/selector-note/index-respec.html

ivan: rob has already talked about the selectors and states as being very powerful in selecting parts of a resource

<azaroth> Protocol: https://www.w3.org/TR/annotation-protocol/

ivan: we also made a separate note which makes it clear that that part of the model can be used outside annotations
... it's general approach to selecting parts of a complex resource
... it extracts it and makes it available to people who aren't interested in annotations
... could be an alternative to CFI
... we've also put CFI as a possibility to express a fragment

TimCole: in terms of where the group is
... charter ends oct 1
... we have these three mature WDs
... we have an upcoming f2f in berlin to move these forward, get them to CR this summer
... try to get to rec by the end of the charter
... we don't have as much on client side API
... we may have a note about html serialization
... and there is work on privacy, on publishing annotations that everyone see
... you can signal behaviour with robots.txt, but this is beyond our scope

lrosenth: could you talk about implementation status?
... how many are known, both client and server?

azaroth: because of the CG's work in the past
... there are a lot of implementations of the CG spec
... there are fewer implementations of the WG spec, as it's not yet CR
... Europeana has implemented the model and the protocol
... for cultural heritage
... pund.it has upgraded from the CG to the WG, they have a client
... Hypothesis has not yet released their version that implements the model
... but I've seen it
... that's the client, and they're working on the protocal
... the IIIF?? community is committed to moving from the CG spec to the WG spec
... that would be both model and protocol

<lrosenth> that was great - thanks!

mgylling: does this mean your exit criteria fears are no longer something to fear

azaroth: I'm still a bit afraid, but we hope to

TimCole: lots of work to do on testing

azaroth: there's still work around the edges
... things like intended audience and license and provenance that we've added in the WG, there are fewer implementations

<ShaneM> there are some concerns about testing ;-)

ivan: let me add to the long list two items
... one is that I had a discussion with EDRL lab
... it's on their plans to implement and add to readium
... it's not clear when it will happen
... we are hopeful they can do it, and help with testing for CR
... the other one which is more a kind of hope
... the new version of bluefire will have very good annotation facilities

<azaroth> IIIF: http://iiif.io/ and in particular: http://iiif.io/api/presentation/2.1/ is the spec that uses the Annotation work

ivan: for the time being this is their own thing, but might migrate to spec

mgylling: there are a few other epub implementations that use the CG version
... I can try to dig up the list
... rob and tim, rob, you helped us with epub adoption of CG spec, which is still a draft

<Jean_K> @dauwhe: Please do!!! I'm interested to know the extent of implementation.

mgylling: which is now waiting to be updated to the WG
... is it ready now?

azaroth: I don't think there are significant areas that will change
... maybe after Berlin F2F would be the time to start work on that, it's when we hope to go to CR
... there are a few minor things we need to discuss that might result in trivial changes

mgylling: cool

ivan: the epub version put the emphasis on JSON, which is part of our current rec
... the other thing is that one of the work items
... the json schema done for epub should be reused and updated
... the testing mechanism will use that schema

<ShaneM> (maybe).

lrosenth: your comments are around EPUB, but we're not far enough along with PWP to talk about that

ivan: the rec is about anything

mgylling: so you have this note with frag selectors
... so you won't require implementation
... but 90% of it is exactly the same as the rec itself
... the only thing which is not in the rec is a mapping onto fragment identifiers
... so people who are not interested in annos can use the same stuff for other uses
... so an implementaiton of the rec will have to support xpath and css selectors

ivan: that is correct with one caveat
... is that not all selectors make sense with all media types
... so implementations may concentrate on particular media types

mgylling: the w3c has tried before to introduce new selectors
... and it hasn't gone well
... do you see a dependency on browsers here?

ivan: long term, yes. It's a big unknown.
... implementations can do internally what they want
... ideally this would be implemented in browsers

azaroth: in the meantime, everything can be implemented in JS
... if browsers were to take up heavy lifting
... it would make impl much easier

mgylling: have you had any expression of interest with browsers

TimCole: we've had a long conversation about text
... interest but no action

mgylling: we're at the top of the hour

TimCole: this is a good time for feedback

mgylling: review the EDs

<astearns> possible new avenue for dealing with new selectors: https://drafts.csswg.org/css-extensions/#custom-selectors

mgylling: thanks everyone
... next week meeting is cancelled

<ivan> trackbot, end telcon

Summary of Action Items

Summary of Resolutions

[End of minutes]

Minutes formatted by David Booth's scribe.perl version 1.144 (CVS log)
$Date: 2016/05/02 16:26:09 $