W3C

DPUB IG Locator TF call

03 Feb 2016

Agenda

See also: IRC log

Attendees

Present
Tzviya Siegman, Ben De Meester (bjdmeest), Romain Deltour, Bill Kasdorf, Luc Audrain, Dave Cramer (dauwhe)
Regrets
Ivan, Markus, Daniel
Chair
Ben De Meester
Scribe
dauwhe

Contents


<laudrain> Presnt+ Luc

<scribe> scribenick: dauwhe

bjdmeest: let's begin

<tzviya> agenda: https://lists.w3.org/Archives/Public/public-digipub-ig/2016Feb/0024.html

Use cases

bjdmeest: romain reviewed the use cases; I tried to come up with a couple
... do these use cases make sense?
... Romain said we need more use cases across states

rdeltour: thanks for use cases
... i think that's what we need
... the review I did was for existing use cases; some were outdated
... some are missing
... in terms of process, we should not write new ones to the wiki
... and then we edit in document or create issues
... directly on github
... I haven't reviewed your new cases yet, but they look like what we need

bjdmeest: does it make sense to go over them now?
... we could start locaters doc and put new use cases there

tzviya: Ivan says we should have use cases in a single place
... we want these right in Romain's doc

rdeltour: I think this is OK
... we can discuss use cases on calls, then when we have consensus I can add to html doc

tzviya: we could enter use cases as issues in github

rdeltour: that could work
... they won't need to be fully edited; can be just an idea

tzviya: I'm trying to avoid having this be a once-a-week topic

rdeltour: some epub use cases are not there
... there is one source file that is sent to many retailers--that's how EPUB works today

laudrain: all these retailers make the book available to customers
... and the epub is downloaded to devices or cloud + devices
... or synced to several devices
... or made available to customer-specific cloud
... and I can access this file through different applications
... either directly or downloaded from private cloud
... this epub file is duplicated many, many, many, many, many times
... so there are a huge number of items
... there is one source manifestation, one isbn identifier,
... there are lots of items spread across devices and buyers
... how will we address this in pwp?

<Bill_Kasdorf> thanks, Luc, this is a really core issue

rdeltour: that's unclear to me as well
... we are lacking use cases about transitioning from state to state
... i was thinking about distribution and delivery
... in most of our cases we're talking about a single user
... what distribution model is the user using

<bjdmeest> +1

laudrain: I should try to describe this use case, perhaps with small graphic

tzviya: I think writing the use case as you described it is very clear

<rdeltour> +1

tzviya: publishers distribute one epub to many retailers, and we have to explain what this means for pwp
... is it the packaged state that goes to the retailer?
... is it up to the retailer to maintain portability?

Bill_Kasdorf: it's not necessary the packaged file
... VitalSource might provide the unpackaged file online and not deliver a packaged state

tzviya: we need to point out that this is how publishers work
... and we need to explore retail model
... VitalSource does both packaged and unpackaged

Bill_Kasdorf: we need to be agnostic around package vs unpackage

rdeltour: maybe copy/pasting the minutes into github issue is what we need
... but Luc's use case is about epub, may need to adapt to pwp

laudrain: it's highly probable that publishers will produce packaged and unpackaged thing in pwp in future
... the packaged/unpackaged might not be a publisher question, but more a consumer/retailer issue
... publisher needs to prepare doc in advance
... there is a delivery function, packaged or unpackaged
... delivery means something coherent as a whole, maybe a single folder or a zip or something newer
... this delivery is of a whole

rdeltour: that's the kind of detail we want in the use cases
... how does publisher want to transmit to the retailer?
... we must be very detailed in requirements
... sending a zip is sending a package
... if you want to allow transmitting the unpackaged publication, using ftp or rsync or something
... we should be explicit about that

laudrain: it's true with canonical location we could transmit only the locator to the retailers
... and they would have to grab the publication
... not sure if we will be working like this in future
... question is what is the canonical location

bjdmeest: this is something to be discussed
... we should start from the down-to-earth current use cases
... without getting lost in possible solutions first

<rdeltour> +1

bjdmeest: we can move to the other use cases
... I will add those use cases as github issues
... and concerning the old ones,
... are there any other comments?
... the review of romain was to the point and made sense
... are there other remarks on those use cases?
... otherwise we can focus on new use cases

tzviya: you had some questions?

bjdmeest: just want to wrap up use cases discussion

(vocal and enthusiastic consent)

Locators

bjdmeest: do we agree on some sort of canonical url
... if we have pwps that are the same content-wise, is there a url that binds them?
... doesn't have to be stateless
... grouping pwps that are the same but in a different state

tzviya: someone must have an opionion about this :)

rdeltour: not sure I agree on canonical url as we define it now

<tzviya> paginated view of use case repo: http://w3c.github.io/dpub-pwp-ucr/

rdeltour: mostly because we have 2 dedicated urls, one for package, one for unpackage, third for canonical

<tzviya> Use case repo: https://github.com/w3c/dpub-pwp-ucr

rdeltour: I'm not convinced 3 urls is good design
... once we have unpackaged pub on the web, the url to this will be shared via social media
... and users will access this unpackaged url
... so the canonical url is not used
... so where does canonical url fit

Bill_Kasdorf: I'm confused about whether we need a canonical url
... or a canonical segment of a url to be shared across states
... I want the url to resolve to an actual thing
... either to packaged or unpackaged
... but then you can get to the other
... in order to do that, the url syntax must begin with something canonical across all states
... it's a component of all the urls for that pwp

bjdmeest: best practice would be that canonical url refers to accessible in easiest way--unpacked online

Bill_Kasdorf: how is that different  from unpacked?
... so you declare unpackaged as canonical?

bjdmeest: yes
... inside that package you have metadata with canonical url
... so you have connection between all urls

Bill_Kasdorf: are we by default saying unpacked is canonical
... if publisher says packed is canonical
... can you get online?

tzviya: I don't think publisher defines what is canonical

laudrain: at which stage of distribution can canonical location be defined?

bjdmeest: I think that's a valid point

Bill_Kasdorf: additional states might be created over time
... all pointing to the same canonical pwp

tzviya: why is that a problem

Bill_Kasdorf: zip, tar are pacakged but not same

rdeltour: I'm not sure that pwp always has a canonical url
... if a user creates a local pwp using local tools
... it wouldn't have a canonical url before it's published

<bjdmeest> +1000

rdeltour: I'd say it doesn't have a canonical url unless it's available online

laudrain: we could try to define from reader POV
... use cases are mainly about annotations--it's a reading question
... maybe publisher has nothing to do with canonical url
... it inherits a canonical url when distributed

tzviya: I'd hesitate to define in reader terms, as they don't have understanding
... should use http perspective

laudrain: perhaps define from upstream
... not downstream from publisher

bjdmeest: you know where you got the pwp from
... if you got it from a certain website, that's the first location you know
... as a reader

laudrain: in that case depending on where I buy this pwp
... it may come from different locations

bjdmeest: I think so
... this relates to ivan's comments
... about breadcrumbs of locators
... this pwp is derived from this pwp
... I got it from company a, from publisher b
... we might need more granular derivation
... to move forward
... i'll write a first definition from an upstream point of view
... it will probably be very bad, but we can improve

tzviya: are we using canonical url when we mean base url

rdeltour: we're using canonical url without consensus on what it means

tzviya: ivan says we might not need one
... we need to talk about what we're trying to accomplish
... this group is tending towards the idea of a base url, www.book.com
... book.com that become book.com(#|/)other things

Bill_Kasdorf: exactly

tzviya: we've seen those proposals, with positives and negatives
... right?

bjdmeest: yes

tzviya: let's talk about that base url structure
... I don't know that we need to define the state of the base url
... what's the technical meaning of a publication having that base url

laudrain: instead of book.com, could we say book.isbn.com?

tzviya: it could be laudrain.com :)

laudrain: it's unique

Bill_Kasdorf: let's call it pwp1.com

laudrain: idea is it's unique

tzviya: could be orchid

Bill_Kasdorf: if you use actionable DOI, then that could point to different states

tzviya: I'm not happy with DOI right now

Bill_Kasdorf: that's how it works for journals now

tzviya: when it works

<Bill_Kasdorf> s / orchid / ORCiD

rdeltour: do we want to discuss nature of canonical url

bjdmeest: would be good to get more comments in advance

rdeltour: can we list consensus points
... pwp doesn't have canonical url until it's published online

<bjdmeest> +1

Bill_Kasdorf: my reservation is
... principle is that there is no difference between online and offline

rdeltour: if there is no online version, what happens?

bjdmeest: doesn't make sense to have locator to offline version

laudrain: in pwp there is web

Bill_Kasdorf: I was confusing online with unpacked

rdeltour: there is still a need to define locators for offline pwp
... as users would want to annotate
... and we need to reach internals

tzviya: right

bjdmeest: we first need to locate the pwp
... then locate resource in that pwp
... base url is about locating the entire thing
... then need to locate resources inside pwp

<rdeltour> +1

bjdmeest: I don't think the two should depend on each other

laudrain: I'd like to go back to Romain statement about offline pwp
... I don't think an offline pwp can exist first
... i think the first state is online, when it's published
... before it's published there's no base url or canonical url
... it's not finished, it's not stable

<bjdmeest> +1

laudrain: it becomes available when published on the web
... so it cannot be first offline

tzviya: my concern is that this is saying
... a web publication doesn't mean it has to be published on the web
... we want this to encompass all the things
... I'm not going to create a website for my business doc
... some companies use epub for internal documentation, possibly on an intranet
... but might just be shared in other ways

<rdeltour> +1000

tzviya: or if we want pwp to be used for contracts, it can't be required to be online

bjdmeest: does this mean it can't have a url?

laudrain: it's like a namespace
... which might not have a webpage, although it's a url

bjdmeest: that's clear agreement
... a pwp needs to have some kind of URL
... so that if it's moved or copied, the URL can connect them

laudrain: the question after that is if we do publish to the web
... this url is supposed to point this content

bjdmeest: i wouldn't make it a must
... it's a best practice
... to say that the URL is dereferencable to the actual content
... but that might not be possible or practical

laudrain: also retailers :)

bjdmeest: OPDS might be related work
... we can close this for now
... try to write something up as a base for discussion
... the last thing is more about the discussion between ivan and bill about breadcrumbs of locators
... if you download a pwp, then it's different from where you downloaded it from
... but points to it

Bill_Kasdorf: if you download *and change* it's no longer the same pwp

bjdmeest: yes
... it's always a different object

Bill_Kasdorf: we're saying the pwp is really focused on the item, not the manifestation
... but Luc's scenario... do you now have thousands of pwps?

tzviya: that's the point of what we're doing, to avoid that

Bill_Kasdorf: only if you change it is it a different pwp

tzviya: but we annotate

bjdmeest: we could say the checksum is different, than you have a non-identical pwp

Bill_Kasdorf: does the annotation use case survive that

bjdmeest: yes, if you don't use the anno inside the pwp

tzviya: as long as annos are not standalone
... same issue with annotating websites

Bill_Kasdorf: that's why anno spec is important

tzviya: the data model is what ties this together

laudrain: is annotation part of the content?

tzviya: it's more of a philosophical question

bjdmeest: indeed

Bill_Kasdorf: your tech description of the checksum is a nice clean way to do that
... the annos are outside that

laudrain: I have another use case
... pwp where I have made updates (like a travel guide)
... so restaurant or hotel website or email address or phone can be updated
... the content changes

bjdmeest: the content of your files themselves don't change

laudrain: there may be a default, an initial value

Bill_Kasdorf: there are 2 diff use cases
... if you issue a new one it's diff
... but if you dynamically update online it's not new

bjdmeest: in first case you can say new pwp derived from old pwp
... can still make annos while keeping link to old one
... and your annos might survive all the different versions

Bill_Kasdorf: that's where ivan's breadcrumbs come into play

tzviya: do we want action items

laudrain: I'm ok to copy/paste use case to github issue

bjdmeest: I'll try to make a first suggestion on how this canonical url can be defined
... these last few minutes of discussion were important for the breadcrumbs idea
... that's it

tzviya: I'll send around minutes

bjdmeest: thanks all

[End of minutes]

Minutes formatted by David Booth's scribe.perl version 1.144 (CVS log)
$Date: 2016/02/03 16:17:26 $