Digital Publishing Interest Group Teleconference

05 May 2014

See also: IRC log


Liza Daly (liza), Ivan Herman (ivan), Ben Ko (benjaminsko),  Tzviya Siegman (Tzviya), Markus Gylling (mgylling), Julie Morris (Julie_Morris_BISG), Tim Cole (Tim_Cole), Brady Duga (duga), Phil Madans (philm), David Stroup (david_stroup), Michael Miller (AH_Miller), Laura Fowler (Laura_Fowler), Bill Kasdorf (Bill_Kasdorf), Bert Bos (Bert), Karen Myers (karen), Gerardo Capiel (gcapiel), Luc Audrain (Luc), Dave Cramer (dauwhe), Suzanne Taylor (Suzanne_Taylor), Liam Quin (Liam), Paul Belfanti (pbelfanti), Alan Stearns (Stearns), Shinyu Murakami (murakami)

Thierry Michel
Stuart Sim, Scholastic
Markus Gylling
Ben Ko


<mgylling> prev weeks minutes: http://www.w3.org/2014/04/28-dpub-minutes.html

Metadata TF

mgylling: 2 agenda topics for today and previous weeks minutes to be accepted. minutes accepted.
... second topic is structural semantics, first being continuation of last week which ended prematurely
... what are next steps for metadata? bill, in terms of harvesting info from interviews, where are we?

<mgylling> BillK’s summary: https://www.w3.org/dpub/IG/wiki/Task_Forces/Metadata#Phase_1_Strategy

mgylling: last week's discussion coalesced in a good area--here's what we were hearing--priorities are: folks want to associate metadata with publications...down to phrase level..people's names..
... liza gave good introduction framing this. we're talking about metadata in web contexts, not in contexts like products
... top of list is subject metadata, lots of vocabs very large, subject of keywords came up frequently, may handle separately...onix code lists...id's, extensibility, pedagogical information, rights information. those are priorities bill heard
... two important things: schema.org already supports a lot, as well as microdata and rdfa
... in schema.org, many vocabularies and complex. many publishers want the detail...we ought to be agnostic as to vocabs just support the mechanism. schema.org has lots of these mechanisms. suggest partnering with BSIG.org

<TimCole> +q

<liza> Good summary, Bill

TimCole: dynamicism of metadata of certain fields, tends to change a lot more often and build up over time vs natural resource does this have implications for best practices of ????

Bill_Kasdorf: typically terms don't go away, new terms are added. old terms aren't made obsolete. lots are volatile. changes-rights, territorial, marketing. it's arguable that one needs to be careful what metadata you embed unlesss you can maintain it

pbelfanti: agree we need to articulate how to support it vs back one particular standard. where there are standards, is there a way to acknowledge that. e.g. with EDUPUB
... can those particular instances (like EDUPUB) be acknowledged?

mgylling: where acknowledged?

pbelfanti: in the guidelines and recommendations

Bill_Kasdorf: can be put in the to do bucket. ability is already there to point to an authority

pbelfanti: there are different entities....there should be as much consistency as possible

ivan: i've reached out to schema.org Dan Brickley public face of schema.org answered. when he is back home, happy to come to one of these calls and answer questions.

mgylling: what are gearing up to do with schema.org

ivan: if there is an agreement that we would work together with light vocab for schema, we should find out if they (schema.org) are interested in getting it into schema.org

mgylling: what problem are we trying to solve in one line?

Bill_Kasdorf: since we resolved last week it is appropriate to focus on books and since bsig.org is already working on this in examining schema.org and how well books are covered, there is an ongoing activity that we shouldn't have to duplicate

Julie: we are not far enough along to report on findings. done some initial work on common core, mapping elements... people know onix very well, trying to map what onix elements to schema.org, lrmi metadata
... gerardo from benetech(?) completed a metadata crosswalk specific to accessiblity within edupub. see it expanded to other data points where it makes sense. not all of onix.
... identify where there are mappings and where add'l work might be needed. consult with schema.org right and conduct a mapping

mgylling: what is the problem you are attempting to address julie?

Julie: identify where if particular metadata is input into onix where pubs can take advantage of metadata and include it in their epub files to be able to have structural metadata as well.
... with less specific goal in mind: general win to show consistency between different metadata schemas
... whether that is worth it to future of onix, hopefully they are in alignment as much as possible

Bill_Kasdorf: keep with coming from coming up with yet another vocabulary. this is in response to enthusiasm for developing an "onix-lite" (not official name). advise we go slowly on that. instead see what BISG is doing. they represent whole book pub industry and a re responsible for onix in North America

mgylling: one liner summation: current metadata schemes used by publishers are not web enabled, separate envelopes that don't blend with content properly. is that a problem

liza: yes that is a problem

mgylling: there is an absence of mappings between various problems out there.

Bill_Kasdorf: true but i would say the problem is the capabilities of schema.org to express publishers' metadata is not well understood

mgylling: i can live with those

ivan: .... [sorrry, accidentally lost a line here].. if people use same metadata but different syntax, then that metadata is visible. my mental model is to do something similar. they could map 80% of their properties to schema.org
... if they do that to onix, that is not possible as onix is way bigger. a model is to make a version of it and map it to schema.org hierarchy. so conceptually same terms could be used on web and searchable on web

Bill_Kasdorf: onix--while it's huge, it's also modular. it's conceivable, at a high level, we have the properties that could be put in schema.org. the codelists in onix would not be put in schema.org. but referenced

mgylling: two diff levels. should we have web-enabled metadata? who should host the vocab

<Luc> +100

dauwhe: our onix data is all over web, amazon's book pages is example of that. not a direct process. onix travels separate from books. quite deliberate. want to update metadata with out having to update epub.
... one thing lacking onix describes the entire publication. we may want more granular publication. no way to describe part of pub

Bill_Kasdorf: that's a fundamental point

Luc: onix.. distribution schema for prices territories rights and so on.. made it available on the web. agree with dave. metadata made available by bookstores. e.g., user is looking at book from certain country, needs to be able to be shown country-specific rights, priceses per territory
... when ingested by bookstores, max is available on the web

ivan: understand that. question is: does it still make sense to define onix schema having a version of a subset of the whole thing that can be put as metadata for specific book in the book, but in various catalogs, e.g., in critiques of books, descriptions of books, so search engines can find them
... should not have pricing in there because that's too complicated. maybe publishers don't need that at all

luc: we have this possibility already. made available to bookstore or website to consumers. the right thing as publisher for me is to include metadata inside the epub file.
... include an onix lite without pricing for reading systems to use it

ivan: having it there in a way that would be compatible with rest of schema.org

Luc: agreed

mgylling: when you french guys say onix lite, do you just mean a better dublin core or retail metadata as well?
... in the epub

Luc: both bibliographic and marketing
... often at the end of the print books, there are quotes, reviews, that would be put inside the epub itself, moved/transported/exported to onix, could be an html file that could be a new page of the book. but it would not be true metadata, just html
... onix tags enable true metadata
... through metadata tags not just div or span

TimCole: comes back to last week's discussion. libraries often use 10% of onix to create bibliographic record
... other things we talked about is linking--pricing, structural--so we have to look at onix not at onix lite, but a new module or focus that is these are the elements we need to enhance discovery, possible to link to info that is dynamic
... hopefully won't require new stuff, but selectively into RDFa or schema.org approaches

ivan: it would be helpful for me if you luc wrote down yr vision of where the importance of this core work would be
... just email or wiki page


Bill_Kasdorf: there is another party that is crucial. that is EDItEUR they have a global focus, not only looking at this from english language or US POV, seeing it from asian, etc point of view, completely familiar with RDFa, etc

Julie: we have reached out to graham to ask him to participate

mgylling: to summarize, we are kinda sorta noting that there is possibly a use for web-enabled vocab for books that has roughly same scope as onix and should live on schema.org
... in order to ascertain that this is the case, we need to discuss more /stakeholders, not just graham and EDItEUR but also with BSIG so that would be first next step
... next step would be modeling how would that look. but must make sure all stakeholders are at the table. as we know, schema.org already has book object. think it's great. it's richer than dublin core, with which epub is stuck with, horribly under-specificied, too weak for books
... identify where dublin core needs greater specification
... action item is with chairs, and with julie and bill, to arrange necessary calls. figure out logistics offline

<liza> +1 everyone!

mgylling: any final comments on metadata?

Content & Markup TF

mgylling: topic 2. quite awhile ago and had the task forces define their current top level goals in order to get going in some of the slower groups. the content and markup group had action to focus on structural semantics--problems and use cases. that was delayed and that was my fault

<mgylling> https://www.w3.org/dpub/IG/wiki/StructuralSemantics

mgylling: today we are on our way with the wiki page with walkthrough of structural semantic inflection in open web platform. link in chat.
... hopefully it is straightforward, broken into sections. first one is use cases, aiming one dozen use cases, not going for 100's.
... many things publishers want to do is similar, so wanted bunch of core uses cases. use cases at the moment are divided into 3 categories: enhanced user behaviors based on clarity of metadata, glossaries, indices, ... footnotes, generalized here to be about conditional exposure of optional content

<liza> Everybody loves popups!!!

mgylling: second category is assistive technology: one about navigation, context, cues, ability (DAISY feature for audio reading)
... third is for re-purposing of content/re-mixing, a vast problem area to address. simple structural semantics would be surprised would get us more than 1/2 way there
... feel free to propose add'l categories and use cases
... second part of page: we will be bringing in people from ARIA working group and from (??) the html editor to understand which of the html5 attributes we should be using to do this...
... beyond the use cases, understanding the scope is this interest group can come up with recommendation to digital publishing including future epub specs which way to go
... epub is using first of options, name space attribute, and there are several agencies(?) and good reasons for revising that. that's a quick run through of the document

<liza> @epub:data-aria-role-type

ivan: several things: one-for use cases, there is a general use case-profile specification. as far as i know, it defines a specific profile and that must be marked up somehow

mgylling: that is not structural semantics, that is publication semantics()?

ivan: so that means within the content i would not have to mark up certain elements are playing a certain role?

mgylling: what did you mean by profile? top level metadata

Luc: for table, missing a column "suitability" whether usage of that specific construct is suitable for whatever we want to do
... e.g., data attribute series could be used, but is sort of against the text and spirit of html5 spec.

mgylling: html editors get angry whenever we ask
... so we can add column to reflect wisdom of w3c editors themselves

ivan: look at row of custom elements. that would mean in terms of open web platform, use the web components, gives structured way of defining structured elements, make use of very small but existing scripting, and is that acceptable?
... do we allow for scripting?

mgylling: the custom elements would be eligible to be prime candidate since it would be yes in all cells?

ivan: maybe, but it has a drawback which is not in the table, which is scripting

mgylling: it falls under the weirdly named "static provisioning" vs dynamic provisioning. custom elements would be no under static provisioning. scripting is still 2d class citizen. disabled by many reading systems. is that true in the future?
... maybe scripting will be just as it is on the web

ivan: at least somehow we can record that in the table

mgylling: those who read through the use case collection have asked "why you haven't added ...."? gerardo are you here?
... do we have a good use case around infographics? typing SVG fragments

gcapiel: this...would also extend into SVG. use cases that rich and i and others can be incorporated into this. we can link to that. didn't think you'd go into SVG itself

mgylling: this is for open web content so not only html5, could be mathml as well. could you have a look and bring in something around complicated diagrams infographics, how it could be used in that area

gcapiel: other things ...big use for braille transcription. maybe have someone from that community to see if we have covered what we need. George would know some folks
... folks from APH would know that well. also Bert Bos.
... thinking of the person who has been working..

[lost connection. dialing back in]

<murakami> O'Reilly's HTMLBook will be a good use case of structural semantics http://oreillymedia.github.io/HTMLBook/

<astearns> Bert Frees?

<tzviya> me: thanks, ben!

mgylling: thanks to Ben for scribing!
... thanks everyone. Goodbye!

[End of minutes]

Minutes formatted by David Booth's scribe.perl version 1.138 (CVS log)
$Date: 2014-05-06 07:20:22 $