Publishing Working Group F2F, Day 1 — Minutes

Date: 2018-05-30

Attendees

Present: Ivan Herman, Dave Cramer, Rick Johnson, Joshua Pyle, Brady Duga, Garth Conboy, Rachel Comerford, Tzviya Siegman, Marisa DeMeglio, Ben Walters, Luc Audrain, Romain Deltour, Wendy Reid, Reinaldo Ferraz, Charles LaPierre, George Kerscher, David Wood, Avneesh Singh, Benjamin Young, David Stroup, Zheng Xu, Ben Dugas, Leonard Rosenthol

On the phone: Deborah Kaplan, Jun Gamou, Jean Kaplansky, Laurent Le Meur, Toshiaki Koike, Ric Wright, Chris Maden, Hadrien Gardeur, Tim Cole, Jeff Buehler

Regrets:

Guests: Kroner the Wonderful Guide Dog

Chair: Tzviya Siegman, Garth Conboy

Scribe(s): Dave Cramer, Rachel Comerford, Joshua Pyle, Romain Deltour, Garth Conboy

Image of the WG around the meeting table

(See further images of the meeting, thanks to Ben Dugas, Kobo.)

Content:

1. technical summary
2. epub:type
3. Infoset
- 3.1. list of resources
4. Manifest serialization
- 4.1. Infoset items in json + schema.org
5. Resolutions

1. technical summary

Tzviya Siegman: where are we?
… we have a lot of open issues on WPUB
… our goal is to clean stuff up, so we can get to the next release
… we want the spec to be more readable
… we want to pull out of the weeds and get to the big picture
… we need to agree on the direction
… we want to formalize infoset and manifest
… we need to resolve some bigger-picture ideas
… like WPUB in isolation, the relationship with PWP, and EPUB4
… hopefully by the end of the meeting we can revise our spec and go to another draft

Ivan Herman: I would like to be able to go home in two days with a clear idea on what should be in the manifest in what format
… we have had too many philosophical discussions over the months
… and the clock is ticking

Tzviya Siegman: we don’t want to rewrite the charter
… we also need to address the use cases and business requirements
… Josh and Nick have been working on use cases
… we have a session tomorrow on affordances and use cases
… we need to think about who will use this specification
… and the relationship with EPUB 3.2
… if we do EPUB 3.2 next month, and EPUB 4 the month after that, it might be confusing to the market
… there is stuff in EPUB that doesn’t fit into the HTML world, like epub:type

2. epub:type

Tzviya Siegman: let’s have a history lesson
… epub:type derived from DAISY’s Digital Talking Book (DTB)
… there were terms like chapter
… EPUB 3 incorporated these terms, to make epub more accessible
… chapter, abstract, index, frontmater, backmatter

Deborah Kaplan: technically, DAISY DTB made books accessible to blind and VI people; DTB is not for all PWD.

Tzviya Siegman: the idea is that this would make books more accessible

Garth Conboy: epub:type vocab: https://idpf.github.io/epub-vocabs/structure/

Tzviya Siegman: and provide publishers with structural semantics
… but it didn’t really help a11y
… but assistive tech did not really pick this up
… reading systems do pop-up footnotes based on epub:type
… but that’s about it
… it helps publishers move from XML to EPUB

Deborah Kaplan: pagebreak gets used, in some reading systems, too.

Tzviya Siegman: every publisher wants to say this is frontmatter or this is a chapter
… they want to maintain this vocab

David Wood: that list you give…
… these terms didn’t come from DAISY, but from typesetting
… Should these terms move into an e-reading environment?
… with this lack of update, have there been discussions that these terms don’t have value in digital formats?

Tzviya Siegman: the original intent was for a11y
… then we joined w3c and we’re trying to find out how to make this list useful

Tzviya Siegman: https://www.w3.org/TR/dpub-aria-1.0/

Tzviya Siegman: so we worked with the ARIA WG to make this useful for AT
… Matt and I found the most useful terms
… they went into the DPUB-ARIA vocab
… they map directly to a11y APIs
… these are the most widely used
… and I’ve asked what people wanted to add to the list, and there haven’t been too many requests

Luc Audrain: re: XML and publishing
… publishers are using XML, but not for that
… we have XML for cooking or gardening or dictionaries
… that have semantics for those domains
… the XML for epub:type is the “book” structure
… we need that in EPUB 3
… we need to use the EPUB 3 as an asset and not just a publication
… we can chunk an EPUB per chapter, for example
… and you need the front matter to go with the chapter, so you need to know what all these things are
… when we moved to EPUB3 at HL, we asked all our suppliers to use epub:type, so we can retrieve the book structure
… it happens now that when we implemented ACE, the epub:type allowed us to easily implement the aria roles
… so I’m glad we used epub:type
… I hope that in WP we keep this mechanism, the structural description of the publication
… HTML5 will not add the book structure types
… but we need this for our assets

George Kerscher: there’s a thing called AAP DTD ISO 12803, an SGML vocab for publishing, which was adapted for HTML, and DAISY’s vocab came from there
… the importance of semantics is not only for a11y
… but to create meaningful comments

George Kerscher: AAP DTD ISO 12083 was created in the 80s. The first HTML elements were extracted from this DTD. Also, DAISY used the AAP DTD for the small vocabulary used in the DTD. Later Ansi/NISO z39.98 defined a vocabulary and the epub-type terms came from that work, i.e. Z39.98.

Deborah Kaplan: re dpub-aria
… one thing that is still true
… is that some of the ARIA documents imply that all aria roles should only be used for AT
… strongly, strongly imply that @role is ONLY for AT
… we would need to talk to ARIA about that

Tzviya Siegman: dkaplan3, you have hit the nail on the head
… I agree with you

Jean Kaplansky: +1 dkaplan3’s comments.

Tzviya Siegman: based on my recent knowledge of ARIA
… adding more roles increases confusion
… and they won’t add more roles
… or change their documentation
… default semantics are often sufficient, and you risk overwriting default semantics
… so aside role=chapter gets messy, and is often unintentional
… 1. what are our options for epub:type
… 2. what do we need as standard? Is this a workflow thing? or a standard thing? a best practice?
… 3. If we talk about standardizing, what are our options

Leonard Rosenthol: the ARIA group is not clear on what they’re doing with @role
… MS is talking about using commenting, the annotations into the content itself
… they’re using ARIA in a similar way
… there’s a need to enhance HTML semantics
… but no one has a good way to do it
… maybe our group should do that

Tzviya Siegman: I’ve talked to anno people; they’ve nixed the idea of using role
… if we proposed DPUB-ARIA today, it would be shot down in two minutes

Laurent Le Meur: if we try to use aria, we have conflict between semantics and a11y
… we could use microdata

Deborah Kaplan: laurentlemeur: ++, with conflicts between semantics and aria

Laurent Le Meur: specific html atttributes

Ivan Herman: the alternatives we have from a technical point of view…

Dave Cramer: .. in html it is in theory possible to define another attribute

Ivan Herman: any HTML parser would parse it, it would end up in the DOM, but the HTML would not be valid
… but we have extended HTML with new attributes

Benjamin Young: https://w3c.github.io/html-extensions/

Ivan Herman: ARIA is one of those, there is another one for controlling translations
… I’ve started asking the team about how it might be done
… we have to define an attribute, what are the possible values, and we have write conformance statements
… on which elements are the attributes usable, etc
… this is a possible route ahead
… the question is, do we really need it
… what are the usages today
… which are not covered by DPUB-ARIA today
… it’s usable today
… are there values that are not in DPUB-ARIA? They will never be in DPUB-ARIA.
… then we have to define a new attribute, with all that work

Romain Deltour: what’s the use case?
… we keep mentioning ARIA
… aria is used for a11y

Deborah Kaplan: romain++

Romain Deltour: aria doesn’t need new stuff
… I don’t feel like there use cases for the readability of the publications

Luc Audrain: +1

Romain Deltour: there might be special affordances based on epub:type

Benjamin Young: +1 to affordances first

Romain Deltour: the primary use case for such a beast is a workflow one, to help with production
… I think it’s orthogonal to what we’re doing

Benjamin Young: epub:type is partly data semantics, which is workflow
… and partly interaction semantics, like footnotes—something happens experientially
… and then look at tools
… we might not want to map those different things into the same attribute
… or there might be other tools for data semantics, like RDFa or microformats

Tzviya Siegman: the 2 questions
… 1. how is epub-type used today
… 2. what solutions are available

Joshua Pyle: I was mostly going to complain :)
… looking at an epub right now
… I have doc-abstract, a zillion things that are tagged at least three ways
… we shouldn’t keep piling stuff into aria
… structural semantics are not necessarily a11y
… what we’re doing is bigger than WP
… as I look at this HTML, it has epub:type everywhere, now I have an epub namespace in a WP
… that’s a problem
… it’s weird to have epub tagging in something that might not be epub

Tzviya Siegman: Luc gave an example of how epub:type is valuable to Hachette, separate from a11y

David Stroup: do we have that mapping of what exists in epub type and not in others?

Tzviya Siegman: at wiley we use data-*

Garth Conboy: if wiley needs to use hachette’s data, then we might need to do something

Luc Audrain: I think semantics are more than workflow

Garth Conboy: what we have from epub:type into aria is sufficient?

Benjamin Young: very rough slides of options in HTML for expressing various epub:type-s https://usercontent.irccloud-cdn.com/file/ZS6lycM6/epub-type-alternatives-in-html.pptx

Zheng Xu: we we use epub:type to improve the view, like footnote
… the semantic info for epub: type is necessary
… we want avoid different types of semantic info from different publishers
… but it might not need to be epub:type

Joshua Pyle: you can use dpub-aria

Luc Audrain: it’s not sufficient

Brady Duga: other than footnote, what do you use it for

Zheng Xu: I’ll find out later

Rick Johnson: we see lots of it and ignore everything but page-breaks
… there’s no other way of getting info

Tzviya Siegman: that’s in aria

Ivan Herman: I want to hear a use case that is not in dpub-aria

Luc Audrain: bodymatter

Tzviya Siegman: it can be useful because kindle uses it in the deprecated guide

Deborah Kaplan: I have a mild issue with saying these things
… I was going to bring up pagebreak, it’s in aria but RSs use it
… there are 3 use cases
… what is a production thing that someone uses in-house
… and what is a11y semantics

Tzviya Siegman: +100

Deborah Kaplan: jamming them all together causes problems
… putting semantics in aria roles has a bad affect on AT users
… which is orthogonal to the question of how much the vocabs overlap
… it’s important to realize that there’s a reason to keep some of these things separate from AT

Avneesh Singh: re: history, lots were in DAISY specs

David Stroup: *

Avneesh Singh: we need to evaluate what can be used by screen readers
… I’m ok with keeping aria the same
… we want to implement epub:type in screen readers
… when we talk about different vocabs from different publishers that worries me
… we should be collaborative
… if we come up with new universal semantics
… then we can map it to aria roles
… having a definite list of semantics is important

Tzviya Siegman: we have five minutes and a queue
… a11y aside, this is important, but once we have this is that we can map it to a11y

Luc Audrain: +1

George Kerscher: having a common vocab is important
… when publishers talk to authors, they need words that aren’t in html like glossary
… when a publisher talk to vendor, you need a common understanding of words
… or when I’m teaching
… I think it has value across domains

Charles LaPierre: This also parallels to personalization and we are struggling with this same problem.

Ivan Herman: i’m trying to see the way forward
… tzviya and I, we have to find out what are the official ways to add new attribute to html whose values we partially control
… we must have an attribute in html that expresses the “role” of the element without using @role, which was taken by a11y
… we might need to work with other groups
… there is a personalization group that’s also looking at new attributes
… we should work with them

Charles LaPierre: BTW I am co-facilitator of the Personalization ARIA (soon to be APA) Task Force.

Benjamin Young: I posted slides on options

Tzviya Siegman: ideas like webcomponents

Benjamin Young: role= is in the list, itemtype and typeof, etc
… none of them define interaction semantics
… but using them with a defined vocab could map into the workflow use cases, or stating what’s a chapter

Dave Cramer: (describing slides)

Dave Cramer: https://usercontent.irccloud-cdn.com/file/ZS6lycM6/epub-type-alternatives-in-html.pptx

Tzviya Siegman: let’s finish the queue

Zheng Xu: we use toc, toc, and landmarks and cover

Garth Conboy: listening to deborah and then avneesh and george
… we don’t want to screw up a11y with @role
… and having common vocabs, so Ivan’s proposal for figuring out where to put that stuff…
… what I’m missing, I haven’t heard of something that isn’t in ARIA, is used in EPUB:type today, and isn’t a workflow thing

Ivan Herman: I would hate to use RDFa or microdata, because it’s complicated
… as a co-author of RDFa I can say that :)
… if we need it, the only solution is to add one single attribute whose values we define

Joshua Pyle: or we can continue to use a namespace

Ivan Herman: namespaces in html5 are dead

Jean Kaplansky: (long live namespaces… lots of publishers will need hand-holding to let go…)

David Stroup: I agree that the first question to be answered is what isn’t already supported
… if the use cases matches what already exists, we don’t need anything
… so far all our examples have been internal to the content
… the toc and reading order and such can be external data about the content
… its context-sensitive
… I’d want to loook at something external. Do we have a use case?

Benjamin Young: 1. we need to get back to affordances, and what we want out of the list
… 2. and what we expect to happen when the user does something
… RDFa and microdata are defined for small bits, so they are conceptually different

Avneesh Singh: Is this something in the charter? Do we need to deliver? We’ve discussed for two months, without having a solid addition
… Luc proposed a few things

Tzviya Siegman: only if we had a use case

Garth Conboy: LET’S FIND problem before solution

David Wood: I’m worried about a spot solution
… it locks us into the affordance discussion
… and it allows us to create a situation where vendors know what they’re building and don’t know about the future
… RDFa is general purpose, and can deal with the future
… we should be cautious

Tzviya Siegman: no one is preventing you from using RDFa
… we’re just not making a custom vocabulary
… we do that at Wiley

David Wood: how will you encourage reading systems to make use of it

Tzviya Siegman: I don’t care about reading systems, I’m on the web

Jean Kaplansky: Readers = web browsers

Avneesh Singh: I have the same question

Ivan Herman: how do we close this session?

Tzviya Siegman: consensus is that the only use cases for terms not in DPUB-ARIA are internal workflow
… as deborah has pointed out, it’s bad to use aria for non-a11y use cases
… so options are new attribute to stay away from aria, or teaching people to be careful with @role

Jean Kaplansky: +1 education on using ARIA correctly.

Garth Conboy: the most common use case for UI is noteref/footnote
… does anyone think that’s wrong to move that set of actions from epub:type to @role

Tzviya Siegman: if you’re assigning the role to the correct elements its not wrong

Dave Cramer: (detailed discussion of dpub-aria note tagging)

Tzviya Siegman: doc-footnote has the same semantics as sections
… if people are assuming footnotes should be list, it will overwrite that semantic

Garth Conboy: is our most common case broken from aria perspective

Tzviya Siegman: I don’t know

Tzviya Siegman: someone open an issue in DPUB-ARIA github, let’s make sure that footnote tagging doesn’t break a11y
… Garth, can you do that?

Rick Johnson: the key is, if I’m not doing a11y, ignore aria

Tzviya Siegman: https://github.com/w3c/dpub-aria/issues

Garth Conboy: Requested DPUB-ARIA issue: https://github.com/w3c/dpub-aria/issues/13

Tzviya Siegman: it’s easy to break stuff if you use aria incorrectly

Leonard Rosenthol: @dkaplan3 - that’s how we see it too :)

3. Infoset

Tzviya Siegman: the infoset is a hot topic that leads us down many rabbitholes
… we are going to attempt to finalize the infoset before lunch
… we need to resolve some issues, make the spec more precise, the infoset does not (or does it??) need to include everything
… we can start by going through some github issues
… luc had looked over our existing infoset and let us know nothing is missing

Luc Audrain: We should have the simplest infoset possible
… it should be possible to have a web publication starting from the webpage
… we had long discussion around things like, does it need a title
… I found the current requirements very short, but enough for the web publicatio and certainly for epub4 in the future
… if we would compare what e have today in epub3, this infoset is too short
… there a gap analysis that Hadrien did
… I don’t know if we need to add the full to the WP infoset

Tzviya Siegman: thoughts?

Ivan Herman: we have to start somewhere. At some point in time we will begin to map this into clear serialization.
… the bulk of the serialization will be in json which is inherently extensive
… we should being this work of mapping and then new items may come up

Tzviya Siegman: https://w3c.github.io/wpub/#infoset

Ivan Herman: we can always see if we need additional things but I believe we are at the point that we are ready to get our hands dirty
… bigbluehat breakdown (issue 197) was helpful

Benjamin Young: Matt has put this in the draft

Dave Cramer: https://github.com/dauwhe/html-first/wiki/WPUB-examples#1-minimal-wpub-based-on-todays-spec

Leonard Rosenthol: I looked over the current draft. I think it’s a good set of material, well defined. And we have an extensibility mechanism which gives us a good foundation to start from

Rick Johnson: I’ve not been involved in a lot of the conversation around infoset. Everything I read is around markup except for privacy
… .how can we expect the system to know this

Ivan Herman: the only thing we say re privacy policy is that there should be one and it should be linked from the infoset

Rick Johnson: we are clearly defining the markup - what is the privacy policy for

Leonard Rosenthol: that is if you have a publication that is declarative

Rick Johnson: we need to make clear what the privacy is for

Tzviya Siegman: let’s open a ticket to clarify this language

Rick Johnson: my (first!) issue on the privacy policy info set https://github.com/w3c/wpub/issues/203

Benjamin Young: I’d love for us to ring out what we’re affording in these things (including privacy policy which the reading system may not know what to do with)
… .are we saying this because it has an effect on manifest etc
… how does this spill out experientially

Tzviya Siegman: we have to clarify the effect on the user and the system

Benjamin Young: if we don’t explain what we’re affording for with the stuff we’re expressing, then we’re missing the point of expressing them at all

Ben Walters: of all the infoset, privacy concerns the most…
… my big concerns are: … 1. compatibility with the web today
… 2. there’s not one privacy policy or one way to interact with privacy

Ben Walters: are we enforcing that everyone interact with privacy policies? that they all click yes on them? are we requiring that everyone follow the same policy

Tzviya Siegman: privacy seems like a publisher specific thing

Benjamin Young: Privacy Policy was added via PR #95 https://github.com/w3c/wpub/pull/95

Garth Conboy: I’m with ben - the farthest we could go with this is that you may put a privacy policy in, and Reading Systems may interact with it

Dave Cramer: websites can do privacy policies, most of them do
… often with a footer that repeats
… how do we define things that apply to the publication as a whole

Romain Deltour: +1

Leonard Rosenthol: https://github.com/w3c/wpub/issues/204 - calling out UA items

Ivan Herman: I propose to remove privacy from infoset

Luc Audrain: we do not do any privacy policies within epub but we have contracts with distributors that say how the epub can be used
… we do have privacy policies
… we have applications which are programmatic
… they include privacy policies
… there is a question of privacy, usage, and the data that is collected

Ivan Herman: we agreed that this is the minimal basic infoset
… not that this is the complete one
… we acknowledge that additional ones might come in
… the manifest, the serialization of the infoset, is based on schema.org struture in json
… I agree with everything you said but we need to decide if this is part of the basic content of the infoset

David Wood: of all the affordances that we’ve discussed, privacy is the keydifferencec between webpage and epub. it’s more important than page break.
… There are lots of good reasons to sweep this under the rug and good reasons to not to

Luc Audrain: +1

David Wood: it is the difference between a society where we have the expectation of privacy when reading a book
… if we do the convenient thing by treating privacy as a legal requirement or vendor requirement, we risk being complicit in a shift in the social experience of reading

Tzviya Siegman: I don’t think we can solve that with a privacy policy

Brady Duga: I think that’s an important point - privacy is important. I don’t know that we can hit the requirements necessary to address that in this spec
… the interaction of privacy policies make it impossible for us to specify this

David Wood: we should at least make a philosophical stance
… we expect privacy even if it is not provided

David Wood: I have an outstanding action to approach this in a ticket

Rick Johnson: I am the privacy officer for our company and implemented GDPR
… we’re conflating a concern we have with privacy and privacy policy determined by jurisdiction
… we need to separate privacy policy, rather than privacy

Tzviya Siegman: RickJ and DavidWood will be our privacy task force
… please clarify to our group what we can and cannot do

Ivan Herman: 3.3.9 in the current draft doesn’t cover what RickJ and DavidWood are talking about

David Wood: in relation to gdpr and I go to a website and the website collects info from me they have to tell me what they’re collecting, allow the right to be forgotten, etc
… if we put books on the web anyone that reads a book
… even on an ereader rather than a traditional browser
… the requirements apply

Ivan Herman: what should I, as an author, put on the webpage

Benjamin Young: it’s recommended that it be in html - why isn’t this just content in the publication
… if your jurisdiction requires it, why not just add it to the content itself and as a publisher, you can express the requirements/concerns to the users of your content

Brady Duga: I hear the question of what do you want done - privacy policy: is it in the infoset or not
… we have to put privacy somewhere according to w3c policy
… it’s seems that we can’t put enough requirements around this in order to put this in the infoset

David Wood: I think part of the reason that we disagree is that we are talking about publications as if they are just content - i take a book, make it into html, and then make an epub3
… therefore the publisher of the content doesn’t have anything to say about privacy
… that may not be right, because if we allow js to be a part of that package we are in a different environment
… ow we have privacy, legislation, and regulation from multiple parties
… there has to be SOME mechanism if we are going to allow js in the packages

Tzviya Siegman: is it required in the default infoset

David Wood: are we ready to say it’s not supposed to be in the infoset?

Leonard Rosenthol: it’s currently recommended - you CAN put it in the infoset, but you don’t have to

Tzviya Siegman: in the default metadata set does this belong?
… many of us are saying it does not belong
… we have other items we must discuss

Hadrien Gardeur: I think that we need to be careful for infoset/metadata that are not default
… they’re very likely to be ignored by most reading systems

Hadrien Gardeur: … for the privacy policy, there’s already a rel value in the IANA link registry

Hadrien Gardeur: … why can’t we simply use that?

Hadrien Gardeur: … there’s no need to do a lot more than point to a privacy policy

Benjamin Young: +1

Luc Audrain: +1

Joshua Pyle: +1

Tzviya Siegman: +1

Brady Duga: +1

Garth Conboy: +1

Benjamin Young: https://tools.ietf.org/html/rfc6903 has all the goods

Ben Walters: it’s there but not used by anyone

David Wood: +0 only because I’m uncertain of the consequences

Tzviya Siegman: can you fix that?

Ben Walters: to convince Edge to do something like this means convincing the other browsers which means there has to be a major user need
… it hasn’t happened yet, which means it’s unlikely

Garth Conboy: BenWaltersMS: unclear anybody uses said IANA privacy link

Dave Cramer: we have needs that are so specific that it requires a brand new data structure - why can we not just put this in html?

Ben Walters: +1 dauwhe

Benjamin Young: its not clear to people who write links that they need to do this - we need to define the affordance before they stick it in there

Tzviya Siegman: we need a proposal for how to include privacy policy within the publication

Garth Conboy: if we can’t mandate the reading of the policy, it belongs in the content. if the publisher cares - they’ll include

Benjamin Young: until the time the reading system can recognize it

3.1. list of resources

Tzviya Siegman: https://github.com/w3c/wpub/issues/198

Dave Cramer: github: https://github.com/w3c/wpub/issues/198

Garth Conboy: https://github.com/w3c/wpub/issues/198

Garth Conboy: this was raised by Ben - we are in agreement a default reading order as part of the infoset
… this issue is around what other resources may/must/should be included in the infoset

Rachel Comerford: ..the reason for including other resources is to show the bounds of the publication

Garth Conboy: for search, offlining, packaging and other affordances of the publication
… end notes and footnotes are an example of the break from the linear pathway
… there may be things in the publication that are not in the default reading order that can’t be sussed out
… ie images referenced as top level, CSS probably
… what is required to be in this other list of resources
… reading systems may want exhaustive
… other perspectives say that the web changes constantly, how can it be exhaustive

Leonard Rosenthol: I don’t think we need to mandate anything in this list but we should say “if these resources are important to your publication in XYZ use cases, then they must be here”
… having a specific list doesn’t buy us anything

Dave Cramer: it’s a burden on the author to enumerate every single list
… that kind of thing doesn’t happen on the web in general

Rachel Comerford: +1 dauwhe

Garth Conboy: I agree with that and I’ve come to the perspective that by the time we package this the resource list is exhaustive

Joshua Pyle: +1 dauwhe

Ben Walters: I agree with everyone. I don’t like a partial list that’s confusing and so not used.
… if I’m a tool that wants to take web pub offline, how do I know which elements should be?
… images? videos? etc
… how is that decision delineated?
… we need to avid a halfway decision

Ivan Herman: if we have the reading order which are html files mostly, any image or CSS files referred from that reading order are automatically a part of the infoset items
… we have to be precise about that
… we probably don’t want to extend that to videos which are dangerous thing
… any resource that the author wants to be a part of the publication needs to be included
… like datasets
… I would feel fine with images, CSS… js is tricky

Benjamin Young: rel=external is in the html spec, which can be used on link and anchor tags
… and could be extended to making exceptions to what to grab (ie images outside the publication)
… the video tag presents a similar opportunity - the video and format that I get depends on the venue I am using to view it

Brady Duga: the manifest in epub is somewhat unnecessary - it was written before there was a package document.
… for pwp - it’s the stuff that’s in the package
… for the web, it’s the stuff that’s in the web
… for an offlinable wp there will be bits that are not findable
… even the stuff that is findable, the user experience is not great because of the processing need

Hadrien Gardeur: +1 for what duga is saying

Brady Duga: it slows down downloads and burns through battery

Romain Deltour: we have to ask ourselves how our publications differ from the web
… are we asking how to cache? how to work offline?
… the author has control how she develops the service worker
… we are not clear what the user agent is in the publication.

Joshua Pyle: I am focused on the pub that is not offlinable etc. At some point it may be one of those things. But right now, we need to focus on the minimal definition of the WP.
… if you want a WP - you need these three things
… making it offlinable? here are the other things
… if there is a lot of complication in creating these files
… no one will do it

Luc Audrain: will we have badly structured web publications if we do not provide more information about what’s required
… I agree with what josh says

Leonard Rosenthol: we seem to be coming down to the offline conversation

Garth Conboy: also search

Leonard Rosenthol: not search - from a technical perspective i disagree
… caching - maybe, maybe not
… if this are the use cases, could we treat them as these specific things instead of a special resource list

Garth Conboy: maybe those use cases can be generalized

Leonard Rosenthol: I don’t think we need to establish the bounds of the publication
… when we take t offline, we need to know what is coming offline

Joshua Pyle: you must be able to search within a publication

Garth Conboy: the one bullet point is that we have a bounded publication

Leonard Rosenthol: @josh - but it’s not clear what that means when you have external references…

Wendy Reid: as a user agent/user experience rep - if we’re giving the user the option of offlining
… we need to give the appropriate information
… want to download this? You’re getting 50 gigs of video. Still want it?
… we need the info to be presentable and the user agent and user should have ptions around them

Ben Walters: is search needed outside the default reading order? is that a requirement?

Garth Conboy: right now in epub you search the whole thing - is it important to include include nonlinear content in the search

Rick Johnson: it seems that when we talk about offline, we talk about packaging as a part of offlining
… it needs to be separate

Dave Cramer: conclusions - the exhaustive list of resources is optional
… there are circumstances where it is not needed
… the spec will not stop people from exhaustively listing resources
… we need to clearly define the boundary of the publication

Garth Conboy: I think yes

Ivan Herman: there is a difference between what the web does and a publication
… if I begin to read a book of 5 chapters and I want it offline, the current web will offline what I read
… the author has to specify somehow
… search is one of the affordances that are important
… personalization as well (ie I want my book to read in night mode) I want that to apply to all chapters, not one
… what dauwhe proposes is incomplete because we have to specify what the user agent does in terms of affordances and offlining.
… there may be an optional list

Benjamin Young: +1 to defining affordances all the places

Brady Duga: so we’re back where we started - we should have an optional list of resources with clear instructions about what they should be used for
… we also need to define default with or without this list

Hadrien Gardeur: if you don’t list something, you can’t expect things to work magically

Garth Conboy: details need to be worked out
… but it sounds as though there may be consensus

Proposed resolution: There is a default reading order. There is an optional list of resources that may be provided to extend the bounds of the publication beyond the default reading order. (Garth Conboy)

Dave Cramer: github-bot

David Stroup: is it resources that are not directly referenced or deterministically identifiable

Garth Conboy: Perhaps a nod toward Ivan’s expansion of Proposal: “Proposal: There is a default reading order. There may be an optional list of resources that may be provided to extend the bounds of the publication beyond the default reading order (plus those images and CSS files directly referenced therefrom).”

Hadrien Gardeur: I would disagree to that last one garth

Hadrien Gardeur: there are many many ways you can reference something on the Web, should we automagically include links that get dynamically added to a page using JS for example?

Proposed resolution: There is a default reading order. There is an optional list of resources that may be provided to extend the bounds of the publication beyond the default reading order. (Garth Conboy)

Proposed resolution: There is a default reading order. There may be an optional list of resources that may be provided to extend the bounds of the publication beyond the default reading order (plus those images and CSS files directly referenced therefrom). (Garth Conboy)

George Kerscher: If we had a compiled search index on the publication, then the bounds of the search would be that index.

Leonard Rosenthol: The problem that I have with Ivan’s addition is that it could muddy the waters.

Ivan Herman: If we keep with Garth’s proposal then we don’t know what the bounds of the publication are.

Brady Duga: The bounds depend on what you’re doing. I agree.
… the bounds for search could be different than the bounds for offlining

Ivan Herman: … but this would lead to many different ways of defining the bounds.

Tzviya Siegman: As a user, I would not assume that a giant chunk of Twitter would ever be in bounds.

Dave Cramer: In the case of an inline image, clearly that’s in bounds
… devtools, e.g. know what goes with a publication.

Hadrien Gardeur: I think that the bounds should be explicit.
… If different UA determine the bounds differently, then we have a big problem.

Garth Conboy:

Garth Conboy: Is that an argument to have an exhaustive list?

Hadrien Gardeur: We shouldn’t discuss exhaustiveness.

Garth Conboy: Do you have a proposal? Or does one of mine cover it?

Benjamin Young: Depending on how we define this, it affords different stuff.
… we’re all bringing our own usage cases
… if I have a massive exhaustive list, then I have no choice but to fetch the entire publication.
… if I have an option, e.g. reading a chapter of a larger textbook, that would be good.
… the HTML that loads the content can make determinations that cannot be made from an exhaustive list.

Leonard Rosenthol: In my world people are creating custom documents (rather than books or journals)
… at some point we will need to deal with understanding resources and boundedness

Tzviya Siegman: We are trying to solve a specific problem. We need to try to find agreement.

George Kerscher: If we have our reading order and indicate “external” as appropriate, that may suffice.

Garth Conboy:

Proposed resolution: There is a default reading order. There is an optional list of resources that may be provided to extend the bounds of the publication beyond the default reading order. (Garth Conboy)

Ivan Herman: I see a fundamental schism in the group…

Benjamin Young: Let’s read the proposal…
… Ivan you have expressed concern about this…

Ivan Herman: The proposal is not specific enough.
… this gives you an idea of the boundaries, but does not define the boundaries.
… if someone doesn’t include an image in the manifest, then it is not in bounds.

Proposed resolution: There is a default reading order. There is an optional list of resources that may be provided to extend the bounds of the publication beyond that default reading order. (Garth Conboy)

Benjamin Young: There are two approaches..

Leonard Rosenthol: https://github.com/w3c/wpub/issues/205 - we need to define the bounds

Benjamin Young: some all-knowing author created a comprehensive list, or a UA gathered a list dynamically.

Ivan Herman: We are going in circles.
… I just want a clear specification of what the boundaries are

Tzviya Siegman: Let’s go back to the queue and try to make a decision.

Dave Cramer: What do we mean by boundary?
… The goal should be to preserve a reading experience when within the bounds.
… From a visual point of view, boundary is less important.

Ivan Herman: we are still repeating things.

Romain Deltour: The issue is that we are approaching from the content. Our specs contains too little guidance for the user agents. We should focus on “What must the UA do in the absence of a manifest”

George Kerscher: I would like the RS to tell me whether the link is leaving the book, for example.

Garth Conboy: I think that works with either of the proposals that we have.

Leonard Rosenthol: Following up… I created an issue (205) we never defined what a UA should do. Let’s separate the issues. There is consensus on Garth’s proposal. Then let’s create another in it’s own section

David Wood: Leonard is right to raise the issue
… We can have a defined hard boundary, or we can collapse the boundary when required (e.g. printing, offlining)

Tzviya Siegman: from TPAC, are we wrapping the web in a package or teaching the web to package things?

Garth Conboy: Closing an issue by opening another is not progress.

David Wood: This is evidence of fundamental disagreement.

Ivan Herman: This will come back and bite us.
… taking the search example, does an SVG file within HTML get searched?

Tzviya Siegman: Accept Garth’s proposal and add a new section for boundaries.

Hadrien Gardeur: Not including a resource won’t prevent it from being rendered.
… We tend to go back and forth between two issues. We should separate “how to create a boundary” from “what does the boundary afford”

Proposed resolution: “There is a default reading order. There is an optional list of resources that may be provided to extend the bounds of the publication beyond the default reading order.” (Garth Conboy)

Proposed resolution: Add section on Boundary determination to spec. (Garth Conboy)

Ivan Herman: +1

Garth Conboy: Close issue #198; continue on Boundary in #205.

Romain Deltour: 0+

Garth Conboy: +1

Leonard Rosenthol: +1

Rick Johnson: +1

Ben Walters: +1

Joshua Pyle: +1

Dave Cramer: +1

Benjamin Young: +1

Wendy Reid: +1

Brady Duga: +1

Reinaldo Ferraz: +1

Rachel Comerford: +1

David Stroup: +1

Charles LaPierre: +1

Tzviya Siegman: +1

Avneesh Singh: +1

David Wood: +1

Ric Wright: +1

Ben Dugas: +1

Luc Audrain: +1

Romain Deltour: +1

Marisa DeMeglio: +1

Romain Deltour: I would vote +1 if we agree to define what the UA should do.

George Kerscher: +1

Jeff Buehler: +1

Resolution #1: There is a default reading order. There is an optional list of resources that may be provided to extend the bounds of the publication beyond the default reading order and add section on Boundary determination to spec.

4. Manifest serialization

Tzviya Siegman: there open issues around serialization
… do people know what “serialization” means?
… we’re talking about the nitty gritty, the vocabulary we’re going to use, etc

George Kerscher: you’re talking about the OPF/metadata-level thing?

Tzviya Siegman: right, not the content, think about it as OPF-next

Ivan Herman: the spec talks about “Descriptive properties” vs “structural properties”
… “descriptive properties” is generally what people think about, it’s metadata
… last week I looked at the schema.org approach, assuming that we can base our infoset on it as much as possible
… using JSON-LD, and see how much we can cover
… several questions there: can we rely on schema.org, can we rely on JSON-LD and not only JSON
… whatever we think about it, the reality is that on the Web if people want their page to be well indexed they use schema.org
… there was a slight disagreement with Leonard about DC (as used traditionally in EPUB) and schema.org
… semantic-wise, there’s good mapping between DC to schema.org
… but the migration can be a problem for the EPUB community
… when I went through the terms, only 2-3 didn’t have counterpart in schema.org
… they are: writing direction (ltr/rtl), reading progression
… but we have good contact with the people who maintain schema.org, so we can talk with them
… the other problem I found is about value types

David Wood: Schema.org does have a mechanism for community extension: http://schema.org/docs/extension.html

Ivan Herman: for instance if the author name is a string, how can I specify things like writing direction?
… an other issue is about order of multiple values (e.g. author names in a scholarly publication)
… besides those issues, I feel it’s a really good fit

David Wood: http://schema.org/author can also be a http://schema.org/Organization

Leonard Rosenthol: we should first make a decision, regardless on the schema decision, about JSON vs JSON-LD
… you can serialize schema.org or DC etc in any one of them

Ivan Herman: I almost agree
… if we make the choice of going with schema.org, then there’s no choice since they require JSON-LD

Tzviya Siegman: I don’t think we need to decide yet

Hadrien Gardeur: on the Readium side we also created a mapping between our infoset/EPUB (including DC, schema.org) and it worked well
… the ability to have context is important in JSON-LD
… in R2 we decided to define a context document that sort of hide some details (names/ns) to make it closer to EPUB
… so we’ll have to ask ourselves: do we use terms as-is or do we define our own context document?

George Kerscher: JSONLD is a W3C Rec, yes?

Ivan Herman: AFAIK, the current handling of schema.org at Google does not handle the context documents
… they’re working on it, but today if you’re not using the terms the way they define it, even if correct for JSON-LD, it won’t work
… so there’s a pragmatic issue here

Hadrien Gardeur: right. It’s in the list of “cons”, and we shouldn’t list the pros and cons now, but keep in mind the question

Tzviya Siegman: we’ve also done a lot of mapping work
… we’ll share that with the group
… almost of the stuff is mappable
… some of us are familiar with working with schema.org to extend the vocabulary
… I fully support the schema.org approach

David Wood: schema.org does have a mechanism for community extensions
… that mechanism has been used successfully by various people
… but if we’re going fully to schema.org, what does that give us?
… it requires a file separate from HTML

Romain Deltour: [people remind it can be inlined]

David Wood: I think we’re crossing the streams
… we’re saying pick this base file (URL of the pub), put this schema.org doc, but this does nothing for your package, for the online search
… it makes it harder for RS that are not browsers
… all those RS will have to understand this schema.org data and they don’t know how to do that today

Ivan Herman: you can put @ID into the JSON-LD which is not the URL of the HTML document it contains

David Wood: didn’t we define at TPAC that the URL of this HTML would be the address of the publication?

Ivan Herman: yes
… we can put the ID of the publication as the subject of the JSON-LD, regardless of where the document is
… whether we prefer JSON or some multitude of XML files, I think I still prefer JSON
… if we don’t use the schema.org terms (or the DC terms), we are engaging to define our own vocab, and we don’t want to do that

Charles LaPierre: the a11y TF did get schema.org to add in various a11y-related metadata

Ivan Herman: right, in fact I point to them in the mapping I did last week

Avneesh Singh: https://idpf.github.io/epub-vocabs/package/a11y/

Avneesh Singh: we were successful to add 4-5 properties
… but the properties critical to conformance (conformsTo, certifiedBy) we couldn’t include
… while working on EPUB Accessibility 1.0 at IDPF, we had to create this vocab for EPUB only since we couldn’t get it on schema.org
… but this is very essential to a11y conformance statements
… in case schema.org can’t accept it, we’ll have to define it in WP

Ivan Herman: in last resort we can do that
… it’s easy to do with JSON-LD
… if I understand well the terms you defined ended up in the main schema.org set of metadata
… you could also have proposed an extension

Avneesh Singh: we explored it but it wouldn’t have had as much weight

Ivan Herman: right. we’ll have to discuss this with schema.org people
… as a last resort JSON-LD allows you to add your own namespace and vocab

Benjamin Young: there’s an affordance that’s driving the schema.org selection
… schema.org has been chosen for search index by various parties
… we should not be opposed to extend the vocabularies for other UA beyond just the search engines

Benjamin Young: http://www.sparontologies.net/ontologies

Benjamin Young: we use the ontology (linked above) at Wiley
… these are in use at publishers
… we must keep in mind what we’re trying to afford in the different scenarios
… search index is one, it’s not the only one

Leonard Rosenthol: +1 on what Benjamin just said
… another question is: should we want another openly available industry location for a metadata document?

Leonard Rosenthol: Adobe XDM - https://www.adobe.io/open/standards/xdm.html

Zheng Xu: from our point of view, JSON-LD is very convenient

Leonard Rosenthol: https://github.com/adobe/xdm

Ivan Herman: the fact that we’ll use JSON is decided
… it’s of course correct that search is the major reason for using schema.org
… if I look at the linked data world in general, schema.org is by far the most largely used vocab
… there is a very active community around schema.org
… if we use the schema.org approach, authors, publishers, etc can use other schema.org terms in the same manifest
… it’s full of additional terms that are actually bloody useful for publishing

Luc Audrain: yes, it’s a very rich and active community
… it tries to describe the whole World
… it covers the description of the publication, but also what’s inside the publication

Benjamin Young: just a clarification: schema.org is only expressible in JSON-LD

Benjamin Young: …and RDFa and Microdata and Turtle

Leonard Rosenthol: we all agree that using schema.org to some extent is worthwhile

Benjamin Young: ..but if it starts with “JSON” and has Schema.org in it, it needs “-LD”

Leonard Rosenthol: the question is should we use it for everything?

Joshua Pyle: Ivan mentioned that the other candidate was DC
… as someone in the publishing industry for 2 decades, we don’t really care about getting rid of DC

Luc Audrain: +1

Tzviya Siegman: +1

Joshua Pyle: we want to be more webby
… we’re happy to abandon DC!

Deborah Kaplan: Libraries and archives are used to crosswalking to and from dublin core, too

Benjamin Young: I propose we start from JSON-LD, inline in the HTML, build up from schema.org terminology
… look for anything we must add to schema.org
… and that gets us started

Proposed resolution: JSON-LD in HTML entry file; starting up from schema.org, and adding from there. (Garth Conboy)

Hadrien Gardeur: I have multiple issues with always embedding JSON-LD in HTML
… I listed some of them in issues before
… I think it’s the best solution when you have a single resource in the reading order, otherwise an external file is OK

Proposed resolution: JSON-LD serialization; starting up from schema.org, and adding from there. (Garth Conboy)

Ivan Herman: +1

Luc Audrain: +1

Proposed resolution: JSON-LD serialization; starting up from schema.org, and adjusting from there. (Garth Conboy)

Garth Conboy: +1

Tzviya Siegman: +1

Ivan Herman: +1

Marisa DeMeglio: +1

Romain Deltour: +1

Zheng Xu: +1

Leonard Rosenthol: +1

Joshua Pyle: +1

Hadrien Gardeur: +1

Rachel Comerford: +1

Charles LaPierre: +1

Reinaldo Ferraz: +1

David Wood: +1

Ben Walters: +1

Ben Dugas: +1

Brady Duga: +1, but I think I fell asleep for some of the conversation

Avneesh Singh: +1

Dave Cramer: +1

Romain Deltour: (for now only for “descriptive properties”, correct?)

Toshiaki Koike: +1

Benjamin Young: +0

Resolution #2: JSON-LD serialization; starting up from schema.org, and adjusting from there.

David Stroup: +1

George Kerscher: As more information for our last agenda item:

George Kerscher: The DPLA Metadata Application Profile (MAP) is the basis for how metadata is structured and validated in DPLA, and guides how metadata is stored, serialized, and made available through our API in JSON-LD. The MAP was originally developed in 2012 and has been updated occasionally since. It is based on the Europeana Data Model (EDM), and integrates the experience and specific needs for aggregating the metadata of America’s cultural heritage institutions. The current version is 4.0.

4.1. Infoset items in json + schema.org

Leonard Rosenthol: https://github.com/w3c/wpub/wiki/Descriptive-Infoset-Properties-vs.-Schema.org-table

Tzviya Siegman: WPM terms & mapping to Schema.org – A11Y first

Garth Conboy: A11Y conformance metadata (from EPUB 3.0.1) proposed to schema.org… no action yet.

Tzviya Siegman: Dave’s taking action items — the one just above is first.
… Address: url – no issue
… Canonical Identifier: could be JSON-LD @ID and/or schema.org “identifier” (text)
… Should check with DanB

Deborah Kaplan: Pubs use lots of identifiers. Canonical Identifeir can be repeated; @ID could be used for THE identifier.

Rick Johnson: run away from ISBN

Deborah Kaplan: ISBN is basically a red herring here

Deborah Kaplan: technically “@ID could be used for THE identifier.” wasn’t me; I was just pointing out the danger of schema.org identifier since it doesn’t have the way to identify the single canonical id in a repeatable term

Leonard Rosenthol: There is a schema called “thing” — which has “same as” – good way to do DOI.

Rick Johnson: also avoid the shiny object that ISSN is…. it’s the same trap as ISBN http://www.issn.org/

Deborah Kaplan: Rick++, avoid the trap of any single scheme (doi, isbn, issn, etc.). We need to be more generic for our canonical ID

Tzviya Siegman: back to “Should check with DanB” on this topic.

Benjamin Young: @id should be the canonical identifier for a publication

Benjamin Young: @id = canonical identifier/publication address

Benjamin Young: https://schema.org/identifier can (or should be made to…ideally) be used for All The Other Things

Benjamin Young: …or so goes my proposal

Ivan Herman: Agree. schema.org “identifier” can be used for additional identifiers. And okay to talk to DanB.

Tzviya Siegman: Cover

Ivan Herman: Did we find any for this?

Hadrien Gardeur: something different in R2. Should it be metadata, or <link> rel, do we worry about data replication?
… perhaps it shouldn’t be a pure metadata item.

Leonard Rosenthol: there is also a thumbnail URL — do people have big thumbs?
… thumbnail could semantically work

Luc Audrain: +1

Tzviya Siegman: could use resource with ARIA role cover

Dave Cramer: could use “image” for cover image. That’s schema.org example.

Charles LaPierre: Cover’s don’t have to be images.

various: not required

Ivan Herman: just part of basic set of infoset items

Rick Johnson: should this be part of the “basic” set anyhow?

Luc Audrain: yes

David Stroup: a bibliographic extension http://bib.schema.org/CoverArt

Tzviya Siegman: it’s in there ‘cause it’s important to many publications

Benjamin Young: there are people who want a cover, we should have a standard way to do it.

Benjamin Young: http://ogp.me/#metadata

Dave Cramer: can’t not have cover

Gerorge: need descriptive alt text

Dave Cramer: q^2

Rachel Comerford: https://usercontent.irccloud-cdn.com/file/9vzQ5mZJ/THE_Q.png

Joshua Pyle: lots of WP’s won’t have a meaningful cover
… lets not be so “book”

Romain Deltour: the ImageObject can have description

Benjamin Young: alttext: if you provided an image in JSON-LD you should/must have a title too?

Dave Cramer: cover is not required

David Stroup: not required; not all are images; and thumbnail too, possibly.

Ivan Herman: ImageObject is pretty rich, so could be fine.

Ivan Herman: could we require ImageObject, rather than URL?

Marisa DeMeglio: does this lead to multiple ways to describe an image?

Avneesh Singh: need to consider AT

Ben Dugas: what about consistency of cover formatting? Landscape, portrait, et al.

Hadrien Gardeur: move Cover out of descriptive properties into structural properties.
… can also enable a number of affordances
… in HTML could use many elements — doesn’t need to be ImageObject

Ivan Herman: do we need to come up with links and options?

Hadrien Gardeur: lots of options, without inventing new stuff

Avneesh Singh: +1 to cover in document

Tzviya Siegman: make “cover” be a resource (in HTML with all it’s goodies), then identify with ARIA role cover?

Luc Audrain: really should be in HTML

Benjamin Young: +1 to cover in (or part of?) a document…fascinating possibilities

Hadrien Gardeur: +1 to reopen this discussion once we know how the list of resources is serialized :p

Leonard Rosenthol: assumption the cover will be displayed?
… I don’t think so.
… Purely machine readable. If you want it to be rendered, it should be in the content and reading order.

Deborah Kaplan: agree with Tzviya, disagree with Leonard
… Cover is an image that displays in a number of ways (e.g., Bookshelf)
… Lots of hoops for this in EPUB as RS’s aren’t consistent.
… if no shelf view, readers still expect to be able to see the cover
… in HTML and tagging it as such would solve these problems.

Hadrien Gardeur: +1 agree about the concept, not necessarily the solution (ARIA role)

Dave Cramer: dkaplan is my hero

Benjamin Young: +N (though perhaps there are more semantics than an ARIA role might want)

Romain Deltour: “cover” ARIA role can only apply to an <img>

Romain Deltour: https://www.w3.org/TR/dpub-aria-1.0/#doc-cover

Ben Walters: agree with Leonard. If you want to show a cover, it should be in content. But, for usage like search, it wants to be metadata — can’t expect that to be dug for.

Hadrien Gardeur: +1 to what BenjaminWalterMS said about search engines, that’s why I think the concept is good but this needs to be addressed at a list of resources level

Deborah Kaplan: arguably doc-cover could be changed, since changes to dpub-aria are in scope for our WG.

Deborah Kaplan: I am in favor for doc-cover being permissible on more legal elements

Tzviya Siegman: what about both? ARIA role on an <img> for rendering. schema.org “image” pointing to ImageObject for machine readable.

Hadrien Gardeur: starting to be like EPUB (in a bad way) — data duplication. We should wait to get into structural properties.

Romain Deltour: author can do both anyhow… is this really a best practice?
… both are already allowed now.

Leonard Rosenthol: agree. Need UA requirements. Multiple places will be used differently.
… at some we’ll need to cross that bridge.

Deborah Kaplan: “they will be used differently” is the problem that causes massive creator confusion that I was upset about. Most publication creators don’t think that way and asking them to makes their heads explode.

Dave Cramer: should sketch these options out… it’s complex enough.

Leonard Rosenthol: @dkaplan3 - which is why we need UA requirements so there is no confusion

Luc Audrain: 2nd time today we’ve used an ARIA role for something other than A11Y… bad (maybe).

Deborah Kaplan: +1 tzviya

Tzviya Siegman: Proposal… run away!

Garth Conboy: for Creators: can we take schema.org’s robust set and call it a day?

Ivan Herman: proposal: any may be used
… all refer to Person or Organization
… is it okay to use multiple? Can’t order be important?

Hadrien Gardeur: bunch of useful schema.org items — we need to specify a set of ‘em that are expected for this purpose.

Ivan Herman: +1

Luc Audrain: a “role” would be better

Tzviya Siegman: schema.org may/may-not have contributor; restricting the list has problems too.

Deborah Kaplan: We must learn from the dangers of what happened with TEI. Too much specificity will (a) not capture all use cases and (b) will make it impossible for implementers to know what to display. We should try to get most use cases, and accept generic “creator” will catch the long tail.

George Kerscher: We will identify required schema.org entries, right? I am talking about the min set.

Deborah Kaplan: Need general one (creator).

Hadrien Gardeur: agree. Contributor is schema,org (though role is missing).

Wendy Reid: publishers are good at deliveriing very detailed metadata, but in retailing, ONIX or Excel sheets (oh my!) is really is what is being used.

Brady Duga: Link to contributor: http://schema.org/contributor

Garth Conboy: Language & base-direction: schema.org has language, but not text-direction (of content in metadata) or reading direction of publication.

Leonard Rosenthol: not relevant in context of metadata.

Ivan Herman: each metadata item should be able to have language and text-direction

Ivan Herman: consensus is that we should to take to DanB.
… just about text-direction of metadata (not content)

Benjamin Young: e.g., language is talking about the publication not the metadata

Zheng Xu: do we need text direction in metadata?

Ivan Herman: this the dir attribute in HTML
… dir is missing from lots of places

Brady Duga: is it just direction? Do we need, e.g., ruby too?

Leonard Rosenthol: Richard’s doc on JSON and text direction - http://w3c.github.io/i18n-discuss/notes/json-bidi.html

Dave Cramer: we’re getting too deep; need examples; all solved in HTML; can we embed HTML fragments in JSON-LD?

Ivan Herman: there are data tags for HTML usable in JSON-LD

Marisa DeMeglio: +1 to Brady; unicode doesn’t fully cut it.

Benjamin Young: ivan: @language can be added in an inline @context as the default for a single JSON-LD file…so there’s that: https://json-ld.org/spec/latest/json-ld/#string-internationalization

Tzviya Siegman: https://w3c.github.io/wpub/#wp-language-and-dir

Ivan Herman: in our document 3.3.6 language & base-direction — we define ability to specify lang & base-direction for each text string and default — the latter would refer to the publication, not the metadata.

Benjamin Young: there is some funky was to do a default for an entire JSON-LD file.
… still no bidi, no ruby

Garth Conboy: decided… move on, this is not the metadata you’re looking for (aka talking DanB)

Garth Conboy: Various dates — they map nicely

Tzviya Siegman: Reading progression direction — conceptually easy; just lacking the property (as it does refer to the content – through reading order)

Tzviya Siegman: Title maps to “name”– funny, but true.

Leonard Rosenthol: should pull this out of DC – names just too wrong.

Joshua Pyle: we currently use name for title

Romain Deltour: with a context, you can map “title” to https://schema.org/name

Zheng Xu: how do you title in multiple languages?

Ivan Herman: can be done in JSON-LD
… need to make sure it’s happy with consumers of schema.org

5. Resolutions

Resolution #1: There is a default reading order. There is an optional list of resources that may be provided to extend the bounds of the publication beyond the default reading order and add section on Boundary determination to spec.
Resolution #2: JSON-LD serialization; starting up from schema.org, and adjusting from there.