Publishing F2F, 1st day — Minutes

Date: 2018-10-22

Attendees

Present: Wendy Reid, Romain Deltour, Ivan Herman, Tzviya Siegman, George Kerscher, Brady Duga, Avneesh Singh, Marisa DeMeglio, Toshiaki Koike, Hadrien Gardeur, Jun Gamou, Charles LaPierre, Rachel Comerford, Luc Audrain, Gregorio Pellegrino, Garth Conboy, Laurent Le Meur, Benjamin Young, Leonard Rosenthol, Cristina Mussinelli, Daniel Weck, Bobby Tung, Dave Cramer, Ralph Swick, Liisa McCloy-Kelley, Juan Corona, Karen Myers, Gregg Kellogg, Reinaldo Ferraz, Joshua Pyle, Vladimir Levantovsky, Wolfgang Schindler

Regrets:

Guests: Gregg Kellogg, Guillaume Sire, Tess O’Connor, Dan Brickley, Richard Ishida, Addison Philips, Matthias Kovatsch, David Clarke

Chair: Tzviya Siegman, Wendy Reid, Garth Conboy

Scribe(s): Benjamin Young, Brady Duga, Leonard Rosenthol, Ralph Swick, Rachel Comerford

Content:

1. Introductions, misc
2. Let’s Talk About Publishing
3. Use cases, affordances
4. Boundaries
5. language and base direction in JSON-LD
6. schema.org issues
7. Github Issues
8. Resolutions

1. Introductions, misc

: lots of wonderful people greet each other with their name and titles.

Tzviya Siegman: anyone on the phone who would like to introduce themselves?

Ivan Herman: please everyone around the table please present+ yourself in IRC

Tzviya Siegman: if you’re not familiar with IRC, please let us know

Tzviya Siegman: please RSVP for dinner on the Google Doc

Tzviya Siegman: https://docs.google.com/document/d/1Mt9PTcOdmrCwIsgfxbGMGjwHlUsySU01I0D4oBkSbcA/edit?usp=sharing

Ivan Herman: also please add your appetizer and desert choice before noon today

2. Let’s Talk About Publishing

Tzviya Siegman: I’ll start with an overview to hopefully catch everyone up
… here’s our current position
… the content in a Web Publication is anything the Web can have: SVG, images, audio, HTML, etc.
… the big questions still pivot around the manifest and the content and how those interact
… we still talk about the abstract concept of the infoset
… and we also continue to explore metadata
… as well as the exact structure
… especially with regard to internationalization
… there is a lot of interplay between the pieces
… and we often get hung up on some of these questions
… the details of which are rather important
… but we need to be careful not to debate these at too much length, as we have more to accomplish
… so. what we have right now is:
… the content - html, css, etc.
… the abstract infoset
… the manifest
… the webidl
… and the JSON-LD context
… and we’ll be discussing how those all interact
… if there’s anyone who’s new to this, please ask questions!
… k. I think we’ll move on to the next agenda item

Ivan Herman: um. I think the whole infoset debate/question should be discussed
… when we started the work, the abstraction was valuable
… if I were comparing it to something else, it would be numbers or numerals
… numbers are the abstract concept, and numerals are the real tangibles…hexidecimal, etc.
… because I’m a mathematician this abstraction helps me a lot
… and currently the whole document is written around this construction
… but we need to determine if it would be better to focus on the just the manifest
… which we did consider may happen when we introduced the manifest concept

Garth Conboy: I would have said the info set is really the content of the manifest

Hadrien Gardeur: +1 to removing the infoset

Garth Conboy: I don’t really see them as separate

Ivan Herman: so let’s not forget that your statement is not quite correct
… the manifest may be incomplete by itself
… and it will need to take information from the surrounding HTML itself
… so it’s much clearer to think about the infoset
… which may be filled from many different sources
… the infoset then is important

Romain Deltour: I feel it’s mostly a communication issue
… and fear we’ll drive people way with the “abstract infoset” language
… for instance, if we talk about “publication title” it seems obvious this is the abstract concept, and not the specific serialization

George Kerscher: +1 to what Romain said

Luc Audrain: +1

Romain Deltour: so I’m not sure continuing to distinguish this is helpful

Tzviya Siegman: +1 to romain

Avneesh Singh: is there a possibility that the infoset at the abstract level includes beyond the manifest?
… so I guess we may need to have both of them

Rachel Comerford: what is the problem that the infoset was attempting to solve in the first place?

Ivan Herman: I think at the moment, when we began to discuss this things went all over the place
… what are the things we need, and then we very quickly went into how we expressed them in JSON
… so we introduced the infoset terminology, to avoid always pushing things into JSON
… so we were then focused on what the things we needed abstractly without demanding we determine the format/expression
… do we still need that now? does it help the reader? that’s really the question for today

Rachel Comerford: I’m still struggling to understand what the infoset is actually solving for us

Tzviya Siegman: we built up this list of metadata and called it the “infoset” for abstraction purposes
… then when we got to the JSON-LD discussion, we started adding MUSTs and SHOULDs
… but we still have the infoset, because things like title may be expressed in either HTML or JSON-LD
… and therefore the abstraction may be helpful

Rachel Comerford: the problem, then that we have is that we need to determine the location of this information

Tzviya Siegman: so one idea is to focus on the origins of the metadata rather than their abstractions

Brady Duga: I’m trying to think of specs that do something like this, and CSS 2.1’s box model spec takes this approach

Tzviya Siegman: https://www.w3.org/TR/CSS2/box.html

Brady Duga: there are abstract ideas explained, and then details of their expressions
… I’m not sure, though, that our spec is that complex
… however, it still might be useful to keep some expression of abstractions
… but for things like title, I’m not sure it needs it’s own abstraction as people understand those already

Leonard Rosenthol: I would agree with duga and think ivan’s history that we defined the infoset before we serialized it is correct
… and that the abstraction may no longer be necessary
… and hopefully we can remove the abstraction
… and fill in the missing pieces that it dealt with

Luc Audrain: so in our intro we say that a Web Publication is “pure Web”
… so the abstractions begin to explain perhaps how a Web Publication differs from a Web Site or a Web App
… it could be very technical and expressed in something like JSON-LD
… but we really should explain how Web Publications differ

Gregg Kellogg: we ran into this on the JSON-LD specifications
… we currently discuss things in strings, arrays, and dictionaries
… abstract enough that it can be implemented in something like YAML, but concrete enough for it’s focus on JSON

Laurent Le Meur: perhaps we could remove the abstract infoset, and instead focus on the WebIDL or some other expression of the properties

Avneesh Singh: +1 Laurent. Infoset is not so well understood

Laurent Le Meur: also the infoset is scattered throughout the document, and perhaps moving it back together into one place could be helpful
… the options for placing the title, are encoding expressions

Hadrien Gardeur: we are working with something conceptual

Luc Audrain: +1

Hadrien Gardeur: but I do think the infoset term is confusing
… but this concept of a Web Publication is what we’re selling to the world
… perhaps one of the issues is that keeping things at the abstract, is that we don’t divide well between what actually ends up in the manifest and what doesn’t

Luc Audrain: +1

Cristina Mussinelli: +1

Tzviya Siegman: one concern we’ve had in the past is duplication
… the requirements on the abstract infoset can be confusing
… and using properties of a publication might be more clarifying

Ivan Herman: o.k. I don’t think we should go on with this discussion to much longer
… I propose that Matt, I, and perhaps the chairs look through the document
… and take these comments into account
… and essentially remove the term…which I realize may frighten some here
… but I don’t think having telcos on this longer makes sense
… so we will work up a pull request to address this

Proposed resolution: edit the infoset and properties section, and introduce a PR to the group (Benjamin Young)

Avneesh Singh: should we express that this is a non-operative section?

Tzviya Siegman: I think that is part of the intent, but as yet it’s not that concrete

Charles LaPierre: +1

Ivan Herman: we hope to have continued discussion around the specific text once it’s sent as a pull request

Resolution #1: edit the infoset and properties section, and introduce a PR to the group

Tzviya Siegman: since we have 10 extra minutes, does anyone have other questions around the interactions of the different pieces, manifest, infoset, webidl, etc?

George Kerscher: are web browsers today looking for this manifest?

Luc Audrain: +1

Ivan Herman: no. and it remains to be seen if/when that will happen

George Kerscher: so that is why we begin with an HTML document?

Tzviya Siegman: correct.

3. Use cases, affordances

Luc Audrain: so, these are at this point very similar to Web Apps.
… but that will be hard for publishers to make something like that consistently and as a standard

Tzviya Siegman: I believe this is issue #271…and we’ve discussed this frequently
… what does it mean to be “WP-aware”?
… I don’t want to derail things, but we have yet to define what it means to open a WP
… Franco has done an amazing amount of work on the use cases and requirements document

Ralph Swick: -> https://github.com/w3c/wpub/issues/271 #271 : WP rendering in non WP aware browsers

Tzviya Siegman: https://w3c.github.io/dpub-pwp-ucr/

Joshua Pyle: the changes I’ve made were checked in on Friday
… and I’ve noticed some have been merged
… they’re good. Franco did some great work identifying areas that weren’t covered
… so the status feels like it has what it needs and just needs some editing and then publishing
… there may, however, need to be some trimming of requirements that I’ve not understood when I started editing the document

Leonard Rosenthol: can you give us any sort of general overview about the changes?
… were these clean up changes? new use cases? can you generalize these into a summary for us?

Joshua Pyle: sure. there were no new use cases.
… Franco has submitted some that Tzviya and I have yet to review
… it was about a two year old document
… so it’s been tidied up a bit, and in the last 48 hours there’s some new content
… but we’ve also done some general clean up
… however, i think this group has bigger fish to fry than more use cases
… I don’t know how in depth Tzviya wants to go
… there’s one huge one, what does the UA want to do?
… and that one could take a while
… for most of these, we’d all agree easily with about 90% of these
… but 10% of the ones related to user experience–layout, user movement through the document, etc
… they sound simple, but they have frequently been hard to discuss
… I don’t know how much of these are practical for the first version

Tzviya Siegman: so, since I merged many of Franco’s PRs, I can give a brief overview
… he did add more use cases based on what’s gone into the spec
… he also added comments, so you’ll see comments like “I did not see these fields, help wanted”
… this is where the request of the working group to review/contribute
… the intent was to make the UCR document match the WPUB document’s current reality
… when heather was the editor, she’d started a pattern of adding requirements at the end of the document
… Franco’s continued that with the intent of making UCR and WPUB match
… we have a WebIDL and we have metadata, but what does that mean if I open the web publication?
… we keep getting hung up here
… and if I want to understand that with the UCR, I open section 2.2.1
… and it explains more about what could be done with that metadata
… so, today, we can talk about what can be done

Ivan Herman: the most important thing is to link the UCR with WPUB…from both sides
… right now, we have the “affordances” section
… and also with the specific metadata
… and those refer to their use cases
… and for many it’s obvious, but for some it’s not
… we also want to do the same from the UCR document
… we can or can not do the use case described with what’s expressed in WPUB with links back to how to make that happen in reality

Hadrien Gardeur: the document is very useful, but I’m concerned that whenever we talk about a UCR or when we talk about what a non-WP does or does not
… we still don’t determine what the minimum of what a WP-aware reader must do
… there are some things that are not minimum, like offlining, that we talk about as if they were
… and I fear that by trying to solve them all at once we solve none of them

Leonard Rosenthol: I think that what I’m hearing is two different things
… I completely agree with the desire to map WPUB requirements to the UCR document
… what I have a problem with is using the UCR document to put requirements on UAs
… in PDF specs for example, we talk about format requirements vs. process requirements
… we have format requirements in WPUB, but we lack process requirements
… that seems like a very normative core to our spec which we should add
… we’re describing not only what they’re expressing, but what they expect to have happen when they do it

Garth Conboy: +1 Leonard

Leonard Rosenthol: the UCR, however, is not normative, and has no “Thou Shalt Do X” in it

Tzviya Siegman: typically, it depends on the spec, but usually that sort of thing does’t appear in W3C specifications
… I’d spoken with Josh about creating three or so “focal” use cases
… one might be offlining
… it’s a lot complicated
… but it seems essential
… so we write some focal use cases around that and other core requirements

Garth Conboy: how do you see the UCR? is it just augmenting the WPUB spec?

Tzviya Siegman: yes. I agree with that

George Kerscher: I agree with garth and Hadrien. That we should reference and describe the things in our UCR document
… and point to them from the WPUB spec
… but we shouldn’t go from the UCR document and then figure out how to modify the WPUB spec to match
… we should focus on implementable WPUB

Ivan Herman: I’m in agreement with a minimum viable approach to WPUB
… but the hard part if determining what is normative and what is non-normative
… if we put it as normative, then we MUST have (per W3C process) several implementations for everyone one of those MUSTs
… if it’s informative, then we can just leave it there
… so, for this MVP approach, those things should be normative, but we should be careful here
… because we have to test and verify all these things
… so if it goes into WPUB as normative, we should be very careful

Tzviya Siegman: so, coming back to the UCR, what things should we add to this MVP for WPUB?

Garth Conboy: all MUSTs but some SHOULDs?

Tzviya Siegman: do we even have that language in the UCR?

Garth Conboy: yeah, we do have that in the UCR currently

Joshua Pyle: I love the MVP idea
… I’ve been going through the UCR document for weeks
… for instance, showing the TOC while you’re anywhere within the document

George Kerscher: The TOC must be omnipresent

PROPOSE: the table of contents must be accessible from anywhere in the publication, part of MVP

Wolfgang Schindler: +1 to bigbluehat on TOC

Benjamin Young: All for clarifying stuff, but really need a testing champion

Tzviya Siegman: We have one! Chris

Benjamin Young: Need another
… Need to figure out how to make this omnipresence testable

Ivan Herman: we’ve discussed the toc issue several times
… there are many more of these though
… like search should be across all the things in the publication
… and figure numbering, etc, should be across all the things in the publication
… these seem like sensible and unique-to-publications features and capabilities

Hadrien Gardeur: I don’t think these are MVP
… frankly, I only see two things as MVP
… going through the reading order, and accessing the ToC

Leonard Rosenthol: I was going the same place as Hadrien.
… what ivan listed require many more discussions
… and I’d not consider them MVP

Garth Conboy: I’m very hesitant to put requirements on UAs.
… things like search, etc, seem likely to be added by UAs, but not likely to be an MVP

Ivan Herman: searching across multiple documents is something unique to publications

Garth Conboy: I’m generally agreeing. The top two (toc access and directional progression through the document) do state an MVP
… but then what can happen in non-WP aware UAs is also a consideration
… well, the actual one is understanding the manifest
… so, perhaps this is a great discussion for us to determine these things and build up from there

Tzviya Siegman: josh perhaps you can think through what’s MVP and what’s WP-aware and add these things to the UCR

Laurent Le Meur: so, we can do some of these things by going back and forward in a current browser
… or is this a directional progression from point a to point b to point c without back and forth?

Garth Conboy: I’d say directional from resource to resource without back and forth

Avneesh Singh: this sounds like determining the bounds of a publication

Luc Audrain: about WP-aware UAs
… is it possible to speak about a UA that understands at least JS and CSS?
… what kind of engine are we speaking about?
… what must it support?
… what if it can’t do CSS or JavaScript?
… what does it mean in that moment?
… I would like to propose that we’re talking about UAs that can do JavaScript and CSS as a minimum

Avneesh Singh: as far as I understand, WP-aware vs. non-WP-aware, the keyword is “WP”
… not JavaScript support, etc.
… there should be points where we say “this is a WP aware UA feature”
… the JavaScript and CSS are not part of those requirements

Luc Audrain: but that is my point by not determining whether we have these things or not, we do not have a foundation to build upon
… I feel like we speak about engines that are too small and lack features, such that we can’t build a WP experience in a non-WP-aware browser

Ivan Herman: RickJ asked me to forward his greetings to everyone around

Ralph Swick: [I suggest that we can describe what it means to be “WP Aware”; what it means to conform to the WP specification but it is not practical to say what a non-“WP Aware” agent does with a WP. We can design WP such that non-aware clients are more likely than not to do something helpful.]

Avneesh Singh: just one note to say that if JavaScript is provided via a non-WP-aware but uses JavaScript to create a WP like experience, than that non-WP-aware UA becomes WP-aware

Hadrien Gardeur: this is back to the discussion earlier, but I think going back and forth to a ToC is not an MVP feature

Tzviya Siegman: I think we need to be careful with defining User Agent
… we can define WP-aware
… but we need to be careful to determine the meaning of UA

Garth Conboy: I keep thinking Reading System, and I’m not sure if that’s a UA. I think it is, but it has been confusing
… I think what laudrain_ and Avneesh are saying is that if you can build the reading experience and distribute that with your publication than you get a WP-aware UA with your publication
… and in as much as WP’s are distributed on the open web platform, then a WP can be distributed with such a built-in WP-aware UA
… the publication itself causes the WP to be WP-aware

George Kerscher: Hadrien I like to be able to move through the document without going back and forth
… but I also want to be able to collapse a collection of 1000+ documents, and get to just part of that

Hadrien Gardeur: yes. I want that too.

George Kerscher: then we’re in agreement. great :)

Romain Deltour: sidenote: in HTML, “user agent” are defined as a conformance class (in 2.1.8: https://html.spec.whatwg.org/#conformance-classes))

Tzviya Siegman: let’s come back to the MVP for WP-aware

Hadrien Gardeur: there’s no concept of this now
… if you build something like this now, it would be a Web App
… and those lack an understanding of scope
… especially with search…so I think that should not be an MVP
… also offline-ability is hard to achieve consistently
… things like comics, etc, don’t have space available usually
… and are therefore not good MVPs

Leonard Rosenthol: and this brings us back to the discussion of boundedness

Garth Conboy: Perhaps: MVP == (get to TOC move through reading order); MGP == search within bounds of WP

Leonard Rosenthol: do we list all resources? do we list just some resources?
… do we allow them all to be searched? offlined? etc.
… then we’ll need to determine per resource what’s possible for search, offline, etc. per resource

Ivan Herman: so we have some MVP thoughts
… but if we stop there, then we have this minimal thing
… and a huge blob of features in the UCR
… and then UA developers pick just what she wants
… and then there are some of these affordances, however hard they may be, should be considered fundamental to a publication
… it perhaps is a difference between MUSTs and SHOULDs, but they should still be expressed

Wendy Reid: the product creation folks understand this
… and ivan is reading my mind
… what we probably have to do is create tiers
… MVP comes from product design
… one of the core product design concepts is iterative improvement
… and we could benefit here from a list of iterative improvements up from an MVP

Brady Duga: so I have an unhelpful comment, but I’m also going to be generally agreeable
… I don’t think search is an MVP because it’d be possible to ship something that can be searched, but without expectation that it will be searched
… if a large group says, it’s hard to implement and hard to create then is it an MVP feature?

Marisa DeMeglio: can someone remind me what MVP means in relation to our stack?
… when I think of our spec, then I could see many browsers to do most of these
… what does it give our spec to define MVP?

Garth Conboy: I’d think of them as just the MUSTs
… and I’ll agree with duga
… searching across the bounds of the publication is probably not a MUST

Marisa DeMeglio: I don’t think we have to split so many hairs here

George Kerscher: the minimum viable reading experience is probably something beyond what we’ve discussed so far as MVP

Luc Audrain: +1

George Kerscher: and I’m concerned that if we only spec an MVP, that it won’t result in a good reading experience

Hadrien Gardeur: we should focus on MUSTs, SHOULDs, and MAYs
… there’s been a lot of discussion of search and offline
… I think those are intertwined
… once they’re offline you could index them

Benjamin Young: The web is trending offline
… service workers, new stuff
… New indexing stuff is coming for searching large documents
… Currently be explored in other groups, let’s talk to them

Romain Deltour: I had a quick look for UAs in the HTML spec
… and they use conformance classes
… interactive browsers do one set of things
… non-interactive ones get a slightly different list
… things like validators, etc.
… we could build up from these

Avneesh Singh: I believe everyone generally agrees, but perhaps MVP is a confusing term
… and in fact we’re looking for the core affordances that should be provided

Tzviya Siegman: I agree

Liisa McCloy-Kelley: I don’t think search is MVP, and I don’t think changing font size is necessarily MVP
… but I do think internal and external linking experience is MVP

Avneesh Singh: +1 to Romain, to also consider html classification

Ivan Herman: I was happy to hear what bigbluehat was saying. We should use the MUST, SHOULD, and MAY, etc.
… if it’s hard today, we should put it as a SHOULD
… but we should have a clear idea of what is being built for the future

Ralph Swick: [more tests are always better; even SHOULD and MAY]

Benjamin Young: You do have to write tests for SHOULD but don’t need to pass them (maybe?)
… There is an assumption we are on a desktop browser given our title with “Web”
… Can we assume this for conformance tests?
… If we don’t assume a proper browser, we end up in a bad place

George Kerscher: is it possible for a WP aware browser, to not process JavaScript embedded affordances that it already provides

Ivan Herman: the answer is yes: that’s what polyfills are designed to do

Tzviya Siegman: we’ve got 13 minutes
… just fyi

Liisa McCloy-Kelley: from a general mapping perspective, MVP is MUSTs, SHOULDs are next level up, and MAY is super awesome product

Liisa McCloy-Kelley: must = minimal, should = desirable, may = optimal

Liisa McCloy-Kelley: or may = sexy

Tzviya Siegman: so. I’m going to propose that josh and franco with the UCR and affordances, etc, go through the existing WPUB document
… and go through the MUSTs
… and next time we meet we look at the MVP/MUSTs
… and start listing them in some section of the document
… and then we’ll see if that’s something we can live with
… does that sound like a good plan?

Ivan Herman: I would also welcome if someone else took on editorial jobs related to this minimal stuff
… so that we have a text that might actually replace the current section we have on affordances, etc.
… matt will have quite a lot of work already to deal with our infoset choices
… so I think it’s unfair to expect matt to do this also
… so help wanted!

Gregorio Pellegrino: Reply to George: JS should check if browser is WP aware
… like JQuery does
… Allows polyfill of specific features

George Kerscher: That means each affordance can be uniquely identified
… Ran into this with footnotes, with JS footnotes vs RS finding them

Hadrien Gardeur: We have been talking about MVP, but in terms of JS don’t want to test if WP aware
… Instead want to test for features
… Transition from non-WP aware to WP aware is also important
… We haven’t discussed it but there is a UX issue

Tzviya Siegman: Josh and Franco to find the MUSTs for MVP
… in the future we will do the same with SHOULD
… Need volunteers and an editor for Affordances section

Garth Conboy: Wanting to define a minimal list of musts
… currently we have 2 and third, is that enough?

Tzviya Siegman: No

Marisa DeMeglio: Does WebIDL have a way to query if a specific affordance is available?

Ivan Herman: No

Brady Duga: —– BREAK —–

Tzviya Siegman: Reminder about dinner stuff

Garth Conboy: Google covering dinner

4. Boundaries

Tzviya Siegman: Reading from the agenda

Dave Cramer: Need to put some effort into describing this in an operational way

Dave Cramer: https://github.com/w3c/wpub/issues/194#issuecomment-428662128

Dave Cramer: Say I am in a WP context and click a link to another WP?
… what happens then?
… How do we discard a manifest?
… Easy to talk about boundaries are, but what do they mean?

Leonard Rosenthol: The concept I agree with
… Thinking about boundaries from UX perspective
… eg my goal is to search this publication - what is a WP to accomplish that
… Look at use cases as they relate to boundaries

Benjamin Young: Dave mentioned UX. We talked about constraints on UAs
… Biggest one we have to consider is what happens when you cross the boundary
… There is some precedent in the web, eg web manifest
… inverse is iFrames, you pull things into your scope
… third is target:blank which insists you leave the publication
… Anything out of scope takes you to a browser context

Hadrien Gardeur: Glad to hear this example, we had inter document linking discussions at epub
… Nice to finally be able to link between pubs
… Earlier we didn’t have the right terminology to discuss the boundaries
… No longer true. The scope is now expressed.
… Agree we have established patterns for what happens
… no need to reinvent the wheel
… When you are no longer in the bounds of the pub, the affordances are no longer available

Tzviya Siegman: We seem to be agreeing
… Goal is to address issue 205
… Maybe we are done with this?
… maybe we just need to be more specific about what happens and the UX for when we leave the pub

Garth Conboy: Searching - maybe that is a should, clearly you need bounds for that
… Are we comfortable saying we are now done?

Leonard Rosenthol: Concerned we are talking about 2 different things regarding bounds
… 1 is what the UA understands are the bounds (eg for search)
… Seems clear why we need that
… Issues around exiting and entering is a completely different issue
… Nothing to do with the actual bounds
… Just a UX issue, which is still important
… Look at both, but do not combine

Romain Deltour: Security issues - what happens when you move between origins
… Origin historically undefined in epub world
… Pubs can share local storage, etc
… Bounds is an opportunity for us to tackle this issue
… Do you have to examine every resource to determine origins?

Ralph Swick: -> https://github.com/w3c/wpub/issues/205 205 - We need a section of the document that explicitly defines the bounds of a publication

Ivan Herman: Issue of bounds depends on what we discussed before the break
… May be ok to say search is only for things in the resource lists
… but may not be true for offlining

Hadrien Gardeur: But those are the same bounds?

Ivan Herman: But do we need to list eg CSS?

Dave Cramer: case 194 talks about links to items in multiple pubs?
… Do you need to define the various combinations of navigation actions?

Benjamin Young: How ready are we to decide things like experiential actions like
… clicking on something outside of bounds is different than inside?
… Are we at a place to deal with that now?

Liisa McCloy-Kelley: Yes, we are!
… If you are in the bounds you should know that
… There should be some experiential way of knowing I am navigating to something I “own” to something I don’t

Dave Cramer: Important web principal - how can a user trust their content (or not)?

Benjamin Young: +1 to experience mapping to user trust

Dave Cramer: web app manifest has a lot on this, about indicating to user they are in some special mode

Benjamin Young: There was a mention about web apps be similar but non standard
… There is no consistency promise
… Web pubs should have more trust - clear you are in the pub
… Adding that expression of trust is valuable to publishing

Tzviya Siegman: We are revisiting why we need boundaries
… But we have already discussed that
… Need for security, offlining, wayfinding, etc
… Need to focus on how not why, people!

Luc Audrain: User trust is fine, but also need to consider author trust
… Something has been “published”
… Bounds are important to verify that

Hadrien Gardeur: In the case of web apps, it is similar to how epub RS often work
… You have a context, when you go out of it, may open a browser or web view
… so you are now in a different UX context
… Compared to web once I have switched, I don’t have the same expectation of how I get back
… Web apps often don’t really support back
… From a UX standpoint fairly common way of handling bounds

Leonard Rosenthol: Take a use case, say offlining
… Use as an example to figure out what we need
… Have default reading, resource list, etc
… [reading from spec]
… “The bounds are defined as the union of resource list and default reading order”

Benjamin Young: Before we go there
… We have avoided UA requirement so far
… Should we define those now?
… Things like leaving the pub, etc
… Should we just file issues and make Josh do them?

Tzviya Siegman: Yes

Dave Cramer: Can we define expectations?
… Eg if you do search, these are the ones you should search
… Those can be tested
… Have an operational definition instead of saying “this is a boundary”
… Breaking the back button is really bad

Hadrien Gardeur: I was just pointing out how web apps work

Dave Cramer: I would be unhappy with pubs that did that

Joshua Pyle: Is it possible you have something in bounds that is not in the reading order?

chorus of voices: yes

Tzviya Siegman: https://w3c.github.io/wpub/#resource-list

Joshua Pyle: Does that mean what Leonard says was wrong?

Ivan Herman: No, you had it wrong, it is the union

Ivan Herman: I am fine putting this into a resolution
… and then we can close an issue
… it puts responsibility in the authors that if they expect eg offline to work
… then they better put the CSS in the resources
… Which is fine, but we need to decide
… I propose we do it now!

Garth Conboy: Agree
… Better flesh out your resources

Leonard Rosenthol: I support that
… The things we need to iterate on are various things we have discussed

Proposed resolution: leave draft language as is, change from a note to text, close the issue (Brady Duga)

Leonard Rosenthol: Change the language to “this defines the bounds of the publication”

Dave Cramer: What happens if you have a resource that links to CSS outside the bounds

Ivan Herman: What happens today if you have something in the cache that refers to an external file?
… That is totes the same

Dave Cramer: I am ok with not required

Tzviya Siegman: Objections?

Romain Deltour: Need to understand what happens when there are multiple origins

Ivan Herman: Need extra constraints on resources

Leonard Rosenthol: You are viewing the bounds wrong
… The fact that you can reference external CSS is irrelevant
… Bounds are what the list says, not where they are
… How we deal with bounds is another question

Benjamin Young: There is some prior art that is painful
… eg app manifest
… which is being replaced by service workers
… No master list, it just puts referenced things in the cache
… No need for boundaries
… Further constrains a service worker

Brady Duga: these lists of resources are all great
… once upon a time there was a thing called epub
… we had a manifest
… then we had a package
… which also had a list of files within the zip format
… and the only thing that we got from the manifest
… was errors
… when you’re not considering packaging, manifest sound great
… but when you get to packing, you probably will hate the manifest

Tzviya Siegman: do you want these to be the same in WPUB and EPUB4?

Brady Duga: probably not

Garth Conboy: A wp doesn’t need to be packaged
… but we could make a rule that the list is expunged by the time we package
… I thought we were sort of close to agreeing that the bounds was the union

Leonard Rosenthol: Didn’t we agree?

Liisa McCloy-Kelley: We did have call for objections
… I disagree with duga, I think it was very useful to have a list of resources
… and am not opposed to it in WP

Dave Cramer: Does the current web packaging spec have have features that support WP?
… Is there something there that we should pay attention to?

Tzviya Siegman: dauwhe++

Dave Cramer: Having some alignment with web packaging would be a lovely alignment
… Hope we can coordinate with them

Romain Deltour: +1

Tzviya Siegman: The union of the resource list and default reading order represents the definitive list of resources that belong to the Web Publication. All other resources are external to the Web Publication.

Ivan Herman: +1

Tzviya Siegman: Can we agree with the statement in the spec now [reading from spec]?
… “The union of the resource list and default reading order represents the definitive list of resources that belong to the Web Publication. All other resources are external to the Web Publication.”
… Do we agree?

Avneesh Singh: +1

Luc Audrain: +1

Gregorio Pellegrino: +1

Wolfgang Schindler: +1

Ivan Herman: Do we close 205 with this?

Tzviya Siegman: Yes?

Dave Cramer: +1 in that this statement is necessary, but not sufficient. There is more work to be done with boundaries and user experiences

Wendy Reid: +1

Resolution #2: The union of the resource list and default reading order represents the definitive list of resources that belong to the Web Publication. All other resources are external to the Web Publication & close #205

Tzviya Siegman: Objections?

Wendy Reid: also +1 dauwhe

Benjamin Young: Not -1

Leonard Rosenthol: +1

Benjamin Young: Don’t want to be the only negative one
… also discussed a CG for exploring this
… and kind of concerned about this
… Opposed because it is underexplored, has security ramifications, etc
… We are pushing ahead due to time, but we need an outlet to properly vet these things

Romain Deltour: +1 to what @bigbluehat said

Wendy Reid: Dave said something like that in his +1

Hadrien Gardeur: +1

Ivan Herman: If new problems come up, it is in our right to reopen
… but don’t want to keep issues open forever

Laurent Le Meur: +1

Joshua Pyle: +1

Garth Conboy: +1

Ivan Herman: at this moment uncertainty is bad

Tzviya Siegman: Now proposing the PCG!

Laurent Le Meur: +1 to Ivan. We have to study implication of this definition of boundaries but can use it as a ground.

Tzviya Siegman: lunch time!

Ivan Herman: —- LUNCH —-

Tzviya Siegman: https://www.w3.org/community/blog/2018/10/22/proposed-group-publishing-community-group/

5. language and base direction in JSON-LD

Leonard Rosenthol: tzviya : quick introductions

Tzviya Siegman: https://w3c.github.io/wpub/#language-and-dir

Leonard Rosenthol: tzviya three github issues

Ivan Herman: long running discussion that we’ve had with you folks in the past, about the need to represent
… the language of content as well as multi-lingual

Ivan Herman: schema.org also needs to be able to understand and address whatever “standard” is used
… base direction of a publication is not well defined in schema.org
… (page) progression direction also needs to be reflected/described
… in the cse of the WP, the UA needs to know which is “next”

Addison Philips: this also applies to J books for vertical orders
… its not the same thing as base direction (for bidi)

Ivan Herman: we do have them separately

Tzviya Siegman: https://github.com/w3c/wpub/labels/topic%3Ainternationalization

Liisa McCloy-Kelley: are we also going to address mixed direction in a block? specifically in JSON, such as the title

Ivan Herman: lets start with the trandition lang and dir issues
… what we did there is to have two separate items (lang + dir)
… but there is a gossip to change that

Addison Philips: what is the context, as there may be the case.
… metadata about the publication, yes?

Ivan Herman: we don’t do anything with the content (that’ HTML). this is indeed abotu metadata
… we set a global language via existing schema.org inLanguage.

Tzviya Siegman: clearing up some issues here…
… we are talking about specifc tags in JSON-LD and possibly schema.org, which danbri may have input into
… we are particularly talking about how to express the language and/or direction of some tags. For example, the title or author of a document.

Addison Philips: the problem that you are encountering is similar to those of other groups.
… if we have a piece of natural lang text, we need to be able to apply language and the base dir of that text.

Richard Ishida: here is our base reference: https://w3c.github.io/string-meta/

Addison Philips: (sometimes you may not need one or the other, but you want to be able to set them when important)
… there are presentation cases where you need to know the proper things to do (eg. font selection)
… we recommend that each tag have its own set of values for this. Our string-meta doc tries to address this
… we agree that you need to transport both

Ivan Herman: in the current draft, we have a global setting, but we also want to enable overridding for specific items.
… inLanguage from schema.org but they don’t have an “inDirection” tag, so we added our own
… but this is an issue because we had to introduce our own. However, there is a bigger issue
… the individual override. Using @value language in JSON-LD
… .but we don’t believe that schema.org understand this
… (see gkellogg who wrote that spec)
… the online tool for schema.org wasn’t happy about it
… but direction is even more complicated, because JSON-LD doesn’t have this natively.

Gregg Kellogg: because JSON-LD is just RDF, then the language is coming from base. but RDF has no direction concept, tehre is nothing to come from
… but if a future RDF had it, then JSON-LD would get it. (but not in the plans right now)

Richard Ishida: https://w3c.github.io/string-meta/#script_subtag

Ivan Herman: going back a bit, the gossip says that for this restrictive usage we could “get away” with just the Lang tag.

Richard Ishida: if there is no other way to do it, then you could assume direction from language tag, if you had script info as well.
… or could reliably guess. Hebrew, for example, might include info about the cdirection

Leonard Rosenthol: script - ISO 15927

Dan Brickley: are there cases where a single script could be written several ways (eg. maybe japanese sometimes vertical etc?)

Addison Philips: inferencing the direction is not the same as actually having one
… esp if you are counting on all downstream processing to do the same (right?) thing
… the challenge is to think of cases where the title is in a lang but non-standard direction

Tzviya Siegman: I can give you lots of examples!

Addison Philips: you can imagine a doc where infering direction would cause things in the wrong direction.

Ivan Herman: the reason why I would like to avoid the direction tag is that it gets rid of a headache.

Richard Ishida: our preference is to have a separate label for each string.
… but there could be any number of strings and each one needs the same treatment

Gregg Kellogg: using HTML literals as string values?

Ivan Herman: we considered by punted on that for other reasons

Gregg Kellogg: there is another level of indirection also possible but also mnot used commonly

Ivan Herman: setting the lang for each literal is not a problem, we know how to do that from JSON-LD perspective
… but also having the direction is the problem

Addison Philips: we get that its a problem but you need that info if that the dstring is ever going to be displayed
… otherwise, things may not actually line up properly in display. This is avoidable.
… we do, however, consider this as a major flaw/limitation in RDF/JSON/JSON=LD

Ivan Herman: yes, but we are at the end of the stick
… the only thing we can set today in JSON-LD is lang

Dan Brickley: there are several dynamics happening here…
… while I am happy to put stuff into schema.org, it may not be the right place
… this group seems to be doing application modelling as well and I can help you do that too
… let’s not confuse the two

Ivan Herman: I don’t see a proper solution here

Dan Brickley: cleanup issues in early 2000’s lead to the current state and we’re not going back there

Hadrien Gardeur: what we have right now is similar to your proposal but it woudln’t be undestandable by generic processors
… and that is a big part of our concern. we don’t necessarily want to go out on our own, we’d rather use general stuff
… .esp. if it causes failures by general processors

Benjamin Young: concerning the suggestion of using HTML syntax for the strings, ivan is not a fan.
… I’d like to understand why we don’t want to go down that path.

Brady Duga: +1 to bigbluehat

Dan Brickley: there is no such thing as a “schema.org processor”. There are search engines, but that’s not a general case

Ivan Herman: can we take the position that we can use any valid JSON-LD and have it handled?

Dan Brickley: no, not aware of a full JSON-LD processor in use today for schema.org
… there are specific use cases where specific needs of JSON-LD are used (or not).
… at Google, we picked a specific subset of things and Bing probably as well

Ivan Herman: what we have tried to do so far is to ensure that what we want to do is understood by an existing tool and if not…

Hadrien Gardeur: we have discussed the HTML route many times, we are concerned that many UAs that would ingest these literals would not know what to do with the HTML anyway
… it would convert it down to a string and would end up dropping the useful bits

Laurent Le Meur: there is also an issues if improper elements are used
… adding HTML elements de-simplifies JSON

Richard Ishida: there are three levels - global, per element, inter-element
… HTML already knows how to do that and the Unicode controls aren’t well supported by browsers

Leonard Rosenthol: ivan : we don’t use HTML for any strings

Leonard Rosenthol: [discussion about some Ruby specifics]

Addison Philips: Ruby shouldn’t be need for presentation of metadata but would be useful for sorting

Addison Philips: yes, if you use HTML, you need to then have it parsed and understood.

Benjamin Young: it might be helpful which strings are thinking about here. What other data could be applied here?

Addison Philips: author, publisher, title

Tzviya Siegman: let’s look at the issues

Ivan Herman: they are all around the same issues

Tzviya Siegman: proposal 1: going to HTML. (but that has been nixed by a few people)

Ivan Herman: do we think our UA/RS folks are willing to do deal with the HTML?

Tzviya Siegman: would UAs rather see it as HTML or JSON/JSON-LD? what are common flows?

Addison Philips: they have fields with the values and not HTML
… you don’t get anything for free from JSON-LD but it still works with those processors

Dan Brickley: re Google SDTT, consider an example like https://gist.github.com/danbri/010ee9afeb48806c857775d062caf3ed … it is OK by the Google tester but effectively useless. SDTT is good for testing to see if specific data examples match the information needs of specific google tools, and it also catches some low-level errors.

Benjamin Young: social web working group uses the HTML representation (or a subset thereof)
… we should probably also set the set of valid values

Leonard Rosenthol: Adobe’s XDM handles some of these issues
… we follow script-meta recommendations

Dan Brickley: there seems to be nothing in between HTML and “plain text”.

Ivan Herman: we are not in the position to make a new RDF datatype

Ivan Herman: we could define a subset of HTML but that would need to be validated during workflows

Benjamin Young: in practice, lots of folks had the same issue that HTML is too big and scary and don’t expect folks to use it
… and ended up using a subset of HTML

Tzviya Siegman: we need something robust but not as scary as HTML

Garth Conboy: it having our special stuff ignored a bad thing?

Tzviya Siegman: why can’t we change JSON-LD?

Benjamin Young: because JSON-LD is modelled on RDF which we can’t change

Benjamin Young: why is violating JSON-LD not OK?

Ivan Herman: because implementors could just ignore the validation errors

Ivan Herman: the reason for JSON-LD was so that the metadata would be understood by search engines and other schema.org aware systems. So why violate it?

Benjamin Young: but maybe Google or bing will index it anyway

Ralph Swick: [ JSON punted on the lang/dir issue. RDF tried to punt on it as well, expecting the underlying serialization to handle it. When the first serialization was XML then the RDF punt nearly worked. An underlying serialization in HTML handles lang/dir as well as (potentially) markup for SVG, MathML, … ]

Richard Ishida: HTML was one option. JSON-LD was another. But there is another possibility

Gregg Kellogg: the three options I heard. (1) Inside a tag, you have to use HTML (or the like). (2) For the entire tag, use a script as part of the language tag (as that is valid RDF). (3) Or use a fully structured object with lang and direction, which is also valid JSON-LD/RDF

Ivan Herman: this means moving away from “simple literals”

Benjamin Young: the situation of feeding search engines, I’ve tested things and Google seems to handle the HTML values in JSON-LD - for some definition of “handle”
… the alternative, the complex objects, will not index by the search engine

Gregg Kellogg: if there was a standard “indirect object” that would help

Liisa McCloy-Kelley: doesn’t ONIX use HTML for thje strings?

Tzviya Siegman: it’s XML not HTML. (ONIX is a common vocab for books)

Matthias Kovatsch: what are the cases where you can’t determine the direction from the lang?

Dan Brickley: re JSON-LD, could we use a complex “datatype” to carry lang+direction together? (ugly and horrible and wrong…)

Garth Conboy: ONIX: https://www.editeur.org/83/Overview/

Addison Philips: writing mode is different. examples like azerbejan can go in both latin or arabic

Addison Philips: … ex: <p lang=ar>W3C TPAC 2018</p>

Ivan Herman: what are the real practical cases that we would have in publishing if we only used the lang tag?

Richard Ishida: you would have to ensure a script tag!

Addison Philips: the Unicode/ICU approach also has some options (???)

Dan Brickley: re JSON-LD, could we use a complex “datatype” to carry lang+direction together? (ugly and horrible and wrong…)

Leonard Rosenthol: when you mention Unicode, you’re not referring to the deprecated substring stuff, are you?

Addison Philips: I’m referring to BCP47
… separately, Unicode bidi control characters that could be used in plain strings outside of markup
… modern changes on isolating controls are not yet widely in use, though that’s what you’d really want to recommend

Dan Brickley: FWIW Google’s JSON-LD parser doesn’t complain if it sees "name": "Stan Dinkley</em>", etc., but nothing in Schema.org says when/whether to treat ‘<’ as markup vs part of the content. Most of our applications would strip or otherwise sanitize it. But that is an application-level decision.

Richard Ishida: we really need isolation control to make bidi work

Leonard Rosenthol: tzviya : what happens next?

Richard Ishida: danbri suggested evaluating underused attrs like “role”

Tzviya Siegman: http://blog.schema.org/2014/06/introducing-role.html

Dan Brickley: we tried that, but it wasn’t a success
… nobody likes it

Benjamin Young: I still feel the only way to handle this, esp with mixed language, is to use (a subset of) HTML

Dan Brickley: here’s a real world json-ld schema.org mixed language name/title from MusicBrainz, https://search.google.com/structured-data/testing-tool?url=https://musicbrainz.org/work/e664139f-6fb5-4aaf-91f1-3c109753c7ea#url=https%3A%2F%2Fmusicbrainz.org%2Fwork%2Fe664139f-6fb5-4aaf-91f1-3c109753c7ea

Laurent Le Meur: I want to go the exact opposite way, to not use HTML but standard data elements
… .but that does not solve mixed language

Dan Brickley: (currently {"name":"(Si Si) Je Suis un Rock Star"})

Leonard Rosenthol: do we actually know what was indexed (by Google) in Benjamin’s experiments?

Benjamin Young: removing the tags and keeping only the content gives the wrong result

Richard Ishida: if you did it using the HTML, you would have to do it all the time for each string

Dan Brickley: I would move away from “how Google indexes things”, but instead consider rthat stuff comes in and then out as a “triple”
… and then sent downstream where some processors may strip out bit and others may not

Ivan Herman: the perfect is enemy of good
… so what is the 80/20 cut on this one?
… are we willing to sacrifice some of these requirements (eg. mixed language) in favor of supporting the others more simply?

Richard Ishida: example of strings that need base direction: في HMTL5 يتم تحقيق ذلك بإضافة العنصر المضمن bdi.

David Clarke: an example in titles such as my text book

Ivan Herman: this is where Unicode directionality might help

Richard Ishida: you are talking about a different issue. Applying bidi inside a string with the Unicode dir chars is fine. but that doesn’t address the base direction
… my example earlier shows up the problem with missing base dir

Addison Philips: when passing data around, when you need the data, you need it! And once computed, you want it understood the same way in all cases
… the options discussed here are all valid ones but you need to decide which pain points you are willing accept

Marisa DeMeglio: another example in arabic…numerals. you switch the directions with numerals

Ivan Herman: there is no ideal solution. “which finger should I bite?”

Benjamin Young: I have a starting point from PDF for that that we use for rich text string…

Benjamin Young: propose that we write a simple HTML subset for strings in JSON-LD

Dan Brickley: we’d use that if you come up with it!
… alternatively, would you like us to add an inDirection property in schema.org to help move things along?

Proposed resolution: we propose for schema.org to add an inDirection term alongside inLanguage, with value of rtl, ltr, auto (Ivan Herman)

Daniel Weck: if we were to carry the HTML in the string liternal, then the RS needs to process that? How would you know that it is HTML?

Ivan Herman: you would know that from the RDF/JSON-LD

Ivan Herman: calling for objections for my proposal?

Garth Conboy: Crickets

Resolution #3: we propose for schema.org to add an inDirection term alongside inLanguage, with value of rtl, ltr, auto

Ralph Swick: [acknowledging that this resolution doesn’t address multiple-direction literals]

Leonard Rosenthol: DanielWeck : this is not just RS, but authoring, etc. all have to deal with whatever we proposed

Dan Brickley: thanks, noted in https://github.com/schemaorg/schemaorg/issues/2086

Daniel Weck: (entities, whitespace normalization, etc.) the HTML processing model

Ivan Herman: if we do a pure HTML datatype, then anything could be allowed

Benjamin Young: which is why I want a specialized version

Brady Duga: there is a history here with EPUB that the Japanese couldn’t represent some titles - let’s not do that again

Ralph Swick: [is it plausible to sanitize the HTML before putting it through a full HTML parser? Likely reading systems don’t want to have to write a separate limited HTML parser?]

Leonard Rosenthol: tzviy : we have a proposal to work up a small subset of HTML that could be used

Ralph Swick: Leonard: if someone is proposing to pursue a formal proposal, we should let them do it

Ralph Swick: Benjamin: I’m writing it now

Proposed resolution: allow literals (title, publisher, creators) to be expressible using an HTML datatype, restricting the HTML to a subset (Ivan Herman)

Proposed resolution: to create a sub-set of HTML–narrowed to multiple language tags–to use within all text strings used within the infoset/manifest (Benjamin Young)

Proposed resolution: to create a sub-set of HTML–narrowed to multiple language tags–to use within (a set of TBD) strings within the infoset/manifest (Leonard Rosenthol)

Leonard Rosenthol: tzviya : any objections?

Ivan Herman: no one is saying that you must use the HTML. Just that it would (possibly) be an option

Addison Philips: please make sure to include us in those discussions
… we want solutions for the web at large

Dan Brickley: FWIW I asked in #tpac-chat, tantek noted WICG draft around sanitization, https://github.com/WICG/purification

Tzviya Siegman: break time!@
… thanks to all our guests.

Ivan Herman: —- BREAK —-

6. schema.org issues

Ivan Herman: https://github.com/w3c/wpub/wiki/Schema.org-issues

6.1. order for terms like creators

Ivan Herman: we’ve started with some problems in which we started to define our own context and terms.
… in publishing the order of authors, publishers, translator etc is deadly serious
… currently, these terms are not ordered
… we put these in an ordered list to meet our needs - is this an acceptable solution for schema.org

Dan Brickley: lists have always been a pain in rdf… schema.org made an item list construction. 6 months ago we went through an exercise with json-ld. the end result was prettier and there was consensus that the result was prettier.
… that’s as far as we got
… we would introduce new ones, not turn the current ones into lists

Ivan Herman: so for the time being its fine to keep it as it is now

Dan Brickley: the most obvious one is recipe instructions

Ivan Herman: it will be a timing issue

Dan Brickley: re lists in schema.org, see https://github.com/schemaorg/schemaorg/issues/1910

Ivan Herman: we will skip over language setting

6.2. accessibility terms

Tzviya Siegman: accessibility issues go back to the first idpf proposals for tags that sit in a non-normative document about certification metadata
… we have outlined what we want
… it picks up conformance rules from dublincore
… certified-by, certified-credential, certified-report

Dan Brickley: my concern would be overlap

Garth Conboy: https://idpf.github.io/epub-vocabs/package/a11y/#sec-certifierCredential

Dan Brickley: This might overlap with the credential work (certifier)

Tzviya Siegman: we didn’t add accessibility to purposefully keep it generic

Wendy Reid: certified-by points to the org, which is a recognized org
… for accessibility

Leonard Rosenthol: it should be tied to accessibility

Dan Brickley: our preference is to say it’s a relationship between two things
… certified credential carries some of that work

Gregorio Pellegrino: we also need a URL, because it is specific to that publication

Ralph Swick: Ralph: certification should be a URL so that you can look up properties of the certifier; e.g. org, specifics about the certification

Leonard Rosenthol: if the document is certified to 2 purposes but you only have one set of metadata how do you address that

Tzviya Siegman: have the field repeat

Dan Brickley: what is the purpose

Tzviya Siegman: to clarify who is saying that something is accessible and what they are using to say it

Dan Brickley: if you have multiple things being investigated, you don’t know which report is which
… you might need to hack something

Ralph Swick: [we should also look for overlap with Verifiable Claims work]

Dan Brickley: EOCred work — https://www.w3.org/community/eocred-schema/

Ralph Swick: Rachel: re: purpose – in education this comes up repeatedly; we have two situations – an actual certification program run by Benetech

Ralph Swick: … Benetech will certify that a publisher is producing accessibly ebooks

Ralph Swick: … and many campuses are requiring publishers to self-certify; accepting responsibility for fixing problems

Dan Brickley: it’s about each particular published thing
… rather than the publisher
… we have this corner of schema.org where we can throw things and see if they work

Dan Brickley: have a look at this: http://pending.eocred-1779.appspot.com/EducationalOccupationalCredential

Tzviya Siegman: there is some potential overlap with verifiable claims and open badges
… if a publisher is saying “I assert this is accessible” how do we verify this?

6.3. language indexing

Ralph Swick: [Ivan is now discussing https://github.com/w3c/wpub/wiki/Schema.org-issues#language-indexing Language indexing ]

Ivan Herman: languages - in json-ld there is something easier to use, called language mapping, from an authoring point of view it is easier, although it is only syntactic sugar from the base language setting structures

Hadrien Gardeur: you need to be able to process the context

Dan Brickley: I’m not aware of anyone in the search industry doing complex processing

Ivan Herman: can you find out if this is planned? is it a good idea to use this index?

Hadrien Gardeur: are there plans to support json?

Dan Brickley: that would be company by company

Benjamin Young: [starts troubleshooting for the json working group even though he has his own meeting]

6.4. `LinkRole`

Ralph Swick: -> https://github.com/w3c/wpub/wiki/Schema.org-issues#linkrole LinkRole

Ivan Herman: we need something close to the link role
… even though it’s experimental
… what we started to do is that we defined our own type
… it is almost the same except linkrole cannot accept mime types
… I raised an issue about this on schema.org

Dan Brickley: encoding format shouldn’t be used on the link role today?

Ivan Herman: the real question is encoding format

Dan Brickley: linkrole is still pending?
… I need to check on the mechanics…

Ivan Herman: It would be better for us to rely on schema.org vocabulary than inventing our own here

6.5. audio `duration`

Ralph Swick: -> https://github.com/w3c/wpub/wiki/Schema.org-issues#duration-values [audio] duration alues

Wendy Reid: duration in schema refers more to a CV - as in I was in a job from jan 2017 to feb 2018
… the html equivalent is more relevant than schema.org

Dan Brickley: see lower section of https://schema.org/Duration for properties whose value is datatyped schema.org/Duration, which does lean towards 8601

Ivan Herman: we should allow for the iso standard to be used

Ralph Swick: -> https://www.w3.org/TR/2018/WD-html53-20181018/infrastructure.html#durations [HTML 5.3] Durations

Wendy Reid: we would like to extend the iso standard to be used

Dan Brickley: whereas https://schema.org/temporalCoverage covers date ranges, and we had some issue about openended ranges

Dan Brickley: https://github.com/schemaorg/schemaorg/issues/2086

Dan Brickley: next step is to follow up https://pending.schema.org/duration and clarify whether it is a temporal quantity, versus a period (temporalCoverage); could be clarified

6.6. additional vocabularies

Ivan Herman: general question - there is a large vocabulary for publications in schema.org, but new types of publications are unavoidable
… what steps should we take to add new types

Ralph Swick: -> https://github.com/w3c/wpub/wiki/Schema.org-issues#what-is-the-mechanism-this-community-should-follow-for-new-publication-types mechanisms for new publication types?

Dan Brickley: we generally push to wikidata to do the work

Ivan Herman: proceedings is a good example

Dan Brickley: https://www.wikidata.org/wiki/Q1143604

Dan Brickley: you can use wikidata directly

Ivan Herman: so in my json-ld, I would use Q1143604 instead of Proceedings?

Dan Brickley: yes

Ivan Herman: we could create an alias in the JSON-LD context file for the wikidata name

Dan Brickley: that too

Dan Brickley: workaround is “Proceedings” is a term in the surface syntax defined by a json-ld context, but maps to Q1234-style URLs

7. Github Issues

Ralph Swick: -> https://github.com/w3c/wpub/issues issues

Wendy Reid: looking for things that could be closed (“propose closing”)

7.1. Cover image

Wendy Reid: https://github.com/w3c/wpub/issues/261

Wendy Reid: #261: use of “cover image”

Ivan Herman: ‘cover’ is a structural property in the manifest

Leonard Rosenthol: cover is optional?

Ivan Herman: yes

Wendy Reid: any objection to closing this?

Ralph Swick: [none expressed]

Garth Conboy: we have to update the editor’s draft now to close it

Ivan Herman: yes; that’s the mechanics

Resolution #4: close #261

7.2. Obtaining language from http headers #54

Tzviya Siegman: https://github.com/w3c/wpub/issues/54

Wendy Reid: Obtaining language from http headers (#54)

Wendy Reid: the issue is whether http headers can be fallback for determining the language of a publication

Tzviya Siegman: this is no longer relevant; it’s from a long time ago

Rachel Comerford: let’s add a comment saying this

Ivan Herman: I’ll clean the minutes then refer to them in a close comment

Resolution #5: close #54

7.3. Conformance criteria for UA

Tzviya Siegman: https://github.com/w3c/wpub/issues/270

Wendy Reid: #270
… Conformance criterion for UA

Ivan Herman: we should add a reference to this morning’s discussion
… and add a comment that it will be followed-up later

Wendy Reid: not proposing to close this yet

7.4. HTML TOC format (#291)

Wendy Reid: #291

Rachel Comerford: https://github.com/w3c/wpub/issues/291

Wendy Reid: Do we need a more detailed definition for the HTML TOC format?
… based on #285

Ivan Herman: and we now have pagelist, too

Garth Conboy: there’s a reference to a transcript of a meeting, though it didn’t resolve the issue

Ivan Herman: the question is whether we want to define a structure in HTML for TOC
… at the moment the spec doesn’t say anything
… we have two extremes: the EPUB model and no restriction (whatever you can express as HTML)
… the discussion was that whatever we define should be machine processable; there should be a clear algorithm for extracting the TOC from the HTML
… the current spec stops at describing an ARIA tag
… it’s relatively clear that an algorithm can be defined for what’s in EPUB, or even for something more liberal
… on the other hand, nobody so far has come up with an algorithm to get a reasonable TOC from arbitrary HTML
… if you think something more liberal should be permitted, provide an algorithm

Juan Corona: I’ve been thinking about an algorithm
… my thinking is to require UL/OL with LI and then require SPAN
… I’ve been looking at the Category content model and have an idea about explicit/implicit P
… I think there’s text there about runs of phrasing content that could be used to define a generic algorithm
… I have a tree-walker algorithm that is very preliminary
… would welcome comments

Tzviya Siegman: sounds interesting; I’d love to see it

Laurent Le Meur: we have to permit round-tripability
… we must be careful

Liisa McCloy-Kelley: what problem are we trying to solve?
… replacing the nav doc in some way that doesn’t sound like addressing missing formatting in that doc
… the inline TOC and the navigation are not a map of each other
… often we include more in the nav that we include in the inline TOC
… we’ve been thinking that we might not create an inline TOC where the printed version didn’t have one
… as a content item
… a renderable TOC is often very heavily designed and that doesn’t get put in the nav
… as we’re thinking about this, keep the option that they can be separate

Tzviya Siegman: nothing prevents you from creating an inline TOC
… in my ideal world they would be the same thing

Liisa McCloy-Kelley: I’d like to have the machine-readable with the option to have the pretty one
… I don’t necessarily want the pretty TOC to be the machine-readable TOC

Ivan Herman: so you’ll have two structures?

Liisa McCloy-Kelley: yes

Ivan Herman: that’s still doable; you just have to identify via ARIA which one is the machine-readable one

Wendy Reid: the pretty one is the one users look at; as a RS, I’d love to offer all the information that’s there

Liisa McCloy-Kelley: two separate problems: the amount of information and the prettyfying of that

Ivan Herman: can you write down (later) the algorithm and the corresponding HTML structure?
… if this can be written, and if it includes the usual structure in EPUB 3 and other possibilities, that’s fine

Juan Corona: I can do that

Luc Audrain: this is important both for TOC and for accessibility issues with TOC

Juan Corona: I’m thinking that there’s one container, a DIV, and a text node
… that text node becomes an H1
… I get lost with mixing H2, H3, …
… I saw a proposal for something that takes the level from the nesting
… I don’t yet know how to address this

Tziya: others here might help in understanding the outline algorithm

Rachel Comerford: when we present a TOC in a textbook we generally have two of them: a detailed one and a brief one
… in our ebooks we have a third for navigation
… we present the nav TOC in the reader navigation and within that there are links to the brief and full TOC
… the full TOC has additional links that are not in the nav
… we put these in the full TOC because students look for them while paging through a book

Benjamin Young: there are use cases for several TOC presentations
… is it a requirement that the machine version be extractable from the human-readable one?
… perhaps it’s an imagemap

Brady Duga: a+

Benjamin Young: are we trying really hard to make this possible without saying “the machine SHOULD …”?

Leonard Rosenthol: similar question; I’m not sure why we need a machine readable TOC in WP
… I understand the need for a human-presentable one
… giving that reading order and resources are both handled elsewhere, why does a UA need in addition another TOC?

Tzviya Siegman: we’ve answered this over the course of several meetings

Brady Duga: briefly, for accessibility
… George can explain in more detail

Liisa McCloy-Kelley: were we planning to be able to have the machine readable have extensible formatting?
… embedding styling, images, …

Tzviya Siegman: this ties back to the JSON-LD discussion

Wendy Reid: audiobooks have an example
… and audiobook in a language that does not have a text form
… the TOC might use voice prompts or images
… we’d want a machine-readable TOC to be able to say “there’s no text for this chapter”

Luc Audrain: I’m concerned with performance to compute a machine-readable TOC
… as a publisher, I have no issue to prepare a machine readable and a presentable TOC
… I don’t care about duplication
… I care about production and performance for the user experience
… I have no issue with a complex presentaton TOC

Laurent Le Meur: a good discussion would be what is the user experience in an RS if there is no machine-readable TOC only an HTML page
… the TOC could still be accessible but it might take the whole screen because the RS wouldn’t know how to make it smaller
… would this be a good user experience?

Hadrien Gardeur: we’re discussing two concepts
… something nice, presented to the user in many ways
… and something else meant for accessibility or [other] specific UI features
… in EPUB3 we tried to have something serve both and failed
… I’m wondering if we shouldn’t simply treat them as separate
… the machine-readable one may have some very specific information
… I’m not hearing yet a step forward

Ivan Herman: Juan might come up with a more general and usable solution
… if Juan’s Algorithm fails there can be a fallback for the UA to use it as is
… an imagemap can be marked as a TOC, JuansAlgorithm will fail, and Benjamin will get what he wants

Hadrien Gardeur: one can mark up the same element with rel values in a link to denote the two different navigations approaches in one element

Tzviya Siegman: we get into a tricky area if you’re thinking that the accessible version is different

Hadrien Gardeur: that’s not what I’m suggesting
… I was thinking at the manifest level

Tzviya Siegman: I think people will be happier with Juan’s approach if it works

Ivan Herman: is there another ARIA element?

Tzviya Siegman: not really

Dave Cramer: it’s possible to be both smart and good looking :)

Benjamin Young: https://wileylabs.github.io/no-can-transclude/moby-dick-from-epub-samples/

Benjamin Young: ^^ shows what one can do
… it’s an EPUB in which I renamed the nav file to index.html
… Avnesh says this is sufficient for accessibility
… next and previous are populated from what’s underneath what you read
… it’s not using LI or anything magical; just finding the next anchor
… keeps reading state

Tzviya Siegman: this is navigation, not the TOC

Benjamin Young: in this document those happen to be the same
… I claim that navigation can be as simple as next anchor

Ivan Herman: the JuansAlgoritm might cover this

Juan Corona: yes; I think this idea could still work

Benjamin Young: tree order can be determined from the parentage of each link

Bobby Tung: +1

Benjamin Young: if this were structured in an HTML DOM tree they could be represented in a machine-readable navigatin

Hadrien Gardeur: my proposal is to have two different rel values at a manifest level, if a resource can be both “good looking” and “machine readable”, it would simply use both rels

Benjamin Young: I believe this suggests that the algorithm is not very hard
… the machine readable thing is calculable
… this example calculates from the TOC

Tzviya Siegman: proposed CG https://www.w3.org/community/blog/2018/10/22/proposed-group-publishing-community-group/

Tzviya Siegman: I think this discussion could proceed in the newly proposed ^^ CG

Garth Conboy: the TOC discussion belongs in this WG
… the discussion of options might come from the CG?

Tzviya Siegman: yes

Benjamin Young: I think the WG can keep the discussion

Tzviya Siegman: Juan will work on JuansAlgorthm
… Hadrien will work on a double TOC proposal

Garth Conboy: I’m skeptical that JuansAlgorithm can work

Ivan Herman: —- Adjurned —-

8. Resolutions

Resolution #1: edit the infoset and properties section, and introduce a PR to the group
Resolution #2: The union of the resource list and default reading order represents the definitive list of resources that belong to the Web Publication. All other resources are external to the Web Publication & close #205
Resolution #3: we propose for schema.org to add an inDirection term alongside inLanguage, with value of rtl, ltr, auto
Resolution #4: close #261
Resolution #5: close #54

Publishing F2F, 1st day — Minutes

Attendees

Content:

1. Introductions, misc

2. Let’s Talk About Publishing

3. Use cases, affordances

4. Boundaries

5. language and base direction in JSON-LD

6. schema.org issues

6.1. order for terms like creators

6.2. accessibility terms

6.3. language indexing

6.4. LinkRole

6.5. audio duration

6.6. additional vocabularies

7. Github Issues

7.1. Cover image

7.2. Obtaining language from http headers #54

7.3. Conformance criteria for UA

7.4. HTML TOC format (#291)

8. Resolutions

6.4. `LinkRole`

6.5. audio `duration`