Publishing Business Group – 25 August 2020

Meeting minutes

EU Text Data Mining Reservation protocol

Liisa: Laurent Lemeur brought this to our attention
… Cristina noted this raises some concerns for publishers and copyright
… what users of text data mining will do with content
… Laurent shared a draft paper with us
… thinking about where this impacts people's businesses

Tzviya: I've been working on TDM at Wiley and in STM

<liisamk> link to TDM paper - https://‌docs.google.com/‌document/‌d/‌1NwtWv_ESS4ZhaHDHWnDSQSJKHrfRmLNjDTPdwNE3aeA/‌edit

Tzviya: STM is doing a lot of work in conjunction with CrossRef
… so there's something embedded in the metadata to make it clear what the license is and whether there's an exemption
… what a "scholarly article" is and what an "exemption" means is not currently well-defined
… you need to allow the user to access the information
… and some sort of temporary storage is allowed
… this is of great interest to Wiley

Ralph: is this just an FYI from Laurent or is he proposing some sort of activity?

Liisa: Laurent would like some feedback
… he had a conflict for today but hopefully we can discuss in future meetings

Tzviya: I'll try to pull together some stuff
… the solution is very silo'd
… if metadata can be embedded in some other identifier then perhaps this can be a more universal solution
… the idea is to add metadata to an identifier
… if academic and trade publishers do this different from scholarly, that will be a problem

Liisa: was anyone else previously aware of this?
… I suspect there will be feelings about how and where this gets used
… it will take some discussion

Ralph: who are the people in our community who we'd want to try to get to a future BG discussion?

Liisa: let our members know this is going on
… and get them to provide feedback to Laurent
… do we think this is good or bad?
… you need metadata to follow along with the content to allow or disallow

Tzviya: my impression is that this is in legislation, like GDPR
… I don't know what exactly this will mean in the world of trade publishing
… it will be a matter of coming to terms on how we incorporate it into our workflows
… but I don't think we can opt-out

Dave: this feels out of scope for this group to me

Daihei: from the perspective of Japanese publishers; in the next month or go we will go into deep discussion on how we apply the new EU circumstances in the Japanese industry
… I'll come back to you with some understanding of how the Japanese publishing industry wants to apply TDM as well as more generically the accessibility situation

ODRL Vocabulary [W3C Recommendation, 2018]

Ralph: perhaps an overview for BG participants on the use cases that the work is trying to solve

Liisa: yes; a use case understanding that we can bring back to our internal teams
… companies need to figure this out for themselves but awareness is our job here

Bill: the business issue is not being surprised that your content is being mined without our knowledge, or that you want it to be mined
… this sounds analogous to the EU accessibility requirements
… the business issue is for publishers not to be unpleasantly surprised

Wendy: this may fall into a CG discussion; e.g. adding another metadata property to published works
… we need use cases

Liisa: we'll ask Laurent to bring us use cases

Audiobook Recommendation status

Wendy: next steps: we're still looking for implementors
… we got two more pull requests to be added as implementors in the past 24 hours!
… we're putting together a list of content implementations as well as implementations that are not official yet
… OBI is officially there as a content-producing implementation
… I'll be adding Coresource; they plan to implement in Q1 2021
… I'll be talking with Harper Collins about being listed as a publisher implementor
… anyone else you know of who is willing to publicly say they're planning to publish in Audiobook form, I'm happy to add
… we expect to start preparing for a Call for Consensus to advance [to Proposed Recommendation] in September
… which will put us on track for Recommendation in late November

MathML Refresh

Liisa: should we circulate this information?

MathML Refresh Community Group summary

WG draft charter [Neil Soiffer, 20-July]

Ralph: Brian Kardell from Igalia gave a presentation on their MathML work at TPAC to the PWG last year
… the MathML community group working on the refresh is now discussing a charter for a working group
… Brian told us that the CG is identifying the subset of MathML that is both implemented and used
… so this isn't redefining MathML, instead trimming it down to make it easier to implement

<Avneesh> MathML WG also includes work on Chemistry

Ralph: I haven't had the chance to review the draft charter, but MathML is important to this community

<Bill_Kasdorf> There will be two versions, the core "trimmed down" version and the full untrimmed but improved version.

Ralph: this is something that this community should provide feedback on
… bring them use cases

Ralph: as we know from the EPUB3 WG charter process, there's a lot of time between starting and shipping a charter

Public MathML CG list archive

Liisa: Do you think this charter will improve MathML conformance in EPUB?

Liisa: do we feel there are still challenges in implementing and deploying MathML that will help the WG?

George: in addition to the MathML CG there's a CG on chemistry
… the Chemistry CG has been working with the MathML CG to add chemical semantics
… the first point was to identify the content as chemistry
… and add metadata suggestions to allow differentiation between symbols; e.g. 'K' can be potassium or Kelvin
… good work in the chemistry domain

Ivan: looking at the draft charter, it has a goal of two specs:
… a simplified MathML Core and a MathML 4 that adds terms and features
… it seems to be a big charter
… may require a lot of work to shepherd it through the process
… they have a long road ahead

Wendy: from what Brian told us at TPAC last year, the main challenge is not how MathML is referenced
… but rather on the reading system side
… webkit doesn't support MathML natively
… we have to use a library to support it consistently
… and App Stores get angry if your package size is too large
… if Igalia's work is adopted by enough browsers, this will be a major change for us
… support in webkit alone will be great

<Bill_Kasdorf> +1 to Wendy

<JulieBlair> +1 Wendy

<Bill_Kasdorf> It's a big deal and it's really happening

Liisa: yes; the implementation has always been the struggle
… there are some issues around authoring too

Wendy: yes; authoring has been a challenge but one of the biggest challenges was lack of interoperability

<Bill_Kasdorf> Would love to stay but have to drop for another meeting

Ralph: It's great that this is happening

George: Mellon Foundation provided the funding to NISO and NISO engaged Igalia

Tzviya: the integration with CSS and Aria is one of the more important aspects
… the charter has so much in it; it might need to be trimmed
… integrating MathML into Aria will be very useful and important

Ivan: early feedback on the CG's draft charter would be useful
… a clear standard for what is implemented to have a core
… until that is done, extending beyond what's in MathML 3 seems a recipe for problem

Avneesh: it seems that Chrome may ship something within a year
… the charter has many deliverables; we need to help them streamline

Ralph: we need to know what the publishing community feels should be streamlined

Use cases and content types

Liisa: we'd like a TF in the CG for three content types:
… FXL + reflow
… what kinds of publications would benefit from being able to combine these?
… Fullpage image + Reflow
… where the image could vary depending on the screen
… mix of FXL and Reflow in a single file
… so the reading system can select
… Daihei noted that this would be useful to the web comics publishers, so people could see something as if it were Reflow
… get a discussion going to see how these things might be implemented
… should we start a document and start sharing stuff?

Ralph: Part of the function of the BG is to find people who ought to be participating in these conversations
… whatever it takes

Wendy: I struggle with this
… they are all already EPUB features and in the spec
… in the spine
… is this a CG thing to understand the use cases?
… so the spec properly communicates how to do it
… it's not as clear in the spec as it needs to be
… maybe a GitHub thread on use cases is enough; it might not need to be a full Task Force
… we need to understand what people are trying to do and how we are failing them

Liisa: I said "task force" because that's what the CG chairs proposed the way the CG will work
… use cases are helpful; when there's a better understanding that it works and how, it can come back to the BG to help the reading systems understand how much content there will be
… putting together some use cases and handing them off
… showing what works, what works with some tweaking
… will help push the conversations

Avneesh: technically this is in the standard
… but it would be good to have some sample content and best practices
… people start implementing more easily with sample code

Daihei: the issues here are ...
… whether it's a formal TF or not, we should explore use cases and samples
… these subjects will help contribute to the advancement of business
… expansion of business
… I would like to see Publishing@W3C help get such contributions
… as you know, manga is a key element of digital publishing in Japan; 85% of the business is manga
… webtoons is making a big impact on manga in Japan
… the form of FXL rendering is going to be a key issue for digital publishing around the world
… the example cases could be really helpful
… and W3C's presence will be really appreciated

Tzviya: do we have a method for documenting these use cases?

Liisa: GitHub ?

Ralph: we can ask around the Team for patterns for gathering use cases

PBG Meetings during TPAC

Liisa: Daihei, Cristina, and I would to propose that we have two BG meetings on the same day as part of TPAC
… two 90-minute meetings on the 13th of October
… what big topics should we discuss there?
… should we propose cross-group meetings there?

Ivan: the EPUB WG needs to plan its virtual F2F
… it's like to meet for several days

Wendy: I proposed some dates

TPAC Group Meetings scheduling wiki

Wendy: 19 to 23rd

<Avneesh> Oct 26 week should be week of breakout sessions

TPAC 2020 schedule

Ralph: we're asked to keep the week of 12-16 October free for joint group meetings
… so the PBG meeting on the 13th isn't necessarily a show-stopper
… but some of our participants and hoped-for observers may have conficts with scheduled joint meetings

<tzviya> https://‌www.w3.org/‌wiki/‌TPAC/‌2020/‌GroupMeetings

Daihei: we'll come back with a proposed agenda

[adjourned]

– DRAFT –
Publishing Business Group

25 August 2020

Attendees