EPUB 3 Working Group Telco — Minutes

Date: 2020-12-18

Attendees

Present: Ivan Herman, Laura Brady, Masakazu Kitahara, Ken Jones, Toshiaki Koike, Avneesh Singh, Zheng Xu (徐征), Tzviya Siegman, George Kerscher, Brady Duga, Laurent Le Meur, Charles LaPierre, Wendy Reid, Hadrien Gardeur, Bill Kasdorf, Gregorio Pellegrino

Regrets:

Guests: Dave Cramer

Chair: Wendy Reid

Scribe(s): Brady Duga, Wendy Reid

Content:

1. in preperation to html5 in epub3
2. release identifier construction
3. Non-unique unique identifiers and reading system conformance
4. Resolutions

Ivan Herman: Admin stuff about rejoining.
… Grace period still in place for 45 days, but pull requests from orgs that have not rejoined will be flagged.
… So please rejoin if you do actual work in the group.

Avneesh Singh: I hope it is not flagging Matt’s PR.

Dave Cramer: This is due to the new patent policy change.

George Kerscher: Who does this? Personal or AC rep?.

Ivan Herman: AC rep does it..

1. in preperation to html5 in epub3

See github issue #636.

Dave Cramer: Reading systems and xhtml vs html.
… Thanks to those that already responded to their uses of xhtml vs html.
… Other RS vendors, please provide this information.
… I have tried a few via side loading and some of them work, but want a proper answer from everyone.
… this is to inform discussion next year.

George Kerscher: DAISY contacted devs and asked about support of features.
… “we” [DAISY or the WG?] could take the list of known RSes and ask them about xhtml vs html.
… or do we just want a feel.

Dave Cramer: Just trying to get a feel to understand cost.
… so might be too soon.

Ivan Herman: A little worried that DAISY might end up in the middle of a discussion it doesn’t want to be in.
… it would be helpful if those contacts could be disclosed to the WG so we could contact them.

George Kerscher: No problem, we can hand out the list, may need to ask a couple of people, but not a big deal.

Brady Duga: I don’t think this is too early to start asking.
… there’s not too many RSs in this group.
… it’s worthwhile to check in and let people know we’re having this discussion.
… we need to reach out to the community.

Laurent Le Meur: this is a big change and we really need to ask about it now.
… less RSes can even support html5. The issue may not be cost, but will be RSes left behind that will never support html5.

Tzviya Siegman: Coming at this as publisher and dev experience from epubcheck.
… don’t want to muddy the waters about content, but would be helpful to understand if the issue is supporting two types of books, or if it is making tools, etc.
… want to know where the issues are.

Hadrien Gardeur: Worried about the consistency.
… We can either be more aligned with the web to make it easier to use html tools to build content, etc.
… or are we being conservative?.
… We seem to be doing both.
… We have been very conservative, now we are not with this discussion.

Dave Cramer: Good point, we aren’t making a decision here, just want to understand where we are and where we might want to go.

Garth Conboy: HTML serialization won’t work out of the box, but doesn’t mean we are against it.
… The issue about stuck in time RSes is a good one.
… As is the issue about direction.
… But we also need to talk to publishers and see if they care.
… Need to understand if RSes are willing to adapt, not understand what they do now.

Dave Cramer: Let’s get back on track. We aren’t discussing the issue now, we are just preparing to discuss it.

Avneesh Singh: Also please remember there is project to move epubcheck to the w3c html validator.
… if that prototype works, then it solves the problem of epubcheck not working with html5.
… second, it is good to start collecting data now.

2. release identifier construction

See github issue #1440.

Dave Cramer: Need to do some boring stuff called work.
… The release identifier comes up from time to time (id + release time).
… So you can figure out new vs update.
… This is really just advice to RSes on how to distinguish books.
… So really hard to test. How do you determine if there was an error?.
… Should this be in the appendix non-normatively?.

Ivan Herman: Is it used ever by anyone?.

Dave Cramer: That is my favorite question.

Hadrien Gardeur: Partial reply: There are roughly 2 groups of RSes.
… one where content always comes from a controlled warehouse (eg big retailers).
… second is more generic that can support any type of epub.

See github issue #1310.

Hadrien Gardeur: this id makes more sense for the second group.
… We are talking about unique id, but also saying maybe it isn’t unique, so maybe don’t rely on it. Very confusing for RS.
… we are building a spec that should work for all types of RSes.
… Personally know that some RSes in the second group do use that id.
… Recs about looking at title is often wrong (eg if the change was to fix the title).

Garth Conboy: We are mostly a retailer, but also allow side loading. We ignore both the UID and the mod date-time..

Ivan Herman: Need to be careful, there are two notions the UID and the release ID.
… This creates an identifier based on some info in the epub.
… this issue is whether that algo makes sense.

Hadrien Gardeur: If this is specifically about release id, then I don’t know who is using it.
… UID is different answer.

Dave Cramer: Not discussing removal of id or release date.
… This is about uniqueness of UID, what happens when two books have the same UID, etc.
… We probably don’t want to start dictating how RSes present their libraries.
… I would say we remove this concept since it can’t really be tested.

Garth Conboy: +1.

Tzviya Siegman: +1 to dauwhe.

Ken Jones: Maybe have a revision number?.
… to make it clear there is id and version explicitly.

Hadrien Gardeur: But that is the issue at stake.
… epub 2 only had uid.
… this was added in epub 3, but isn’t really used.
… first group doesn’t really care, and second group doesn’t handle versioning at all.
… don’t think we should be in the business of specifying how these are used.
… Would like to not say anything about UID, just say what it is and leave it to the RS implementors to figure it out.

Zheng Xu (徐征): Will be very difficult to test something that is undefined.

Tzviya Siegman: Want to caution about versions and revisions.
… Not changing too much in the spec is good guidance.
… There has been a lot of discussion about rev vs version, don’t want to take it up here.

Gregorio Pellegrino: Publishers don’t seem to use this, we use ISBN and file hashes.

Garth Conboy: We seem to agree that release ID is not really used, and even if it is doesn’t need to be in the spec.
… So we seem to agree on this issue.

Dave Cramer: Makes sense. We won’t comment on metadata construction in the specs.
… proposal would be to remove that language without touching required metadata.

Ivan Herman: Need to be careful about how this is phrased.
… are we proposing removal or moving to non-normative section/document.
… Also DC terms has a hasVersion term, so we have that too.
… May want to mention it, but no reason why it couldn’t just be used.
… But I propose that dauwhe propose something.

Dave Cramer: [typing noises].

Proposed resolution: remove language around the concept of a release identifier, while retaining the mandatory unique ID and dcterms:modified metadata. (Dave Cramer)

Dave Cramer: I don’t there is a compat issue here.
… don’t think there is any testable change here.
… This is basically editorial.

Ivan Herman: We can say any valid 3.2 will be valid with 3.3, this change does not alter that.
… Doesn’t change behavior or authoring.

Wendy Reid: +1.

Ivan Herman: +1.

Bill Kasdorf: +1.

George Kerscher: +1.

Brady Duga: +1.

Charles LaPierre: +1.

Hadrien Gardeur: +1.

Garth Conboy: +π.

Laura Brady: +1.

Ken Jones: +1.

Tzviya Siegman: +1.

Gregorio Pellegrino: +1.

Zheng Xu (徐征): +1.

Dave Cramer: Resolved!.

Resolution #1: remove language around the concept of a release identifier, while retaining the mandatory unique ID and dcterms:modified metadata.

3. Non-unique unique identifiers and reading system conformance

See github issue #1310.

Dave Cramer: the spec says “Reading Systems MUST NOT depend on the Unique Identifier being unique to one and only one EPUB Publication. Determining whether two EPUB Publications with the same Unique Identifier represent different versions of the same publication (see Release Identifier), or different publications, might require inspecting other metadata, such as the titles or authors.”.
… Has no idea how to test this.
… Just make two books with the same UID and see what happens?.
… Seems like implementation advice.
… Would not be sad to see this go away.

Garth Conboy: Concept is useful, maybe turn must to should, remove ref to release id and declare victory.

Ivan Herman: Need to be careful about testability being the only criteria.

Tzviya Siegman: +1 to ivan.

Ivan Herman: when it comes to metadata, we can’t necessarily depend on testability.
… since, e.g., libraries want metadata for their purposes we can’t test, but we may still want to require it.
… That is a general comment on using testable as a criteria.

Wendy Reid: Think it is the case that we have more bad uids than correct ones.
… Create our own uids since they are better than what we get.

Dave Cramer: Maybe move to a note?.
… non-normatively.
… Which also solves the testing issue.

Laurent Le Meur: +1 to Dave for making it a note (the 2 paragraphs, one about the “static” aspect and one about RS).

Dave Cramer: And cleans up the spec a bit.

Wendy Reid: +1 to Dave.

Bill Kasdorf: We have a combo of a spec with recommended practice.
… These are often kept separate, prefer that since it keeps the spec only a spec, and allows for updating recs more easily.

Charles LaPierre: +1 to Bill / Dave.

Ivan Herman: Trying to reconcile this. We have to be consistent with all the metadata.
… For instance author must be correct, but that is also untestable.
… May need to clarify for all metadata that it is required, and there is an implicit must, and give a blanket statement that any of it could be wrong.

Proposed resolution: make statement about reading systems not depending on uniqueness of unique ID a note rather than a normative statement. (Dave Cramer)

Tzviya Siegman: As a publisher, the way the metadata is currently is very clear.
… and while it is processed in different ways, this is a very easy part of the spec to understand.
… One major complaint is the spec is already too broken up.
… don’t want to add to that.

Hadrien Gardeur: prefer to make non-normative and also revisit the text overall.
… don’t think it needs to exist at all.

Dave Cramer: Two proposals: 1 make it non-normative or 2 remove entirely.

Garth Conboy: If we make it non-normative it might be rewritten.

Wendy Reid: +1.

Avneesh Singh: 1.

Hadrien Gardeur: 2.

Charles LaPierre: 1.

Dave Cramer: vote 1 for non-normative, 2 for remove, 0 to abstain.

Ken Jones: 0.

Ivan Herman: 1.

Laurent Le Meur: 2.

George Kerscher: 2.

Toshiaki Koike: 1.

Garth Conboy: +1 (make non-normative; shrink hopefully; but retain at least the concept).

Bill Kasdorf: 1.

Masakazu Kitahara: +1.

Tzviya Siegman: 1.

Gregorio Pellegrino: 1.

Zheng Xu (徐征): 1.

Ivan Herman: +1 to Garth.

Laura Brady: +1.

Ivan Herman: Can we agree to make non-normative and request new, shorter text?.
… then vote on the new text later.

Laurent Le Meur: +1 to Ivan.

all: rumblings of assent.

Resolution #2: make statement about uniqueness of unique ID in #1310 non-normative.

Garth Conboy: Volunteer audio books time slot. Something about moving back two hours.

Brady Duga: dauwhe + wendyreid: chairs will discuss.

Dave Cramer: Thanks for actual progress!.

4. Resolutions

Resolution #1: remove language around the concept of a release identifier, while retaining the mandatory unique ID and dcterms:modified metadata.
Resolution #2: make statement about uniqueness of unique ID in #1310 non-normative.