EPUB Next

Notes from 27-October-2021 publishing community meeting, a joint meeting of the W3C Publishing Business Group, Publishing Community Group and EPUB 3 Working Group, hosted as part of TPAC 2021.  Thanks to Wendy Reid for taking notes in real time.

Wendy offers overview of EPUB 3 WG: backward compat, clean up spec, FXL a11y, locators TF,

13 Ways of looking at web publications Dave Cramer on Twitter: "13 ways of looking at web publications: a thread" 

EPUB 3.3 Change log https://www.w3.org/TR/epub-33/#change-log         

Rick Johnson (VitalSource): How to define business/market specific issues against privacy requirements. What about profiles for market specific rules?

Wendy Reid (Rakuten/Kobo): Prior to EPUB 3 WG, we wrote Publication Manifest https://www.w3.org/TR/pub-manifest/ - idea of core backbone with market specific needs

Tzviya Siegman (Wiley): in the education space, users want to live in the browser but also want the package - how do we achieve both?

Hadrien Gardeur (EDRLab): what is our goal? We need to figure out our scope

Florian Rivoal (IE): there is a gap between epub and the web. Do we want to close it? Do we need multiple html files in an epub

Ivan Herman (W3C): maybe we should be involved in discussions on WebViews

Ralph Swick (W3C): Consider what the pain points are; use that as a way to guide this conversation.

Dave Cramer: use case: someone wants to create a book with lots of interactive JS, He wants to be able to send it to people for $10 without needing to install software. Ad-hoc distro of books.

Tzviya Siegman (Wiley): Big +1 to Dave

Andrew Rhomberg: Jellybooks, we do reading system for browser users, we have to remove the concept of files and downloads. The recipient does not need to know about EPUB. But a big problem is that content comes in EPUB 2.

Taking trade publishers forward takes forever.

Brian Kardell (Igalia): To underscore what Tzviya was saying before, the more that we get into browser engines the more opportunity there is to work with more people and work on more problems. To Florian’s comment about the web being based on a single page, I think there’s positive things that you could explore in this space that keep you closer to things, and help you find more alignments within the w3c. One of the problems we have in w3c is time. Everyone’s time is finite. You need to prioritize. What has the most people who can work on a problem?

Brady Duga (Google): Going back to how this meeting started, we have some successes and failures. The “if we spec it we will come” has led to all our failures. The successes are when people did something and THEN we spec’d. FXL had multiple implementations before there was a spec. We need people putting resources into an idea.

Tzviya: Brady said it better than I did.

Charles LaPierre (Benetech): with browsers, Microsoft Edge opened EPUBs natively, with good a11y features. But then MS went with another engine, and we lost that feature. I’d like to see that happen again. We just need to get the browsers on board.

Florian Rivoal: what Andrew was talking about was EPUB in the back end, but one of the major successes is that we can sell books, rather than relying on advertising supported content. We can do the book experience with service workers, etc, but that doesn’t address the business model.

Microsoft Edge was basically bundling an EPUB reader inside the browser--it was two things in one.

Lars (Colibrio): Interesting stuff. So many things… to Florian’s comment regarding the value of being able to “buy” stuff, rather than being a service… I call it the digital hardcopy. My owned copy of an EPUB. Just like you can actually have a PDF, which we’re trying to replace maybe. You can send a PDF to someone, it has a provenance. You can take notes on it. We need to focus more on that. We have always the canonical version, but there’s the individual piece. Imagine your book being accessible in two hundred years. That can’t be done by books as a service. We need to think of the archivability. Especially if we want to help orgs move away from PDF.

If we want EPUB to be a part of the web, the web needs to decide on an archive format for web content. But for now EPUB is a good format for archiving web content. OCF is a good packaging format.

Rick Johnson (VitalSource): something Brady said… we should look at what interesting things people are doing now, and bring them to a spec… the interesting stuff we’re working on are going on outside w3c--authentication, machine learning. What is our tolerance/appetite for working with people outside the domain of the W3C?

Tzviya Siegman (Wiley): That’s a good question. Lots of authentication work is done in w3c.

The next question: what do we think is missing?

Wendy Reid (Rakuten Kobo): The business group had a discussion… There’s an evolving business model of how digital content is sold. Webtoons are a big deal… vertically scrolled comics, it is often a freemium model, you can watch ads to get tokens to buy more episodes. Books are being serialized. Might be a dollar per piece. There’s the rise of a chapter-based model. Every couple days you get the next chapter. This reflects what’s going on like streaming models for music or TV… but serialization is common for print--it happened in the 19th century. Some of these models are opposed to owning the entire book. I love webtoons but they’re really inaccessible. We should be making it better.

Tzviya Siegman (Wiley): the whole subscription textbook model has blown up. Now we’re renting access to a whole library.

Avneesh (DAISY Consortium): my comments come from management/strategy perspective. Backwards compatibility of EPUB is important. We must continue maintaining EPUB 3. Our biggest pain point is that we are not using Publishing CG enough. In IDPF there was one big group where everyone could talk. Now there are three groups--WG, BG, CG.

Can we take forward what we’re doing on this call, and build more of a community inside the community group? I’d like to orient the conversation to how we can work on things in the CG. We have ideas but we need to start work.

Laurent Le Meur (EDRLab): Backward compatibility of EPUB has large implications. If we have to keep that, then there is no EPUB 4, only EPUB 3.x. We can refine the spec and create profiles per business sector, but that’s all we can do. We should first validate or invalidate this idea.

I think the biggest pain point in EPUB is still interoperability. Companies distributing and selling EPUB publications (Apple, Amazon …) are filtering the files that come to them; they have their own rules and constraints, after EPUBCheck validation. A publisher cannot send the same file to everybody. This is a problem we should tackle with them.

Lars (Colibrio): That sounds wonderful. It would be nice if we could have a CG that works together… there’s more enthusiasm for new and fun stuff. I was talking to Laurent at Frankfurt, maybe we should start a marketing group inside the CG.

The marketing team next door said to me "EPUB must be the most unsexy brand we’ve ever seen." We need to work on that. We need to lift the spirits when it comes to EPUB.

Lots of awesome stuff has been done with this standard. Let’s market the positive and awesome stuff we can do. We should strive to improve, but we are sitting on a competent and well-functioning standard. We do have compatibility issues, like the web in the early 2000s.

Liisa McCloy-Kelley (PRH): I want to talk about backwards compatibility. Part of that conversation was that there’s a feeling that backwards compatibility needs to be contained for the huge numbers of manga that has been made in the past, and then coming up with something beyond EPUB for new stuff. The future could be different than the past. But we need to be mindful of the need for a11y, there is a moment over the next few years. Our community should bring the baseline up. How do we make it worthwhile to fix our old catalog, moving from EPUB 2 to EPUB 3 and improve a11y. I need to do this to 30,000 titles.

Tzviya Siegman (Wiley): we see the need for backward compatibility. But readers want more. COVID changed textbooks forever. Print textbooks are dead. No one worries about print compatibility.

Jeff Xu (Gardenia): 1: in terms of web browser, it’s really simple technology. Edge supported EPUB pre-blink. What is the value for web browser to adopt epub? Might happen in different layers. It’s not a technical issue. Think about what is the value of epub and reading systems? Pagination issues, reading experience issues… what is our value as a reading system? Annotations, whole book search (in browser can only search current page) and some other stuff, including DRM, publishing or reading specific stuff… could we bring more of this discussion into the CG… bring some low-level solution, then we can bring to Working Group if a spec is needed… I know we had some discussions about annotations that didn’t move forward. I’d like to see discussions in CG.

Mateus (Norton): I’d like to connect some points made by a lot of people. It’s important to realize that we have created a strong community. We know we’re working on interesting things beyond the specs. We know that we have a need for incubating ideas, and we have a channel for that in the CG. We need to combine these two factors. Embrace the cool things, incubate them in the CG. There are things like webtoons in the world, but they don’t follow a11y principles. We need to shepherd these things so they can become more robust.

We have people working with audio books. We have people working with publication manifests. What projects are happening on the boundaries? We have three serious problems. 1. Bandwidth—people have bandwidth problems. 2. Finding people to fill leadership positions. 3. Misunderstanding of what the necessary skills are. People think they need a technology background. We do often look for expertise from people who have experience with pain points--we can connect them with people who work with text.

I hope the outcome of this meeting is a shared energy, and taking next steps in a more active way. I know there are serious constraints imposed by ideas like backward compatibility. But we need to think about that. Please join us in the CG.

Lars: This is the time, not forgetting the standard we worked so hard on, but backing it more fervently than ever. See it come to fruition. Outwardly. When it comes to interoperability, we don’t have much anarchy when it comes to HTML as a platform, we have browsers working together, with interoperability. Reading systems need to work together and hash things out where we can build a platform that publishers and content creators can trust. It will be hard to convince anyone to build EPUB, like the early days of the web and interop issues, we need to work together!

We reading systems need to sit together and agree to standards.

Tzviya: What we’ve heard overall, there’s not one problem to solve. We have a number of issues, we mainly need to talk to each other.

WE have a number of ways to communicate that doesn’t involve weekly meetings. We have a monthly meeting of the community group, flexible times for people around the world. Business Group meetings that are also time zone inclusive. The CG is open to everyone.

What is the barrier here? Tell us what is making it difficult to participate, we want to know if we’re overwhelming.

Wolfgang: You are not overwhelming. If you have a niche area, like dictionaries, you are unsure if you are occupying too much space for your area. You are very open and welcoming.

Rick: Coming back into this from being away, we have three groups with a lot of overlap, sometimes it’s three times as much work to attend all three.

Tzviya: I wanted to set this up like in the old IDPF days, where we all get into a room together. I would love to hear any ideas on overcoming these barriers. It’s not dissimilar to the IDPF days where tech people and business people sometimes had challenges.

One of the reasons we joined with the W3C, was being able to work with the browsers, and yet there’s little overlap. There’s people here involved in browser work, but there’s not a lot of overlap. Some of it is time. Sometimes it’s not about being able to file a bug, but discussing use cases.

Ralph: Thanks Rick. I see the agreement in Zoom chat that it’s challenging to participate in all of the activity. One of the reasons we separated the conversation into a BG, CG, and WG like this is that we thought there would be different groups of people who would feel more comfortable in those areas, business people discuss in one place, different from the Community Group, or the WG. Community group a place where people discuss their wishlist or prototypes.

I felt that each area would have different agendas or priorities. If you feel you need to be in all three places, as Tzviya commented, we need to come back to that, to see what these independent conversations need from each other. If there’s no one in your organization to delegate some of these conversations to, maybe we need to rethink the structure of the groups.

Tzviya: The goal of this aside from getting everyone together, is understanding what we should work on after EPUB 3.3

Lars: The good thing about the CG is that it’s open. Say we merge them together, we potentially exclude people due to the membership requirements.

Rick: New guy coming back: the question is, is the reality of the BG a way for getting into the W3C for a cheaper price. Is there an economic conversation to be had about doing something there.

Tzviya: There is definitely a conversation to be had there, but I want to focus on the main topic we gathered here today for.

What’s missing? Is the lack of community an issue? Are you hoping EPUB opens in the browser? Are you concerned about EPUB branching out into several specs if something isn’t done?

Earlier on Dave posted a twitter thread on publishing that points out a lot of issues. It points to a book that exist on the web today that aren’t a package. We have several examples of this online today.

Hadrien: You’re listing a lot of things that are necessary if we want to create a specification. Buty maybe we don’t need a new standard for reading on the web, at least not for web-native publications. Maybe the mistake we made in the past was not declaring a difference between web-native content and content that might be accessed via web technologies. I haven’t seen good arguments to convince browsers or content providers to do something different.

Florian: If we’re wondering what to do next, the primary question is a business one. The web is a publishing industry. The difference between the web and the “publishing” industry is business cases. What is it about the things that can be done with web-native content that can’t be done with existing business models? What do we want to do that we’re failing to do? Many of these things have existing answers. Once we understand the business model, we can move from there.

Ben (Pearson): ONe of the things I’d like to see us solve is the fundamental question of remote resources. How do we open up to the web and maintain security. Fundamentally do we want to access the potential of linking out, to other resources or publications, there are challenges, the ephemeral nature of these resources and privacy/security.

Lars: The integrity of an EPUB is a philosophical question. WE need to take into account the ephemerality of resources. I saw Andrew’s comment about replacing PDF. We need to communicate to the public about PDF being an image format, it’s hard to explain to a person (even in publishing) why you can’t make them accessible. It’s not a document as such. We need to work with academic publishers and making documents accessible with web formats that can be used as data sources.

Tzviya: Wiley is an academic publisher, we can’t tell users what to do, HTML is more likely to replace PDF than EPUB.

Andrew: I second that. PDF is difficult to peel off. Moving to HTML is more common. Where are the use cases that EPUB is better than PDF. What has struck me recently about the use of PDF in government bodies where accessibility is highly important, there are opportunities, we need to spell out the use cases. When and where could EPUB replace PDF because of it’s accessibility, friendliness. Where can we make a case and where are we tilting at windmills?

Tzviya: As Lars said, we need tools. Friendly tools that are easy to use. If we don’t have the tool chain. Tools don’t always support modern EPUB.

Wendy: EPUB in browsers are key to some of this transition (especially government).

Tzviya: We need marketing, maybe a TF for evangelism in the CG.