Skip

EPUB 3 History and Future

Facilitator: Tzviya Siegman, Wendy Reid

Learn about the 20+ year history of EPUB, how it came to the W3C, and what the new EPUB 3 WG will accomplish.

Slides

Minutes (including discussions that were not audio-recorded)

Previous: European Publishers Council All breakouts Next: Long-form reading on the web

Skip

Skip

Transcript

Good morning, good evening, good afternoon, good day.

I'm Tzviya Siegman and I'm presenting today with Wendy Reid.

I work with Wiley and I am the co-chair of the Publishing Steering Committee.

Wendy and I will be talking today about EPUB History and Future" and I'll give Wendy a minute to introduce herself and then we'll dive into our slides. Thanks Tzviya. Hi everyone, my name is Wendy Reid and I work for Rakuten Kobo. I am one of the co-chairs of the Publishing Working Group and of the EPUB working group. Thanks Wendy. So today we're going to talk a little bit about the history of EPUB as well as where we think EPUB is going. We know that a lot of people don't have the context about the complete history of EPUB and that helps understand where it is that we think EPUB headed. So today, we'll talk about exactly what I just said. The history, how W3C absorbed the International Digital Publishing Forum, the publishing activity at W3C and the EPUB 3.0 future. In 2011, the International Digital Publishing Forum released EPUB 3.0. I should say, I'm sorry. I think I skipped a slide, sorry (laughs). Somehow the slides got cut. But anyway, prior to that in the '90s, the first version of EPUB was released called the Open Ebook Format. But for our purposes, EPUB 3 is relevant. The shift to HTML5 in content and navigation is reflected in the work of W3C. It incorporates the DAISY digital talking book format, which increased EPUB's overall accessibility. I'm gonna take a step back from the slide because the history before this is important. The first version of EPUB was called the Open Ebook Format. It was released in the late '90s and the open ebook format was shifted to be called EPUB in, I think it was 2007, when we released the EPUB 2.0 format. A maintenance release was put out called EPUB 2.1 a few years later and that version EPUB 2.1 was very stable for about five years. That brings us to this slide. In 2011, the IDPF released EPUB 3.0. EPUB 3.0 was a major, major update. It shifted to HTML5, it increased major accessibility compatibility by incorporating some of DAISY's accessibility things from the digital talking book format. In 2012, the IDPF released the EPUB fixed layout format, based on EPUB 2. It's a specification that builds on an Apple internal format. And the IDPF really tried to keep up to date with what it was that the vendors were doing. So Apple had introduced this internal thing that a lot of people were using. It looks exactly like the print page and it's a way to create a fixed page view within EPUB. However, it was many years before anyone used EPUB and just note that there were four years between version two and version three of EPUB. We're a slow moving industry. The IDPF met W3C in 2013. We made very heavy use of the W3C specifications. There are references to it all over the place and in 2013 and 2014, we launched a digital publishing interest group. We produced a lot of notes, some of them were even tr tracked, in conjunction with other groups. We have the web publications for the open platform, which was a white paper, it's linked to here. We had use cases and requirements. We had a lot of notes about accessibility. The annotations use cases which contributed to the Web Annotations Working Group. Our colleague, Dave Kramer, worked with the CSS Working Group to begin the work on Latin req and I and some colleagues worked on DPUB ARIA 1.0, which is TR today. Now we come to how EPUB and web are compatible. The W3C merged with IPPF to close the gap, to close the coordination of EPUB and enhance it as a very successful use of applied web technologies. EPUB is really just a package of HTML, CSS and metadata, as well as other things. Some people call it a website in a box. Some EPUB reading systems are built around a browser, some of them are browser engines with a layer of a reading system on top of it. And the reading in the ebook world, we call user agents reading systems. So it's really a very nice application, applied web technology, as we often call it. The business and technology of publishing are somewhat different from the business and technology of the web. And I'm going to turn this over to Wendy now, to talk about how it is that we differ and why it is that this affects the way that we work so much. Thanks, Tzviya. So the business and technology of publishing is a really interesting thing and especially when you think about tech today and all of the different ways that tech companies operate. Publishing doesn't do any of those things, generally. The bulk of the drive behind EPUB really comes from trade publishing. They're the primary users of EPUB. They adopted it quite early and it was to put books on the web, because they wanted to shift the business model to digital. This came about with the launch of reading devices like the Kindle and Kobo and many others around the world. But the supply chain is quite complicated, especially for the sale of digital books. Publishers often distribute to distributors, who often then distribute to retailers, third parties like Amazon, Apple and Kobo. And for many publishers, EPUB 3 is just fine. It does exactly what they need. However, we do still actually have a lot of content in market that's on EPUB 2. And that is because there's not a lot of drive around upgrading. Next slide. And there's part of a reason for this. So publishing companies can be 200 or more years old. They're not tech companies and they don't think of themselves that way. Any R&D budget that they might have is often allocated to their website resources and backend systems. They're not thinking about the tech of building EPUBs. The EPUB team is often just part of the production team that also probably is producing their print books. Innovation is not treated the same way. They're not trying to, you know make new books, they're often just trying to find different ways to sell their content. And even then, they're really reliant on the retail experience provided by third parties. The authors are customers just as much as the customer is. Authors rely on royalties and publishers provide services to their authors, like the editorial services and production services that they wouldn't get in other places. However, we do see, I will add to this, thanks to digital publishing, there's a really big contingent of self-published authors out there, that are kind of flipping this on their head. The business model of publishing is kind of odd in that authors get paid advances before their book is ever published. And this can be a small amount of money or a large amount of money, really, depending on both who the author is but also potentially the impact of the book. And so it's a bit of a gamble and publishers are well aware of this when they publish a new book. The publishing companies do find new technology interesting but oftentimes, they don't because of that lack of R&D budget, they may not necessarily know how to implement that within their business. And publishing is not uniform. I'm talking mainly here about trade publishing and Tzviya can talk a little bit more about like educational and scholarly publishing but publishing for education, publishing professionally, or even news publishing is really, really different in terms of what technology is used and how they operate. Next slide. So, this kind of brings us all around to today. Why do we have an EPUB 3 Working Group? We've got this kind of ecosystem of publishers that are maybe not super-techie. They're not as aware of what's going on in the web world. We wanna kind of bring that together. EPUB is, today, I wasn't able to find a concrete number, especially during COVID but it's a over a billion dollar industry. And this year alone, we've seen actually a huge rise in the use of digital publishing for distribution, for people to get the books when they want them. And EPUB needs a stronger foundation. We have some longstanding problems that have been around since we launched EPUB 3 and we think that using the testing framework that's provided by the W3C's req track process, we can actually fix some of these problems. We also wanna integrate better with web technologies, like HTML5 and CSS 3. One of the reasons, so as Tzviya mentioned, they're heavily mentioned within the specification but when EPUB 3 was launched, things were still quite new and so things within the spec weren't ironed out completely. And so we still actually use XHTML as the main format within an EPUB. We wanna change that. We wanna kind of align ourselves better with the web. We wanna do some work around accessibility. As many people have probably heard, the European Accessibility Act is coming down the pipe. And a lot of publishers are quite concerned about that, because it's one of the first acts that explicitly mentioned e-books. And so there's a lot of work to be done in that area. We wanna make sure that we're providing the most up-to-date information for publishers, so that they can produce the best possible content. And we're hoping that any improvements that we make to EPUB will make production cheaper for publishers and more automated. And we're gonna talk a little bit about the biggest problem that we've had since the launch of EPUB in general is interoperability. Next slide. So the EPUB 3 interoperability problem is, the best way to explain it I think to web people is imagine the web back when in like 2006, when people used to have those buttons on their websites, it said Best viewed in Firefox." We have that today, except it's best viewed on iPad, or iBooks, or this book is optimized for Kobo, things like that.

We did a survey earlier this year of ebook users, so that could be this range from production people in publishing houses to tool vendors, to just readers.

And we learned that the number one problem is the lack of interoperability between reading systems.

One huge request was that the spec really needs to clarify the role of the User Agent Reading System and the use of JavaScript and CSS, which are things that are really big, really enhanced JavaScript and really enhanced CSS as part of the web today but it's not so much part of reading systems.

Interop is really expensive for publishers.

They have to spend a lot of time and money on QA and developing workarounds for each of the different reading systems that they distribute to.

And it's expensive for user agents, because as kind of illustrated in this slide, I was able to find just in the documentation from my company and the documentation from DAISY three different ways to produce a footnote.

Footnotes are really common in books and yet as a publisher, I don't know which is the right way to do it.

And if I do it the wrong way, some reading systems might not even recognize that my footnote is actually a footnote and therefore display it the way that I want it to be displayed.

And so this is just like one small example but there's many other examples of features within EPUB that don't work across different user agents, or work differently across these different user agents.

Next slide So the future.

So what the Working Group is working on right now is the development of what we're right now, we're calling EPUB 3.3.

It's all about testing.

We're working on getting that test suite, a really robust test suite, so that we can spot those interoperability problems and solve them.

We wanna align even better with the web by implementing HTML5 and CSS3.

We want clear info for reading systems.

We wanna make it easier to, you know, create interoperability between those different platforms, where possible.

And the number one thing that people always care about in the EPUB ecosystem is backwards compatibility.

We cannot implement something in the new version of the spec that potentially breaks old books, because there are a lot of old books.

And EPUB accessibility.

We're not trying to create a magic accessibility secret sauce that just solves every problem in the world but we do wanna make it a lot easier for publishers to create forward accessible content for their users so that every single EPUB that is produced in the coming years is accessible for users, regardless of what kind of needs they might have.

And we wanna apply WCAG and all the metadata guidelines that we've developed.

Next slide.

One thing that we actually, this is, I wanted to talk about a little bit was the possibility of EPUB and web.

One thing that gets mentioned a lot by both users and by publishers is why can't I just open an EPUB?

Why do I have to have a dedicated reading system software on my computer, or in my browser?

Why do I have to rely on a retailer and, you know, there's things, people often complain about getting stuck in these walled gardens.

If you buy a book from Amazon, you have to use Amazons applications, things like that.

You know, if you buy on Kobo, it's easier to use the Kobo app than it is to download the book and move it somewhere else.

You can do it but you know, you have to take a couple of extra steps.

One thing is that you just can't open an EPUB anywhere.

You need those reading systems, you need a dedicated application on your computer, or on your phone or in your browser but how great would it be if you could just open EPUB in any web browser.

This very briefly was possible in Edge and so they dropped the feature but we think that there's actually a lot of possibility out there for books to just be opened in browsers, if it was implemented.

There's a lot of good reasons for this, accessibility, making books as easy to open and use as possible.

Making them super available, especially for students and people who might not have access to like the latest e-readers and just the discoverability.

Being able to open a book anywhere it makes it a lot easier for people to get to content really fast and, you know, enjoy it wherever they are.

And next slide.

I think that's it, yeah.

Skip

Sponsors

Platinum sponsor

Coil Technologies,

Media sponsor

Legible

For further details, contact sponsorship@w3.org