EPUB 3.3

W3C Candidate Recommendation Draft

More details about this document
This version:
https://www.w3.org/TR/2022/CRD-epub-33-20221206/
Latest published version:
https://www.w3.org/TR/epub-33/
Latest editor's draft:
https://w3c.github.io/epub-specs/epub33/core/
History:
https://www.w3.org/standards/history/epub-33
Commit history
Test suite:
https://w3c.github.io/epub-tests/index.html
Implementation report:
https://w3c.github.io/epub-specs/epub33/reports/
Editors:
Matt Garrish (DAISY Consortium)
Ivan Herman (W3C)
Dave Cramer (Invited Expert)
Feedback:
GitHub w3c/epub-specs (pull requests, new issue, open issues)
public-epub-wg@w3.org with subject line [epub-33] … message topic … (archives)

Abstract

EPUB® 3 defines a distribution and interchange format for digital publications and documents. The EPUB format provides a means of representing, packaging, and encoding structured and semantically enhanced web content — including HTML, CSS, SVG, and other resources — for distribution in a single-file container.

This specification defines the authoring requirements for EPUB publications and represents the third major revision of the standard.

Status of This Document

This section describes the status of this document at the time of its publication. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at https://www.w3.org/TR/.

This document was published by the EPUB 3 Working Group as a Candidate Recommendation Draft using the Recommendation track.

Publication as a Candidate Recommendation does not imply endorsement by W3C and its Members. A Candidate Recommendation Draft integrates changes from the previous Candidate Recommendation that the Working Group intends to include in a subsequent Candidate Recommendation Snapshot.

This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.

This document was produced by a group operating under the W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.

This document is governed by the 2 November 2021 W3C Process Document.

1. Introduction

1.1 Overview

This section is non-normative.

EPUB 3 has been widely adopted as the format for digital books (ebooks), and this revision continues to increase the format's capabilities to better support a wider range of publication requirements, including complex layouts, rich media and interactivity, and global typography features. The expectation is that publishers will utilize the EPUB 3 format for a broad range of content, including books, magazines, and educational, professional, and scientific publications.

This specification represents the core of EPUB 3 and includes the conformance requirements for EPUB publications — the product of the standard. The other specifications that comprise EPUB 3 are as follows:

These specifications represent the formal list recognized as belonging to EPUB 3 and that contain functionality normatively referenced as part of the standard. The development of extension specifications periodically adds new functionality to EPUB publications. Features and functionality defined outside of core revisions to the standard, while not formally recognized in this specification, are nonetheless available for EPUB creators and reading system developers to use.

The non-normative EPUB 3 Overview [epub-overview-33] provides a general introduction to EPUB 3. A list of technical changes from the previous version is also available in the change log.

1.2 Organization

This section is non-normative.

This section reviews the organization of this specification through the central product it defines: the EPUB publication.

An EPUB publication is, in its most basic sense, a bundle of resources with instructions on how to render those resources to present the content in a logical order. The types of resources that are allowed in EPUB publication, as well as restrictions on their use, are defined in 3. Publication resources.

A ZIP-based archive with the file extension .epub bundles the EPUB publication's resources for distribution. As conformant ZIP archives, EPUB publications can be unzipped by many software programs, simplifying both their production and consumption.

The container format not only provides a means of determining that the zipped content represents an EPUB publication (the mimetype file), but also provides a universally named directory of non-normative resources (/META-INF). Key among these resources is the container.xml file, which directs reading systems to the available package documents. Refer to 4. Open Container Format (OCF) for more information about the container format.

An EPUB publication is typically represented by a single package document. This document includes metadata used by reading systems to present the content to the user, such as the title and author for display in a bookshelf as well as rendering metadata (e.g., whether the content is reflowable or has a fixed layout). It also provides a manifest of resources and includes a spine that lists the default sequence in which to render documents as a user progresses through the content. Refer to 5. Package document for the requirements for the package document.

The actual content of an EPUB publication — what users are presented with when they begin reading — is built on the Open Web Platform and comes in two flavors: XHTML and SVG. Called EPUB content documents, these documents typically reference many additional resources required for their proper rendering, such as images, audio and video clips, scripts, and style sheets.

Refer to 6. EPUB content documents for detailed information about the rules and requirements to produce EPUB content documents, and [epub-a11y-11] for accessibility requirements.

An EPUB publication also includes another key file called the EPUB navigation document. This document provides critical navigation capabilities, such as the table of contents, that allow users to navigate the content quickly and easily. The navigation document is a specialized type of XHTML content document which also allows EPUB creators to use it in the content (i.e., avoiding one table of contents for machine processing and another for user consumption). Refer to 7. EPUB navigation document for more information about this document.

EPUB publications by default are intended to reflow to fit the available screen space. It is also possible to create publications that have pixel-precise fixed layouts using images and/or CSS positioning. The metadata to control layouts are defined in 8. Layout rendering control.

Media overlay documents complement EPUB content documents. They provide declarative markup for synchronizing the text in EPUB content documents with prerecorded audio. The result is the ability to create a read-aloud experience where reading systems highlight the text as it is narrated. Refer to 9. Media overlays for the definition of media overlay documents.

While conceptually simple, an EPUB publication is more than just a collection of HTML pages and dependent assets in a ZIP package as presented here. Additional information about the primary features and functionality that EPUB publications provide to enhance the reading experience is available from the referenced specifications, and a more general introduction to the features of EPUB 3 is provided in the non-normative [epub-overview-33].

Refer to [epub-rs-33] for the processing requirements for reading systems. Although it is not necessary that EPUB creators read that document to create EPUB publications, an understanding of how reading systems present the content can help craft publications for optimal presentation to users.

1.3 Relationship to other specifications

This section is non-normative.

Caution

The technologies EPUB 3 builds on are constantly evolving. Some, typically referred to as "living" or "evergreen" standards, are subject to change daily and their impact on the validity of EPUB publications is immediate. Others are updated less frequently and the changes may not affect EPUB publications until EPUB 3 undergoes a new revision.

In all cases, it is possible that previously valid features may become obsolete (e.g., due to a lack of support or because of security issues). EPUB creators should therefore be cautious about using any feature without broad support and keep their EPUB conformance checkers up to date.

1.3.1 Relationship to HTML

The [html] standard is continuously evolving — there are no longer versioned releases of it. That standard, in turn, references various technologies that continue to evolve, such as MathML, SVG, CSS, and JavaScript.

The benefit of this approach for EPUB is that EPUB publications always keep pace with changes to the web without the need for new revisions. EPUB creators, however, must keep track of the various changes to HTML and the technologies it references to ensure they keep their processes up to date.

The XHTML profile defined by this specification inherits all definitions of semantics, structure and processing behaviors from HTML unless otherwise specified.

In addition, this specification defines a set of extensions to the [html] document model that EPUB creators may include in XHTML content documents.

1.3.2 Relationship to SVG

This specification does not reference a specific version of [svg], but instead uses an undated reference. Whenever there is any ambiguity in this reference, the latest recommended specification is the authoritative reference.

This approach ensures that EPUB will always keep pace with changes to the SVG standard. EPUB creators, however, must keep track of changes to the SVG standard to ensure they keep their processes up to date.

1.3.3 Relationship to CSS

EPUB 3 supports CSS as defined by the CSS Working Group Snapshot [csssnapshot]. EPUB 3 also maintains some prefixed CSS properties, to ensure consistent support for global languages.

1.3.4 Relationship to MathML

EPUB 3 only supports Presentation Markup [mathml3]. Content Markup is only allowed in structured markup annotations.

1.3.5 Relationship to SMIL

This specification relies on a subset of [smil3], from which the media overlays elements and attributes defined in 9.2.2 Media overlay document definition are derived.

1.3.6 Relationship to URL

This specification refers to the [url] standard for terminology and processing related to URLs expressed in EPUB publications. It is anticipated that new and revised web formats will adopt this standard, but until then this may put this specification in conflict with the internal requirements for some formats (e.g., valid relative paths), specifically with respect to the use of internationalized URLs. If a format does not allow internationalized URLs (i.e., URLs must conform to [rfc3986] or earlier), that requirement takes precedence within those resources.

1.4 Terminology

This specification defines the following terms specific to EPUB 3. They appear capitalized wherever used.

Note

Only the first instance of a term in a section links to its definition.

codec

Codec refers to content that has intrinsic binary format qualities, such as video and audio media types designed for optimum compression or that provide optimized streaming capabilities.

container resource

A publication resource that is located within the EPUB container, as opposed to a remote resource which is not.

Refer to 3.6 Resource locations for media type-specific rules for resource locations.

container root URL

The URL [url] of the root directory representing the OCF abstract container. It is implementation specific, but EPUB creators must assume it has properties defined in 4.2.5 URLs in the OCF abstract container.

content URL

The URL of a file or directory in the OCF abstract container, defined in 4.2.5 URLs in the OCF abstract container.

core media type resource

A publication resource that conforms to one of the MIME media types [rfc2046] listed in 3.2 Core media types and, therefore, does not require the provision of a fallback (cf. foreign resource).

The designation "core media type resource" only applies when a resource is used in the rendering of EPUB content documents and foreign content documents. A core media type resource cannot be used in the spine, for example, without a fallback unless it also has the media type of an EPUB content document.

EPUB conformance checker

An application that verifies the requirements of this specification against EPUB publications and reports on their conformance.

EPUB container
OCF ZIP container

The ZIP-based packaging and distribution format for EPUB publications defined in 4.3 OCF ZIP container.

EPUB container and OCF ZIP container are synonymous.

EPUB content document

A publication resource referenced from the spine or a manifest fallback chain that conforms to either the XHTML or SVG content document definitions.

EPUB content documents contain all or part of the content of an EPUB publication (i.e., the textual, visual and/or audio content).

EPUB creators can include EPUB content documents in the spine without the provision of fallbacks.

EPUB creator

An individual, organization, or process that produces an EPUB publication.

Note

The creation of an EPUB publication often involves the work of many individuals, and may be split across multiple organizations (e.g., when a publisher outsources all or part of the work). Depending on the process used to produce an EPUB publication, responsibilities may fall on the organization (e.g., the publisher), the individuals preparing the publication (e.g., technical editors), or automatic procedures (e.g., as part of a publication pipeline). As a result, not every party or process may be responsible for ensuring every requirement is met, but there is always an EPUB creator responsible for the conformance of the final EPUB publication.

Previous versions of this specification referred to the EPUB creator as the Author.

EPUB manifest (or manifest)

The section of the package document that lists the publication resources.

Refer to 5.6.1 The manifest element for more information.

EPUB navigation document

A specialization of the XHTML content document that contains human- and machine-readable global navigation information. The EPUB navigation document conforms to the constraints expressed in 7. EPUB navigation document.

EPUB publication

A logical document entity consisting of a set of interrelated resources packaged in an EPUB container.

An EPUB publication typically represents a single intellectual or artistic work, but this specification does not restrict the nature of the content.

EPUB reading system (or reading system)

A system that processes EPUB publications for presentation to a user in a manner conformant with this specification.

EPUB spine (or spine)

The section of the package document that defines an ordered list of EPUB content documents and foreign content documents. This list represents the default reading order of the EPUB publication.

Refer to 5.7.1 The spine element for more information.

exempt resource

Exempt resources are a special class of publication resources that reading systems are not required to support the rendering of, but EPUB creators do not have to provide fallbacks for.

Refer to 3.4 Exempt resources for more information.

file name

The name of any type of file within an OCF abstract container, whether a directory or a file within a directory.

file path

The file path of a file or directory is its full path relative to the root directory, as defined by the algorithm specified in 4.2.4 Deriving file paths.

fixed-layout document

An EPUB content document with fixed dimensions directly referenced from the spine. Fixed-layout documents are designated pre-paginated in the package document, as defined in 8.2 Fixed layouts.

foreign content document

Any publication resource referenced from a spine itemref element, or a manifest fallback chain, that is not an EPUB content document.

When a foreign content document is referenced from a spine itemref element, it requires a manifest fallback chain with at least one EPUB content document.

Note

With the exception of XHTML and SVG, all core media type resources are foreign content documents when referenced directly from the spine.

foreign resource

A publication resource with a MIME media type [rfc2046] that does not match any of those listed in 3.2 Core media types. Foreign resources are subject to the fallback requirements defined in 3.3 Foreign resources.

The designation "foreign resource" only applies to resources used in the rendering of EPUB content documents and foreign content documents.

Note

Foreign resource and foreign content document are not interchangeable terms. The types of resources considered foreign when used in the spine is greater than the types of resources considered foreign when used in EPUB content documents.

linked resource

A resource that is only referenced from a package document link element (i.e., not also used in the rendering of an EPUB publication.

Linked resources are not publication resources but may be stored in the EPUB container. They do not require fallbacks.

media overlay document

An XML document that associates the XHTML content document with pre-recorded audio narration to provide a synchronized playback experience, as defined in 9. Media overlays.

non-codec

Non-codec refers to content types that benefit from compression due to the nature of their internal data structure, such as file formats based on character strings (for example, HTML, CSS, etc.).

OCF abstract container

The OCF abstract container defines a file system model for the contents of the OCF ZIP container, as defined in 4.2 OCF abstract container.

package document

A publication resource that describes the rendering of an EPUB publication, as defined in 5. Package document. The package document carries meta information about the EPUB publication, provides a manifest of resources, and defines a default reading order.

publication resource

A resource that contains content or instructions that contribute to the logic and rendering of an EPUB publication. In the absence of this resource, reading systems may not render the EPUB publication as the EPUB creator intends. Examples of publication resources include the package document, EPUB content documents, CSS Style Sheets, audio, video, images, embedded fonts, and scripts.

EPUB creators must list publication resources in the package document manifest and typically bundle them all in the EPUB container (the exception being they may locate resources listed in 3.6 Resource locations outside the EPUB container).

Note

Resources on the web identified in outbound hyperlinks (e.g., referenced from the href attribute of an [html] a element) are not publication resources.

Data urls are also not publication resources — they are considered part of the resource they are embedded in.

remote resource

A publication resource that is located outside of the EPUB container, typically, but not necessarily, on the web.

Publication resources within the EPUB container are referred to as container resources.

Refer to 3.6 Resource locations for media type specific rules for resource locations.

root directory

The root directory represents the base of the OCF abstract container file system. This directory is virtual in nature.

scripted content document

An EPUB content document that includes scripting or an XHTML content document that contains [html] form elements.

Refer to 6.3.2 Scripting for more information.

SVG content document

An EPUB content document that conforms to the constraints expressed in 6.2 SVG content documents.

synthetic spread

The rendering of two adjacent pages simultaneously on a device screen.

top-level content document

An EPUB content document or foreign content document referenced from the spine, whether directly or via a fallback chain.

unique identifier

The primary identifier for an EPUB publication. The unique identifier is the value of the dc:identifier element specified by the unique-identifier attribute in the package document.

Significant revision, abridgement, etc. of the content requires a new unique identifier.

viewport

The region of an EPUB reading system in which an EPUB publication is rendered visually to a user.

XHTML content document

An EPUB content document that conforms to the profile of [html] defined in 6.1 XHTML content documents.

XHTML content documents use the XML syntax defined in [html].

1.5 Conformance

As well as sections marked as non-normative, all authoring guidelines, diagrams, examples, and notes in this specification are non-normative. Everything else in this specification is normative.

The key words MAY, MUST, MUST NOT, OPTIONAL, RECOMMENDED, REQUIRED, SHOULD, and SHOULD NOT in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.

All algorithm explanations are non-normative.

1.6 Authoring shorthands

This section is non-normative.

In package document metadata examples, reserved prefixes are used without declaration.

References to Dublin Core elements [dcterms] use the dc: prefix. This prefix must be declared in the package document for their use to be valid (xmlns:dc="http://purl.org/dc/elements/1.1/")

The epub namespace prefix [xml-names] is also used on elements and attributes without always having an explicit declaration (xmlns:epub="http://www.idpf.org/2007/ops").

2. EPUB publication conformance

An EPUB publication:

In addition, all publication resources MUST adhere to the requirements in 3. Publication resources.

The rest of this specification covers specific conformance details.

2.1 Conformance checking

This section is non-normative.

Due to the complexity of this specification and number of technologies used in EPUB publications, EPUB creators are advised to use an EPUB conformance checker to verify the conformance of their content.

EPUBCheck is the de facto EPUB conformance checker used by the publishing industry and has been updated with each new version of EPUB. It is integrated into a number of authoring tools and also available in alternative interfaces and other languages (for more information, refer to its Apps and Tools page).

When verifying their EPUB publications, EPUB creators should ensure they do not violate the requirements of this specification (practices identified by the keywords "MUST", "MUST NOT", and "REQUIRED"). These types of issues will often result in EPUB publications not rendering or rendering in inconsistent ways. These issues are typically reported as errors or critical errors.

EPUB creators should also ensure that their EPUB publications do not violate the recommendations of this specification (practices identified by the keywords "SHOULD", "SHOULD NOT", and "RECOMMENDED"). Failure to follow these practices does not result in an invalid EPUB publication but may lead to interoperability problems and other issues that impact the user reading experience. These issues are typically reported as warnings.

Note

Vendors, distributors, and other retailers of EPUB publications should consider the importance of recommended practices before basing their acceptance or rejection on a zero-issue outcome from an EPUB conformance checker. There will be legitimate reasons why EPUB creators cannot follow recommended practices in all cases.

3. Publication resources

3.1 Introduction

This section is non-normative.

An EPUB publication is made up of many different categories of resources, not all of which are mutually exclusive. Some resources are publication resources, some are not. Some publication resources are allowed in the spine by default, while all others require fallbacks. Some resources can be used in rendering EPUB content documents, while others can only be used with fallbacks.

Trying to understand these differences by reading the technical definitions of each category of resource can be complex. To make the categorizations easier to understand, this introduction uses the concept of different planes to explain how resources are grouped and referred to.

The three planes are:

The same resource may exist on more than one plane and will be referred to differently in this specification depending on which plane is being discussed. For example, a core media type resource used in the rendering of an EPUB content document (on the content plane) may also be a foreign content document if it is also listed in the spine (the spine plane).

The following sections describe these planes in more detail.

Note

Refer to H.1 Resources for a detailed example showing how resources fit into the different planes.

3.1.1 The manifest plane

To manifest plane defines all the resources of an EPUB publication. It is analogous to the package document manifest, but includes resources not present in that list.

The primary resources in this group are designated publication resources, which are all the resources used in rendering an EPUB publication to the user. EPUB creators always have to list these resources in the manifest element.

Publication resources are further classified by their use(s) in the spine plane and content plane.

The manifest plane also contains a set of linked resources. These resources are tangential to the direct rendering. They include, for example, metadata records and links to external content (e.g., where to purchase an EPUB publication).

Unlike publication resources, they are not listed in the package document manifest (i.e., because they are not essential to rendering the EPUB publication). They are instead defined in link elements in the package document metadata. These elements define their nature and purpose similar to how manifest item elements define publication resource. (In this way, they are like an extension of the manifest.)

Refer to 5.5.7 The link element for more information about linked resources.

Resources in the manifest plane are also sometimes broken down by where they are located. Although most publication resources have to be located in the EPUB container (called container resources), EPUB 3 allows audio, video, font and script data resources to be hosted outside the container. These exceptions were made to speed up the download and loading of EPUB publications, as these resources are typically quite large, and, in the case of fonts, not essential to the presentation. When remotely hosted, these publication resources are referred to as remote resources.

Since linked resources are not essential to the rendering of an EPUB publication, there are no requirements on where they are located and consequently no special naming of them based on their location. They may be located within the EPUB container or outside it.

Note

Hyperlinked content outside the EPUB container (e.g., web pages) are not publication resources, and consequently are not listed in the manifest. Reading systems will normally open these links in a separate browser instance, not as part of the EPUB publication.

3.1.2 The spine plane

The spine plane defines resources used in the default reading order established by the spine, which includes both linear and non-linear content. The spine instructs reading systems on how to load these resources as the user progresses through the EPUB publication. Although many resources may be bundled in an EPUB container, they are not all allowed by default in the spine.

EPUB 3 defines a special class of resources called EPUB content documents that EPUB creators can use in the spine without any restrictions. EPUB content documents encompass both XHTML content documents and SVG content documents.

To use any other type of resource in the spine, called a foreign content document, requires including a fallback to an EPUB content document. This extensibility model allows EPUB creators to experiment with formats while ensuring that reading systems are always able to render something for the user to read, as there is no guarantee of support for foreign content documents.

A mechanism called manifest fallbacks allows EPUB creators to provide fallbacks for foreign content documents. In this model, the manifest entry for the foreign content document must include a fallback attribute that points to the next possible resource for reading systems to try when they do not support its format. Although not common, a fallback resource can specify another fallback, thereby making chains many resources deep. The one requirement is that there must be at least one EPUB content document in a manifest fallback chain.

Although they are not directly listed in the spine, all of the resources in the fallback chain are considered part of the spine, and by extension part of the spine plane, since any may be used by a reading system.

Refer to 3.5.1 Manifest fallbacks for more information.

Caution

Although manifest fallbacks fulfill the technical requirements of EPUB, there is little practical support for them in reading systems. Their use is strongly discouraged as it can lead to unreadable publications.

Note

It is possible to provide manifest fallbacks for EPUB content documents, but this is not required or common. For example, a scripted content document could have a fallback to an unscripted alternative for reading systems that do not support scripting.

3.1.3 The content plane

The content plane classifies resources that are used when rendering EPUB content documents and foreign content documents. These types of resources include embedded media, CSS style sheets, scripts, and fonts. These resources fall into three categories based on their reading system support: core media type resources, foreign resources, and exempt resources.

A core media type resource is one that reading systems have to support, so it can be used without restriction in EPUB or foreign content documents. For more information about core media type resources, refer to 3.2 Core media types.

Note

Being a core media type resource does not mean that reading systems will always render the resource, as not all reading systems support all features of EPUB 3. A reading system without a viewport, for example, will not render visual content such as images.

The opposite of core media type resources are foreign resources. These are resources that reading systems are not guaranteed to support the rendering of. As a result, similar to how using foreign content documents in the spine requires fallbacks to ensure their rendering, using foreign resources in content documents also requires fallbacks. These fallbacks are provided in one of two ways: using the capabilities of the host format or via manifest fallbacks.

The preferred method is to use the fallback capabilities of the host format. Many HTML elements, for example, have intrinsic fallback capabilities. One example is the picture element [html], which allows EPUB creators to specify multiple alternative image formats.

If an intrinsic fallback method is not available, it is also possible to use manifest fallbacks, but this method, as cautioned against in the previous section, is discouraged. For more information about foreign resources, refer to 3.3 Foreign resources.

Falling between core media type resources and foreign resources are exempt resources. These are most closely associated with foreign resources, as there is no guarantee that reading systems will render them. But like core media types, they do not require fallbacks.

Exempt resources tend to address specific cases for which there are no core media types defined, but for which providing a fallback would prove cumbersome or unnecessary. These include embedding video, adding accessibility tracks, and linking to resources from the [html] link element.

Refer to 3.4 Exempt resources for more information about these exceptions.

Note

A common point of confusion arising from core media type resources is the listing of XHTML and SVG as core media type resources with the requirement the markup conform to their respective EPUB content document definitions. This allows EPUB creators to embed both XHTML and SVG documents in EPUB content documents while keeping consistent requirements for authoring and reading system support.

In practice, it means that EPUB creators can put XHTML and SVG core media type resources in the spine without any modification or fallback (they are also conforming XHTML and SVG content documents), but this is a unique case. All other core media type resources become foreign content documents when used in the spine (i.e., foreign content documents include all foreign resources and all core media type resources except for XHTML and SVG).

3.2 Core media types

EPUB creators MAY include publication resources that conform to the MIME media type [rfc2046] specifications defined in the following table without fallbacks when they are used in EPUB content documents and foreign content documents. These resources are classified as core media type resources.

With the exception of XHTML content documents and SVG content documents, EPUB creators MUST provide manifest fallbacks for core media type resources referenced directly from the spine. In this case, they are foreign content documents.

The columns in the table represent the following information:

Media Type Content Type Definition Applies to
Images
image/gif [gif] GIF Images
image/jpeg [jpeg] JPEG Images
image/png [png] PNG Images
image/svg+xml SVG content documents SVG documents
image/webp [webp-container], [webp-lb] WebP Images
Audio
audio/mpeg [mp3] MP3 audio
audio/mp4 [mpeg4-audio], [mp4] AAC LC audio using MP4 container
audio/opus [rfc7845] OPUS audio using OGG container
Style
text/css CSS Style Sheets CSS Style Sheets.
Fonts
  1. font/ttf
  2. application/font-sfnt
[truetype] TrueType fonts
  1. font/otf
  2. application/font-sfnt
  3. application/vnd.ms-opentype
[opentype] OpenType fonts
  1. font/woff
  2. application/font-woff
[woff] WOFF fonts
font/woff2 [woff2] WOFF2 fonts
Other
application/xhtml+xml XHTML content documents HTML documents that use the XML syntax [html].
  1. application/javascript
  2. application/ecmascript
  3. text/javascript
[rfc4329] Scripts.
application/x-dtbncx+xml [opf-201] The legacy NCX.
application/smil+xml Media overlays EPUB media overlay documents
Note

Inclusion as a core media type resource does not mean that all reading systems will support the rendering of a resource. Reading system support also depends on the capabilities of the application (e.g., a reading system with a viewport must support image core media type resources, but a reading system without a viewport does not). Refer to Core media types [epub-rs-33] for more information about which reading systems rendering capabilities require support for which core media type resources.

The Working Group typically only includes formats as core media type resources when they have broad support in web browser cores — the rendering engines that EPUB 3 reading systems build upon. They are an agreement between reading system developers and EPUB creators to ensure the predictability of rendering of EPUB publications.

3.3 Foreign resources

A foreign resource, unlike a core media type resource is one which is not guaranteed reading system support when used in an EPUB content document or foreign content document.

EPUB creators MUST provide fallbacks for foreign resources, where fallbacks take one of the following forms:

Note

Refer to the [html] and [svg] specifications for the intrinsic fallback capabilities their elements provide.

3.5.2 Intrinsic fallbacks also provides additional information about how fallbacks are interpreted for specific elements.

3.4 Exempt resources

An exempt resource shares properties with both foreign resources and core media type resources. It is most similar to a foreign resource in that it is not guaranteed reading system support, but, like a core media type resource, does not require a fallback.

There are only a small set of special cases for exempt resources. Video, for example, are exempt from fallbacks because there is no consensus on a core media type video format at this time (i.e., there is no format to fallback to). Similarly, audio and video tracks are exempt to allow EPUB creators to meet accessibility requirements using whatever format reading systems support best.

The following list details cases of content-specific exempt resources, including any restrictions on where EPUB creators can use them.

Fonts

All font resources not already covered as font core media types are exempt resources.

This exemption allows EPUB creators to use any font format without a fallback, regardless of reading system support expectations, as CSS rules will ensure a fallback font in case of no support.

Refer to the reading system support requirements for fonts [epub-rs-33] for more information.

Tracks

All audio and video tracks (e.g., [webvtt] captions, subtitles and descriptions) referenced from the [htmltrack element are exempt resources.

Video

All video codecs referenced from the [html] video — including any child source elements — are exempt resources.

Note

Although reading systems are encouraged to support at least one of the H.264 [h264] and VP8 [rfc6386] video codecs, support for video codecs is not a conformance requirement. EPUB creators must consider factors such as breadth of adoption, playback quality, and technology royalties when deciding which video formats to include.

Note

The exemptions made above do not apply to the spine. If an exempt resource is used in the spine, and it is not also an EPUB content document, it will require a fallback in that context.

In addition to the content-specific exemptions, a resource is classified as an exempt resource if:

This exemption allows EPUB creators to include resources in the EPUB container that are not for use by EPUB reading systems. The primary case for this exemption is to allow data files to travel with an EPUB publication, whether for scripts to use in their constituent EPUB content documents or for external applications to use (e.g., a scientific journal might include a data set with instructions on how to extract it from the EPUB container).

It also allows EPUB creators to use foreign resources in foreign content documents without reading systems or EPUB conformance checkers having to understand the fallback capabilities of those resources (i.e., the requirement for a fallback for the foreign content document covers any rendering issues within it). As the resource is not referenced from an EPUB content document, it automatically becomes exempt from fallbacks.

3.5 Resource fallbacks

3.5.1 Manifest fallbacks

Manifest fallbacks are a feature of the package document that create a manifest fallback chain for a publication resource, allowing reading systems to select an alternative format they can render.

Fallback chains are created using the fallback attribute on manifest item elements. This attribute references the ID [xml] of another manifest item that is a fallback for the current item. The ordered list of all the references that a reading system can reach, starting from a given item's fallback attribute, represents the full fallback chain for that item. This chain also represents the EPUB creator's preferred fallback order.

There are two cases for manifest fallbacks:

Spine fallbacks

EPUB creators MUST specify a fallback chain for a foreign content document to ensure that reading systems can always render the spine item. In this case, the chain MUST contain at least one EPUB content document.

EPUB creators MAY provide fallbacks for EPUB content documents (e.g., to provide a fallback for scripted content).

When a fallback chain includes more than one EPUB content document, EPUB creators can use the properties attribute to differentiate the purpose of each.

Content fallbacks
Note

The original purpose for content fallbacks was to specify fallback images for the [html] img element. As HTML now has intrinsic fallback mechanism for images, the use of content fallbacks is strongly discouraged. EPUB creators should always use the intrinsic fallback capabilities of [html] and [svg] to provide fallback content.

EPUB creators MUST provide a content fallback for foreign resources when the elements that reference them do not have intrinsic fallback capabilities. In this case, the fallback chain MUST contain at least one core media type resource.

EPUB creators MAY also provide manifest fallbacks for core media type resources (e.g., to allow reading systems to select from more than one image format).

Regardless of the type of manifest fallback specified, fallback chains MUST NOT contain self-references or circular references to item elements in the chain.

Note

As it is not possible to use manifest fallbacks for resources represented in data URLs, EPUB creators can only represent foreign resources as data URLs where an intrinsic fallback mechanism is available.

3.5.2 Intrinsic fallbacks

The following sections provide additional clarifications about the intrinsic fallback requirements of specific elements.

3.5.2.1 HTML audio and video fallbacks

EPUB creators MUST NOT use embedded [htmlflow content within a media element (i.e, audio or video) as an intrinsic fallback for audio foreign resources. Only child source elements [html] provide intrinsic fallback capabilities.

Only older reading systems that do not recognize the audio or the video elements (e.g., EPUB 2 reading systems) will render the embedded content. When reading systems support these elements but not the available media formats, they do not render the embedded content for the user.

Note

The requirement for fallbacks only applies to audio foreign resources referenced from audio and video elements. Fallbacks are not required for video resources; they are exempt resources.

3.5.2.2 HTML img fallbacks

Due to the variety of sources that EPUB creators can specify in the [html] img element, the following fallback conditions apply to its use:

  • If it is the child of a picture element:

    • it MUST reference core media type resources from its src and srcset attributes, when EPUB creators specify those attributes; and
    • each sibling source element MUST reference a core media type resource from its src and srcset attributes unless it specifies the MIME media type [rfc2046] of a foreign resource in its type attribute.
  • Otherwise, it MAY reference foreign resources in its src and srcset attributes provided EPUB creators define a manifest fallback.
3.5.2.3 HTML script element

Although data blocks have a separate MIME media type [rfc2046] from their containing XHTML content document, it is not possible to provide intrinsic fallbacks as no such mechanisms are specified for the [html] script element. It is also not possible to provide manifest fallbacks because data blocks cannot be defined as standalone files in the EPUB container but are always embedded as inline script elements.

But, as the script element does not represent user content — data blocks are not rendered unless manipulated by script, and content rendered by scripts already has core media type requirements — requiring fallbacks for the raw data does not serve a useful purpose.

Consequently, to ensure that EPUB creators can include data blocks for scripting purposes, they are exempt from fallback requirements.

Note

This exemption aligns data blocks with the exemption for data files.

Note

[svg] does not define data blocks as of publication, but the same exclusion would apply if a future update adds the concept.

3.6 Resource locations

EPUB creators MAY host the following types of publication resources outside the EPUB container:

EPUB creators MUST store all other resources within the EPUB container.

Storing all resources inside the EPUB container is strongly encouraged whenever possible as it allows users access to the entire presentation regardless of connectivity status.

When resources have to be located outside the EPUB container, EPUB creators are RECOMMENDED to reference them via the secure https URI scheme [rfc7230] to limit the threat of exposing their publications, and users, to network attacks. Reading systems might not load remote resources referenced using insecure schemes such as http.

These rules for locating publication resource apply regardless of whether the given resource is a core media type resource or a foreign resource.

Note

Refer to the remote-resources property for more information on how to indicate that a manifest item references a remote resource.

3.7 Data URLs

The data: URL scheme [rfc2397] is used to encode resources directly into a URL string. The advantage of this scheme is that it allows EPUB creators to embed a resource within another, avoiding the need for an external file.

EPUB creators MUST NOT use data URLs in the following scenarios where they can result in a top-level content document or top-level browsing context [html]:

Note

These restrictions on the use of data URLs are to prevent security issues and also to ensure that reading systems can determine where to take a user next (i.e., because data URLs cannot be referenced from the spine).

The list of prohibited uses for data URLs is subject to change as the respective standards that allow their use evolve.

A consequence of embedding is that the data in a data URL is not considered its own unique publication resource for manifest reporting purposes (i.e., only its containing publication resource gets listed). As this data has its own media type, however, it is still subject to foreign resource restrictions. EPUB creators MUST therefore encode data URLs as core media type resources or provide a fallback using the intrinsic fallback mechanisms of the host format.

3.8 File URLs

The file: URL scheme is defined in [rfc8089] as "identifying an object (a 'file') stored in a structured object naming and accessing environment on a host (a 'file system')." It is typically used to retrieve files from the local operating system.

Using a file URL in an EPUB publication, which can be transferred among different hosts, represents a security risk and is also non-interoperable. As a consequence, EPUB creators MUST NOT use file URLs in EPUB publications.

3.9 XML conformance

Any publication resource that is an XML-based media type [rfc2046]:

The above constraints apply regardless of whether the given publication resource is a core media type resource or a foreign resource.

Note

[html] and [svg] are removing support for the XML base attribute [xmlbase]. EPUB creators should avoid using this feature.

4. Open Container Format (OCF)

4.1 Introduction

This section is non-normative.

OCF is the required container technology for EPUB publications. OCF may play a role in the following workflows:

This section defines the rules for structuring the file collection in the abstract: the "abstract container". It also defines the rules for the representation of this abstract container within a ZIP archive: the "physical container". The rules for ZIP physical containers build upon the ZIP technologies used by [odf].

OCF also defines a standard method for obfuscating embedded fonts for those EPUB publications that require this functionality.

4.2 OCF abstract container

4.2.1 Introduction

This section is non-normative.

The OCF abstract container file system model uses a single common root directory. All container resources are located within the directory tree headed by the root directory, but no specific file system structure for them is mandated by this specification.

The file system model also includes a mandatory directory named META-INF that is a direct child of the root directory and stores the following special files:

container.xml [required]

Identifies one or more package documents that define the EPUB publication.

signatures.xml [optional]

Contains digital signatures for various assets.

encryption.xml [optional]

Contains information about the encryption of publication resources. This file is mandatory when EPUB creators use font obfuscation.

metadata.xml [optional]

Used to store metadata about the OCF ZIP container.

rights.xml [optional]

Used to store information about digital rights.

manifest.xml [optional]

A manifest of container contents as allowed by Open Document Format [odf].

Refer to 4.2.6 META-INF directory for conformance requirements for the various files in the META-INF directory.

4.2.2 File and directory structure

The virtual file system for the OCF abstract container MUST have a single common root directory for all the contents of the container.

The OCF abstract container MUST include a directory for configuration files named META-INF that is a direct child of the container's root directory. Refer to 4.2.6 META-INF directory for the requirements for the contents of this directory.

The file name mimetype in the root directory is reserved for use by OCF ZIP containers, as explained in 4.3 OCF ZIP container.

EPUB creators MAY locate all other files within the OCF abstract container in any location descendant from the root directory, provided they are not within the META-INF directory. EPUB creators MUST NOT reference files in the META-INF directory from an EPUB publication.

Note

Some reading systems do not provide access to resources outside the directory where the package document is stored. EPUB creators should therefore place all resources at or below the directory containing the package document to avoid interoperability issues.

This problem is more commonly encountered when creating multiple renditions [epub-multi-rend-11] of the publication.

4.2.3 File paths and file names

In the context of the OCF abstract container, file paths and file names are scalar value strings [infra] (i.e., their values are case sensitive).

In addition, the following restrictions are designed to allow file paths and file names to be used without modification on most operating systems:

  • File names MUST NOT exceed 255 bytes.

  • The file paths for any directory or file within the OCF abstract container MUST NOT exceed 65535 bytes.

  • File names MUST NOT use the following [unicode] characters, as commonly used operating systems may not support these characters consistently:

    • SOLIDUS: / (U+002F)

    • QUOTATION MARK: " (U+0022)

    • ASTERISK: * (U+002A)

    • FULL STOP as the last character: . (U+002E)

    • COLON: : (U+003A)

    • LESS-THAN SIGN: < (U+003C)

    • GREATER-THAN SIGN: > (U+003E)

    • QUESTION MARK: ? (U+003F)

    • REVERSE SOLIDUS: \ (U+005C)

    • VERTICAL LINE: | (U+007C)

    • DEL (U+007F)

    • C0 range (U+0000 … U+001F)

    • C1 range (U+0080 … U+009F)

    • Private Use Area (U+E000 … U+F8FF)

    • All Unicode Non Characters, specifically:

      • The 32 contiguous characters in the Basic Multilingual Plane (U+FDD0 … U+FDEF)

      • The last two code points of the Basic Multilingual Plane (U+FFFE and U+FFFF)

      • The last two code points at the end of the Supplementary Planes (U+1FFFE, U+1FFFF … U+EFFFE, U+EFFFF)

    • Specials (U+FFF0 … U+FFFF)

    • Supplementary Private Use Area-A (U+F0000 … U+FFFFF)

    • Supplementary Private Use Area-B (U+100000 … U+10FFFF)

    Note

    The Unicode Character Database [uax44] also includes a list of deprecated characters. EPUB creators are advised to avoid these characters, as well, as it is expected that EPUB conformance checkers will flag their use.

  • For compatibility with older reading systems, file names SHOULD NOT contain SPACE (U+0020) characters.

  • All file names within the same directory MUST be unique following Unicode canonical normalization [uax15] and then full case folding [unicode]. (Refer to Unicode Canonical Case Fold Normalization Step [charmod-norm] for more information.)

Note

If EPUB creators dynamically integrate resources (i.e., where the naming is beyond their control), they should be aware that automatic truncation of file names to keep them within the 255 bytes limit can lead to corruption. This is due to the difference between bytes and characters in multibyte encodings such as UTF-8; it is, therefore, important to avoid mid-character truncation. See the section on "Truncating or limiting the length of strings" in [international-specs] for more information.

Note

EPUB creators should use an abundance of caution in their file naming when interoperability of content is key. The list of restricted characters is intended to help avoid some known problem areas, but it does not ensure that all other Unicode characters are supported. Although Unicode support is much better now than in earlier iterations of EPUB, older tools and toolchains may still be encountered (e.g., ZIP tools that only support [us-ascii]).

4.2.4 Deriving file paths

To derive the file path, given a file or directory file in the OCF abstract container, apply the following steps (expressed using the terminology of [infra]):

  1. Let path be an empty list.
  2. Let current be file.
  3. While current is not the root directory:
    1. prepend the file name of current to path;
    2. set current to the parent directory of current.
  4. Return the concatenation of path using the U+002F (/) character.

4.2.5 URLs in the OCF abstract container

The container root URL is the URL [url] of the root directory. It is implementation-specific, but EPUB creators MUST assume it has the following properties:

The content URL of a file or directory in the OCF abstract container is the result of parsing the file's file path with the container root URL as base.

Note

The container root URL is the URL assigned by the reading system to the root of the EPUB container. It typically depends on how the reading system internally implements the container file system.

However, a reading system cannot arbitrarily use any URL, but one that honors the constraints defined above. These constraints ensure that any relative URL string found in the EPUB will always be parsed to a URL of a resource within the container (which may or may not exist). The primary reason for these constraints is to avoid potential run-time security issues that would be caused by parsed URLs "leaking" outside the container files.

For example, URLs like https://localhost:12345/ or https://www.example.org:12345/ honor these properties. But URLs like https://localhost:12345/path/to.epub/, file:///path/to.epub#path=/, or jar:file:/path/to.epub!/EPUB/ do not (parsing the URL string ".." with these three examples as base would return https://localhost:12345/path/, file:///path/, and a parsing error, respectively). It is the responsibility of the reading system to assign a URL to the root directory that complies with the properties defined above.

Note

Parsing may replace some characters in the file path by their percent encoded alternative. For example, A/B/C/file name.xhtml becomes A/B/C/file%20name.xhtml.

A string url is a valid-relative-ocf-URL-with-fragment string if it is a path-relative-scheme-less-url string, optionally followed by U+0023 (#) and a url-fragment string, and if the following steps return true:

  1. Set the container root URL to https://a.example.org/A/.

    Explanation

    The goal of the algorithm is to detect whether url could be seen as "leaking" outside the container. To do that, the standard URL parsing algorithm is used with an artificial root URL; the detection of the "leak" is done by comparing the result of the parsing with the presence of the first test path segment (A). (Note that the artificial container root URL wilfully violates, for the purpose of this algorithm, the required properties by using that first test path segment.)

  2. Let base be the base URL that must be used to parse url as defined by the context (document or environment) where url is used, and according to the content URL of the package document (see 5.2 Parsing URLs in the package document).

    Explanation

    In the case of a URL in the package document the base variable is set to the content URL of the package document. In the case of a document within the META-INF directory, the base variable is set to the container root URL (see 4.2.6.2 Parsing URLs in the META-INF directory). In the case of a URL in an XHTML content document, the base URL used for parsing is defined by the HTML standard. Typically, it will be the content URL of the content document (unless the discouraged base element is used).

  3. Let testURLRecord be the result of applying the URL parser to url, with base.
  4. Let testURLStringA be the result of applying the URL Serializer to testURLRecord.
  5. Set the container root URL to https://b.example.org/B/.

    Explanation

    The reasons to repeat the same steps twice with different, and artificial, settings of the container root URL is to avoid collision which may occur if the url string also includes /A/. Consider, for example, the case where url is ../../A/doc.xhtml.

  6. Set base to be the base URL that must be used to parse url as defined by the context (document or environment) where url is used, and according to the content URL of the package document (see 5.2 Parsing URLs in the package document).
  7. Set testURLRecord to be the result of applying the URL parser to url, with base.
  8. Let testURLStringB be the result of applying the URL Serializer to testURLRecord.
  9. If testURLStringA does not start with https://a.example.org/ or testURLStringB does not start with https://b.example.org/, return true.

    Explanation

    If any of the result does not share the test URL host, it means that url, or its base URL (for example, in HTML, if it is explicitly set with the base element), was absolute and points outside the container. This is acceptable.

  10. If testURLStringA starts with https://a.example.org/A/ and testURLStringB starts with https://b.example.org/B/, return true.

    Explanation

    The presence of the first test path segments (A, respectively B) indicate that the URL doesn't leak outside the container.

  11. Return false.

In the OCF abstract container, any URL string MUST be an absolute-url-with-fragment string or a valid-relative-ocf-URL-with-fragment string.

In addition, all relative-URL-with-fragment strings [url] MUST, after parsing, be equal to the content URL of an existing file in the OCF abstract container.

Note

These constraints on URL strings mean that:

  • relative URL strings starting with a / (U+002F) (for example, /EPUB/content.xhtml) are disallowed;
  • relative URL strings containing more double-dot path segments than needed to reach the target file (for example, EPUB/../../../../config.xml) are disallowed;
  • any other absolute or relative URL string is allowed.

Note that in any case, even the disallowed URL strings described above will not "leak" outside the container after parsing (as explained in the first note of this section). They are nevertheless disallowed for better interoperability with non-conforming or legacy reading systems and toolchains.

4.2.6 META-INF directory

4.2.6.1 Inclusion in OCF abstract container

All OCF abstract containers MUST include a directory called META-INF in their root directory.

This directory is reserved for configuration files, specifically those defined in 4.2.6.3 Reserved files.

4.2.6.2 Parsing URLs in the META-INF directory

To parse a URL string url used in files located in the META-INF directory the URL parser MUST be applied to url, with the container root URL as base.

4.2.6.3 Reserved files
4.2.6.3.1 Container file (container.xml)

The REQUIRED container.xml file in the META-INF directory identifies the package documents available in the OCF abstract container.

All [xml] elements defined in this section are in the urn:oasis:names:tc:opendocument:xmlns:container namespace [xml-names] unless specified otherwise.

The contents of this file MUST be valid to the definition in this section after removing all elements and attributes from other namespaces (including all attributes and contents of such elements).

Note

An XML Schema also informally defines the content of this file.

4.2.6.3.1.1 The container element

The container element encapsulates all the information in the container.xml file.

Element Name:

container

Usage:

REQUIRED root element [xml] of the container.xml file.

Attributes:
version [required]
This attribute MUST have the value "1.0".
Content Model:

In this order:

4.2.6.3.1.2 The rootfiles element

The rootfiles element contains a list of package documents available in the EPUB container.

Element Name:

rootfiles

Usage:

REQUIRED first child of container.

Attributes:

None

Content Model:
4.2.6.3.1.3 The rootfile element

Each rootfile element identifies the location of one package document in the EPUB container.

Element Name:

rootfile

Usage:

As child of the rootfiles element. Repeatable.

Attributes:
full-path [required]

Identifies the location of a package document.

The value of the attribute MUST be a path-relative-scheme-less-URL string [url]. The path is relative to the root directory.

media-type [required]

Identifies the media type of the package document.

The value of the attribute MUST be "application/oebps-package+xml".

Content Model:

Empty

If an EPUB creator defines more than one rootfile element, each MUST reference a package document that conforms to the same version of EPUB. Each package document represents one rendering of the EPUB publication.

Note

Although the EPUB container provides the ability to reference more than one package document, this specification does not define how to interpret, or select from, the available options. Refer to [epub-multi-rend-11] for more information on how to bundle more than one rendering of the content.

4.2.6.3.1.6 Examples

This section is non-normative.

4.2.6.3.2 Encryption file (encryption.xml)

The OPTIONAL encryption.xml file in the META-INF directory holds all encryption information on the contents of the container. If an EPUB creator encrypts any resources within the container, they MUST include an encryption.xml file to provide information about the encryption used.

4.2.6.3.2.1 The encryption element
Element Name:

encryption

Namespace:

urn:oasis:names:tc:opendocument:xmlns:container

Usage:

REQUIRED root element [xml] of the encryption.xml file.

Attributes:

None

Content Model:

In any order:

  • EncryptedKey [1 or more]
  • EncryptedData [1 or more]

The encryption element contains child elements of type EncryptedKey and EncryptedData as defined by [xmlenc-core1].

An EncryptedKey element describes each encryption key used in the container, while an EncryptedData element describes each encrypted file. Each EncryptedData element refers to an EncryptedKey element, as described in XML Encryption.

Note

An XML Schema also informally defines the content of the encryption.xml file.

OCF encrypts individual files independently, trading off some security for improved performance, allowing the container contents to be incrementally decrypted. Encryption in this way exposes the directory structure and file naming of the whole package.

OCF uses XML Encryption [xmlenc-core1] to provide a framework for encryption, allowing a variety of algorithms to be used. XML Encryption specifies a process for encrypting arbitrary data and representing the result in XML. Even though an OCF abstract container may contain non-XML data, EPUB creators can use XML Encryption to encrypt all data in an OCF abstract container. OCF encryption supports only the encryption of entire files within the container, not parts of files. EPUB creators MUST NOT encrypt the encryption.xml file when present.

Encrypted data replaces unencrypted data in an OCF abstract container. For example, if an EPUB creator encrypts an image named photo.jpeg, they should replace the contents of the photo.jpeg resource with its encrypted contents. Within the ZIP directory, EPUB creators SHOULD store encrypted files rather than Deflate-compress them.

Note that some situations require obfuscating the storage of embedded fonts referenced by an EPUB publication to make them more difficult to extract for unrestricted use. Although obfuscation is not encryption, reading systems use the encryption.xml file in conjunction with the font obfuscation algorithm to identify fonts to deobfuscate.

EPUB creators MUST NOT encrypt the following files:

  • mimetype
  • META-INF/container.xml
  • META-INF/encryption.xml
  • META-INF/manifest.xml
  • META-INF/metadata.xml
  • META-INF/rights.xml
  • META-INF/signatures.xml
  • [= package document =]

EPUB creators MAY subsequently encrypt signed resources using the Decryption Transform for XML Signature [xmlenc-decrypt]. This feature enables a reading system to distinguish data encrypted before signing from data encrypted after signing.

4.2.6.3.2.2 Order of compression and encryption

When stored in an OCF ZIP container, EPUB creators SHOULD compress streams of data with non-codec content types before encrypting them. EPUB creators MUST use Deflate compression. This practice ensures that file entries stored in the ZIP container have a smaller size.

EPUB creators SHOULD NOT compress streams of data with codec content types before encrypting them. In such cases, additional compression introduces unnecessary processing overhead at production time (especially with large resource files) and impacts audio/video playback performance at consumption time. In some cases, the combination of compression with some encryption schemes might even compromise the ability of reading systems to handle partial content requests (e.g. HTTP byte ranges), due to the technical impossibility to determine the length of the full resource ahead of media playback (e.g. HTTP Content-Length header).

When EPUB creators compress streams of data before encrypting, they SHOULD provide additional EncryptionProperties metadata to specify the size of the initial resource (i.e., before compression and encryption), as per the Compression XML element defined below. When EPUB creators do not compress streams of data before encrypting, they MAY provide the additional EncryptionProperties metadata to specify the size of the initial resource (i.e., before encryption).

Element Name:

Compression

Namespace:

http://www.idpf.org/2016/encryption#compression

Usage:

OPTIONAL child of EncryptionProperty.

Attributes:
Method [required]

Identifies the compression method used.

Value is either "0" (no compression) or "8" (Deflate algorithm).

OriginalLength [required]

Represents the size of the initial resource (number of bytes).

Value is a positive integer.

Content Model:

Empty

4.2.6.3.3 Manifest file (manifest.xml)

The OPTIONAL manifest.xml file in the META-INF directory provides a manifest of files in the container.

The OCF specification does not mandate a format for the manifest.

Note that package documents specify the only manifests used for processing EPUB publications. Reading systems do not use this file.

Note
This feature exists only for compatibility with [odf].
4.2.6.3.4 Metadata file (metadata.xml)

The OPTIONAL metadata.xml file in the META-INF directory is only for container-level metadata.

If EPUB creators include a metadata.xml file, they SHOULD use only namespace-qualified elements [xml-names] in it. The file SHOULD contain the root element [xml] metadata in the namespace http://www.idpf.org/2013/metadata, but this specification allows other root elements for backwards compatibility.

This version of the specification does not define metadata for use in the metadata.xml file. Future versions of this specification MAY define container-level metadata.

4.2.6.3.5 Rights management file (rights.xml)

This specification reserves the OPTIONAL rights.xml file in the META-INF directory for the trusted exchange of EPUB publications among rights holders, intermediaries, and users.

When EPUB creators do not include a rights.xml file, no part of the OCF abstract container is rights governed at the container level. Rights expressions might exist within the EPUB publications.

4.2.6.3.6 Digital signatures file (signatures.xml)
Note

Adding a digital signature is not a guarantee that a malicious actor cannot tamper with an EPUB publication as reading systems do not have to check signatures.

The OPTIONAL signatures.xml file in the META-INF directory holds digital signatures for the container and its contents.

4.2.6.3.6.1 The signatures element
Element Name:

signatures

Namespace:

urn:oasis:names:tc:opendocument:xmlns:container

Usage:

REQUIRED root element [xml] of the signature.xml file.

Attributes:

None

Content Model:
  • Signature [1 or more]

The signature element contains child elements of type Signature, as defined by [xmldsig-core1]. EPUB creators can apply signatures to an EPUB publication as a whole or to its parts, and can specify the signing of any kind of data (i.e., not just XML).

Note

An XML Schema also informally defines the content of the signatures.xml file.

When an EPUB creator does not include a signatures.xml file, they are not signing any part of the OCF abstract container at the container level. Digital signing might exist within the EPUB publication.

When an EPUB creator creates a data signature for the OCF abstact container, they SHOULD add the signature as the last child Signature element of the signatures element.

Note

Each Signature in the signatures.xml file identifies by URL [url] the data to which the signature applies, using the [xmldsig-core1] Manifest element and its Reference sub-elements. EPUB creator may sign individual container files separately or together. Separately signing each file creates a digest value for the resource that reading systems can validate independently. This approach might make a Signature element larger. If EPUB creators sign files together, they can list the set of signed files in a single XML Signature Manifest element and reference them by one or more Signature elements.

EPUB creators can sign any or all files in the OCF abstract container in their entirety, except for the signatures.xml file since that file will contain the computed signature information. Whether and how EPUB creators sign the signatures.xml file depends on their objective.

If the EPUB creator wants to allow signatures to be added or removed from the OCF abstract container without invalidating their signature, they SHOULD NOT sign the signatures.xml file.

If the EPUB creator wants any addition or removal of a signature to invalidate their signature, they can use the Enveloped Signature transform defined in Section 6.6.4 of [xmldsig-core1] to sign the entire pre-existing signature file excluding the Signature being created. This transform would sign all previous signatures, and it would become invalid if a subsequent signature were added to the package.

Note

If the EPUB creator wants the removal of an existing signature to invalidate their signature, but also wants to allow the addition of signatures, they could use an XPath transform to sign just the existing signatures. The details of such a transform are outside the scope of this specification, however.

The [xmldsig-core1] specification does not associate any semantics with a signature; an agent might include semantic information, for example, by adding information to the Signature element that describes the signature. The [xmldsig-core1] specification describes how additional information can be added to a signature, such as by use the SignatureProperties element.

4.3 OCF ZIP container

4.3.1 Introduction

This section is non-normative.

An OCF ZIP container is a physical single-file manifestation of an OCF abstract container. The container allows:

  • the exchange of in-progress EPUB publication between different individuals and/or different organizations;

  • the transfer of EPUB publications from a publisher or conversion house to the distribution or sales channel; and

  • the delivery of EPUB publications to EPUB reading systems or users.

4.3.2 ZIP file requirements

An OCF ZIP container uses the ZIP format as specified by [zip], but with the following constraints and clarifications:

  • The contents of the OCF ZIP container MUST be a conforming OCF abstract container.

  • OCF ZIP containers MUST NOT use the features in the ZIP application note [zip] that allow ZIP files to be spanned across multiple storage media or be split into multiple files.

  • OCF ZIP containers MUST include only stored (uncompressed) and Deflate-compressed ZIP entries within the ZIP archive.

  • OCF ZIP containers MAY use the ZIP64 extensions defined as "Version 1" in section V, subsection G of the application note [zip] and SHOULD use only those extensions when the content requires them.

  • OCF ZIP containers MUST NOT use the encryption features defined by the ZIP format; instead, encryption MUST be done using the features described in 4.2.6.3.2 Encryption file (encryption.xml).

  • OCF ZIP containers MUST encode file system names using UTF-8 [unicode].

The following constraints apply to specific fields in the OCF ZIP container archive:

  • In the local file header table, EPUB creators MUST set the version needed to extract fields to the values 10, 20 or 45 to match the maximum version level needed by the given file (e.g., 20 for Deflate, 45 for ZIP64).

  • In the local file header table, EPUB creators MUST set the compression method field to the values 0 or 8.

4.3.3 OCF ZIP container media type identification

EPUB creators MUST include the mimetype file as the first file in the OCF ZIP container. In addition:

  • The contents of the mimetype file MUST be the MIME media type [rfc2046] string application/epub+zip encoded in US-ASCII [us-ascii].
  • The mimetype file MUST NOT contain any leading or trailing padding or white space.
  • The mimetype file MUST NOT begin with the Unicode byte order mark U+FEFF.
  • EPUB creators MUST NOT compress or encrypt the mimetype file.
  • EPUB creators MUST NOT include an extra field in its ZIP header.
Note

Refer to I.2 The application/epub+zip media type for further information about the application/epub+zip media type.

4.4 Font obfuscation

Caution

Better methods of protecting fonts exist. Both [woff] and [woff2] fonts, for example, allow the embedding of licensing information and provide some protection through font table compression. The use of remotely hosted fonts also allows for font subsetting. EPUB creators are advised to use font obfuscation as defined in this section only when no other options are available to them. See also the limitations of obfuscation.

4.4.1 Introduction

This section is non-normative.

Since an OCF ZIP container is fundamentally a ZIP file, commonly available ZIP tools can be used to extract any unencrypted content stream from the package. Moreover, the nature of ZIP files means that their contents might appear like any other native container on some systems (e.g., a folder).

While this simplicity of ZIP files is quite useful, it also poses a problem when ease of extraction of fonts is not a desired side-effect of not encrypting them. An EPUB creator who wishes to include a third-party font, for example, typically does not want that font extracted and re-used by others. More critically, many commercial fonts allow embedding, but embedding a font implies making it an integral part of the EPUB publication, not just providing the original font file along with the content.

Since integrated ZIP support is so ubiquitous in modern operating systems, simply placing a font in the OCF ZIP container is insufficient to signify that the font cannot be reused in other contexts. This uncertainty can undermine the otherwise useful font embedding capability of EPUB publications.

To discourage reuse of their fonts, some font vendors might only allow their use in EPUB publications if the fonts are bound in some way to the EPUB publication. That is, if the font file cannot be installed directly for use on an operating system with the built-in tools of that computing device, and it cannot be directly used by other EPUB publications.

It is beyond the scope of this specification to provide a digital rights management or enforcement system for fonts. This section instead defines a method of obfuscation that will require additional work on the part of the final OCF recipient to gain general access to any obfuscated fonts.

4.4.2 Limitations

This section is non-normative.

This specification does not claim that obfuscation constitutes encryption, nor does it guarantee that the resource will be secure from copyright infringement. The hope is only that this algorithm will meet the requirements of vendors who require some assurance that their fonts cannot be extracted simply by unzipping the OCF ZIP container and copying the resource.

Obfuscation, like any protection scheme, cannot fully protect fonts from being accessed in their deobfuscated state. The mechanism only provides an obstacle for those who are unaware of the license details. It will not prevent a determined user from gaining full access to the font through such alternative means as:

  • applying the deobfuscation algorithm to extract the raw font file;
  • accessing the deobfuscated font through a reading system that must dedobfuscate it to render the content (e.g., by accessing the resources through a browser-based reading system); or
  • accessing the deobfuscated font through authoring tools that provide the visual rendering of the content.

As a result, whether this method of obfuscation satisfies the requirements of individual font licenses remains a question for the licensor and licensee. EPUB creators are responsible for ensuring their use of obfuscation meets font licensing requirements.

EPUB creators should also be aware that obfuscation may lead to interoperability issues in reading systems as reading systems are not required to deobfuscate fonts. As a result, the visual presentation of their publications may differ from reading system to reading system.

Also note that the algorithm is restricted to obfuscating fonts. It is not intended as a general-purpose mechanism for obfuscating any resource in the EPUB container.

4.4.3 Obfuscation key

EPUB creators MUST derive the key used in the obfuscation algorithm from the unique identifier.

All white space characters, as defined in section 2.3 of the XML 1.0 specification [xml], MUST be removed from this identifier — specifically, the Unicode code points U+0020, U+0009, U+000D and U+000A.

EPUB creators MUST generate a SHA-1 digest of the UTF-8 representation of the resulting string as specified by the Secure Hash Standard [fips-180-4]. They can then use this digest as the key for the algorithm.

4.4.4 Obfuscation algorithm

The algorithm employed to obfuscate fonts consists of modifying the first 1040 bytes (~1KB) of the font file. (In the unlikely event that the font file is less than 1040 bytes, this process will modify the entire file.)

To obfuscate the original data, store, as the first byte of the embedded font, the result of performing a logical exclusive or (XOR) on the first byte of the raw font file and the first byte of the obfuscation key.

Repeat this process with the next byte of source and key and continue for all bytes in the key. At this point, the process continues starting with the first byte of the key and 21st byte of the source. Once 1040 bytes are encoded in this way (or the end of the source is reached), directly copy any remaining data in the source to the destination.

EPUB creators MUST obfuscate fonts before compressing and adding them to the OCF ZIP container. Note that as obfuscation is not encryption, this requirement is not a violation of the one in 4.2.6.3.2 Encryption file (encryption.xml) to compress fonts before encrypting them.

The following pseudo-code exemplifies the obfuscation algorithm.

  1. set ocf to OCF ZIP container file
  2. set source to font file
  3. set destination to obfuscated font file
  4. set keyData to key for file
  5. set outer to 0
  6. while outer < 52 and not (source at EOF)
    1. set inner to 0
    2. while inner < 20 and not (source at EOF)
      1. read 1 byte from source (Assumes read advances file position)
      2. set sourceByte to result of read
      3. set keyByte to byte inner of keyData
      4. set obfuscatedByte to (sourceByte XOR keyByte)
      5. write obfuscatedByte to destination
      6. increment inner
      end while
    3. increment outer
    end while
  7. if not (source at EOF) then
    1. read source to EOF
    2. write result of read to destination
    end if
  8. Deflate destination
  9. store destination as source in ocf

4.4.5 Specifying obfuscated fonts

Although not technically encrypted data, all obfuscated fonts MUST have an entry in the encryption.xml file accompanying the EPUB publication (see 4.2.6.3.2 Encryption file (encryption.xml)).

EPUB creators MUST specify an EncryptedData element for each obfuscated font. Each EncryptedData element MUST contain a child EncryptionMethod element whose Algorithm attribute has the value http://www.idpf.org/2008/embedding. The presence of this attribute signals the use of the algorithm described in this specification.

EPUB creators MUST list the path to the obfuscated font in the CipherReference child of the CipherData element. As the obfuscation algorithm is restricted to fonts, the URI attribute of the CipherReference element MUST reference a Font core media type resource.

To prevent trivial copying of the embedded font to other EPUB publications, EPUB creators MUST NOT provide the obfuscation key in the encryption.xml file.

5. Package document

All [xml] elements defined in this section are in the http://www.idpf.org/2007/opf namespace [xml-names] unless otherwise specified.

5.1 Introduction

This section is non-normative.

The package document is an XML document that consists of a set of elements that each encapsulate information about a particular aspect of an EPUB publication. These elements serve to centralize metadata, detail the individual resources, and provide the reading order and other information necessary for its rendering.

The following list summarizes the information found in the package document:

Note

An EPUB publication can reference more than one package document, allowing for alternative representations of the content. For more information, refer to 4.2.6.3.1 Container file (container.xml)

Note

Refer to I.1 The application/oebps-package+xml media type for information about the file properties of package documents.

5.2 Parsing URLs in the package document

To parse a URL string url used in the package document, the URL parser [url] MUST be applied to url, with the content URL of the package document as base.

5.3 Shared attributes

This section provides definitions for shared attributes (i.e., attributes allowed on two or more elements).

5.3.1 The dir attribute

Specifies the base direction [bidi] of the textual content and attribute values of the carrying element and its descendants.

Allowed values are:

  • ltr — left-to-right base direction;
  • rtl — right-to-left base direction; and
  • auto — base direction is determined using the Unicode Bidi Algorithm [bidi].

Reading systems will assume the value auto when EPUB creators omit the attribute or use an invalid value.

Note

The base direction specified in the dir attribute does not affect the ordering of characters within directional runs, only the relative ordering of those runs and the placement of weak directional characters such as punctuation.

Allowed on: collection, dc:contributor, dc:coverage, dc:creator, dc:description, dc:publisher, dc:relation, dc:rights, dc:subject, dc:title, meta and package.

5.3.2 The href attribute

A valid URL string [url] that references a resource.

The URL string MUST NOT reference resources via elements in the package document (e.g., via a manifest item or spine itemref declaration).

Allowed on: item and link.

5.3.3 The id attribute

The ID [xml] of the element, which MUST be unique within the document scope.

Allowed on: collection, dc:contributor, dc:coverage, dc:creator, dc:date, dc:description, dc:format, dc:identifier, dc:language, dc:publisher, dc:relation, dc:rights, dc:source, dc:subject, dc:title, dc:type, item, itemref, link, manifest, meta, package and spine.

5.3.4 The media-type attribute

A media type [rfc2046] that specifies the type and format of the referenced resource.

Allowed on: item and link.

5.3.5 The properties attribute

A space-separated list of property values.

Refer to each element's definition for the reserved vocabulary for the attribute.

Allowed on: item, itemref and link.

5.3.6 The refines attribute

Establishes an association between the current expression and the element or resource identified by its value. EPUB creators MUST use as the value a path-relative-scheme-less-URL string, optionally followed by U+0023 (#) and a URL-fragment string that references the resource or element they are describing.

The refines attribute is OPTIONAL depending on the type of metadata expressed. When omitted, the element defines a primary expression.

When creating expressions about a publication resource, the refines attribute SHOULD specify a fragment identifier that references the ID [xml] of the resource's manifest entry.

Refinement chains MUST NOT contain circular references or self-references.

Allowed on: link and meta.

5.3.7 The xml:lang attribute

Specifies the language of the textual content and attribute values of the carrying element and its descendants, as defined in section 2.12 Language Identification of [xml]. The value of each xml:lang attribute MUST be a well-formed language tag [bcp47].

Allowed on: collection, dc:contributor, dc:coverage, dc:creator, dc:description, dc:publisher, dc:relation, dc:rights, dc:subject, dc:title, meta and package.

5.4 The package element

The package element encapsulates all the information expressed in the package document.

Element Name:

package

Usage:

REQUIRED root element [xml] of the package document.

Attributes:
Content Model:

In this order:

The version attribute specifies the EPUB specification version to which the given EPUB publication conforms. The attribute MUST have the value "3.0" to indicate conformance with EPUB 3.

Note

Updates to this specification do not represent new versions of EPUB 3 (i.e., each new 3.X specification is a continuation of the EPUB 3 format). The Working Group is committed to minimizing any changes that would invalidate existing content, allowing the version attribute value to remain unchanged.

The unique-identifier attribute takes an IDREF [xml] that identifies the dc:identifier element that provides the preferred, or primary, identifier.

The prefix attribute provides a declaration mechanism for prefixes not reserved by this specification. Refer to D.1.4 The prefix attribute for more information.

5.5 Metadata section

5.5.1 The metadata element

The metadata element encapsulates meta information.

Element Name:

metadata

Usage:

REQUIRED first child of package.

Attributes:

None

Content Model:

In any order:

The package document metadata element has two primary functions:

  1. to provide a minimal set of meta information for reading systems to use to internally catalogue an EPUB publication and make it available to a user (e.g., to present in a bookshelf).

  2. to provide access to all rendering metadata needed to control the layout and display of the content (e.g., fixed-layout properties).

The package document does not provide complex metadata encoding capabilities. If EPUB creators need to provide more detailed information, they can associate metadata records (e.g., that conform to an international standard such as [onix] or are created for custom purposes) using the link element. This approach allows reading systems to process the metadata in its native form, avoiding the potential problems and information loss caused by translating to use the minimal package document structure.

In keeping with this philosophy, the package document only has the following minimal metadata requirements: it MUST contain the [dcterms] dc:title, dc:identifier, and dc:language elements together with the [dcterms] dcterms:modified property. All other metadata is OPTIONAL.

The meta element provides a generic mechanism for including metadata properties from any vocabulary. Although EPUB creators MAY use this mechanism for any metadata purposes, they will typically use it to include rendering metadata defined in EPUB specifications.

Note

See [epub-a11y-11] for accessibility metadata recommendations.

5.5.2 Metadata values

The Dublin Core elements [dcterms] and meta element have mandatory child text content [dom]. In the descriptions for these elements, this specification refers to this content as the element's value.

These elements MUST have non-empty values after leading and trailing ASCII whitespace [infra] is stripped (i.e., they must consist of at least one non-whitespace character).

Whitespace within these element values is not significant. Sequences of one or more whitespace characters are collapsed to a single space [infra] during processing .

5.5.3 Dublin Core required elements

5.5.3.1 The dc:identifier element

The dc:identifier element [dcterms] contains an identifier such as a UUID, DOI or ISBN.

Element Name:

dc:identifier

Namespace:

http://purl.org/dc/elements/1.1/

Usage:

REQUIRED child of metadata. Repeatable.

Attributes:
  • id [conditionally required]

Content Model:

Text

The EPUB creator MUST provide an identifier that is unique to one and only one EPUB publication — its unique identifier — in an dc:identifier element. This dc:identifier element MUST specify an id attribute whose value is referenced from the package element's unique-identifier attribute.

Although not static, EPUB creators should make changes to the unique identifier for an EPUB publication as infrequently as possible. Unique Identifiers should have maximal persistence both for referencing and distribution purposes. EPUB creators should not issue new identifiers when making minor revisions such as updating metadata, fixing errata, or making similar minor changes.

EPUB creators MAY specify additional identifiers. The identifiers should be fully qualified URIs.

EPUB creators MAY use the identifier-type property to indicate that the value of a dc:identifier element conforms to an established system or an issuing authority granted it.

5.5.3.2 The dc:title element

The dc:title element [dcterms] represents an instance of a name for the EPUB publication.

Element Name:

dc:title

Namespace:

http://purl.org/dc/elements/1.1/

Usage:

REQUIRED child of metadata. Repeatable.

Attributes:
Content Model:

Text

The first dc:title element in document order is the main title of the EPUB publication (i.e., the primary one reading systems present to users).

EPUB creators should use only a single dc:title element to ensure consistent rendering of the title in reading systems.

Note

Although it is possible to include more than one dc:title element for multipart titles, reading system support for additional dc:title elements is inconsistent. Reading systems may ignore the additional segments or combine them in unexpected ways.

For example, the following example shows a basic multipart title:

<metadata …>
   <dc:title>
      THE LORD OF THE RINGS
   </dc:title>
   <dc:title>
      Part One: The Fellowship of the Ring
   </dc:title></metadata>

The same title could instead be expressed using a single dc:title element as follows:

<metadata …>
   <dc:title>
       THE LORD OF THE RINGS, Part One:
       The Fellowship of the Ring
   </dc:title></metadata>

Previous versions of this specification recommended using the title-type and display-seq properties to identify and format the segments of multipart titles (see the Great Cookbooks example). It is still possible to add these semantics, but they are also not well supported.

5.5.3.3 The dc:language element

The dc:language element [dcterms] specifies the language of the content of the EPUB publication.

Element Name:

dc:language

Namespace:

http://purl.org/dc/elements/1.1/

Usage:

REQUIRED child of metadata. Repeatable.

Attributes:

id [optional]

Content Model:

Text

The value of each dc:language element MUST be a well-formed language tag [bcp47].

Although EPUB creators MAY specify additional dc:language elements for multilingual Publications, reading systems will treat the first dc:language element in document order as the primary language of the EPUB publication.

Note

Publication resources do not inherit their language from the dc:language element(s). EPUB creators must set the language of a resource using the intrinsic methods of the format.

5.5.4 Dublin Core optional elements

5.5.4.1 General definition

All [dcterms] elements except for dc:identifier, dc:language, and dc:title are designated as OPTIONAL. These elements conform to the following generalized definition:

Element Name:

dc:contributor | dc:coverage | dc:creator | dc:date | dc:description | dc:format | dc:publisher | dc:relation | dc:rights | dc:source | dc:subject | dc:type

Namespace:

http://purl.org/dc/elements/1.1/

Usage:

OPTIONAL child of metadata. Repeatable.

Attributes:
  • dir [optional] – only allowed on dc:contributor, dc:coverage, dc:creator, dc:description, dc:publisher, dc:relation, dc:rights, and dc:subject.

  • id [optional] – allowed on any element.

  • xml:lang [optional] – only allowed on dc:contributor, dc:coverage, dc:creator, dc:description, dc:publisher, dc:relation, dc:rights, and dc:subject.

Content Model:

Text

This specification does not modify the [dcterms] element definitions except as noted in the following sections.

5.5.4.2 The dc:contributor element

The dc:contributor element [dcterms] is used to represent the name of a person, organization, etc. that played a secondary role in the creation of the content.

The requirements for the dc:contributor element are identical to those for the dc:creator element in all other respects.

5.5.4.3 The dc:creator element

The dc:creator element [dcterms] represents the name of a person, organization, etc. responsible for the creation of the content. EPUB creators MAY associate a role property with the element to indicate the function the creator played.

The dc:creator element should contain the name of the creator as EPUB creators intend reading systems to display it to users.

EPUB creators MAY use the file-as property to associate a normalized form of the creator's name, and the alternate-script property to represent the creator's name in another language or script.

If an EPUB publication has more than one creator, EPUB creators should specify each in a separate dc:creator element.

The document order of dc:creator elements in the metadata section determines the display priority, where the first dc:creator element encountered is the primary creator.

EPUB creators should represent secondary contributors using the dc:contributor element.

5.5.4.4 The dc:date element

The dc:date element [dcterms] defines the publication date of the EPUB publication. The publication date is not the same as the last modified date (the last time the EPUB creator changed the EPUB publication).

It is RECOMMENDED that the date string conform to [iso8601], particularly the subset expressed in W3C Date and Time Formats [datetime], as such strings are both human and machine readable.

EPUB creators should express additional dates using the specialized date properties available in the [dcterms] vocabulary, or similar.

EPUB publications MUST NOT contain more than one dc:date element.

5.5.4.5 The dc:subject element

The dc:subject element [dcterms] identifies the subject of the EPUB publication. EPUB creators should set the value of the element to the human-readable heading or label, but may use a code value if the subject taxonomy does not provide a separate descriptive label.

EPUB creators MAY identify the system or scheme they drew the element's value from using the authority property.

When a scheme is identified, EPUB creators MUST associate a subject code using the term property.

The term property MUST NOT be associated with a dc:subject element that does not specify a scheme.

The values of the dc:subject element and term property are case sensitive only when the designated scheme requires.

5.5.4.6 The dc:type element

The dc:type element [dcterms] is used to indicate that the EPUB publication is of a specialized type (e.g., annotations or a dictionary packaged in EPUB format).

EPUB creators MAY use any text string as a value.

Note

The former IDPF EPUB 3 Working Group maintained a non-normative registry of specialized EPUB publication types for use with this element. This Working Group no longer maintains the registry and does not anticipate developing new specialized publication types.

5.5.5 The meta element

The meta element provides a generic means of including package metadata.

Element Name:

meta

Usage:

As child of the metadata element. Repeatable.

Attributes:
Content Model:

Text

Each meta element defines a metadata expression. The property attribute takes a property data type value that defines the statement made in the expression, and the text content of the element represents the assertion. (Refer to D.1 Vocabulary association mechanisms for more information.)

This specification defines two types of metadata expressions that EPUB creators can define using the meta element:

  • A primary expression is one in which the expression defined in the meta element establishes some aspect of the EPUB publication. A meta element that omits a refines attribute defines a primary expression.
  • A subexpression is one in which the expression defined in the meta element is associated with another expression or resource using the refines attribute to enhance its meaning. A subexpression might refine a media clip, for example, by expressing its duration, or refine a creator or contributor expression by defining the role of the person.

EPUB creators MAY use subexpressions to refine the meaning of other subexpressions, thereby creating chains of information.

Note

All the [dcterms] elements represent primary expressions, and permit refinement by meta element subexpressions.

The Meta Properties Vocabulary is the default vocabulary for use with the property attribute.

EPUB creators MAY add terms from other vocabularies as defined in D.1 Vocabulary association mechanisms.

The scheme attribute identifies the system or scheme the EPUB creator obtained the element's value from. The value of the attribute MUST be a property data type value that resolves to the resource that defines the scheme. The scheme attribute does not have a default vocabulary (i.e., all values require a prefix).

5.5.6 Last modified date

The metadata section MUST contain exactly one dcterms:modified property [dcterms] containing the last modification date. The value of this property MUST be an [xmlschema-2] dateTime conformant date of the form: CCYY-MM-DDThh:mm:ssZ

EPUB creators MUST express the last modification date in Coordinated Universal Time (UTC) and MUST terminate it with the "Z" (Zulu) time zone indicator.

EPUB creators should update the last modified date whenever they make changes to the EPUB publication.

EPUB creators MAY specify additional modified properties in the package document metadata, but they MUST have a different subject (i.e., they require a refines attribute that references an element or resource).

Note

The requirements for the last modification date are to ensure compatibility with earlier versions of EPUB 3 that defined a release identifier [epubpackages-32] for EPUB publications.

5.6 Manifest section

5.6.1 The manifest element

The manifest element provides an exhaustive list of publication resources used in the rendering of the content.

Element Name:

manifest

Usage:

REQUIRED second child of package, following metadata.

Attributes:

id [optional]

Content Model:

item [1 or more]

EPUB creators MUST list all publication resources in the manifest, regardless of whether they are container resources or remote resources. Moreover, the manifest MUST only list publication resources.

Note that the manifest is not self-referencing: EPUB creators MUST NOT specify an item element that refers to the package document itself.

Note

Failure to provide a complete manifest of resources may lead to rendering issues. Reading systems might not unzip such resources or could prevent access to them for security reasons.

5.6.2 The item element

The item element represents a publication resource.

Element Name:

item

Usage:

As a child of manifest. Repeatable.

Attributes:
Content Model:

Empty

Each item element identifies a publication resource by the URL [url] in its href attribute. The value MUST be an absolute- or path-relative-scheme-less-URL string [url]. EPUB creators MUST ensure each URL is unique within the manifest scope after parsing.

The publication resource identified by an item element MUST conform to the applicable specification(s) as inferred from the MIME media type [rfc2046] provided in the media-type attribute. For core media type resources, EPUB creators MUST use the media type designated in 3.2 Core media types.

The fallback attribute specifies the fallback for the referenced publication resource. The fallback attribute's IDREF [xml] value MUST resolve to another item in the manifest.

The fallback for one item MAY specify a fallback to another item, and so on, creating a chain of fallback options. Refer to 3.5.1 Manifest fallbacks for additional requirements related to the use of fallback chains.

The media-overlay attribute takes an IDREF [xml] that identifies the media overlay document for the resource described by this item. Refer to 9.3.5 Media overlays packaging for more information.

Note

The order of item elements in the manifest is not significant. The spine element provides the presentation sequence of content documents.

5.6.2.1 Resource properties

The properties attribute provides information to reading systems about the content of a resource. This information enables discovery of key resources, such as the cover image and EPUB navigation document. It also allows reading systems to optimize rendering by indicating, for example, whether the resource contains embedded scripting, MathML, or SVG.

The Manifest Properties Vocabulary is the default vocabulary for the properties attribute.

EPUB creators MUST set the following properties whenever a resource referenced by an item element matches their respective definitions:

These properties do not apply recursively to content included into a resource (e.g., via the [html] iframe element). For example, if a non-scripted XHTML content document embeds a scripted content document, only the embedded document's manifest item properties attribute will have the scripted value.

EPUB creators MUST declare exactly one item as the EPUB navigation document using the nav property.

If an EPUB publication contains a cover image, it is recommended to set the cover-image property, but setting this property is OPTIONAL.

EPUB creators MAY add terms from other vocabularies as defined in D.1 Vocabulary association mechanisms.

5.6.2.2 Examples

5.6.3 The bindings element (deprecated)

The bindings element defines a set of custom handlers for media types not supported by this specification.

Use of the element is deprecated.

Refer to the bindings element definition in [epubpublications-301] for more information.

5.7 Spine section

5.7.1 The spine element

The spine element defines an ordered list of manifest item references that represent the default reading order.

Element Name:

spine

Usage:

REQUIRED third child of package, following manifest.

Attributes:
Content Model:

itemref [1 or more]

The spine MUST specify at least one EPUB content document or foreign content document.

EPUB creators MUST list in the spine all EPUB and foreign content documents that are hyperlinked to from publication resources in the spine, where hyperlinking encompasses any linking mechanism that requires the user to navigate away from the current resource. Common hyperlinking mechanisms include the href attribute of the [html] a and area elements and scripted links (e.g., using DOM Events and/or form elements). The requirement to list hyperlinked resources applies recursively (i.e., EPUB creators must list all EPUB and foreign content documents hyperlinked to from hyperlinked documents, and so on.).

EPUB creators also MUST list in the spine all EPUB and foreign content documents hyperlinked to from the EPUB navigation document, regardless of whether EPUB creators include the EPUB navigation document in the spine.

Note

As hyperlinks to resources outside the EPUB container are not publication resources, they are not subject to the requirement to include in the spine (e.g., web pages and web-hosted resources).

Publication resources used in the rendering of spine items (e.g., referenced from [html] embedded content) similarly do not have to be included in the spine.

The page-progression-direction attribute sets the global direction in which the content flows. Allowed values are ltr (left-to-right), rtl (right-to-left) and default. When EPUB creators specify the default value, they are expressing no preference and the reading system can choose the rendering direction.

Although the page-progression-direction attribute sets the global flow direction, individual EPUB content documents and parts of EPUB content documents MAY override this setting (e.g., via the writing-mode CSS property). Reading systems may also provide mechanisms to override the default direction (e.g., buttons or settings that allow the application of alternate style sheets).

The legacy toc attribute takes an IDREF [xml] that identifies the manifest item that represents the NCX.

5.7.2 The itemref element

The itemref element identifies an EPUB content document or foreign content document in the default reading order.

Element Name:

itemref

Usage:

As a child of spine. Repeatable.

Attributes:
Content Model:

Empty

Each itemref element MUST reference the ID [xml] of an item in the manifest via the IDREF [xml] in its idref attribute. item element IDs MUST NOT be referenced more than once.

Each referenced manifest item MUST be either a) an EPUB content document or b) a foreign content document that includes an EPUB content document in its manifest fallback chain.

Note

Although EPUB publications require an EPUB navigation document, it is not mandatory to include it in the spine.

The linear attribute indicates whether the referenced item contains content that contributes to the primary reading order and that reading systems must read sequentially ("yes"), or auxiliary content that enhances or augments the primary content that reading systems can access out of sequence ("no"). Examples of auxiliary content include notes, descriptions, and answer keys.

The linear attribute allows reading systems to distinguish content that a user should access as part of the default reading order from supplementary content which a reading system might, for example, present in a popup window or omit from an aural rendering.

Specifying that content is non-linear does not require reading systems to present it in a specific way, however; it is only a hint to the purpose. Reading systems may present non-linear content where it occurs in the spine, for example, or may skip it until users reach the end of the spine.

Note

EPUB creators should list non-linear content at the end of the spine except when it makes sense for users to encounter it between linear spine items.

A linear itemref element is one whose linear attribute value is explicitly set to "yes" or that omits the attribute — reading systems will assume the value "yes" for itemref elements without the attribute. The spine MUST contain at least one linear itemref element.

EPUB creators MUST provide a means of accessing all non-linear content (e.g., hyperlinks in the content or from the EPUB navigation document).

The Spine Properties Vocabulary is the default vocabulary for the properties attribute.

EPUB creators MAY add terms from other vocabularies as defined in D.1 Vocabulary association mechanisms.

5.8 Collections

5.8.1 The collection element

The collection element defines a related group of resources.

Element Name:

collection

Usage:

OPTIONAL sixth element of package. Repeatable.

Attributes:
Content Model:

In this order: metadata [0 or 1], ( collection [1 or more] or ( collection [0 or more], link [1 or more] ))

The collection element allows EPUB creators to assemble resources into logical groups for a variety of potential uses: enabling reassembly into a meaningful unit of content split across multiple EPUB content documents (e.g., an index split across multiple documents), identifying resources for specialized purposes (e.g., preview content), or collecting together resources that present additional information about the EPUB publication.

EPUB creators MUST identify the role of each collection element in its role attribute, whose value MUST be one or more NMTOKENs [xmlschema-2] and/or absolute-URL-with-fragment strings [url].

The requirements for authoring specialized collections are defined by their respective specifications.

Note

The former IDPF EPUB 3 Working Group maintained both a registry of role extensions and a list of custom extension roles. This Working Group no longer maintains these registries.

5.8.2 Defining collection types (deprecated)

The creation of new collection element roles is now deprecated.

Refer to the collection element definition in [epubpackages-32] for more information about the creation of specialized collections, including the requirements and restrictions on their use.

5.9 Legacy features

5.9.1 Introduction

The package document legacy features are retained from EPUB 2 only to allow EPUB creators to author content that can function, to some degree, in reading systems that only support EPUB 2 publications.

These features were added primarily to address the overlap period as EPUB 3 reading systems were developed, as there was still a high probability at that time that users would be opening EPUB 3 publications on EPUB 2 reading systems.

As reading systems that only handle EPUB 2 publications are now rare, EPUB creators should consider the likelihood of their publications still being opened on these types of older devices before making the effort to add these legacy features.

5.9.2 Support

EPUB creators MAY include the legacy features defined in this section for compatibility purposes with EPUB 2 reading systems.

EPUB 3 reading systems will not use these features when presenting publications to users.

Note

EPUB conformance checkers should not alert EPUB creators about the presence of legacy features in an EPUB publication, as their inclusion is valid for backwards compatibility. EPUB conformance checkers must alert EPUB creators if a legacy feature does not conform to its definition or otherwise breaks a usage requirement.

5.9.3 The meta element

The meta element [opf-201] provides a means of including generic metadata for EPUB 2 reading systems.

Refer to the meta element definition in [opf-201] for more information.

Note

The EPUB 3 meta element, which uses different attributes and requires text content, provides metadata capabilities for EPUB 3 reading systems.

The [opf-201] meta element also allows EPUB creators to identify a cover image for EPUB 2 reading systems. In EPUB 3, the cover image must be identified using the cover-image property on the manifest item for the image.

5.9.4 The guide element

The guide element [opf-201] provides machine-processable navigation to key structures in EPUB 2 reading systems.

Refer to the guide element definition in [opf-201] for more information.

Note

The landmarks nav in the EPUB navigation document provides this functionality in EPUB 3 reading systems.

5.9.5 NCX

The NCX [opf-201] provides a table of contents for EPUB 2 reading systems.

Refer to the NCX definition in [opf-201] for more information.

Note

The EPUB navigation document replaces the NCX for EPUB 3 reading systems.

6. EPUB content documents

6.1 XHTML content documents

6.1.1 Introduction

This section is non-normative.

This section defines a profile of [html] for creating XHTML content documents. An instance of an XML document that conforms to this profile is a core media type resource and is referred to in this specification as an XHTML content document.

6.1.2 XHTML requirements

An XHTML content document:

  • MUST be an [html] document that conforms to the XML syntax.

  • MUST conform to the conformance criteria for all document constructs defined by [html] unless explicitly overridden in 6.1.4 HTML deviations and constraints.

  • MAY include extensions to the [html] grammar as defined in 6.1.3 HTML extensions, and MUST conform to all content conformance constraints defined therein.

Unless specified otherwise, XHTML content documents inherit all definitions of semantics, structure, and processing behaviors from the [html] specification.

Note

The recommendation that EPUB publications follow the accessibility requirements in [epub-a11y-11] applies to XHTML content documents. See Accessibility.

6.1.3 HTML extensions

This section defines EPUB 3 XHTML content document extensions to the underlying [html] document model.

Note

Although [html] allows user agents to support vendor-neutral extensions, unless such extensions are listed in this section, they are not supported features of EPUB 3.

6.1.3.1 Structural semantics

EPUB creators MAY use the epub:type attribute in XHTML content documents to express structural semantics.

The attribute MUST NOT be used on the head element or metadata content [html].

6.1.3.2 RDFa

The [html-rdfa] specification defines a set of attributes that EPUB creators MAY use in XHTML content documents to semantically enrich the content. The use of these attributes MUST conform to the requirements defined in [html-rdfa].

The [html-rdfa] specification defines changes to the [html] content model when authors use RDFa attributes. This modified content model is valid in XHTML content documents.

Note

The listing of RDFa does not express a preference on the part of the Working Group, only that these attributes represent an extension of the HTML grammar. EPUB creators can also specify microdata attributes [html] and linked data [json-ld11] in XHTML content documents as both are natively supported.

6.1.3.3 Content switching (deprecated)

The switch element provides a simple mechanism through which EPUB creators can tailor the content displayed to users, one that is not dependent on the scripting capabilities of the EPUB reading system.

Use of the element is deprecated.

Refer to the switch element definition in [epubcontentdocs-301] for more information.

6.1.3.4 The epub:trigger element (deprecated)

The trigger element enables the creation of markup-defined user interfaces for controlling multimedia objects, such as audio and video playback, in both scripted and non-scripted contexts.

Use of the element is deprecated.

Refer to the epub:trigger element definition in [epubcontentdocs-301] for more information.

6.1.3.5 Custom attributes

XHTML content documents MAY contain custom attributes, which are prefixed [xml-names] attributes whose namespace URL does not include either of the following strings in its domain [url]:

  • w3.org
  • idpf.org

When using custom attributes, the content MUST remain consumable by a user without any information loss or other significant deterioration, regardless of the reading system it is rendered on.

Note

Custom attributes are usually defined in a reading system-specific manner and are not intended for use by other reading systems. This specification should be extended to provide extensions that multiple independent reading systems can use.

6.1.4 HTML deviations and constraints

This section defines deviations from, and constraints on, the underlying [html] document model applicable to EPUB 3 XHTML content documents.

6.1.4.1 Embedded MathML

XHTML content documents support embedded [mathml3]. Occurrences of MathML markup MUST conform to the constraints expressed in the MathML specification [mathml3], with the following additional restrictions:

Presentation MathML

The math element MUST contain only Presentation MathML, except within the annotation-xml element.

Content MathML

EPUB creators MAY include Content MathML within MathML markup in XHTML content documents, and, when present, MUST include it within an annotation-xml child element of a semantics element.

When EPUB creators include Content MathML per the previous condition, they MUST set the given annotation-xml element's encoding attribute to either of the functionally-equivalent values MathML-Content or application/mathml-content+xml, and the name attribute to contentequiv.

This subset eases the implementation burden on reading systems and promotes accessibility, while retaining compatibility with [html] user agents.

Note

The mathml property of the manifest item element indicates that an XHTML content document contains embedded MathML.

6.1.4.2 Embedded SVG

XHTML content documents support the embedding of SVG document fragments [svg] by reference (embedding via reference, for example, from an img or object element) and by inclusion (embedding via direct inclusion of the svg element in the XHTML content document).

The content conformance constraints for SVG embedded in XHTML content documents are the same as defined for SVG content documents in 6.2.3 Restrictions on SVG.

Note

The svg property of the manifest item element indicates that an XHTML content document contains embedded SVG.

6.1.4.3 Discouraged constructs

This section is non-normative.

6.1.4.3.1 The base element

The [html] base element can be used to specify the document base URL for the purposes of parsing URLs. When using it in an EPUB publication, the interpretation of the base element may inadvertently result in references to remote resources. It may also cause reading systems to misinterpret the location of hyperlinks (e.g., relative links to other documents in the publication might appear as links to a web site if the base element specifies an absolute URL). To avoid significant interoperability issues, EPUB creators should not use the base element.

6.1.4.3.2 The rp element

The [html] rp element is intended to provide a fallback for older reading systems that do not recognize ruby markup (i.e., a parenthesis display around ruby markup). As EPUB 3 reading systems are ruby-aware, and can provide fallbacks, EPUB creators should not use rp elements.

6.1.4.3.3 The embed element

Since the [html] embed element element does not include intrinsic facilities to provide fallback content for reading systems that do not support scripting, EPUB creators are discouraged from using the element when the referenced resource includes scripting. The [html] object element is a better alternative, as it includes intrinsic fallback capabilities.

6.2 SVG content documents

Caution

Reading systems may not support all the features of [svg] or support them across all platforms that reading systems run on. When utilizing such features, EPUB creators should consider the inherent risks on interoperability and document longevity.

6.2.1 Introduction

This section is non-normative.

The Scalable Vector Graphics (SVG) specification [svg] defines a format for representing final-form vector graphics and text.

Although EPUB creators typically use XHTML content documents as the top-level document type, the use of SVG content documents is also permitted. EPUB creators will typically only need SVGs for certain special cases, such as when final-form page images are the only suitable representation of the content (e.g., for cover art or in the context of manga or comic books).

This section defines a profile for [svg] documents. An instance of an XML document that conforms to this profile is a core media type resource and is referred to in this specification as an SVG content document.

Note

This section defines conformance requirements for SVG content documents. Refer to 6.1.4.2 Embedded SVG for the conformance requirements for SVG embedded in XHTML content documents.

6.2.2 SVG requirements

An SVG content document:

Note

The recommendation that EPUB publications follow the accessibility requirements in [epub-a11y-11] applies to SVG content documents. See Accessibility.

6.2.3 Restrictions on SVG

This specification restricts the content model of SVG content documents and SVG embedded in XHTML content documents as follows:

Note

Although the [svg] title element allows markup elements, support for this feature is limited. EPUB creators are advised to use text-only titles for maximum interoperability.

6.3 Common resource requirements

This section defines requirements for technologies usable in both XHTML and SVG content documents.

6.3.1 Cascading Style Sheets (CSS)

6.3.1.1 Introduction

This section is non-normative.

CSS is an integral part of the Open Web Platform. Readers, publishers, and document authors expect CSS to "just work," as they expect HTML to just work.

In the past, EPUB defined a profile of CSS that mandated support for certain properties and provided prefixed versions of numerous other properties. Although the CSS Working Group no longer recommends the use of prefixed properties, this specification maintains some prefixed properties to avoid breaking existing content. But with the minor exceptions defined in this section, EPUB defers to the W3C to define CSS.

Note

Keep in mind that some reading systems will not support all desired features of CSS. The following are known to be particularly problematic:

  • Reading system-induced pagination can interact poorly with style sheets as reading systems sometimes paginate using columns. This may result in incorrect values for viewport sizes. Fixed and absolute positioning are particularly problematic.

  • Some types of screens will render animations and transitions poorly (e.g., those with high latency).

6.3.1.2 CSS requirements

A CSS style sheet:

Note

This specification restricts the use of the direction and unicode-bidi properties because reading systems might not implement, or might switch off, CSS processing. EPUB creators must use the following format-specific methods when they need control over these aspects of the rendering:

6.3.1.3 Prefixed properties

Earlier version of EPUB included prefixed CSS properties, as many CSS features related to world languages were not yet mature. To ensure backwards compatibility for content authored using these prefixes, they have been retained in this specification. Unless otherwise noted, prefixed properties and values behave exactly as their unprefixed equivalents as described in the appropriate CSS specification. The prefixed properties are documented in E. Prefixed CSS properties.

Caution

EPUB creators should use unprefixed properties and reading systems should support current CSS specifications. This specification retains the widely used prefixed properties from [epubcontentdocs-301] but removes support for the less-used ones. EPUB creators should use CSS-native solutions for the removed properties whenever available.

The Working Group recommends that EPUB creators currently using these prefixed properties move to unprefixed versions as soon as support allows, as the Working Group does not anticipate supporting them in the next major version of EPUB.

6.3.2 Scripting

6.3.2.1 Script inclusion

EPUB content documents MAY contain scripting using the facilities defined for this in the respective underlying specifications ([html] and [svg]). When an EPUB content document contains scripting, this specification refers to it as a scripted content document. This label also applies to XHTML content documents that contain [html] form elements.

The scripted property of the manifest item element is used to indicate that an EPUB content document is a scripted content document.

When an [html] script element contains a data block [html], it does not represent scripted content.

Note

[svg] does not define data blocks as of publication, but the same exclusion would apply if a future update adds the concept.

EPUB creators should note that reading systems are required to behave as though a unique origin [html] has been assigned to each EPUB publication. In practice, this means that it is not possible for scripts to share data between EPUB publications.

Which context a script is used in also determines the rights and restrictions that a reading system places on it (refer to Scripting [epub-rs-33] for more information).

Note

Reading systems may render scripted content documents in a manner that disables other EPUB capabilities and/or provides a different rendering and user experience (e.g., by disabling pagination).

6.3.2.2 Scripting contexts

EPUB 3 defines two contexts for script execution:

Note

Scripts may execute in other contexts, but reading system support for these contexts is optional. For example, a scripted SVG document may be referenced from an [html] object element.

Refer to the processing of scripts [epub-rs-33] for more information.

Whether EPUB creators embed the code directly in a script element or reference it via the element's src attribute makes no difference to its executing context.

Which context EPUB creators use for their scripts affects both what actions the scripts can perform and the likelihood of support in reading systems, as described in the following subsections.

Note

Refer to H.2 Scripting contexts for an example of the difference between the two contexts.

6.3.2.2.1 Container-constrained scripts

A container-constrained script is either of the following:

A container-constrained script MUST NOT contain instructions for modifying the DOM of the EPUB content document that embeds it (i.e., the one that contains the iframe element). It also MUST NOT contain instructions for manipulating the size of its containing rectangle.

EPUB creators should note that support for container-constrained scripting in reading systems is only recommended in reflowable documents [epub-rs-33]. Furthermore, reading system support in fixed-layout documents is optional.

EPUB creators should ensure container-constrained scripts degrade gracefully in reading systems without scripting support (see 6.3.2.5 Scripting fallbacks).

Note

EPUB creators choosing to restrict the usage of scripting to the container-constrained model will ensure a more consistent user experience between scripted and non-scripted content (e.g., consistent pagination behavior).

6.3.2.2.2 Spine-level scripts

A spine-level script is an instance of the [html] script or [svg] script element contained in a top-level content document.

EPUB creators should note that support for spine-level scripting in reading systems is only recommended in fixed-layout documents and reflowable documents set to scroll [epub-rs-33]. Furthermore, reading system support in all other contexts is optional.

Top-level content documents that include spine-level scripting SHOULD remain consumable by the user without any information loss or other significant deterioration when scripting is disabled or not available (e.g., by employing progressive enhancement techniques or fallbacks). Failing to account for non-scripted environments in top-level content documents can result in EPUB publications being unreadable.

6.3.2.3 Event model

This section is non-normative.

EPUB creators should consider the wide variety of possible reading system implementations when adding scripting functionality to their EPUB publications (e.g., not all devices have physical keyboards, and in many cases a soft keyboard is activated only for text input elements). Consequently, EPUB creators should not rely on keyboard events alone; they should always provide alternative ways to trigger a desired action.

6.3.2.4 Scripting accessibility

EPUB content documents that contain scripting SHOULD employ relevant [wai-aria] accessibility techniques to ensure that the content remains consumable by all users.

6.3.2.5 Scripting fallbacks

EPUB content documents that contain scripting MAY provide fallbacks for such content, either by using intrinsic fallback mechanisms (such as those available for the [html] object and canvas elements) or, when an intrinsic fallback is not applicable, by using a manifest-level fallback.

EPUB creators MUST ensure that scripts only generate core media type resources or fragments thereof.

7. EPUB navigation document

7.1 Introduction

This section is non-normative.

The EPUB navigation document is a mandatory component of an EPUB publication. It allows EPUB creators to include a human- and machine-readable global navigation layer, thereby ensuring increased usability and accessibility for the user.

The EPUB navigation document is a special type of XHTML content document that defines the table of contents for reading systems. It may also include other specialized navigation elements, such as a page list and a list of key landmarks. These navigation elements have additional restrictions on their content to facilitate their processing.

The EPUB navigation document is not exclusively for machine processing, however. There are no restrictions on the structure or content of the EPUB navigation document outside of the specialized navigation elements (i.e., EPUB creators can mark the rest of the document up like any other XHTML content document). As a result, it can also be part of the linear reading order, avoiding the need for duplicate tables of contents. EPUB creators can hide navigation elements that are only for machine processing (e.g., the page list) with the hidden attribute.

Note that reading systems may strip scripting, styling, and HTML formatting as they generate navigational interfaces from information found in the EPUB navigation document, and this may make the result difficult to read. If EPUB creators require such formatting and functionality, then they should also include the EPUB navigation document in the spine. The use of progressive enhancement techniques for scripting and styling of the navigation document will help ensure the content will retain its integrity when rendered in a non-browser context.

7.2 Navigation document requirements

A valid EPUB navigation document:

7.3 The nav element: restrictions

When a nav element carries the epub:type attribute in an EPUB navigation document, this specification restricts the content model of the element and its descendants as follows:

Content Model:
nav

In this order:

ol

In this order:

  • li [1 or more]

li

In this order:

  • (span or a) [exactly 1]

  • ol [conditionally required]

span and a

In any order:

Note that there are no restrictions on the attributes allowed on these elements.

Refer the definition below for additional requirements.

The following elaboration of the content model of the nav element explains the purpose and restrictions of the various elements:

Caution

Although the headings and links in nav elements allow any [html] phrasing content, app-based reading systems often only support simple text labels. Because these apps create their own navigation widgets that are not based on HTML rendering, they often cannot retain embedded images and multimedia, MathML, inline styling and other element- and attribute-based rendering instructions. EPUB creators should avoid using these types of elements where their absence may lead to usability issues.

As a conforming XHTML content document, EPUB creators MAY include the EPUB navigation document in the spine.

In the context of this specification, the default display style of list items within nav elements is equivalent to the list-style: none property [csssnapshot]. EPUB creators MAY specify alternative list styling using CSS for rendering of the document in the spine.

7.4 The nav element: types

7.4.1 Introduction

This section is non-normative.

The nav elements defined in an EPUB navigation document are distinguished semantically by the value of their epub:type attribute.

This specification defines three types of navigation aid:

toc

Identifies the nav element that contains the table of contents. The toc nav is the only navigation aid that EPUB creators must include in the EPUB navigation document.

page-list

Identifies the nav element that contains a list of pages for a print or other statically paginated source.

landmarks

Identifies the nav element that contains a list of points of interest.

An EPUB navigation document may contain at most one navigation aid for each of these types.

The EPUB navigation document may include additional navigation types. See 7.4.5 Other nav elements for more information.

7.4.2 The toc nav element

The toc nav element defines the primary navigational hierarchy. It conceptually corresponds to a table of contents in a printed work (i.e., it provides navigation to the major structural sections of the publication).

EPUB creators SHOULD order the references in the toc nav element such that they reflect both:

7.4.3 The page-list nav element

The page-list element provides navigation to static page boundaries in the content. These boundaries may correspond to a statically paginated source such as print or may be defined exclusively for the EPUB publication.

The page-list nav element is OPTIONAL in EPUB navigation documents and MUST NOT occur more than once.

The page-list nav element SHOULD contain only a single ol descendant (i.e., no nested sublists).

EPUB creators MAY identify the destinations of the page-list references in their respective EPUB content documents using the pagebreak term [epub-ssv-11].

7.4.4 The landmarks nav element

The landmarks nav element identifies fundamental structural components in the content to enable reading systems to provide the user efficient access to them (e.g., through a dedicated button in the user interface).

The landmarks nav element is OPTIONAL in EPUB navigation documents and MUST NOT occur more than once.

The landmarks nav element SHOULD contain only a single ol descendant (i.e., no nested sublists).

The epub:type attribute is REQUIRED on a element descendants of the landmarks nav element. The structural semantics of each link target within the landmarks nav element is determined by the value of this attribute.

The landmarks nav MUST NOT include multiple entries with the same epub:type value that reference the same resource, or fragment thereof.

EPUB creators should limit the number of items they define in the landmarks nav to only items that a reading system is likely to use in its user interface. The element is not meant to repeat the table of contents.

The following landmarks are recommended to include when available:

  • bodymatter [epub-ssv-11] — Reading systems often use this landmark to automatically jump users past the front matter when they begin reading.
  • toc [epub-ssv-11] — If the table of contents is available in the spine, reading systems may use this landmark to take users to the document containing it.

Other possibilities for inclusion in the landmarks nav are key reference sections such as indexes and glossaries.

Although the landmarks nav is intended for reading system use, EPUB creators should still ensure that the labels for the landmarks nav are human readable. Reading systems may expose the links directly to users.

7.4.5 Other nav elements

EPUB navigation documents MAY contain one or more nav elements in addition to the toc, page-list, and landmarks nav elements defined in the preceding sections. If these nav elements are intended for reading system processing, they MUST have an epub:type attribute and are subject to the content model restrictions defined in 7.3 The nav element: restrictions.

This specification imposes no restrictions on the semantics of any additional nav elements: they MAY represent navigational semantics for any information domain, and they MAY contain link targets with homogeneous or heterogeneous semantics.

7.5 Using in the spine

This section is non-normative.

Although it is possible to reuse the EPUB navigation document in the spine, it is often the case that not all of the navigation structures, or branches within them, are needed. EPUB creators will often want to hide the page list and landmarks navigation elements or trim the branches of the table of contents for books that have many levels of subsections.

While the display property [csssnapshot] controls the visual rendering of EPUB navigation documents in reading systems with viewports, reading systems without viewports may not support CSS. To better ensure the proper rendering in these reading systems, EPUB creators should use the [html] hidden attribute to indicate which (if any) portions of the navigation data are excluded from rendering in the content flow.

The hidden attribute has no effect on how reading systems render the navigation data outside of the content flow (such as in dedicated navigation user interfaces provided by reading systems).

Note

The hidden attribute can be used together with the display property to maximize interoperability across all reading systems.

8. Layout rendering control

8.1 Introduction

This section is non-normative.

Not all rendering information can be expressed through the underlying technologies that EPUB is built upon. For example, although HTML with CSS provides powerful layout capabilities, those capabilities are limited to the scope of the document being rendered.

This section defines properties that allow EPUB creators to express package-level rendering intentions (i.e., functionality that can only be implemented by the EPUB reading system). If a reading system supports the desired rendering, these properties enable the user to be presented the content as the EPUB creator optimally designed it.

8.2 Fixed layouts

8.2.1 Introduction

This section is non-normative.

EPUB publications, unlike print books or PDF files, are designed to change. The content flows, or reflows, to fit the screen and to fit the needs of the user. As noted in Rendering and CSS "content presentation adapts to the user, rather than the user having to adapt to a particular presentation of content." [epub-overview-33]

But this principle does not work for all types of documents. Sometimes content and design are so intertwined it is not possible to separate them. Any change in appearance risks changing the meaning or losing all meaning. Fixed-layout documents give EPUB creators greater control over presentation when a reflowable EPUB is not suitable for the content.

EPUB creators define fixed layouts using a set of package document properties to control the rendering in reading systems. In addition, they set the dimensions of each fixed-layout document in its respective EPUB content document.

Note

EPUB 3 affords multiple mechanisms for representing fixed-layout content. When fixed-layout content is necessary, the EPUB creator's choice of mechanism will depend on many factors including desired degree of precision, file size, accessibility, etc. This section does not attempt to dictate the EPUB creator's choice of mechanism.

8.2.2 Fixed-layout package settings

8.2.2.1 Layout

The rendition:layout property specifies whether the content is reflowable or pre-paginated.

When the rendition:layout property is specified on a meta element, it indicates that the paginated or reflowable layout style applies globally (i.e., for all spine items).

EPUB creators MUST use one of the following values with the rendition:layout property:

reflowable

The content is not pre-paginated (i.e., reading systems apply dynamic pagination when rendering). Default value.

pre-paginated

The content is pre-paginated (i.e., reading systems produce exactly one page per spine itemref when rendering).

Note

Reading systems typically restrict or deny the application of user or user agent style sheets to pre-paginated documents because dynamic style changes are likely to have unintended consequence on the intrinsic properties of such documents. EPUB creators should consider the negative impact on usability and accessibility that these restrictions have when choosing to use pre-paginated instead of reflowable content. Refer to Guideline 1.4 - Provide text configuration [uaag20] for related information.

When the property is set to pre-paginated for a spine item, its content dimensions MUST be set as defined in 8.2.2.6 Content document dimensions.

EPUB creators MUST NOT declare the rendition:layout property more than once.

They also MUST NOT declare the property using the refines attribute. Refer to 8.2.2.1.1 Layout overrides for setting the property for individual EPUB content documents.

8.2.2.1.1 Layout overrides

EPUB creators MAY specify the following properties locally on spine itemref elements to override the global value for the given spine item:

rendition:layout-pre-paginated
Specifies that the given spine item is pre-paginated.
rendition:layout-reflowable
Specifies that the given spine item is reflowable.

EPUB creators MUST NOT use more than one of these overrides on any given spine item.

8.2.2.2 Orientation

The rendition:orientation property specifies which orientation the EPUB creator intends the content to be rendered in.

When the rendition:orientation property is specified on a meta element, it indicates that the intended orientation applies globally (i.e., for all spine items).

EPUB creators MUST use one of the following values with the rendition:orientation property:

landscape

Reading systems should render the content in landscape orientation.

portrait

Reading systems should render the content in portrait orientation.

auto

The content is not orientation constrained. Default value.

EPUB creators MUST NOT declare the rendition:orientation property more than once.

They also MUST NOT declare the property using the refines attribute. Refer to 8.2.2.2.1 Orientation overrides for setting the property for individual EPUB content documents.

8.2.2.2.1 Orientation overrides

EPUB creators MAY specify the following properties locally on spine itemref elements to override the global value for the given spine item:

rendition:orientation-auto
Specifies that the reading system determines the orientation to render the spine item in.
rendition:orientation-landscape
Specifies that reading systems should render the given spine item in landscape orientation.
rendition:orientation-portrait
Specifies that reading systems should render the given spine item in portrait orientation.

EPUB creators MUST NOT use more than one of these overrides on any given spine item.

8.2.2.3 Synthetic spreads

The rendition:spread property specifies the intended reading system synthetic spread behavior.

When the rendition:spread property is specified on a meta element, it indicates that the intended synthetic spread behavior applies globally (i.e., for all spine items).

EPUB creators MUST use one of the following values with the rendition:spread property:

none

Do not incorporate spine items in a synthetic spread. Reading systems should display the items in a single viewport positioned at the center of the screen.

landscape

Render a synthetic spread for spine items only when the device is in landscape orientation.

portrait (deprecated)

The use of spreads only in portrait orientation is deprecated.

EPUB creators should use the value "both" instead, as spreads that are readable in portrait orientation are also readable in landscape.

both

Render a synthetic spread regardless of device orientation.

auto

The EPUB creator is not defining an explicit synthetic spread behavior. Default value.

EPUB creators MUST NOT declare the rendition:spread property more than once.

They also MUST NOT declare the property using the refines attribute. Refer to 8.2.2.3.1 Synthetic spread overrides for setting the property for individual EPUB content documents.

Note

When synthetic spreads are used in the context of XHTML and SVG content documents, the dimensions given via the viewport meta element and viewBox attribute represents the size of one page in the spread, respectively.

Note

Refer to the spine element for information about declaration of global flow directionality using the page-progression-direction attribute and that of local page-progression-direction within content documents.

8.2.2.3.1 Synthetic spread overrides

EPUB creators MAY specify the following properties locally on spine itemref elements to override the global value for the given spine item:

rendition:spread-auto
Specifies the reading system determines when to render a synthetic spread for the spine item.
rendition:spread-both
Specifies the reading system should render a synthetic spread for the spine item in both portrait and landscape orientations.
rendition:spread-landscape
Specifies the reading system should render a synthetic spread for the spine item only when in landscape orientation.
rendition:spread-none
Specifies the reading system should not render a synthetic spread for the spine item.
rendition:spread-portrait

The rendition:spread-portrait property is deprecated.

Refer to the spread-portrait property definition in [epubpublications-301] for more information.

EPUB creators MUST NOT use more than one of these overrides on any given spine item.

8.2.2.4 Spread placement

When a reading system renders a synthetic spread, the default behavior is to populate the spread by rendering the next EPUB content document in the next available unpopulated viewport, where the next available viewport is determined by the given page progression direction or by local declarations within EPUB content documents. An EPUB creator MAY override this automatic population behavior and force reading systems to place a document in a particular viewport by specifying one of the following properties on its spine itemref element:

rendition:page-spread-center
The rendition:page-spread-center property is an alias of the spread-none property for centering a spine item.
rendition:page-spread-left
The rendition:page-spread-left property is an alias of the page-spread-left property for placing a spine item in the left-hand slot of a two-page spread.
rendition:page-spread-right
The rendition:page-spread-right property is an alias of the page-spread-right property for placing a spine item in the right-hand slot of a two-page spread.

The rendition:page-spread-center, rendition:page-spread-left, and rendition:page-spread-right properties apply to both pre-paginated and reflowable content. They only apply when the reading system is creating synthetic spreads.

Although EPUB creators often indicate to use a spread in certain device orientations, the content itself does not represent true spreads (i.e., two consecutive pages that reading systems must render side-by-side for readability, such as a two-page map). To indicate that two consecutive pages represent a true spread, EPUB creators SHOULD use the rendition:page-spread-left and rendition:page-spread-right properties on the spine items for the two adjacent EPUB content documents, and omit the properties on spine items where one-up or two-up presentation is equally acceptable.

EPUB creators MUST NOT declare more than one rendition:page-spread-* property, and/or their unprefixed equivalents, on any given spine item (e.g., it is valid to specify both "rendition:page-spread-left page-spread-left" in case reading systems only support one of properties).

Note

The rendition:page-spread-left and rendition:page-spread-right properties were created to allow the use of a single vocabulary for all fixed-layout properties. EPUB creators can use either property set, but older reading systems might only recognize the unprefixed versions.

The rendition:page-spread-center was created to make it easier for EPUB creators to understand the process of switching between two-page spreads and single centered pages. EPUB creators can use either rendition:page-spread-center or spread-none to disable spread behavior in reading systems.

8.2.2.5 Viewport dimensions (deprecated)

The rendition:viewport property allows EPUB creators to express the CSS initial containing block (ICB) [css2] for XHTML and SVG content documents whose rendition:layout property has been set to pre-paginated.

Use of the property is deprecated.

Refer to the rendition:viewport property definition in [epubpublications-301] for more information.

8.2.2.6 Content document dimensions

This section defines rules for the expression and interpretation of dimensional properties of fixed-layout documents.

Fixed-layout documents specify their initial containing block [css2] in the manner applicable to their format:

Expressing in XHTML

For XHTML fixed-layout documents, the initial containing block [css2] is obtained from the REQUIRED height and width definitions in a viewport meta tag, where:

  • the height property MUST have as its value a positive number or the keyword device-height; and
  • the width property MUST have as its value a positive number or the keyword device-width.

The device-width and device-height values refer to the 100% of the width and height, respectively, of the reading system's viewport.

EPUB creators MUST NOT specify duplicate height or width definitions either within a single viewport meta tag or by specifying multiple viewport meta tags.

Expressing in SVG

For SVG fixed-layout documents, the initial containing block [css2] dimensions MUST be expressed using the viewBox attribute [svg].

Note

The initial containing block definition affects only the document where it is defined. The dimensions of the containing blocks in the other content documents within the same publication may be different.

8.3 Reflowable layouts

Although control over the rendering of EPUB content documents to create fixed layouts is an obvious need not handled by other technologies, there are also considerations for reflowable content that are unique to EPUB publications (e.g., how to handle the flow of content in the viewport). This section defines properties that allow EPUB creators to control presentation aspects of reflowable content.

8.3.1 The rendition:flow property

The rendition:flow property specifies the EPUB creator preference for how reading systems should handle content overflow.

When the rendition:flow property is specified on a meta element, it indicates the EPUB creator's global preference for overflow content handling (i.e., for all spine items). EPUB creators MAY indicate a preference for dynamic pagination or scrolling. For scrolled content, it is also possible to specify whether consecutive EPUB content documents are to be rendered as a continuous scrolling view or whether each is to be rendered separately (i.e., with a dynamic page break between each).

EPUB creators MUST use one of the following values with the rendition:flow property:

paginated

Dynamically paginate all overflow content.

scrolled-continuous

Render all EPUB content documents such that overflow content is scrollable, and the EPUB publication is presented as one continuous scroll from spine item to spine item (except where locally overridden).

Note that EPUB creators SHOULD NOT create publications in which different resources have different block flow directions, as continuous scrolled rendition in EPUB reading systems would be problematic.

scrolled-doc

Render all EPUB content documents such that overflow content is scrollable, and each spine item is presented as a separate scrollable document.

auto

Render overflow content using the reading system default method or a user preference, whichever is applicable. Default value.

Note that when two reflowable EPUB content documents occur sequentially in the spine, the default rendering for their [html] body elements is consistent with the page-break-before property [csssnapshot] having been set to always. In addition to using the rendition:flow property, EPUB creators MAY override this behavior through an appropriate style sheet declaration, if the reading system supports such overrides.

EPUB creators MUST NOT declare the rendition:flow property more than once.

They also MUST NOT declare the property using the refines attribute. Refer to 8.3.1.1 Spine overrides for setting the property for individual EPUB content documents.

Figure 6 Rendering of an EPUB publication with a single spine item, and with the rendition:flow set to paginated.
The continuous progression of paginated content produced for a single document.
Image description

Three column-like rectangles linked left-to-middle and middle-to-right with respective arrows, with a text flowing from one rectangle to the next one. The text is sectioned with headers figuring 'Chapter 1', '2', and '3'. The leftmost rectangle is enclosed in a schematic view of a tablet.

Figure 7 Rendering of an EPUB publication with multiple spine items, and with the rendition:flow set to paginated.
The continuous progression of paginated content produced for each document with transitions to
					new pages between documents.
Image description

Three column-like rectangles linked left-to-middle and middle-to-right with respective arrows, with a text flowing from one rectangle to the next one. The text is sectioned with headers figuring 'Chapter 1', '2'. The section with 'Chapter 2' starts at the top of the rightmost rectangle, leaving an empty space at the bottom of the middle rectangle. The leftmost rectangle is enclosed in a schematic view of a tablet.

Figure 8 Rendering of an EPUB publication with a single spine item, and with the rendition:flow set to scrolled-continuous.
The progression of a continuous scroll of content extends vertically off the user's screen,
					with new documents added to the bottom as encountered.
Image description

A single, column-like strip (i.e., a rectangle without a bottom edge) with a text flowing down the strip. The text is sectioned with headers figuring 'Chapter 1', '2'. The top part of the strip is enclosed in a schematic view of a tablet.

Figure 9 Rendering of an EPUB publication with multiple spine items, and with the rendition:flow set to scrolled-doc.
The progression of scrollable documents depicting how only the content within each document
					is scrollable.
Image description

Three column-like strips (i.e., a rectangles without bottom edges) linked left-to-middle and middle-to-right with respective arrows, each containing a text flowing down the strip. The text is sectioned with headers figuring 'Chapter 1', '2' and '3'. Each strip starts with a chapter header and flows down the strip. The top part of the leftmost strip is enclosed in a schematic view of a tablet.

8.3.1.1 Spine overrides

EPUB creators MAY specify the following properties locally on spine itemref elements to override the global value for the given spine item:

rendition:flow-auto
Indicates no preference for overflow content handling by the EPUB creator.
rendition:flow-paginated
Indicates the EPUB creator preference is to dynamically paginate content overflow.
rendition:flow-scrolled-continuous
Indicates the EPUB creator preference is to provide a scrolled view for overflow content, and that consecutive spine items with this property are to be rendered as a continuous scroll.
rendition:flow-scrolled-doc
Indicates the EPUB creator preference is to provide a scrolled view for overflow content, and each spine item with this property is to be rendered as a separate scrollable document.

EPUB creators MUST NOT use more than one of these overrides on any given spine item.

8.3.2 The rendition:align-x-center property

The rendition:align-x-center property specifies that the given spine item should be centered horizontally in the viewport or spread.

The property MUST NOT be set globally for all EPUB content documents (i.e., in a meta element without a refines attribute). It is only available as a spine override for individual EPUB content documents via the itemref element's properties attribute.

Note

This property was developed primarily to handle "Naka-Tobira (中扉)" (sectional title pages), in the absence of reliable centering control within the content rendering. As support for paged media evolves in CSS, however, this property is expected to be deprecated. EPUB creators are encouraged to use CSS solutions when effective.

9. Media overlays

9.1 Introduction

This section is non-normative.

Mainstream ebooks, educational tools and ebooks formatted for persons with print disabilities are some examples of works that contain synchronized audio narration. In EPUB 3, EPUB creators can create these types of books using media overlay documents to describe the timing for the pre-recorded audio narration and how it relates to the EPUB content document markup. The specification defines the file format for media overlays as a subset of [smil3], a W3C recommendation for representing synchronized multimedia information in XML.

The text and audio synchronization enabled by media overlays provides enhanced accessibility for any user who has difficulty following the text of a traditional book. Media overlays also provide a continuous listening experience for readers who are unable to read the text for any reason, something that traditional audio embedding techniques cannot offer. They are even useful for purposes not traditionally considered accessibility concerns (e.g., for language learning).

The media overlays feature is transparent to EPUB reading systems that do not support the feature. The inclusion of media overlays in an EPUB publication has no impact on the ability of media overlay-unaware reading systems to render the EPUB publication as though the media overlays are not present.

Media overlays in EPUB are not an equivalent to audiobooks, as audiobooks are primarily audio-based with text occasionally provided as an alternate format. The W3C [audiobooks] recommendation is for building audio publications.

Although future versions of this specification might incorporate support for video media (e.g., synchronized text/sign-language books), this version supports only synchronizing audio media with the EPUB content document.

9.2 Media overlay documents

9.2.1 Media overlay document requirements

A media overlay document:

9.2.2 Media overlay document definition

All elements [xml] defined in this section are in the https://www.w3.org/ns/SMIL namespace [xml-names] unless otherwise specified.

9.2.2.1 The smil element

The smil element encapsulates all the information in an media overlay document.

Element Name:

smil

Usage:

REQUIRED root element [xml] of the media overlay document.

Attributes:
version [required]

Specifies the version number of the [smil3] specification to which the media overlay document adheres.

This attribute MUST have the value "3.0".

id [optional]

The ID [xml] of the element, which MUST be unique within the document scope.

epub:prefix [optional]

Declares additional metadata vocabulary prefixes.

Refer to 9.3.3 Structural semantics in overlays for more information.

Content Model:

In this order:

9.2.2.2 The head element

The head element is the container for metadata in the media overlay document.

Element Name:

head

Usage:

The head element is the OPTIONAL first child of the smil element.

Attributes:

None

Content Model:

metadata [0 or 1]

As this specification does not define any metadata properties that must occur in the media overlay document, the head element is OPTIONAL.

9.2.2.3 The metadata element

The metadata element represents metadata for the media overlay document. The metadata element is an extension point that allows the inclusion of metadata from any metainformation structuring language.

Element Name:

metadata

Usage:

As a child of the head element.

Attributes:

None

Content Model:

[0 or more] elements from any namespace

This specification does not require any metadata properties in the media overlay document; the metadata element is provided for custom metadata requirements.

9.2.2.4 The body element

The body element is the starting point for the presentation contained in the media overlay document. It contains the main sequence of par and seq elements.

Element Name:

body

Usage:

The body element is a REQUIRED child of the smil element. It follows the head element, when that element is present.

Attributes:
epub:type [optional]

An expression of the structural semantics of the corresponding element in the EPUB content document.

The value is a white space separated list of property types. Refer to 9.3.3 Structural semantics in overlays for more information.

id [optional]

The ID [xml] of the element, which MUST be unique within the document scope.

epub:textref [optional]

Refers to the associated EPUB content document and, optionally, identifies a specific part of it.

The value MUST be a path-relative-scheme-less-URL string, optionally followed by U+0023 (#) and a URL-fragment string.

Content Model:

In any order:

  • seq [0 or more]

  • par [0 or more]

MUST include at least one par or seq.

9.2.2.5 The seq element

The seq element is a sequential time container for media objects and/or child time containers.

Element Name:

seq

Usage:

One or more seq elements MAY occur as children of the body element and of the seq element.

Attributes:
epub:type [optional]

An expression of the structural semantics of the corresponding element in the EPUB content document.

The value is a white space separated list of property types. Refer to 9.3.3 Structural semantics in overlays for more information.

id [optional]

The ID [xml] of the element, which MUST be unique within the document scope.

epub:textref [required]

Refers to the associated EPUB content document and, optionally, identifies a specific part of it.

The value MUST be a path-relative-scheme-less-URL string, optionally followed by U+0023 (#) and a URL-fragment string.

Refer to 9.3.2.1 Overlay structure for more information.

Content Model:

In any order:

  • seq [0 or more]

  • par [0 or more]

MUST include at least one par or seq.

9.2.2.6 The par element

The par element is a parallel time container for media objects.

Element Name:

par

Usage:

One or more par elements MAY occur as children of the body and seq elements.

Attributes:
epub:type [optional]

An expression of the structural semantics of the corresponding element in the EPUB content document.

The value is a white space separated list of property types. Refer to 9.3.3 Structural semantics in overlays for more information.

id [optional]

The ID [xml] of the element, which MUST be unique within the document scope.

Content Model:

In any order:

9.2.2.7 The text element

The text element references an element in an EPUB content document. A text element typically refers to a textual element but can also refer to other EPUB content document media elements. In the absence of a sibling audio element, textual content referred to by this element may be rendered via text-to-speech.

Element Name:

text

Usage:

As a REQUIRED child of the par element.

Attributes:
src [required]

Refers to the associated EPUB content document and, optionally, identifies a specific part of it.

The value MUST be a path-relative-scheme-less-URL string, optionally followed by U+0023 (#) and a URL-fragment string.

id [optional]

The ID [xml] of the element, which MUST be unique within the document scope.

Content Model:

Empty

Note

This specification places no restriction on the src attribute of a text element. EPUB creators should, however, refer to a content that can be styled with CSS to make the association with style information effective (i.e., palpable content for XHTML or paths, basic shapes, or text elements in SVG).

Note

[epub-rs-33] no longer provides guidance for reading systems on the playback of timed media (i.e., the automatic starting of the referenced media). Although the src attribute of a text element may refer to embedded timed media (e.g., via an [htmlvideo element), referencing such media may have unpredictable results.

9.2.2.8 The audio element

The audio element represents a clip of audio media.

Element Name:

audio

Usage:

An OPTIONAL child of the par element.

Attributes:
id [optional]

The ID [xml] of the element, which MUST be unique within the document scope.

src [required]

The relative- or absolute-URL string [url] reference to an audio file. The audio file MUST be one of the audio formats listed in the core media type resources table.

clipBegin [optional]

A clock value that specifies the offset into the physical media corresponding to the start point of an audio clip.

MUST be a [smil3] clock value.

See H.4 Clock values.

clipEnd [optional]

A clock value that specifies the offset into the physical media corresponding to the end point of an audio clip.

MUST be a [smil3] clock value.

See H.4 Clock values.

The chronological offset of the terminating position MUST be after the starting offset specified in the clipBegin attribute.

Content Model:

Empty

9.3 Creating media overlays

9.3.1 Introduction

This section is non-normative.

EPUB creators can represent a pre-recorded narration of a publication as a series of audio clips, each corresponding to part of an EPUB content document. A single audio clip, for example, typically represents a single phrase or paragraph, but infers no order relative to the other clips or to the text of a document. Media overlays solve this problem of synchronization by tying the structured audio narration to its corresponding text (or other media) in the EPUB content document using [smil3] markup. Media overlays are, in fact, a simplified subset of SMIL 3.0 that define the playback sequence of these clips.

The SMIL elements primarily used for structuring media overlays are body (used for the main sequence), seq (sequence) and par (parallel). (Refer to 9.2.2 Media overlay document definition for more information on these and other SMIL elements.)

The par element is the basic building block of a media overlay and corresponds to a phrase in the EPUB content document. The element provides two key pieces of information for synchronizing content: 1) the audio clip containing the narration for the phrase; and 2) a pointer to the associated EPUB content document fragment. The par element uses two media element children to represent this information: an audio element and a text element. Because par elements' media object children are timed in parallel, reading systems render the audio clip and EPUB content document fragment at the same time, resulting in a synchronized presentation.

The text element src attribute references the associated phrase, sentence, or other segment of the EPUB content document by its URL [url] reference. The audio element src attribute similarly references the location of the corresponding audio clip and adds the OPTIONAL clipBegin and clipEnd attributes to indicate a specific offset within the clip.

EPUB creators place par elements together sequentially to form a series of phrases or sentences. Not every element of the EPUB content document will have a corresponding par element in a media overlay document, only those relevant to the audio narration.

EPUB creators can also add par elements to seq elements to define more complex structures such as parts and chapters (see 9.3.2.1 Overlay structure).

9.3.2 Relationship to the EPUB content document

Note

In this section, the EPUB content document is assumed to be an XHTML content document. While EPUB creators may use media overlays with SVG content documents, playback behavior might not be consistent and therefore interoperability is not guaranteed.

9.3.2.1 Overlay structure

The body of a media overlay document consists of two elements: the par element and the seq element. The ordering of these elements represents how reading systems render the content in the corresponding EPUB content documents during playback.

The par element represents a segment of content, such as a word, phrase, sentence, table cell, list item, image, or other identifiable piece of content in the markup. Each element identifies both the content to display (in the text element) and audio to synchronize (in the audio element) during playback.

The seq element represents sequences — sets of seq and/or par elements that together represent a logical component of the content. EPUB creators can use it to represent nested containers such as sections, asides, headers, tables, lists, and footnotes. It allows EPUB creators to retain the structure inherent in these containers in the media overlay document.

The seq element MUST contain an epub:textref attribute. As seq elements do not provide synchronization instructions, this attribute allows a reading system to match the fragment to a location in the text.

Note

The reason for grouping structures like sections, figures, tables, and footnotes in a seq element is so that reading systems can identify their start and end positions during playback. Reading systems can then offer playback options tailored to the layout of the content, such as jumping past a long figure, turning off rendering of page break announcements (see 9.4 Skippability and escapability), or customizing the reading mode to suit structures such as tables.

9.3.2.2 Referencing document fragments

Both the epub:textref attribute and the text element's src attribute may contain a URL-fragment string that references a specific part (e.g., an element via its ID) of the associated EPUB content document.

For XHTML and SVG content documents, the URL-fragment string SHOULD be a reference to a specific element via its ID, or an SVG Fragment Identifier [svg], respectively.

EPUB creators MAY use other fragment identifier schemes, but reading systems may not support such identifiers.

9.3.2.3 Overlay granularity

This section is non-normative.

The granularity level of the media overlay depends on how EPUB creators mark up the EPUB content document and the type of fragment identifier they use in the text elements' src attributes and the seq elements' epub:textref attrbutes. For example, when referencing [html] elements, if the finest level of markup is at the paragraph level, then that is the finest possible level for media overlay synchronization. Likewise, if sub-paragraph markup is available, such as [html] span element representing phrases or sentences, then finer granularity is possible in the media overlay. Finer granularity gives users more precise results for synchronized playback when navigating by word or phrase and when searching the text but increases the file size of the media overlay documents. Fragment identifier schemes that do not rely on the presence of elements could provide even finer granularity, where supported.

9.3.2.4 Text-to-speech rendering

This specification allows the use of text-to-speech (TTS) — the rendering of the textual content of an EPUB publication as artificial human speech using a synthesized voice — in addition to pre-recorded audio clips.

When a media overlay par element omits its audio element, its text element may be rendered in reading systems via TTS. If the text fragment is not appropriate for TTS rendering (e.g., is not a text element and/or has no text fallback), this may produce unexpected results.

Note

See EPUB 3 Text-to-Speech Support [epub-tts-10] for more information about using TTS technologies in EPUB publications.

9.3.3 Structural semantics in overlays

To express structural semantics in media overlay documents, EPUB creators MAY specify the epub:type attribute on par, seq, and body elements.

The epub:type attribute facilitates reading system behavior appropriate for the semantic type(s) indicated. Examples of these behaviors are skippability and escapability and table reading mode [epub-rs-33].

Media overlay documents MAY use the applicable vocabulary association mechanisms for the epub:type attribute to define additional semantics.

9.3.4 Associating style information

EPUB creators MAY express visual rendering information for the currently playing EPUB content document element in a CSS Style Sheet using author-defined classes.

When used, EPUB creators MUST declare the class names in the package document using the active-class and playback-active-class properties.

EPUB creators MUST define exactly one CSS class name in each property they define. Each property MUST define a valid CSS class name not including any selectors [css2]. This specification does not reserve names for use with these properties.

EPUB creators MAY define any CSS properties for the specified CSS classes but must ensure that each EPUB content document with an associated media overlay document includes a CSS stylesheet (either embedded or linked) containing the class definitions. In the absence of such definitions reading systems might provide their own styling, or no styling at all.

EPUB creators MUST NOT use the active-class and playback-active-class properties in conjunction with a refines attribute as they always apply to the entire EPUB publication.

9.3.5 Media overlays packaging

9.3.5.1 Including media overlays

If an EPUB content document is wholly or partially referenced by a media overlay document, then its manifest item element MUST specify a media-overlay attribute. The attribute MUST reference the ID [xml] of the manifest item for the corresponding media overlay document.

EPUB creators MUST only specify the media-overlay attribute on manifest item elements that reference EPUB content documents.

Manifest items for media overlay documents MUST have the media type application/smil+xml.

9.3.5.2 Overlays package metadata

EPUB creators MUST specify the duration of the entire EPUB publication in the package document using a meta element with the duration property.

In addition, EPUB creators MUST provide the duration of each media overlay document. EPUB creators MUST use the refines attribute to associate each duration declaration to the corresponding manifest item.

The sum of the durations for each media overlay document SHOULD equal the total duration plus or minus one second.

Note

Although the sum of indivudal durations may not exactly match the total due to rounding the times to nearest fraction of a second, a difference of greater than one second indicates a mismatch arising from other issues.

EPUB creators MAY also specify narrator information in the package document, as well as author-defined CSS class names to apply to the currently playing EPUB content document element.

Note

The media: prefix is reserved for inclusion of these properties in package metadata.

9.4 Skippability and escapability

9.4.1 Skippability

While reading, users may want to turn on or off certain features of the content, such as footnotes, page numbers, or other types of secondary content. This feature is called skippability. Reading systems use the semantic information provided by media overlay elements' epub:type attribute to determine when to offer users the option of skippable features.

EPUB creators MAY use the following semantics to enable skippability:

This list is non-exhaustive, however. It represents terms from the Structural Semantics Vocabulary [epub-ssv-11] for which reading systems are most likely to offer the option of skippability.

9.4.2 Escapability

Escapable items are nested structures, such as tables and lists, that users might wish to skip over, continuing to read from the point immediately after the nested structure. The escapability feature differs from the skippability feature in that it does not enable or disable entire types of items, but provides an exit from them (e.g., a user can listen to some of the content before choosing to escape).

EPUB creators MAY use the following semantics to enable escapability:

This list is non-exhaustive list, however. It represents terms from the Structural Semantics Vocabulary [epub-ssv-11] for which reading systems are most likely to offer the option of escapability.

Note

Sometimes escapable structures may contain escapable structures. For example, tables are composed of many rows and cells that users may want to separately escape from. Reading system support for escaping from such structures is complex and not well supported at this time. EPUB creators should avoid identifying nested escapable structures until better support is available.

9.5 Navigation document overlays

This section is non-normative.

As the EPUB navigation document is an XHTML content document, EPUB creators may associate a media overlay document with it. Unlike traditional XHTML content documents, however, reading systems must present the EPUB navigation document to users even when it is not included in the spine (see Navigation document processing [epub-rs-33]). As a result, the method in which an associated media overlay behaves can change depending on the context:

Note

Specific implementation details are beyond the scope of this specification. The DAISY Media Overlays Playback Requirements document describes best practices for EPUB creators and provides recommendations for reading system developers.

10. Accessibility

This section is non-normative.

EPUB 3 builds upon the Open Web Platform expressly so that it can leverage the structure, semantics and, by extension, accessibility built into its underlying technologies.

The requirements and practices for creating accessible web content have already been documented in the W3C's Web Content Accessibility Guidelines (WCAG) [wcag2]. These guidelines also form the basis for defining accessibility in EPUB publications.

As the current WCAG guidelines (version 2) are heavily focused on web pages, a separate specification, EPUB Accessibility [epub-a11y-11], defines how to apply the standard to EPUB publications. It also adds EPUB-specific requirements and recommendations for metadata, pagination, and media overlays.

This specification recommends that EPUB publications conform to the accessibility requirements defined in [epub-a11y-11]. A benefit of following this recommendation is that it helps to ensure that EPUB publications meet the accessibility requirements legislated in jurisdictions around the world.

EPUB creators, however, should look beyond legal imperatives and treat accessibility as a requirement for all their content. The more accessible that EPUB publications are, the greater the potential audience for them.

Note

This specification does not integrate the accessibility requirements to allow them to adapt and evolve independent of the EPUB specification — accessibility practices often need more frequent updating. The accessibility specification is also intended for use with past, present, and future versions of EPUB. The approach of a separate specification ensures that the evolution of EPUB does not lock accessibility in time (i.e., it allows producers of older versions of EPUB to reference the latest accessibility requirements).

11. Security and privacy

11.1 Overview

This section is non-normative.

The particularity of an EPUB publication is its structure. The EPUB format provides a means of representing, packaging, and encoding structured and semantically enhanced web content — including HTML, CSS, SVG, JavaScript, and other resources — for distribution in a single-file container.

This means that EPUB 3's security and privacy issues are primarily linked to the features of those formats, and closely mirror the threats presented by web content.

Although content risks are often equated with deliberately malicious authoring intent, EPUB creators need to be aware that many practices followed with the best of intentions may expose users to privacy and security issues. The rest of this section explores the risk model of EPUB 3 with the aim of helping EPUB creators recognize and mitigate these risks.

Note

For the risks associated with reading systems, refer to the security and privacy section of [epub-rs-33].

11.2 Threat model

This section is non-normative.

EPUB publications pose a variety of privacy and security threats to unsuspecting users. Many of these threats intersect with web content, but EPUB also introduces its own unique methods of attack that can be used to trick users into accessing malicious content or into providing sensitive information. Some of the more important attack vectors that EPUB creators and users need to be aware of include:

Embedding of remote resources

EPUB 3 allows some publication resources to be remotely hosted, specifically resources whose sizes can negatively affect the downloading and opening of the EPUB publication (e.g., audio, video, and fonts). Although helpful for users when used as intended, these exemptions can also be used to inject malicious content into a publication.

This threat is not limited to accessing content created by a bad actor. If EPUB creators embed content from untrustworthy sources (e.g., third party audio and video), there is always the possibility that users may receive compromised resources.

Checking for malware and exploits at distribution time is not always reliable, either, as the malicious content can be swapped in any time after publication, unlike resources that come embedded in the EPUB container.

The origin of an EPUB is both unknown to the EPUB creator and specific to each reading system implementation. Consequently, if the EPUB creator hosts remote resources on a web server they control, the server effectively cannot use security features that require specifying allowable origins, such as headers for CORS, Content-Security-Policy, or X-Frame-Options.

Linking to external resources

Whether intentional or not, links to external web sites and resources expose users to potential exploits that can compromise their reading system or operating system. Although external links will typically open in a web browser, and be subject to the browser security model, this does not protect users from all exploits.

Even if the intentions of the EPUB creator are not malicious, adding tracking information to external links is problematic for user privacy as it can allow a user's activity to be tracked without their consent.

Broken-link hijacking — when a domain expires and is bought by another party to exploit the links to it — can also lead to users being taken to resources the EPUB creator did not intend.

Including malicious content

Resources embedded in the EPUB container are not immune to malicious actors, especially when EPUB publications are obtained from untrusted sources. Resources may contain exploits or forms that may submit sensitive information to unintended parties. Such actors may also try to gain access to remote resources using file indirection techniques, such as symbolic links or file aliases.

The use of third-party content, such as games and quizzes, may also lead to security and privacy issues if the EPUB creator is not able to fully vet the content.

Allowing scripts network access

When scripts can access a device's network, it provides a variety channels to exploit the user:

  • collecting information about the user and their activities, whether malicious or not;
  • attempting to access the file system and local storage to harvest information;
  • phishing attempts (e.g., making an EPUB content document appear like a trusted web site to get the user to submit login information); and
  • injecting malicious content from external sites into the EPUB publication.

Network access may allow third-party content to exploit the user even if it was not the EPUB creator's intent.

Securing content with digital rights management

The encryption and decryption of EPUB publications using digital rights management schemes may allow personally identifiable information about the user, what vendors they use, and their reading choices to be relayed to third parties.

The effectiveness of these attacks also often depends on tricking users into believing that the publication they are interacting with is from a trustworthy source. These deceptions can take the following forms:

Falsified publication information

The EPUB publication may include false information about itself to trick users into believing that it comes from a legitimate source. A malicious EPUB creator might, for example, fake the title, authors, identifiers, and publisher for the work.

Although this misinformation itself does not present an immediate harm, it could lead users to trust malicious forms, links, and other content within the EPUB publication believing it comes from a reliable source.

Spoofed platforms

Malicious EPUB creators may also design their content to imitate or replicate a platform's experience to trick users into trusting their content.

11.2.1 EPUB-specific features

EPUB 3 tries to avoid extending the underlying technologies it builds on, but it has introduced some new features. The restricted scope of these features limits the threats they might pose, however:

The one potential exception is the epubReadingSystem object [epub-rs-33] that allows EPUB creators to query information about the current reading system. EPUB creators need to be mindful that they only use the information exposed by this object to improve the rendering of their content (i.e., avoid using the information to profile the user and their environment).

11.3 Recommendations

Although EPUB creators cannot prevent every method of exploiting users, they are ultimately responsible for the secure construction of their content. That means that they need to take precautions to limit the exposure of their EPUB publications to the types of malicious exploits described in the previous section.

Some practical steps include:

EPUB creators also need to consider the privacy rights of users and avoid situations where they are intentionally collecting data. Ideally, EPUB creators SHOULD NOT track their users, but this is not realistic for all types of publishing.

When EPUB creators have to track users, they SHOULD obtain the approval of the user to collect information prior to opening the EPUB publication (e.g., in educational course work). If this is not possible, they SHOULD obtain permission when users access the EPUB publication for the first time. EPUB creators SHOULD also allow users to opt out of tracking, and provide users the ability to manage and delete any data that is collected about them.

EPUB creators also need to consider the inadvertent collection of information about users. Linking to content on a publisher's web site, or remotely hosting resources on their servers, can lead to profiling users, especially if unique tracking identifiers are added to the URLs.

When collecting and storing user information within an EPUB publication (e.g., through the use of cookies and web storage [html]), EPUB creators need to consider to potential for data theft by other EPUB publications on a reading system. Although [epub-rs-33] introduces a unique origin requirement for EPUB publications, which limits the potential for attacks, there is still a risk that reading systems will allow EPUB publications access to shared persistent storage (e.g., older reading systems that have not been updated and non-conforming newer reading systems). Consequently, EPUB creators SHOULD NOT store sensitive user data in persistent storage. If EPUB creators must store sensitive data, they SHOULD encrypt the data to prevent trivial access to it in the case of an exploit.

When publishers and vendors must use digital rights management schemes, they should prefer schemes that do not utilize or transmit information about the user or their content to external parties to perform encryption or decryption.

To maximally reduce security and privacy risks, EPUB creators SHOULD produce their EPUB publications with the goal of long-term preservation. EPUB publications created this way are normally self-contained, not dependent on network access, and not encrypted with digital rights management, removing many of the possible attack vectors. [iso22424] is an example of such a preservation format for EPUB publications. While it is understood that not all EPUB creators can achieve these levels of self-containment, following as many of these practices as possible will still benefit overall user privacy and security.

A. Unsupported features

This specification contains certain features that are not yet fully supported in reading systems, that the Working Group no longer recommends for use, or that are only retained for interoperability with EPUB 2 reading systems. This section defines the meanings of the designations attached to these features and their support expectations.

A.1 Under-implemented features

A under-implemented feature is a feature introduced prior to EPUB 3.3 for which the Working Group has not been able to establish enough implementation experience.

These features are considered important to retain despite this limitation because they are known to be implemented by EPUB creators (i.e., their deprecation would invalidate existing content) and/or they are integral to the content model on which EPUB is built.

If this specification designates a feature as under-implemented, EPUB creators MAY use the features as described.

Note

EPUB conformance checkers should alert EPUB creators to the presence of under-implemented features when encountered in EPUB publications but must not treat their inclusion as a violation of the standard (i.e., not emit errors or warnings).

Caution

Whether under-implemented labels are removed or replaced by deprecation in a future version of the standard cannot be determined at this time. EPUB creators should strongly consider the interoperability problems that may arise both now and in the future when using these features.

Note

The marking of features as under-implemented is a one-time event to account for the different process under which EPUB was developed prior to being brought into W3C. This label will not be used for new features developed under W3C processes.

A.2 Deprecated features

A deprecated feature is one the Working Group no longer recommends for use in this version of the specification. Deprecated features typically have limited or no support in reading systems and/or usage in EPUB publications.

If this specification designates a feature as deprecated, EPUB creators SHOULD NOT use the feature in their EPUB publications.

Note

EPUB conformance checkers should alert EPUB creators to the presence of deprecated features when encountered in EPUB publications.

B. Allowed external identifiers

The following table lists the public and system identifiers [xml] allowed in document type declarations. [xml]

EPUB creators MAY use these external identifiers only in publication resources with the listed media types [rfc2046] specified in their manifest declarations. (Refer to 3.9 XML conformance for more information.)

Media Type(s) Public Identifier System Identifier
  • application/mathml+xml
  • application/mathml-presentation+xml
  • application/mathml-content+xml
-//W3C//DTD MathML 3.0//EN http://www.w3.org/Math/DTD/mathml3/mathml3.dtd
application/x-dtbncx+xml -//NISO//DTD ncx 2005-1//EN http://www.daisy.org/z3986/2005/ncx-2005-1.dtd
image/svg+xml -//W3C//DTD SVG 1.1//EN http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd

C. Expressing structural semantics

C.1 Introduction

This section is non-normative.

Structural semantics add additional meaning about the specific structural purpose an element plays. The epub:type attribute is used to express domain-specific semantics in EPUB content documents and media overlay documents, with the structural information it carries complementing the underlying vocabulary.

The applied semantics refine the meaning of their containing elements without changing their nature for assistive technologies, as happens when using the similar role attribute [html]. The attribute does not enhance the accessibility of the content, in other words; it only provides hints about the purpose.

Semantic metadata enriches content for use in publishing workflows and for author-defined purposes. It also allows reading systems to learn more about the structure and content of a document (e.g., to enable skippability and escapability in media overlays).

This specification defines a method for adding structural semantics using the attribute axis: instead of adding new elements, EPUB creators can append the epub:type attribute to existing elements to add the desired semantics.

C.2 The epub:type attribute

Attribute Name:

epub:type

Namespace:

http://www.idpf.org/2007/ops

Usage:

Refer to the requirements for XHTML, SVG, and media overlays.

Value:

A white space-separated list of property values, with restrictions as defined in D.1 Vocabulary association mechanisms.

White space is the set of characters as defined in [xml].

Caution

Although the epub:type attribute is similar in nature to the role attribute [html], the attributes serve different purposes. The values of the epub:type attribute do not enhance access through assistive technologies like screen readers as they do not map to the accessibility APIs used by these technologies. This means that adding epub:type values to semantically neutral elements like [html] div and span does not make them any more accessible to assistive technologies. Only ARIA roles influence how assistive technologies understand such elements.

The epub:type attribute is consequently only intended for publishing semantics and reading system enhancements. Reading systems may use epub:type values to provide accessibility enhancements like built-in read aloud or media overlays functionality where interaction with assistive technologies is not essential.

Refer to Digital Publishing WAI-ARIA Module [dpub-aria] for more information about accessible publishing roles.

The epub:type attribute inflects semantics on the element on which it appears. Its value is one or more white space-separated terms stemming from external vocabularies associated with the document instance.

The default vocabulary for the epub:type attribute is the EPUB 3 Structural Semantics Vocabulary [epub-ssv-11]. EPUB creators MAY include unprefixed terms that are not part of this vocabulary, but the preferred method for adding custom semantics is to use prefixes for them. Refer to D.1 Vocabulary association mechanisms for more information.

D. Vocabularies

This appendix defines a general set of mechanisms by which attributes in this specification can reference terms from vocabularies. It also defines EPUB-specific vocabularies for use with the attributes.

D.1 Vocabulary association mechanisms

D.1.1 Introduction

This section is non-normative.

EPUB defines a formal method of referencing terms and properties defined in metadata and semantic vocabularies using the property data type. The epub:type attribute uses this data type in EPUB content documents and media overlay documents to add structural semantics, for example, while the property and rel attributes use the data type to define properties and relationships in the package document.

A property value is like a CURIE [rdfa-core] — it represents a URL [url] in compact form. The expression consists of a prefix and a reference, where the prefix — whether literal or implied — is a shorthand mapping of a URL that typically resolves to a term vocabulary. When a reading system converts the prefix to its URL representation and combines with the reference, the resulting URL normally resolves to a fragment within that vocabulary that contains human- and/or machine-readable information about the term.

To reduce the complexity for authoring, each attribute that takes a property data type also defines a default vocabulary. Terms and properties referenced from the default vocabularies do not include a prefix as the mapping reading systems use to map to a URL is predefined.

The power of the property data type lies in its easy extensibility. To incorporate new terms and properties, EPUB creators only need to declare a prefix. In another authoring convenience, this specification also reserves prefixes for many commonly used publishing vocabularies (i.e., their declaration is optional).

The following sections provide additional details on the property data type and vocabulary association mechanism.

D.1.2 The property data type

The property data type is a compact means of expressing a URL [url] and consists of an OPTIONAL prefix separated from a reference by a colon.

(EBNF productions ISO/IEC 14977)
All terminal symbols are in the Unicode Block 'Basic Latin' (U+0000 to U+007F).
property = [ prefix , ":" ] , reference;  
prefix = ? xsd:NCName ? ;  
reference = ? path-relative-scheme-less-URL string [url] ? ; /* as defined in [url] */

This specification derives the property data type from the CURIE data type defined in [rdfa-core]. A property represents a subset of CURIEs.

There are two key differences from CURIEs, however:

  • an empty reference does not represent a valid property value even though it is valid to the definition above (i.e., a property value that only consists of a prefix and colon is invalid).

  • an empty string does not represent a valid property even though it is valid to the definition above.

When an EPUB creator omits a prefix from a property value, the expressed reference represents a term from the default vocabulary for that attribute.

D.1.3 Default vocabularies

A default vocabulary is one that EPUB creators do not have to declare a prefix for in order to use its terms and properties where a property value is expected. EPUB creators MUST NOT add a prefix to terms and properties from a default vocabulary.

EPUB creators MUST NOT assign a prefix to the URLs associated with these vocabularies using the prefix attribute.

Note

Refer to the definition of each attribute that takes a property data type for more information about its default vocabulary.

D.1.4 The prefix attribute

The prefix attribute defines prefix mappings for use in property values.

The value of the prefix attribute is a white space-separated list of one or more prefix-to-URL mappings of the form:

(EBNF productions ISO/IEC 14977)
All terminal symbols are in the Unicode Block 'Basic Latin' (U+0000 to U+007F).
prefixes = mapping , { whitespace, { whitespace } , mapping } ;  
mapping = prefix , ":" , space , { space } , ? xsd:anyURI ? ;  
prefix = ? xsd:NCName ? ;  
space = #x20 ;  
whitespace = (#x20 | #x9 | #xD | #xA) ;  

With the exception of reserved prefixes, EPUB creators MUST declare all prefixes used in a document. EPUB creators MUST only specify the prefix attribute on the root element [xml] of the respective format.

The attribute is not namespaced when used in the package document.

EPUB creators MUST declare the attribute in the namespace http://www.idpf.org/2007/ops in EPUB content documents and media overlay documents.

Note

Although the prefix attribute is modeled on the identically named prefix attribute in [rdfa-core], EPUB creators cannot use the attributes interchangeably. The prefix attribute without a namespace in EPUB content documents is the RDFa attribute.

It is common for both attributes to appear in EPUB content documents that also specify RDFa expressions.

<htmlprefix="…"
        xmlns:epub="http://www.idpf.org/2007/ops"
        epub:prefix="…"></html>

Note that for embedded SVG, prefixes MUST be declared on the [html] root html element.

To avoid conflicts, EPUB creators MUST NOT use the prefix attribute to declare a prefix that maps to a default vocabulary.

EPUB creators MUST NOT declare the prefix '_' as this specification reserves this prefix for future compatibility with RDFa [rdfa-core] processing.

For future compatibility with alternative serializations of the package document, EPUB creators MUST NOT declare a prefix for the Dublin Core /elements/1.1/ namespace [dcterms]. EPUB creators MUST use only the [dcterms] elements allowed in the package document metadata.

D.1.5 Reserved prefixes

Caution

Although reserved prefixes are an authoring convenience, EPUB creators should avoid relying on them as they may cause interoperability issues. EPUB conformance checkers will often reject new prefixes until their developers update the tools to the latest version of the specification, for example. EPUB creators should declare all prefixes they use to avoid such issues.

EPUB creators MAY use reserved prefixes in attributes that expect a property value without declaring them in a prefix attribute.

EPUB creators SHOULD NOT override reserved prefixes in the prefix attribute.

The reserved prefixes an EPUB creators can use depends on the context:

Package document

EPUB creators MAY use the following prefixes in package document attributes without having to declare them.

Prefix URL
a11y http://www.idpf.org/epub/vocab/package/a11y/#
dcterms http://purl.org/dc/terms/
marc http://id.loc.gov/vocabulary/
media http://www.idpf.org/epub/vocab/overlays/#
onix http://www.editeur.org/ONIX/book/codelists/current.html#
rendition http://www.idpf.org/vocab/rendition/#
schema http://schema.org/
xsd http://www.w3.org/2001/XMLSchema#
Structural Semantics

EPUB creators MAY use the following reserved prefixes in the epub:type attribute without having to declare them.

Prefix URL
msv http://www.idpf.org/epub/vocab/structure/magazine/#
prism http://www.prismstandard.org/specifications/3.0/PRISM_CV_Spec_3.0.htm#

D.2 Property field definitions

The fields in the vocabulary definition tables have the following implicit requirements:

Allowed Values

Specifies the REQUIRED type of value using [xmlschema-2] datatypes.

Applies To

Specifies which publication resource type(s) EPUB creators MAY specify the property on.

This field appears for properties used in the properties attribute.

Cardinality

Specifies the number of times EPUB creators MAY specify the property, whether globally or attached to another element or property.

Properties with a minimum cardinality of one MUST be specified.

Description

Describes the purpose of the property and specifies any additional usage requirements that EPUB creators must follow.

Example

Provides non-normative usage examples.

Extends

Identifies what EPUB creators MAY associate the property with.

This field appears for properties that define primary expressions and subexpressions and relationships.

Name

Specifies the name of the property as it MUST appear in the metadata.

D.3 Meta properties vocabulary

The properties in this vocabulary are usable in the meta element's property attribute.

Unless indicated otherwise in its "Extends" field, the properties defined in this section are used to define subexpressions: in other words, a meta element carrying a property defined in this section MUST have a refines attribute referencing a resource or expression being augmented.

The prefix URL for referencing these properties is http://idpf.org/epub/vocab/package/meta/#.

D.3.1 alternate-script

Name: alternate-script
Description:

The alternate-script property provides an alternate expression of the associated property value in a different language and/or script. The language tags of the alternate-script property and its associated property — as expressed by their respective in-scope xml:lang attributes — MUST NOT be the same.

This property is typically attached to creator and title properties for internationalization purposes.

Allowed value(s): xsd:string
Cardinality: zero or more
Extends: All properties.

D.3.2 authority

Name: authority
Description:

The authority property identifies the system or scheme the referenced element's value is drawn from.

Allowed value(s): xsd:string
Note

The former IDPF EPUB 3 Working Group maintained a registry of subject authorities for use with this property. This Working Group no longer maintains the registry.

Cardinality: zero or one
Extends: dc:subject

D.3.3 belongs-to-collection

Name: belongs-to-collection
Description:

The belongs-to-collection property identifies the name of a collection to which the EPUB publication belongs. An EPUB publication MAY belong to one or more collections.

It is also possible to chain these properties using the refines attribute to indicate that one collection is itself a member of another collection.

To allow reading systems to organize collections and avoid naming collisions (e.g., unrelated collections might share a similar name, or different editions of a collection could be released), EPUB creators SHOULD provide an identifier that uniquely identifies the instance of the collection. The dcterms:identifier property must carry this identifier.

The collection MAY more precisely define its nature by attaching a collection-type property.

The position of the EPUB publication within the collection MAY be provided by attaching a group-position property.

Allowed value(s): xsd:string
Cardinality: zero or more
Extends: Applies to the EPUB publication and can refine other instances of itself.

D.3.4 collection-type

Name: collection-type
Description:

The collection-type property indicates the form or nature of a collection.

When the collection-type value is drawn from a code list or other formal enumeration, EPUB creators SHOULD attach a scheme attribute to identify its source.

This specification also defines the following collection types when no scheme is specified:

series

A sequence of related works that are formally identified as a group, typically open-ended with works issued individually over time.

set

A finite collection of works that together constitute a single intellectual unit, typically issued together and able to be sold as a unit.

Note

Although reading systems are not required to support these values, specifying them provides the option to group related EPUB publications in more meaningful ways.

Allowed value(s): xsd:string
Cardinality: zero or one
Extends: belongs-to-collection

D.3.5 display-seq

Name: display-seq
Description:

The display-seq property indicates the numeric position in which to display the current property relative to identical metadata properties.

This property only applies where precedence rules have not already been defined (e.g., precedence is given to creators based on their appearance in document order).

Allowed value(s): xsd:unsignedInt
Cardinality: zero or one
Extends: All properties.

D.3.6 file-as

Name: file-as
Description: The file-as property provides the normalized form of the associated property for sorting.
Allowed value(s): xsd:string
Cardinality: zero or one
Extends: All properties.

D.3.7 group-position

Name: group-position
Description:

The group-position property indicates the numeric position in which the EPUB publication is ordered relative to other works belonging to the same group (whether all EPUB publications or not).

EPUB creators can attach the group-position property to any metadata property that establishes the group but it is typically associated with the belongs-to-collection property.

An EPUB publication can belong to more than one group.

Allowed value(s): A single xsd:unsignedInt or series of decimal-separated numbers (e.g., 1 or 2.2.1).
Cardinality: zero or one
Extends: All properties.

D.3.8 identifier-type

Name: identifier-type
Description:

The identifier-type property indicates the form or nature of an identifier.

When the identifier-type value is drawn from a code list or other formal enumeration, EPUB creators SHOULD attach a scheme attribute to identify its source.

Allowed value(s): xsd:string
Cardinality: zero or one
Extends: dc:identifier, dc:source

D.3.9 meta-auth (deprecated)

Use of this property is deprecated.

Refer to the meta-auth property definition in [epubpublications-30] for more information.

D.3.10 role

Name: role
Description:

The role property describes the role of a creator, contributor or publisher in the creation of an EPUB publication.

When the role value is drawn from a code list or other formal enumeration, EPUB creators SHOULD attach a scheme attribute to identify its source.

When attaching multiple roles to an individual or organization, the importance of the roles should match the document order of their containing meta elements (i.e., the first meta element encountered should contain the most important role).

Allowed value(s): xsd:string
Cardinality: zero or more
Extends: dc:contributor, dc:creator, dc:publisher

D.3.11 source-of

Name: source-of
Description:

The source-of property indicates a unique aspect of an adapted source resource that has been retained in the EPUB publication.

This specification defines the pagination value to indicate that the referenced dc:source element is the source of the pagebreak properties defined in the content.

Allowed value(s): pagination
Cardinality: zero or one
Extends: dc:source
Note

See [epub-a11y-tech-11] for information on how to provide accessible page navigation.

D.3.12 term

Name: term
Description:

The term property provides a subject code.

Allowed value(s): xsd:string
Cardinality: zero or one
Extends: dc:subject

D.3.13 title-type

Name: title-type
Description:

The title-type property indicates the form or nature of a title.

When the title-type value is drawn from a code list or other formal enumeration, EPUB creators SHOULD attach a scheme attribute to identify its source. When a scheme is not specified, reading systems SHOULD recognize the following title type values: main, subtitle, short, collection, edition and expanded.

Allowed value(s): xsd:string
Cardinality: zero or one
Extends: dc:title

D.3.14 Examples

D.5 Package rendering vocabulary

The prefix URL for referencing these properties is http://www.idpf.org/vocab/rendition/#.

The "rendition:" prefix is reserved for use with the package rendering properties and does not have to be declared in the package document.

Note

Unlike the other vocabularies in this appendix, the properties in the Package Rendering Vocabulary consist of a mix of properties (expressed in meta elements) and spine overrides (expressed on itemref elements).

The usage requirements are also defined in 8. Layout rendering control not in this appendix. The following table provides a map to the properties, overrides, and where they are defined.

Property Overrides Defined in
rendition:layout
  • rendition:layout-pre-paginated
  • rendition:layout-reflowable
8.2.2.1 Layout
rendition:orientation
  • rendition:orientation-auto
  • rendition:orientation-landscape
  • rendition:orientation-portrait
8.2.2.2 Orientation
rendition:spread
  • rendition:spread-auto
  • rendition:spread-both
  • rendition:spread-landscape
  • rendition:spread-none
  • rendition:spread-portrait (Deprecated)
8.2.2.3 Synthetic spreads
  • rendition:page-spread-center
  • rendition:page-spread-left
  • rendition:page-spread-right
8.2.2.4 Spread placement
rendition:viewport (Deprecated) 8.2.2.5 Viewport dimensions (deprecated)
rendition:flow
  • rendition:flow-paginated
  • rendition:flow-scrolled-continuous
  • rendition:flow-scrolled-doc
  • rendition:flow-auto
8.3.1 The rendition:flow property
  • rendition:align-x-center
8.3.2 The rendition:align-x-center property

D.5.1 Custom rendering properties

Reading system developers may introduce functionality not defined in this specification to address reading system-specific issues rendering EPUB content documents.

To facilitate this experimentation, EPUB creators MAY include custom properties and spine overrides for use in the package document provided they do not use the rendition: prefix.

Note

Custom properties should only address rendering issues specific to a particular reading system. This specification should be extended to provide extensions that multiple independent reading systems can use.

D.6 Manifest properties vocabulary

The properties in this vocabulary are usable in the manifest item element's properties attribute.

The prefix URL for referencing these properties is http://idpf.org/epub/vocab/package/item/#.

D.6.1 cover-image

Name: cover-image
Description: The cover-image property identifies the described publication resource as the cover image for the EPUB publication.
Applies to: All raster and vector image types
Cardinality: Zero or one

D.6.2 mathml

Name: mathml
Description: The mathml property indicates that the described publication resource contains one or more instances of MathML markup.
Applies to: EPUB content documents
Cardinality: Zero or more

D.6.3 nav

D.6.4 remote-resources

Name: remote-resources
Description:

The remote-resources property indicates that the described publication resource contains one or more internal references to other publication resources that are located outside of the EPUB container.

Refer to 3.6 Resource locations for more information.

Applies to: All publication resources with the capability of internal referencing (e.g., XHTML content documents, SVG content documents, CSS style sheets and media overlay documents).
Cardinality: Zero or more

D.6.5 scripted

Name: scripted
Description: The scripted property indicates that the described publication resource is a scripted content document (i.e., contains scripting and/or [html] form elements).
Applies to: EPUB content documents
Cardinality: Zero or more

D.6.6 svg

Name: svg
Description:

The svg property indicates that the described publication resource embeds one or more instances of SVG markup.

This property MUST be set when SVG markup is included directly in the resource and MAY be set when the SVG is referenced from the resource (e.g., from an [html] img, object or iframe element).

Applies to: XHTML content documents; the value is implied for SVG content documents.
Cardinality: Zero or more

D.6.7 switch

Name: switch
Description:

The switch property indicates that the described publication resource contains one or more instances of the deprecated epub:switch element.

Applies to: XHTML content documents.
Cardinality: Zero or more

D.7 Spine properties vocabulary

The properties in this vocabulary are usable in the spine itemref element's properties attribute.

The prefix URL for referencing these properties is http://idpf.org/epub/vocab/package/itemref/#.

D.7.1 page-spread-left

Name: page-spread-left
Description:

The page-spread-left property indicates that the first page of the associated item element's EPUB content document represents the left-hand side of a two-page spread.

The rendition:page-spread-left property is an alias for this property. Refer to 8.2.2.4 Spread placement for more information about their use.

D.7.2 page-spread-right

Name: page-spread-right
Description:

The page-spread-right property indicates that the first page of the associated item element's EPUB content document represents the right-hand side of a two-page spread.

The rendition:page-spread-right property is an alias for this property. Refer to 8.2.2.4 Spread placement for more information about their use.

D.7.3 Examples

D.8 Media overlays vocabulary

The properties in this vocabulary are usable in the meta element's property attribute.

The prefix URL for referencing these properties is http://www.idpf.org/epub/vocab/overlays/#.

The prefix "media:" is reserved for use with properties in this vocabulary and does not have to be declared in the package document.

D.8.1 active-class

Name: active-class
Description: EPUB creator-defined CSS class name to apply to the currently playing EPUB content document element.
Allowed value(s): xsd:string
Cardinality: Zero or one
Example: <meta property="media:active-class">-epub-media-overlay-active</meta>

D.8.2 duration

Name: duration
Description: The duration of the entire presentation or of a specific media overlay document. The specified durations account for the audio clips known at authoring time, and so exclude live streaming from external resources and speech synthesis.
Allowed value(s):

MUST be a [smil3] clock value.

Cardinality: Exactly one for the EPUB publication and for each media overlay document.
Example: <meta property="media:duration">1:36:20</meta>

D.8.3 narrator

Name: narrator
Description: Name of the narrator.
Allowed value(s): xsd:string
Cardinality: Zero or more
Example: <meta property="media:narrator">Joe Speaker</meta>

D.8.4 playback-active-class

Name: playback-active-class
Description: Author-defined CSS class name to apply to the EPUB content document's document element when playback is active.
Allowed value(s): xsd:string
Cardinality: Zero or one
Example: <meta property="media:playback-active-class">-epub-media-overlay-playing</meta>

E. Prefixed CSS properties

This appendix describes the prefixed CSS properties supported by EPUB.

Note

The prefix definitions are no longer being synchronized with their CSS counterparts. In some cases, the unprefixed versions of these properties now support additional values. Reading systems may not support the new syntax with the prefixed properties, so EPUB creators are advised to use the unprefixed versions for newer features.

E.1 CSS writing modes

This section describes the -epub- prefixed properties for [css-writing-modes-3].

E.1.1 The -epub-text-orientation property

This property is a prefixed version of the text-orientation property [css-writing-modes-3].

Name: -epub-text-orientation
Value: mixed | upright | sideways | sideways-right

For compatibility with existing content, the -epub-text-orientation property also supports the deprecated vertical-right, rotate-right, and rotate-normal keywords. The following table specifies the effect these have when specified.

Deprecated value Value to be used
vertical-right mixed
rotate-right sideways
rotate-normal sideways

E.1.2 The -epub-writing-mode property

This property is a prefixed version of the writing-mode property [css-writing-modes-3], with the same syntax and behavior.

Name: -epub-writing-mode
Value: horizontal-tb | vertical-rl | vertical-lr

E.1.3 The -epub-text-combine-horizontal and -epub-text-combine properties

These properties are prefixed versions of the text-combine-upright property [css-writing-modes-3], although -epub-text-combine is deprecated.

Name: -epub-text-combine-horizontal
Value: none | all
Name: -epub-text-combine (deprecated)
Value: none | horizontal | horizontal <number>

For compatibility with existing content, the -epub-text-combine-horizontal and -epub-text-combine properties also support a number of deprecated keywords. The following table specifies the effect these have when specified.

Prefixed version CSS equivalent
-epub-text-combine-horizontal: none text-combine-upright: none
-epub-text-combine-horizontal: all text-combine-upright: all
-epub-text-combine: none text-combine-upright: none
-epub-text-combine: horizontal text-combine-upright: all
-epub-text-combine: horizontal <number> no equivalent

E.2 CSS text level 3

This section describes the -epub- prefixed properties (and one prefixed value) for [css-text-3].

E.2.1 The -epub-hyphens property

This property is a prefixed version of the hyphens property [css-text-3].

Name: -epub-hyphens
Value: none | manual | auto | all

For compatibility with existing content, the -epub-hyphens property also supports the deprecated all keyword. The value is no longer supported in CSS and there is no equivalent to use in its place.

E.2.2 The -epub-line-break property

This property is a prefixed version of the line-break property [css-text-3].

Name: -epub-line-break
Value: auto | loose | normal | strict

E.2.3 The -epub-text-align-last property

This property is a prefixed version of the text-align-last property [css-text-3].

Name: -epub-text-align-last
Value: auto | start | end | left | right | center | justify

E.2.4 The -epub-word-break property

This property is a prefixed version of the word-break property [css-text-3].

Name: -epub-word-break
Value: normal | keep-all | break-all

E.2.5 The text-transform property

This property is a prefixed value for the text-transform property [css-text-3].

Name: text-transform
Value: -epub-fullwidth

For compatibility with existing content, the text-transform property also supports the deprecated -epub-fullwidth keyword. When specified, this has the same effect as text-transform: full-width.

E.3 CSS text decoration level 3

This section describes the -epub- prefixed properties for [css-text-decor-3].

E.3.1 The -epub-text-emphasis-color Property

This property is a prefixed version of the text-emphasis-color property [css-text-decor-3].

Name: -epub-text-emphasis-color
Value: <color>

E.3.2 The -epub-text-emphasis-position property

This property is a prefixed version of the text-emphasis-position property [css-text-decor-3].

Name: -epub-text-emphasis-position
Value: [ over | under ] && [ right | left ]

E.3.3 The -epub-text-emphasis-style property

This property is a prefixed version of the text-emphasis-style property [css-text-decor-3].

Name: -epub-text-emphasis-style
Value: none | [ [ filled | open ] || [ dot | circle | double-circle | triangle | sesame ] ] | <string>

E.3.4 The -epub-text-underline-position property

This property is a prefixed version of the text-underline-position property [css-text-decor-3].

Name: -epub-text-underline-position
Value: auto | [ under || [ left | right ] ] | alphabetic

For compatibility with existing content, the value -epub-text-underline-position property also supports the deprecated alphabetic keyword. When specified, this has the same effect as text-underline-position: auto.

F. The viewport meta tag

F.1 Introduction

This section is non-normative.

As the Safari HTML definition of the viewport meta tag, that was used in earlier versions of EPUB 3, is not an officially recognized standard, this specification defines a basic syntax in order to allow EPUB creators to express width and height dimensions for use rendering fixed-layout documents.

The syntax of this grammar is also influenced by the parsing algorithm for the viewport meta tag, as defined in [css-viewport-1].

The syntax is intentionally left as generic as possible as it is not in this specification's scope to define all the possible properties and values. It only defines the basic requirements for defining a property and value pair as well as the possible separators between expressions.

F.2 Syntax

For fixed-layout documents, a viewport meta tag [html] MUST have name and content attributes that conform to the following definition:

name
The value of the name attribute [html] after whitespace normalization [xml] MUST be viewport.
content

The value of the content attribute [html] after whitespace normalization [xml] MUST be of the following form:

(EBNF productions ISO/IEC 14977)
All terminal symbols are in the Unicode Block 'Basic Latin' (U+0000 to U+007F).
viewport = property, { sep, property } ;
property = name, [ assign, value ] ;
name = ? character data ? ;
value = ? character data ? ;
sep = sep-char, { sep-char } ;
sep-char = ( ";" | "," | space ) ;
assign = [ space ], "=", [ space ] ;
space = #x20 ;

The only restriction on property names and values is that they MUST NOT contain separator characters or the assignment character.

The authoring requirements in this section apply after whitespace normalization [xml] (i.e., after a reading system strips leading and trailing whitespace and compacts all instances of multiple whitespace within the attribute to single spaces). EPUB creators MAY include any valid ascii whitespace [infra] in the authored tag so long as the result is valid to this definition.

There are no restrictions on any other attributes allowed on the meta element by the [html] grammar.

Note

For more information about specifying the required height and width properties, and their required values, refer to 8.2.2.6 Content document dimensions.

Although the viewport meta tag allows EPUB creators to use properties other than height and width, and to not include values for these properties, such use is strongly discouraged. Setting other properties may have unintended consequences on the rendering of fixed-layout documents.

G. Schemas

This section is non-normative.

G.1 Package document schema

A schema for package documents is available at https://github.com/w3c/epubcheck/tree/master/src/main/resources/com/adobe/epubcheck/schema/30/package-30.nvdl.

Validation using this schema requires a processor that supports [nvdl], [relaxng-schema], [isoschematron] and [xmlschema-2].

Note

The NVDL schema layer can be substituted by a multi-pass validation using the embedded RELAX NG and ISO Schematron schemas alone.

Note

These schemas may be updated and corrected outside of formal revisions of this specification. As a result, they are subject to change at any time.

G.2 OCF schemas

G.2.1 Schema for container.xml

A schema for container.xml files is available at https://github.com/w3c/epubcheck/tree/master/src/main/resources/com/adobe/epubcheck/schema/30/ocf-container-30.nvdl.

Validation using this schema requires a processor that supports [relaxng-schema] and [xmlschema-2].

G.2.2 Schema for encryption.xml

The schema for encryption.xml files is included in [xmlsec-rngschema-20130411].

G.2.3 Schema for signatures.xml

The schema for signatures.xml files is included in [xmlsec-rngschema-20130411].

G.3 Media overlays schema

A schema for media overlay documents is available at https://github.com/w3c/epubcheck/tree/main/src/master/resources/com/adobe/epubcheck/schema/30/media-overlay-30.nvdl.

Validation using this schema requires a processor that supports [nvdl], [relaxng-schema], [isoschematron] and [xmlschema-2].

Note

The NVDL schema layer can be substituted by a multi-pass validation using the embedded RELAX NG and ISO Schematron schemas alone.

H. Detailed examples

This section is non-normative.

H.1 Resources

Consider the following extracts of a package document and an XHTML content document:

<package …>
    <metadata …><link rel="record" 
            href="meta/data.xml" 
            media-type="application/marc"/>
        <link rel="record" 
            href="https://www.example.org/meta/data2.xml" 
            media-type="application/marc"/></metadata>
    <manifest><item id="page"
            href="page.xhtml"
            media-type="application/xhtml+xml"/>
        <item id="nav"
            href="nav.xhtml"
            media-type="application/xhtml+xml"
            properties="nav"/>
        <item id="style"        
            href="style.css"
            media-type="text/css"/>
        <item id="font_otf"
            href="fonts/font-file.otf"
            media-type="font/otf"/>
        <item id="font_otf_remote"
            href="https://www.example.org/fonts/font-file2.otf"
            media-type="font/otf"/>
        <item id="font_cff"
            href="fonts/font-file.cff"
            media-type="font/sfnt"/>
        <item id="pls"
            href="speech/cmn.pls"
            media-type="application/pls+xml"/>
        <item id="image_1"
            href="media/image_1.png"
            media-type="image/png"/>
        <item id="image_2"
            href="media/image_2.png"
            media-type="image/png"
            fallback="image_desc"/>
        <item id="image_desc"
            href="image_desc.xhtml"
            media-type="application/xhtml+xml"/>
        <item id="image_3_heic"
            href="media/image_3.heic"
            media-type="image/heic"/>
        <item id="image_3_png"
            href="media/image_3.png"
            media-type="image/png"/>
        <item id="widget"
            href="widget.xhtml"
            media-type="application/xhtml+xml"/></manifest>
    <spine><itemref idref="page_001"/>
        <itemref idref="image_2"/></spine>
</package>
	
<html …>
    <head …><link rel="stylesheet" type="text/css" href="style.css"/>
        <link rel="pronunciation" type="application/pls+xml" href="speech/cmn.pls"/></head>
    <body>
        <img src="media/image1_png"/><a href="media/image_2.png"></a><picture>
            <source srcset="media/image_3.heic" type="image/heic"/>
            <img src="media/image_3.png"/>
        </picture><iframe src="widget.xhtml"></iframe><a href="https://www.example.org/some_content"></a>
    </body>
</html>

The various resources in the EPUB publication can be categorized as follows. (Refer to 3. Publication resources for more information about these categories.)

meta/data.xml

The resource is a metadata record, stored in the EPUB container. It is linked via a link element in the package document metadata. It is therefore a linked resource on the manifest plane (i.e., is not listed in the manifest). It is not part on any other planes.

https://www.example.org/meta/data2.xml

The resource is a metadata record, stored remotely. It is linked via a link element in the package document metadata. It is therefore a linked resource on the manifest plane, (i.e., it is not listed in the manifest). It is not part on any other planes.

page.xhtml

The resource is an XHTML document. It is listed in the [[EPUB spine | spine=]. It is a publication resource on the manifest plane, a container resource, an EPUB content document on the spine plane, and is not present on the content plane. No fallback is necessary.

nav.xhtml

The resource is the EPUB navigation document. It is not listed in the spine. It is a publication resource on the manifest plane, a container resource, and is not present on either the spine plane or the content plane. No fallback is necessary.

style.css

The resource is a CSS file. It is not listed in the spine but is referenced from an [html] link element. It is a publication resource on the manifest plane, a container resource, is not present on the spine plane, and is a core media type resource on the content plane. No fallback is necessary.

font/font-file.otf

The resource is a TrueType font file. It is not listed in the spine but is referenced from a CSS file. It is a publication resource on the manifest plane, is a container resource, is not present on the spine plane, and is a core media type resource on the content plane. No fallback is necessary.

https://www.example.org/fonts/font-file2.otf

The resource is a TrueType font file. It is not listed in the spine but is referenced from a CSS file. It is a publication resource on the manifest plane, is a remote resource, is not present on the spine plane, and is a core media type resource on the content plane. No fallback is necessary.

font/font-file.cff

The resource is a font file in Compact Font Format. It is not listed in the spine but is referenced from a CSS file. Its media type is not listed as a core media type. It is a publication resource on the manifest plane, a container resource, is not present on the spine plane, and is an exempt resource on the content plane. No fallback is necessary.

speech/cmn.pls

The resource is a Pronunciation Lexicon file. It is not listed in the spine but is referenced from an [html] link element. It is a publication resource on the manifest plane, a container resource, not present on the spine plane, and is an exempt resource on the content plane. No fallback is necessary.

image/image_1.png

The resource is a PNG image file. It is not listed in the spine but is referenced from an [html] img element. It is a publication resource on the manifest plane, a container resource, is not present on the spine plane, and is a core media type resource on the content plane. No fallback is necessary.

image/image_2.png

The resource is a PNG image file. It is referenced via an [html] a element. Because it is referenced from a hyperlink, it must be listed in the spine. It is a publication resource on the manifest plane, a container resource, a foreign content document on the spine plane, and a core media type resource on the content plane. As a foreign content document, a fallback is required, which is provided via a manifest fallback.

image_desc.xhtml

The resource is an XHTML document. It is the "target" of a manifest fallback so is not explicitly listed in the spine (but it "replaces" the existing spine item when needed). It is a publication resource on the manifest plane, a container resource, an EPUB content document on spine plane, and, because it is not "used" when rendering another EPUB content document, it is not present on the content plane. No fallback is necessary.

image/image_3.heic

The resource is a High Efficiency (HEIC) image file. It is not listed in the spine but is referenced from an [html] source element. Its media type is not listed as a core media type. It is a publication resource on the manifest plane, a container resource, is not present on the spine plane, and is a foreign resource on the content plane. As a foreign resource, a fallback is required, which is provided via the sibling [html] img element in an [html] picture element.

image/image_3.png

The resource is a PNG image file. It is not listed in the spine but is referenced from an [html] img element that is used as an intrinsic fallback of the [html] picture element. It is a publication resource on the manifest plane, a container resource, is not present on the spine plane, and is a core media type resource on the content plane. No fallback is necessary.

widget.xhtml

The resource is an XHTML document. It is not listed in the spine but is referenced from an [html] iframe element. It is a publication resource on the manifest plane, a container resource, is not present on spine plane, and, because it is "used" when rendering another EPUB content document, a core media type resource on the content plane. No fallback is necessary.

https://www.example.org/some_content

The resource is referenced via an [html] a element and is not stored in the EPUB container. Reading systems will normally open this link via a separate browser instance. It is not on any planes defined by this specification.

Additional examples on the usage of different types of resources can be found in 5.6.2.2 Examples.

H.2 Scripting contexts

Consider the following example package document:

<package …><manifest><item id="chap01" 
            href="scripted01.xhtml" 
            media-type="application/xhtml+xml"
            properties="scripted"/>
        <item id="inset01" 
            href="scripted02.xhtml" 
            media-type="application/xhtml+xml"
            properties="scripted"/>
        <item id="slideshowjs" 
            href="slideshow.js" 
            media-type="text/javascript"/>
    </manifest>
    
    <spine …>
        <itemref idref="chap01"/></spine></package>

and the following file scripted01.xhtml:

<html …>
    <head><script type="text/javascript">
            alert("Reading system name: " + navigator.epubReadingSystem.name);
        </script>
    </head>
    <body><iframe src="scripted02.xhtml" … /></body>
</html>

and the following file scripted02.xhtml:

<html …>
    <head><script type="text/javascript" href="slideshow.js"></script>
    </head>
    <body></body>
</html>

From these examples, it is true that:

H.3 Packaged EPUB

This example demonstrates the use of the OCF format to contain a signed and encrypted EPUB publication within an OCF ZIP container.

Ordered list of files in the OCF ZIP container:

mimetype
META-INF/container.xml
META-INF/signatures.xml
META-INF/encryption.xml
EPUB/As_You_Like_It.opf
EPUB/book.html
EPUB/nav.html
EPUB/images/cover.png

The contents of the mimetype file

application/epub+zip

The contents of the META-INF/container.xml file

<?xml version="1.0"?>
<container
    version="1.0"
    xmlns="urn:oasis:names:tc:opendocument:xmlns:container">
   <rootfiles>
      <rootfile
          full-path="EPUB/As_You_Like_It.opf"
          media-type="application/oebps-package+xml"/>
   </rootfiles>
</container>

The contents of the META-INF/signatures.xml file

<signatures
    xmlns="urn:oasis:names:tc:opendocument:xmlns:container">
   <Signature
       Id="AsYouLikeItSignature"
       xmlns="http://www.w3.org/2000/09/xmldsig#">
        
      <!--
           SignedInfo is the information that is actually signed.
           In this case, the SHA-1 algorithm is used to sign the 
           canonical form of the XML documents enumerated in the
           Object element below.
      -->
      
      <SignedInfo>
         <CanonicalizationMethod
             Algorithm="http://www.w3.org/TR/2001/REC-xml-c14n-20010315"/>
         <SignatureMethod
             Algorithm="http://www.w3.org/2000/09/xmldsig#dsa-sha1"/>
         <Reference
             URI="#AsYouLikeIt">
            <DigestMethod
                Algorithm="http://www.w3.org/2000/09/xmldsig#sha1"/>
            <DigestValue></DigestValue>
         </Reference>
      </SignedInfo>
      
      <!--
           The signed value of the digest above, using the DSA 
           algorithm
      -->
      <SignatureValue></SignatureValue>
      
      <!--
           The key used to validate the signature
      -->
      <KeyInfo>
         <KeyValue>
            <DSAKeyValue>
               <P></P>
               <Q></Q>
               <G></G>
               <Y></Y>
            </DSAKeyValue>
         </KeyValue>
      </KeyInfo>
      
      <!--
           The list of resources to sign (note that the canonical
           form of XML documents is signed, while the binary form
           of all other resources is used)
      -->
      <Object>
         <Manifest
             Id="AsYouLikeIt">
            <Reference
                URI="EPUB/As_You_Like_It.opf">
               <Transforms>
                  <Transform
                      Algorithm="http://www.w3.org/TR/2001/REC-xml-c14n-20010315"/>
               </Transforms>
               <DigestMethod
                   Algorithm="http://www.w3.org/2000/09/xmldsig#sha1"/>
               <DigestValue>
               </DigestValue>
            </Reference>
            
            <Reference URI="EPUB/book.html">
               <Transforms>
                  <Transform
                      Algorithm="http://www.w3.org/TR/2001/REC-xml-c14n-20010315"/>
               </Transforms>
               <DigestMethod
                   Algorithm="http://www.w3.org/2000/09/xmldsig#sha1"/>
               <DigestValue>
               </DigestValue>
            </Reference>
            
            <Reference
                URI="EPUB/images/cover.png">
               <DigestMethod
                   Algorithm="http://www.w3.org/2000/09/xmldsig#sha1"/>
               <DigestValue>
               </DigestValue>
            </Reference>
         </Manifest>
      </Object>
   </Signature>
</signatures>

The contents of the META-INF/encryption.xml file

<?xml version="1.0"?>
<encryption
    xmlns="urn:oasis:names:tc:opendocument:xmlns:container"
    xmlns:enc="http://www.w3.org/2001/04/xmlenc#"
    xmlns:ds="http://www.w3.org/2000/09/xmldsig#">

   <!--
        The RSA-encrypted AES-128 symmetric key used to encrypt
        data enumerated in EncryptedData blocks below
   -->
   <enc:EncryptedKey
       Id="EK">
      <enc:EncryptionMethod
          Algorithm="http://www.w3.org/2001/04/xmlenc#rsa-1_5"/>
      <ds:KeyInfo>
         <ds:KeyName>
            John Smith
         </ds:KeyName>
      </ds:KeyInfo>
      <enc:CipherData>
         <enc:CipherValue>
            xyzabc…
         </enc:CipherValue>
      </enc:CipherData>
   </enc:EncryptedKey>

   <!--
        Each EncryptedData block identifies a single resource
        that has been encrypted using the AES-128 algorithm.
        The data remains stored, in its encrypted form, in the
        original file within the container.
   -->
   <enc:EncryptedData Id="ED1">
      <enc:EncryptionMethod
          Algorithm="http://www.w3.org/2001/04/xmlenc#kw-aes128"/>
      <ds:KeyInfo>
         <ds:RetrievalMethod
             URI="#EK"
             Type="http://www.w3.org/2001/04/xmlenc#EncryptedKey"/>
      </ds:KeyInfo>
      <enc:CipherData>
         <enc:CipherReference
             URI="EPUB/book.html"/>
      </enc:CipherData>
   </enc:EncryptedData>

   <enc:EncryptedData Id="ED2">
      <enc:EncryptionMethod
          Algorithm="http://www.w3.org/2001/04/xmlenc#kw-aes128"/>
      <ds:KeyInfo>
         <ds:RetrievalMethod
             URI="#EK" Type="http://www.w3.org/2001/04/xmlenc#EncryptedKey"/>
      </ds:KeyInfo>
      <enc:CipherData>
         <enc:CipherReference
             URI="EPUB/images/cover.png"/>
      </enc:CipherData>
   </enc:EncryptedData>
</encryption>

The contents of the EPUB/As_You_Like_It.opf file

<?xml version="1.0"?>
<package
    version="3.0"
    xml:lang="en"
    xmlns="http://www.idpf.org/2007/opf"
    unique-identifier="pub-id">
    
   <metadata
       xmlns:dc="http://purl.org/dc/elements/1.1/">
      
      <dc:identifier
          id="pub-id">
         urn:uuid:B9B412F2-CAAD-4A44-B91F-A375068478A0
      </dc:identifier>
      
      <dc:language>
         en
      </dc:language>
      
      <dc:title>
         As You Like It
      </dc:title>
       
      <dc:creator
          id="creator">
         William Shakespeare
      </dc:creator>
      
      <meta
          property="dcterms:modified">
         2000-03-24T00:00:00Z
      </meta>
      
      <dc:publisher>
         Project Gutenberg
      </dc:publisher>
      
      <dc:date>
         2000-03-24
      </dc:date>
      
      <meta
          property="dcterms:dateCopyrighted">
         9999-01-01
      </meta>
      
      <dc:identifier
          id="isbn13">
         urn:isbn:9780741014559
      </dc:identifier>
      
      <dc:identifier
          id="isbn10">
         0-7410-1455-6
      </dc:identifier>
      
      <link
          rel="xml-signature"
          href="../META-INF/signatures.xml#AsYouLikeItSignature"/>
   </metadata>
    
   <manifest>
      <item id="r4915" 
          href="book.html" 
          media-type="application/xhtml+xml"/>
      <item id="r7184" 
          href="images/cover.png" 
          media-type="image/png"/>
      <item id="nav" 
          href="nav.html" 
          media-type="application/xhtml+xml" 
          properties="nav"/>
   </manifest>
   
   <spine>
      <itemref
          idref="r4915"/>
   </spine>
</package>

H.4 Clock values

The following are examples of allowed clock values:

I. Media type registrations

I.1 The application/oebps-package+xml media type

This appendix registers the media type application/oebps-package+xml for the EPUB package document. This registration supersedes RFC4839 (see https://www.rfc-editor.org/rfc/rfc4839).

The package document is an XML file that describes an EPUB publication. It identifies the resources in the EPUB publication and provides metadata information. The package document and its related specifications are maintained and defined by the World Wide Web Consortium (W3C).

MIME media type name:

application

MIME subtype name:

oebps-package+xml

Required parameters:

None.

Optional parameters:

None.

Encoding considerations:

8bit if UTF-8; binary if UTF-16.

Package documents are in XML, represented either in UTF-8 or UTF-16. When the package document is written in UTF-8, the file is 8bit compatible. When it is written in UTF-16, the binary content-transfer-encoding must be used. For further details, see [rfc7303].

Security considerations:

Package documents contain well-formed XML conforming to the XML 1.0 specification.

Clearly, it is possible to author malicious files which, for example, contain malformed data. Most XML parsers protect themselves from such attacks by rigorously enforcing conformance.

All processors that read package documents should rigorously check the size and validity of data retrieved.

There is no current provision in the EPUB 3 specification for encryption, signing, or authentication within the package document format.

Interoperability considerations:

None.

Published specification:

This media type registration is for the EPUB package document, as described by the EPUB 3 specification located at https://www.w3.org/TR/epub-33/.

The EPUB 3 specification supersedes the Open Packaging Format 2.0.1 specification, which is located at https://idpf.org/epub/20/spec/OPF_2.0.1_draft.htm and which also uses the application/oepbs-package+xml media type.

Applications which use this media type:

This media type is in wide use for the distribution of ebooks in the EPUB format.

Additional information:
Magic number(s):

none

File extension(s):

.opf

Macintosh File Type Code(s):

TEXT

Fragment identifiers:

EPUB Canonical Fragment Identifiers are custom fragment identifiers defined for EPUB Publications. They may be used to refer to an arbitrary content within any publication resource defined for the publication. These identifiers are defined at https://idpf.org/epub/linking/cfi/.

Person & email address to contact for further information:

EPUB 3 Working Group (public-epub-wg@w3.org)

Intended usage:

COMMON

Author/Change controller:

World Wide Web Consortium (W3C)

I.2 The application/epub+zip media type

This appendix registers the media type application/epub+zip for the EPUB Open Container Format (OCF).

An OCF ZIP container, or EPUB container, file is a container technology based on the zip archive format (see https://pkware.cachefly.net/webdocs/casestudies/APPNOTE.TXT). It is used to encapsulate the EPUB publication. OCF and its related standards are maintained and defined by the World Wide Web Consortium (W3C).

MIME media type name:

application

MIME subtype name:

epub+zip

Required parameters:

None.

Optional parameters:

None.

Encoding considerations:

OCF ZIP container files are binary files encoded in the application/zip media type.

Security considerations:

All processors that read OCF ZIP container files should rigorously check the size and validity of data retrieved.

In addition, because of the various content types that can be embedded in OCF ZIP container files, application/epub+zip may describe content that poses security implications beyond those noted here. However, only in cases where the processor recognizes and processes the additional content, or where further processing of that content is dispatched to other processors, would security issues potentially arise. In such cases, matters of security would fall outside the domain of this registration document.

Security considerations that apply to application/zip also apply to OCF ZIP container files.

Interoperability considerations:

None.

Published specification:

This media type registration is for the EPUB Open Container Format (OCF), as described by the EPUB 3 specification located at https://www.w3.org/TR/epub-33/.

The EPUB 3 specification supersedes both RFC 4839 and the Open Container Format 2.0.1 specification, which is located at https://idpf.org/epub/20/spec/OCF_2.0.1_draft.doc, and which also uses the application/epub+zip media type.

Applications that use this media type:

This media type is in wide use for the distribution of ebooks in the EPUB format.

Additional information:
Magic number(s):

0: PK 0x03 0x04, 30: mimetype, 38: