OCF Functionality

From Digital Publishing Interest Group
Jump to: navigation, search

This page provides an overview of all functionality built on top of ZIP in the OCF container -- the packaging mechanism for EPUB. It does not address restrictions that OCF places on ZIP archives.

See EPUB Open Container Format (OCF) 3.0.1 for more details.

Magic Numbers

The required mimetype file in the root of the zip container enables the following magic numbers: 30: mimetype, 38: application/epub+zip

These numbers allow the otherwise generic zip container to be identified as containing an epub, without resorting to file extensions.

Abstract Container

OCF defines the "abstract container" (virtual file system) within the zip archive. The root of the abstract container is the root of the zip container, and must contain the mimetype file and META-INF directory.

Relative referencing is used in the abstract container, except for files in the META-INF directory which use the container root as their base IRI.

The abstract container also places requirements on file and path naming: utf-8 encoding, file length not more than 255 characters, path restriction of not more than 65535 characters, a restricted character set, plus character normalization and uniqueness requirements.

META-INF

The META-INF directory contains one or more special files that contain meta information about the publication: where to locate the package files for each rendition, how resources are encrypted, etc. The following sections provide a quick synopsis of these files.

Rendition Identification

The container.xml file provides the means of locating each package document in the container, where each package document defines one rendition of the content.

It also allows for selection of the most appropriate rendition using a set of selection attributes on rootfile elements, as defined in the Multiple Renditions specification.

An extensible linking mechanism allows publication-wide resources to be included (e.g., a mapping document to retain the user's location when switching from one rendition to another).

Resource Encyption

The encryption.xml file provides encryption information for each encrypted resource in the container (the resource that has been encrypted, the encryption scheme, and public key). OCF places no restrictions on the encryption method used.

The epub as a whole cannot be encrypted, since that would require a default encryption method. The OCF specification therefore restricts a minimal set of files from being encrypted to ensure interoperability: the mimetype, package documents, and meta-inf documents defined in this section.

The syntax of this file is defined in XML Encryption Syntax and Processing Version 1.1.

Publication Manifest

The manifest.xml file currently unused in epub, but allows for a listing of all resources for all renditions.

The resources needed for each rendition are specified in the rendition's package document, so epubs generally lack a universal resource list.

Publication Metadata

The metadata.xml file is an optional file that allows publication metadata to be expressed (i.e., metadata common to the publication, not individual renditions of it).

Currently it is only required when multiple renditions are present, and only a release identifier is required (dc:identifier + dcterms:modified).

Rights Management

The rights.xml file is an optional file that contains rights management information.

This file is currently not standardized for use in epub.

Signatures

The signatures.xml is an optional file that holds digital signatures for the container and its contents.

The syntax of this file is defined in XML-Signature Syntax and Processing Version 1.1.

Resource Obfuscation

OCF defines an algorithm for obfuscating resources in the container (typically fonts). Each resource that is obfuscated is identified in the encryption.xml file.

The algorithm is trivial to undo and is not intended as a replacement for DRM, only where a modest measure of protection is necessary.