W3C

Compound Document by Reference Use Cases and Requirements Version 1.0

W3C Working Draft 9 August 2005

This version:
http://www.w3.org/TR/2005/WD-CDRReqs-20050809/
Latest version:
http://www.w3.org/TR/CDRReqs/
Previous version:
http://www.w3.org/TR/2005/WD-CDRReqs-20050404/
Editors:
Daniel Appelquist, Vodafone Group Services Limited
Timur Mehrvarz, Vodafone Group Services Limited
Antoine Quint, Fuchsia Design (Invited Expert)

Abstract

This document describes the use cases for a framework that combines documents by reference and the set of requirements for such a framework.

Status of this Document

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.

This is the third Working Draft of the Compound Document by Reference Use Cases and Requirements document. It has been produced by the Compound Document Formats Working Group, which is part of the CDF Activity.

This is a Working Draft and is expected to change. The CDF Working Group does not expect this document to become a Recommendation. This document, after review and refinement, will be published and maintained as a Working Group Note.

In parallel, the CDF Working Group is producing a specification that combines document formats by reference, and meets the requirements listed here.

Comments and discussion on this document should be sent to public-cdf@w3.org (public archives).

Publication as a Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.

Table of Contents

1 Introduction
    1.1 Approach
        1.1.1 Phases of Work
        1.1.2 By Reference
        1.1.3 By Inclusion
    1.2 Definition of Rich Multimedia Content
    1.3 Relationships With Other Technologies
        1.3.1 HTML and XHTML
        1.3.2 SVG
        1.3.3 SMIL
        1.3.4 XLink
        1.3.5 XInclude
        1.3.6 SOAP Attachments and Optimization Technologies
        1.3.7 OASIS's Open Document Format
        1.3.8 XForms
        1.3.9 XHTML + Voice
    1.4 Definition of Terms
    1.5 Who/what will implement the specification
        1.5.1 Authoring tools and content production systems
        1.5.2 User agents, such as browsers and plug-ins
        1.5.3 Intermediary Processors, content renderers
2 Use Cases
    2.1 Web Publishing and Broadcasting
        2.1.1 Interactive News Service
        2.1.2 Web Portal
        2.1.3 News or stock ticker
        2.1.4 Document viewing
        2.1.5 Infotainment
    2.2 Web Applications
        2.2.1 Reservation system
        2.2.2 Order entry system
        2.2.3 On-line shopping
        2.2.4 Survey
        2.2.5 Games
        2.2.6 Interactive maps
    2.3 Resident Applications
        2.3.1 Personal Information Management
        2.3.2 Communication
        2.3.3 Dynamic graphics as background or screen saver in mobile phone
        2.3.4 Document viewing
        2.3.5 Interactive Icon bar
        2.3.6 Interactive Maps
    2.4 Content Authoring, Aggregation and Deployment
        2.4.1 Personal content creation
        2.4.2 Personal content management
        2.4.3 Web logs
        2.4.4 Content aggregation
        2.4.5 Professional content management
    2.5 Navigation
        2.5.1 SVG used as logo or advertisement (no links in SVG).
        2.5.2 SVG embedded interactive map (links in SVG).
        2.5.3 Menu with SVG images that animate and expand when focused (no links in SVG)
        2.5.4 SVG image is embedded interactive part of the Web page (SVG has links)
    2.6 Specific Use Cases
        2.6.1 Synchronization of HTML content with audio visual content:
        2.6.2 Animated SVG icons in HTML menu
        2.6.3 Visualization of poling results in HTML page
3 Requirements
    3.1 High-Level Requirements
        3.1.1 CDR MUST exploit existing specifications, favoring W3C specifications wherever possible and limit the definition of new markup unless absolutely required for integration purposes
        3.1.2 CDR MUST provide the ability for content developers to describe or author rich media content define Rich Multimedia Content
        3.1.3 CDR MUST specify a base set of formats, corresponding profiles and versions
        3.1.4 Each CDR profile and version MUST specify, which formats can be referenced
        3.1.5 CDR MUST specify, for each format, the element used to reference other formats, if any.
        3.1.6 CDR MUST specify generic integration techniques
        3.1.7 CDR MUST support temporal synchronization of dynamic content coming from multiple references, possibly with multiple references to the same source.
        3.1.8 CDR MUST support event mechanisms that cross namespace boundaries
        3.1.9 CDR MUST support scriptability
        3.1.10 CDR MUST say the allowed nesting level of referencing
        3.1.11 CDR MUST explain how scripting interacts between components and the parent document
        3.1.12 CDR profiles MUST specify how event propagation works across namespace boundaries.
        3.1.13 CDR profiles MUST specify how focus traversal works with referenced documents.
        3.1.14 CDR profiles MUST specify how link activation work with referenced document.
        3.1.15 CDR profiles MUST specify triggering of animations across namespaces.
        3.1.16 CDR MUST support fragment identifiers in cross-namespace interaction
        3.1.17 CDR profiles SHOULD provide a method for adding event handlers using declarative markup for the formats it uses
        3.1.18 CDR documents MUST cater for accessibility requirements
        3.1.19 CDR documents MUST support dynamic updating
        3.1.20 CDR must define its integration into the Web Architecture. It must include delivery over HTTP and should also strive to be transport independent
        3.1.21 CDR MUST NOT prevent compression of the data
        3.1.22 CDR MUST define the way soft-keys and accesskeys are handled across document components
        3.1.23 CDR User Agents MUST provide a default font for use by all components
        3.1.24 CDR MUST NOT prevent server-side adaptation
        3.1.25 CDR MUST support limited bandwidth networks and limited capability devices
        3.1.26 CDR Profiles MUST define clear document conformance criteria
        3.1.27 CDR Profiles MUST define clear user agent conformance criteria
        3.1.28 CDR Profiles SHOULD provide a way to know the current loading status of a referenced component
        3.1.29 CDR MUST provide a solution for packaging of self-contained, static content
        3.1.30 CDR MAY provide a solution for packaging of streamed content
    3.2 CDR Profile 1 Requirements (Rich Multimedia Content)
        3.2.1 CDR Profile 1 MUST specify a user interaction model
        3.2.2 CDR Profile 1 MUST explain how a User Agent is able to identify a CDR Profile 1 document
        3.2.3 CDR Profile 1 MUST support 2D scalable vector graphics
        3.2.4 CDR Profile 1 MUST support audio
        3.2.5 CDR Profile 1 SHOULD support video
        3.2.6 CDR Profile 1 MUST support grid, flow, overlapping layouts
        3.2.7 CDR Profile 1 MUST support SVG backgrounds
        3.2.8 CDR Profile 1 MAY support XHTML backgrounds
        3.2.9 CDR Profile 1 MUST support identification of markup and versions in CDF documents
        3.2.10 CDR Profile 1 MUST support scalable diagrams that can be animated and can cause link traversal
        3.2.11 CDR Profile 1 MUST define how to reference SVGT graphics and resources from an XHTML document
        3.2.12 CDR Profile 1 MUST support advertising the specific supported versions of formats and capabilities in headers
        3.2.13 CDR Profile 1 MUST support XHTML as a root/host language
        3.2.14 The XHTML <object> element MUST be used for referring to other formats from XHTML
        3.2.15 CDR Profile 1 MUST define the interaction model for an SVG document referenced by an XHTML document
        3.2.16 CDR Profile 1 MUST define for animated SVG icons to act like HTML images (no need for interactivity, links, zoom and pan)
        3.2.17 CDR Profile 1 MUST define a way for events to trigger SVG animation
        3.2.18 CDR Profile 1 MUST define the process for real-estate negotiation between an XHTML document and a referenced SVG document
        3.2.19 CDR Profile 1 MUST define handling of leftover SVG area
        3.2.20 CDR Profile 1 MUST define system font support in SVG
        3.2.21 CDR Profile 1 SHOULD provide temporal synchronization with dynamic media
        3.2.22 CDR Profile 1 MAY provide functionality to stop and start media objects
        3.2.23 CDR Profile 1 MUST support a unified rendering and processing model
        3.2.24 CDR Profile 1 SHOULD provide a way to play an animation while some referenced components of the Combined Document are loading
        3.2.25 CDR Profile 1 MUST specify the behavior of audio mixing

Appendices

A References (Non-Normative)
B Acknowledgements (Non-Normative)
C Changes Log (Non-Normative)


1 Introduction

1.1 Approach

The Compound Document Formats Working Group is producing recommendations on combining separate component languages (e.g. XML-based languages, elements and attributes from separate vocabularies), like XHTML, SVG, XForms, MathML, and SMIL, with a focus on user interface markups. When combining user interface markups, specific problems have to be resolved that are not addressed by the individual markups specifications, such as the propagation of events across markups, the combination of rendering or the user interaction model with a combined document. The Compound Document Formats working group will address this type of problems. This work is divided in phases and two technical solutions: combining by reference and by inclusion.

The group is addressing the semantics of combining markups, which goes beyond the mechanics and syntactical elements used to combine markups. The semantic of combining markup is, to a large extent, specific to any two markups being combined. For example, including SVG markup in an XHTML document can be done in various ways and there is a need to define how the combination is done and what it means, especially with regards to issues mentioned above (such as event propagation, user interactions or rendering). Because defining the combined semantics is complex, because the group needs to deliver a test suite and specific conformance requirements (see charter) and because there is a growing request to support specific versions of specific markups, the group will initially produce a specification that mandates specific versions of specific markup profiles. However, as explained in the following section, the group will structure its specification work such that generic combination concepts and techniques are part of a framework specification so that they can be easily referenced and reused in other contexts, or in future specifications this group produces.

1.1.1 Phases of Work

The phases of work lists a set of languages (with their profiles) that are combined. As the languages differ by their features and properties, different issues come up when combining together different languages. Therefore, the phase constrains the issues that are currently solved.

The standardization starts with a limited set of languages. Once issues on these languages are solved, a next larger package is started. The following phases contain more languages and therefore, the corresponding recommendations solve a wider variety of combination issues.

Each phase results in:

  • a new version of the Compound Document Framework specification

  • (at least) one specific Profile specification that uses the corresponding Compound Document Framework. Each Profile specification will include a list of specific supported languages (and profiles thereof) and also include any supplemental integration notes as necessary beyond what is described in the Framework.

1.1.2 By Reference

The first technical solution for combining document formats is combination by reference. This means that documents using different languages (XML vocabularies) are linked by a reference such as XLink references, XHTML <img>, <object> and XForms instance src attributes. This allows separate languages to work together, but it allows implementations of the languages be separated.

The standardized issues are for example, how events flow in multi-document environment, how different documents are accessed by the scripts, and how different languages should cooperate in drawing to the screen (or other media).

1.1.3 By Inclusion

A subsequent phase will address compound documents by inclusion, which is combining more than one namespace vocabulary within the same document. For example, a compound document with XHTML as the enclosing root schema which contains nested XForms namespace elements, and SVG namespace elements, and XML Events elements all in the same document. A compound document by inclusion, may also contain reference to other separate documents, and therefore may be a combination of inclusion and reference.

The standardization issues that will be addressed in the Compound Documents by Inclusion Recommendation are how different namespace vocabulary implementations will interact within the same document, and how compound documents by inclusion should be loaded and processed by browser user agents.

1.2 Definition of Rich Multimedia Content

This specification addresses UI languages in order to facilitate rich multimedia content, which can include the following characteristics.

  • Graphically rich content, possibly including animated background image.

  • Content author/provider has exact control of the presentation, including fonts, layout, color, etc.

  • Layout adaptation: layout can be based upon device characteristics - screen size, color depth, resolution, orientation.

  • Navigation (forward/backwards tab, up/down/left/right, accesskey, pointer, voice), graphical menus, interactivity and animations.

  • Graphical menu systems where items have an animated action when focused on.

  • Portable user interface.

  • Presentation can be customized to reflect a brand or user's personality.

  • Skinnable user interface: the ability to use animations and interactivity as a user interface "skin" or template around other content.

  • Rich content for different contexts.

  • Dynamic content: documents that are generated on the fly or which change over time

  • Interaction with mixed content, for example interacting with all the parts (graphics, text, images, audio, video, voice and multimodal interaction) of the mixed document.

  • Content adaptation, such as when a portal delivers a mixed document with varying capabilities (textual content and interactive animated content, for example) to a user which has been aggregated from multiple rich content sources.

The following table shows existing W3C formats which are relevant to rich multimedia content. Each combination for which there is a requirement is annotated with comments. This is here to show the relationship between current W3C recommendations and the definition of rich multimedia content: 1.2 Definition of Rich Multimedia Content.

XHTML + CSSSVGSMILXFormsVoiceXMLXBLDOM/Scripting
Graphically rich content, possibly including animated background imageNeed CSS's 'background-*' propertiesNeed animated graphics facility of SVGSMIL supports z-order. Hence, animated or non-animated background image are easy to make in SMIL.(not directly applicable)(not directly applicable)(not directly applicable)(not directly applicable)
Content author/provider has exact control of the presentation, including fonts, layout, color, etc...For when author/provider requires exact control in flowable scenariosFor when author/provider requires exact control in fixed canvas scenariosSMIL provides an exact control of the main document layout and it uses other media types such as XHTML or PNG for the content.(not directly applicable)Author exact control of audio presentation when media=aural or for multimodal presentation(not directly applicable)(not directly applicable)
Layout adaptation: layout can be based upon device characteristics - screen size, color depth, resolution, orientationFor some scenarios, CSS box model layout and @media will be part of solutionFor some scenarios, scalable graphics will be part of the solutionSMIL BasicContentControl (including <smil:switch>) is the module that provides this functionality.In XForms, the content is separated from the presentation, and thus it support layout adaptation of forms.Allows author to provide an aural presentation when media=aural.(not directly applicable)(not directly applicable)
Navigation (forward/backwards tab, up/down/left/right, accesskey, pointer, voice), graphical menus, interactivity and animationsNeed XHTML <form> elements to define navigation pointsFor graphical menus, rich interactivity and animationsSMIL Linking module provides this functionalityFor device-independent representation of human-computer interactionFor voice(not directly applicable)For when the author requires advanced behavior
Menu system e.g., items get animated when focusedFor some scenarios, XHTML+CSS is sufficient for menusFor rich UI and animationMenus can be done in SMIL, but it can be cumbersome because each menu item needs to be in a separate file. Scripting could also be used in conjunction with referenced XHTML content.Menus could be presented as forms with help of XBL(not directly applicable)Allows menus to be abstracted as reusable componentsOften, menus require scripting facilities
Portable user interfaceFor some scenarios, XHTML+CSS is sufficient for portable UIsFor rich UI and animationMenus can be done in SMIL, but it can be cumbersome because each menu item needs to be in a separate file. Scripting could also be used in conjunction with referenced XHTML content.For device-independent representation of human-computer interactionFor voice-based user interfacesAllows menus to be abstracted and reskinned to adapt to different platformsOften, XBL components require client-side scripting
Presentation can be customized to reflect brand or user's personalityFor some scenarios, XHTML+bitmaps is sufficientFor some scenarios, SVG's richness is a requirementSMIL CustomTestAttributes module provides this functionality.(not directly applicable)(not directly applicable)Allows presentation elements to be abstracted and reskinnedClient-side scripting is often used for presentation customization
Skinnable user interface e.g., ability to skin a document with animations and interactivity. XHTML+CSS alone are not sufficient for skinnable UIFor rich graphics, animations and rich interactivitySMIL controls the main document layout and all content is customizable. Test module provides ability to select appropriate content.XForms provides the definition of the abstract UI components before skinning is appliedIf you want to skin for voiceAllows UI elements to be abstracted and reskinnedOften, XBL components require client-side scripting

1.3 Relationships With Other Technologies

There are many technologies that allow combining different languages. Most extensible languages provide this feature. The combining technologies and issues can be divided in two categories. The first category is how the combining is described. The second category is how the semantics of combining is understood. Each language specifies the first category. The combining recommendations try to solve the second ones.

1.3.1 HTML and XHTML

HTML has an <object> element. This element can be used to combine by reference. The <object> element provides a way to link to external documents.

1.3.2 SVG

SVG has a <foreignObject> element. This element can be used to combine by reference. The <foreignObject> element provides a way to link to external documents.

1.3.3 SMIL

SMIL technologies are closely related to combining by reference. SMIL provides mechanisms (e.g., <animation>, <audio> and <video> elements) to combine content and synchronize them in time.

1.3.4 XLink

Xlink specifies a generic reusable vocabulary for links in XML documents. An XLink may be specified on an arbitrary element in such a way that an XLink enabled processor will be able to understand the linking semantics of said element. A variety of linking behaviors (embedding, hyperlinking, etc.) may be further described using additional linking metadata.

1.3.5 XInclude

XInclude specifies how to combine different infosets together. This approach may be used for combination by inclusion.

After the XInclude markup is processed, the WG's produced recommendations will specify the semantics on how the application should understand the compound document.

1.3.6 SOAP Attachments and Optimization Technologies

SOAP Attachments provide a way to divide documents into several pieces for more efficient serialization and subsequent recombination. For the WG's produced recommendations, SOAP Attachments are similar in scope to the XInclude technology, the combining technology operates and provides semantic information on top of these serialization issues.

1.3.7 OASIS's Open Document Format

The OASIS effort is targeted at office productivity applications, such as word processors, spreadsheets or presentation authoring tools. The work is focused on representing the full information contained in documents that are created and edited with such tools in an interoperable manner. On the other hand, the CDF effort is focused on combining formats for web publication and has a lot of focus on user agent environment, and runtime behavior. For example, the CDF Working Group efforts will focus on issues such as the runtime interaction model for documents including components from different XML formats. This is not addressed by the Open Document Format specification. The Open Document Format, on the other hand, specifies an office application compatible style model, page layouts, index generations, text fields, table formulas which the CDF specifications will not address.

1.3.8 XForms

XForms is a schema designed to split documents into three parts: the XForms model, the instance data, and the user interface. It provides a mechanism for expressing the data and intent of a form and allowing browser/renderers the flexibility to present the information most suitable for the device currently rendering the form. XForms supports compound document by inclusion by design since it is not a free-standing document type, and was always meant to be enclosed by a root document.

1.3.9 XHTML + Voice

The XHTML+Voice profile is designed for Web clients that support visual and spoken interaction. The XHTML+Voice profile brings spoken interaction to standard web content by integrating a set of mature web technologies such as XHTML and XML Events with XML vocabularies developed as part of the W3C Speech Interface Framework. The profile includes voice modules that support speech synthesis, speech dialogs, command and control, speech grammars, and the ability to attach Voice handlers for responding to specific DOM events, thereby re-using the event model familiar to web developers. Voice interaction features are integrated directly with XHTML and CSS, and can consequently be used directly within XHTML content. The definition of the XHMTL+Voice profile is a compound document profile.

1.4 Definition of Terms

The working group has decided to use following terms in the work and this document.

Compound document

The compound document is a document that combines separate component languages either by reference or by inclusion.

Root document

In the case of combining by reference, one compound document may be a collection of several separate documents.

The top-most (e.g. document not having any references to it) compound document is called Root document.

Parent document

In the case of combining by reference, one compound document may be a collection of several separate documents.

The document that references another document is called Parent document. The top-most parent document is also called Root document.

Child document

In the case of combining by reference, one compound document may be a collection of several separate documents.

The document that is referenced is called Child document. If the Child document references other documents, it is also called Parent document.

User Agent

See definition in Device Independence Glossary document.

Component Language

Component language refers to an XML-based language (like XHTML and SVG) with its own elements and attributes.

1.5 Who/what will implement the specification

1.5.1 Authoring tools and content production systems

Compound documents can be authored by a variety of means including:

  • Hand editing with a text editor

  • Dynamic generation via a Web servers or other back end systems

  • Via authoring tools which focus on the component languages (an XHTML authoring tool)

  • Via authoring tools which focus on compound documents (a "CDF authoring tool")

It is expected that there will be multiple categories of CDF authoring tools. Here are some examples:

  • Document-centric authoring tools to create content which mixes static text and graphics for the purpose of publishing on the Web

  • Multimedia-centric authoring tools for creating time-based, interactive content (a movie or game, for example)

  • Application-centric authoring tools to create user interfaces

  • Forms-centric authoring tools to create data templates

  • Device-independent authoring tools for creating content that can be adapted to different user requirements and across different media

1.5.2 User agents, such as browsers and plug-ins

Depending on the system, CDF content will be viewed in a variety of ways:

  • Via native support within a browser which is conformant with CDF specification(s)

  • Via a browser plug-in which registers itself as a handler for CDF content

  • Via a dedicated CDF viewing application

  • Via browser fallback compatibility (e.g., the browser treats CDF files as XHTML content)

  • Via accessibility tools such as screen readers

1.5.3 Intermediary Processors, content renderers

Systems that process, combine, re-order, re-format or otherwise render content based on context. An example is a content rendering engine which takes content in a device neutral form and renders it appropriately according to particular device or bearer characteristics (such as screen size or bandwidth).

2 Use Cases

The W3C and other bodies have created multiple XML syntaxes for various purposes, such as XHTML for on-line document viewing, SVG for 2D vector graphics or SMIL for multi-media synchronization. While there has been a lot of interest for these individual solutions, there has been a growing demand for a solution that would let end-users use them together to define a single piece of content. The individual markups as those just mentioned have great features, but these features become even more compelling when combined. For example, being able to display scalable 2D images in XHTML pages provides the ability to define pages which can be printed with high quality. Similarly, using SVG images in an XHTML table provides an easy way to layout SVG images in a table.

There is a demand for letting content creators combine markups so as to create richer documents, containing multi-media information: text, graphics, audio and video. For that reason, the main use case for the CDF activity is the definition of rich multimedia content, as defined in 1.2 Definition of Rich Multimedia Content.

2.1 Web Publishing and Broadcasting

Services that mix text, audio, video and interactive elements to deliver information.

2.1.1 Interactive News Service

An interactive news service may provide the user with multimedia news content comprising traditional text and images, but also audio and video reports as well as diagrams which react to user behavior.

Note:

Example: newspaper Web sites that offer interactive features which go beyond textual content offered in a print edition.

2.1.2 Web Portal

A "portal" application combines content and services from multiple back-end sources across multiple integration paths to create a cohesive user experience. Portals often are customizable to user preferences and other factors and therefore often produce dynamic content that is specific to a particular user-request.

2.1.3 News or stock ticker

An animated "ticker" which displays dynamic data, such as stock prices or current news headlines, can be displayed as part of a larger page of more static information, such as a news article.

2.1.4 Document viewing

The capability to view documents with preserved formatting, layout, images and graphics and interactive features such as zooming in and out and multi-page handling.

2.1.5 Infotainment

Infotainment services combine information and entertainment from different sources into a single interactive service with compelling presentation. The user experience for such services is often a single "page" which displays multiple pieces of content, each potentially of a different kind. These services often require dynamic user interfaces that use animation and other graphic effects to produce a pleasing user interface.

Note:

Example: mobile portals deployed by mobile network operators.

2.2 Web Applications

Web applications typically have some form of programmatic control, either on the client, on the server or a combination of both. This document addresses client-side Web applications only. They may run within the user agent, or within another host application. A Web application is typically downloaded on demand each time it is "executed", allowing a developer to update the application for all users when needed. Web applications are usually smaller than regular desktop applications, and can have rich graphical interactive interfaces.

2.2.1 Reservation system

Interactive services which allow a user to book travel or make other kinds of reservations by using graphically-rich user interface. For instance, a user may use a calendar to pick a departure or check-in date and a map to locate a destination, with results represented as textual data for confirmation.

Note:

Example: online travel sites or airline sites.

2.2.2 Order entry system

An application which might be used in the field to process orders for goods or services.

Note:

Examples: Parcel tracking application used by delivery agents; order entry system used by a mobile sales force.

2.2.3 On-line shopping

E-Commerce services which use interactive elements to display goods and enable a shopping experience.

2.2.4 Survey

A survey application which provides a multi-part form which can be stepped forward or backward and submitted in one action.

2.2.5 Games

Interactive games that use animated content embedded within a page and using elements within the page to actuate functions within the game. For example, a simple painting game where the painting occurs in the animated area but a palate of colors are set outside of this area in the non-animated area.

2.2.6 Interactive maps

In a mapping application, the map itself may appear in an animated inset, while controls that show or hide various aspects of the map or zoom the view in or out may be located outside the animated area.

2.3 Resident Applications

These are applications which are resident or partially resident on a device.

2.3.1 Personal Information Management

To-do list, calendar entries and Calculator user interfaces.

2.3.2 Communication

Email, MMS, instant messaging, access to phone functions, such as dialing a number, sending a text message, or manipulation of the built-in address book (subject to security and privacy considerations).

2.3.3 Dynamic graphics as background or screen saver in mobile phone

In a mobile phone, a Web application is set as background or screen saver, and is dynamically updated depending on, for example, data from the network or the time of day. This allows the user to have immediate access to 'glancable' information, such as current weather conditions, stock quotes, or more application-oriented information such as flight departure data.

2.3.4 Document viewing

The capability to view documents with preserved formatting, layout, images and graphics and interactive features such as zooming in and out and multi-page handling.

2.3.5 Interactive Icon bar

A set of icons, representing destinations in a portal or applications, which animate when focused on.

2.3.6 Interactive Maps

Display of a map with interactive features such as panning, zooming, level of detail, points of interest.

2.4 Content Authoring, Aggregation and Deployment

This class includes tools and applications for both personal and professional content creation, management, and distribution.

2.4.1 Personal content creation

Personal content is created by users themselves and, often, shared with other users. In many cases, users want to combine different content formats. A typical example is that use take a picture with mobile phone, write short message about the excellent meal that he/she had in the restaurant, and attach a map which shows the location of the restaurant.

2.4.2 Personal content management

Personal content management refers to storing your personal media content including images, audio, video, graphics, text, etc. It should be possible to link different content items to each other, view them simultaneously. In addition, the content management system should support adaptation of the content for different kinds of devices and sharing the content online with other users.

2.4.3 Web logs

Blogs are personal diaries that are published online. Many bloggers try to create visually attractive content by combining different content formats.

2.4.4 Content aggregation

In content aggregation, content coming from different sources is combined together. Web portals are typical examples. Often, content aggregators have different versions of the content offering for different delivery channels and user groups. Therefore, content filtering and transforming is often required. In addition, the content aggregators want to maintain certain look and feel of service across different devices and versions. Also, there might be need for Digital Rights Management (DRM).

2.4.5 Professional content management

Most professional web sites use content management systems, which provide means for version management, access control, content versioning, etc. The content itself is usually in several different formats.

2.5 Navigation

2.5.1 SVG used as logo or advertisement (no links in SVG).

SVG images are used as embedded logos or advertisements (with animations) in an XHTML document. There are no links in the SVG images. The user cannot focus the images, they are like bitmaps embedded by <img>.

2.5.2 SVG embedded interactive map (links in SVG).

An SVG image, which is an interactive city map, is embedded inside an XHTML page. There are links inside the SVG image. The user can navigate to the SVG map, which then gets focus. When the map has focus the user can either navigate further into the first link inside the map or skip the map and navigate further to the next link in the XHTML document.

2.5.3 Menu with SVG images that animate and expand when focused (no links in SVG)

SVG images are used as menu icons, wrapped inside <a> elements, in an XHTML document. There are no links inside the SVG images. When the user focuses a menu item, the icon expands and starts to animate. When the user moves to the next item, the animating icon returns to the state it had before it had focus, and the next icon, which now gets focus and starts to animate.

2.5.4 SVG image is embedded interactive part of the Web page (SVG has links)

An SVG image is used as 2D-graphics in a Web page. The SVG image has links. The user can navigate directly from an XHTML link into the first link in the SVG image; the SVG image is seamlessly integrated with the XHTML page. The user navigates directly from the last link inside the SVG image to the next link, after the SVG image, in XHTML.

This is an alternative scenario in 2.5.2 SVG embedded interactive map (links in SVG). .

2.6 Specific Use Cases

Whereas the use cases outlined above are general in nature, the following use cases go into more detail and specifically describe some of the implementation issues that require the use of compound document technology.

2.6.1 Synchronization of HTML content with audio visual content:

The description of an auction item on an auction service (a Web page) includes a short video clip that describes the item. Lets assume the user needs to watch the video clip in-full for legal reasons before he is allowed to place a bid for this item by typing the amount in an entry field and pressing the "Place Bid" button on the same Web page.

User starts play-back of the video by selecting a button on the page. This raises an event that starts playback of the embedded video object. Once the video has ended the embedded video object raises another event. This event causes the "Place Bid" to become selectable in the HTML page.

This use case includes several functionalities which are not possible in HTML and the <object> element today:

  • starting and stopping playback of an embedded video (or audio) in response to interaction with another HTML element

  • an embedded object generating an event that causes an update to the including HTML page (possibly with the help of JavaScript) without reloading this page

2.6.2 Animated SVG icons in HTML menu

An HTML page includes a menu for the user to select between different Web applications. Each menu item is a single <a> element. Each anchor value consists of an SVG graphics and a rich text label. Each SVG graphics must scale to fit the same size rectangular bounding box (position and size defined by the containing HTML+CSS document), aspect ratio must be maintained when scaling. Example:

Animated SVG Icons Example
<a href="Web-application-1.html">
     <object width="100%" data="menu-item-1.svg" />
     Web Application Choice<em>1</em>
</a>

The background image of the page (defined by background-image CSS property) must be visible through the 'holes' in the SVG graphics.

While the user designates an anchor without selecting it (see :hover pseudo class in HTML) the rendering of the anchor value should change as follows:

  • background of the text part of the anchor changes to yellow

  • the SVG graphics plays an animation

While the user activates an anchor (see active pseudo class in CSS) the rendering of the anchor value should change as follows:

  • background of the text and SVG parts of the anchor changes to blue

  • the SVG graphics does not animate. Any running animation stops

The following features can not be done with HTML + CSS alone, or cause significant overhead in the content:

  • background image visible through the 'holes' of content rendered by a typical browser plug-in (or at least this is implementation dependent).

  • SVG content animate for the exact duration of an anchor being designated

  • reliable definition of dimension and aspect ratio of SVG content

2.6.3 Visualization of poling results in HTML page

An HTML page allows to enter poling results for the three candidates of an election. As soon as (partial) results of the pole have been entered a pie chart on the same HTML page should be updated to visualize the entered data.

The HTML page consists of three entry fields to enter number of votes for each candidate. A JavaScript calculates the percentage of votes for each candidate and updates a pie chart in SVG immediately.

The key functionality that can not be done with the current integration of HTML and SVG via the <object> element is access from JavaScript to both documents.

3 Requirements

All the requirements collected by this WG are listed in the following sections.

3.1 High-Level Requirements

These are general requirements for the Compound Documents by Reference (CDR) specification.

3.1.1 CDR MUST exploit existing specifications, favoring W3C specifications wherever possible and limit the definition of new markup unless absolutely required for integration purposes

It is not the role of the Working Group to invent new markup, but rather to specify rules by which existing markup formats are combined.

3.1.2 CDR MUST provide the ability for content developers to describe or author rich media content define Rich Multimedia Content

Rich Multimedia Content is defined in 1.2 Definition of Rich Multimedia Content.

3.1.3 CDR MUST specify a base set of formats, corresponding profiles and versions

Conformant implementations MUST support this base set.

3.1.4 Each CDR profile and version MUST specify, which formats can be referenced

Each CDR profile has a root document and one or more child documents. The CDR profile has to specify both the root document and the possible child documents.

3.1.5 CDR MUST specify, for each format, the element used to reference other formats, if any.

For example, the CDR specification will specify that the XHTML <object> element is to be used for formats included in XHTML, and will specify the usage of this element to include other formats.

It may be necessary to transfer some parameters declaratively to a child document. Each CDR profile MAY specify a list of pre-defined parameters/values for the <object> element. For instance, for CDR WP1, there could be these kind of parameters/values:

      <param name="Volume" value="50" />
      <param name="Mute" value="false" />
      <param name="amination" value="onfocusevent" />
      

3.1.6 CDR MUST specify generic integration techniques

These are techniques that will be applicable beyond the set of formats, versions and profiles required by the CDR spec.

3.1.7 CDR MUST support temporal synchronization of dynamic content coming from multiple references, possibly with multiple references to the same source.

Support the synchronization of multiple instances of dynamic content, such as animations, audio and video, within a single compound document.

3.1.8 CDR MUST support event mechanisms that cross namespace boundaries

Define consistent rules that specify how events pass to components. In particular, provide rules that define the behavior when events are passed across namespace boundaries.

3.1.9 CDR MUST support scriptability

Compound documents have to support scripting languages. The scripts have to be able to access and modify the elements of both root and child documents.

3.1.10 CDR MUST say the allowed nesting level of referencing

Each CDR profile must define, whether there are restrictions to number of nesting levels. For example, whether a referenced SVG can itself reference another XHTML content.

3.1.11 CDR MUST explain how scripting interacts between components and the parent document

CDR profile documents must explain what kind of access, if any, scripts have between the parent and child documents and vice-versa. Can a script in a child document access its parent's DOM, and if so, how? Can a script in a parent document access a child's DOM? If there is access, the mechanism must be described.

3.1.12 CDR profiles MUST specify how event propagation works across namespace boundaries.

CDF profile documents must explain how, if at all, events get propagated from parent to child documents and vice-versa. Do events from a parent document get propagated to the children documents? Do events from a child document get propagated to the parent document?

For example, if an XHTML document references an SVG document, what is the event propagation model when the user activates an SVG element? Do events bubble from a child document to the referencing parent document? Does event capture happen from the root of the referenced document or does it happen from the root of the referencing document? This event propagation model MUST be consistent with the DOM Level 3 event model.

3.1.13 CDR profiles MUST specify how focus traversal works with referenced documents.

For example, when an XHTML document references an SVG document, what is the relation between the XHTML document's focus traversal and the SVG document's focus traversal. How does the SVG document get focus and how does the focus traversal starts in the SVG document? How does it end?

3.1.14 CDR profiles MUST specify how link activation work with referenced document.

A referenced object may be wrapped with a XHTML anchor. A referenced object may contain it's own embedded links. A referenced object may be wrapped with a XHTML anchor and contain it's own embedded links.

In addition, a referenced object may draw it's own visual feedback for focus activation and/or mouse-over.

CDR must specify all these cases, since they may require different handling, depending on the type of user interaction: pointing device or joystick operation.

3.1.15 CDR profiles MUST specify triggering of animations across namespaces.

For example, if an XHTML document references an SVG document, the profile MUST specify how an SVG animation can be triggered by an XHTML event target.

3.1.16 CDR MUST support fragment identifiers in cross-namespace interaction

In CDR, child documents within a parent document are referenced with URI references. This URI reference must support also so-called fragment identifiers. Each fragment identifier is passed to the component or plug-in that displays the child document, when the documents are loaded. The handling of the fragment identifiers depends on the MIME type of the child document.

3.1.17 CDR profiles SHOULD provide a method for adding event handlers using declarative markup for the formats it uses

Event handlers contain instructions on how to react to certain event. These can be written either using a scripting language or a declarative (i.e., markup) language. The CDR specification should provide a method to add event handlers both to parent and child documents. If a CDR profile contains a markup language that supports declarative event handlers then the CDR profile should also support event handlers defined using markup.

3.1.18 CDR documents MUST cater for accessibility requirements

All CDR profiles must take accessibility requirements in consideration. The Web Accessibility Initiative has defined Web Accessibility Guidelines, which must be used when CDR profiles are developed. In addition, markup language specific accessibility guidelines have been defined in Techniques for Web Content Accessibility Guidelines document. A CDR profile must support all the techniques that are supported by the markup language modules included in it. The same applies for the authoring tool accessibility guidelines and the user agent accessibility guidelines.

3.1.19 CDR documents MUST support dynamic updating

Dynamic documents can be manipulated by scripts, etc. The user interaction can fire events, which are caught by event handlers. The event handlers can then manipulate the elements via the DOM interface. These changes are then displayed to the user. CDR documents must support dynamic updating across different parent and child documents.

3.1.20 CDR must define its integration into the Web Architecture. It must include delivery over HTTP and should also strive to be transport independent

CDR profile documents must describe how they integrate into the Web Architecture, particularly in the areas of media types and transfer/transport protocols. It must be described how this works with HTTP as well as other protocols typically used to transfer Web content (e.g., RTP).

3.1.21 CDR MUST NOT prevent compression of the data

For efficiency, user agents and other renderers must support served content that is compressed in whole or part.

3.1.22 CDR MUST define the way soft-keys and accesskeys are handled across document components

For instance, what happens when there is a conflict between 2 accesskeys which have the same value but are supposed to trigger different actions in the parent and in the child documents.

3.1.23 CDR User Agents MUST provide a default font for use by all components

Standalone components or renderers, such as SVGT engines, do not always provide a default system font.

A CDR User Agent, however, must at least provide one default system font to all components, such as browsers, SVG engines and other renderers.

The CDR specification cannot mandate any particular font, nor font technology. But it mandates the availability of 'a' default font, so that content providers can print text in any component or render as simply as this can be done in XHTML.

The default system font may be bitmap based, antialiased or a vector font. Ideally, the same default system font should be used for all components.

The following SVG sample markup MUST generate visible text.
<svg ...>
     <text x="20" y="20" font-size="normal" fill="#000">Sample Text</text>
</svg>

3.1.24 CDR MUST NOT prevent server-side adaptation

Some animations could be too heavy for some terminals - or could be too 'unsophisticated' for others.

Server side adaptation must therefore be possible in order to support less capable devices. For example, it should be possible for a server to determine the SVG capabilities of the device (such as the rendering and animation capabilities.

3.1.25 CDR MUST support limited bandwidth networks and limited capability devices

Small footprint and mobile devices often are connected to mobile networks, characterized by comparatively low bandwidth and/or high connection latency. Additionally, mobile networks may charge users for data downloaded. It is therefore desirable to ensure that this bandwidth is used optimally.

Authoring tools and/or systems serving CDR documents must be able to use techniques to reduce size of materials transmitted and the latency involved.

3.1.26 CDR Profiles MUST define clear document conformance criteria

CDR Profile specifications must define what kinds of documents are conformant and non-conformant.

3.1.27 CDR Profiles MUST define clear user agent conformance criteria

CDR Profile specifications must define conformance rules for user agents.

3.1.28 CDR Profiles SHOULD provide a way to know the current loading status of a referenced component

Sometimes, one or several medias are needed before the end-user can interact with a RichMedia application : for instance, a list of small audio clips must be downloaded in an interactive game before the end-user can start playing. Or some small audio clips must be downloaded before the application can provide speech instructions in a navigation system ("turn left", "turn right", ...). As a result, CDR should explain a way to know whether a referenced component is currently loading, has successfully been loaded or failed to be loaded and launch appropriate actions accordingly.

3.1.29 CDR MUST provide a solution for packaging of self-contained, static content

Packaging is needed in order to support optimization of the number of individual fetch operations that are required to construct the final user experience on the device.

3.1.30 CDR MAY provide a solution for packaging of streamed content

In addition to 3.1.29 CDR MUST provide a solution for packaging of self-contained, static content, packaging of streams may be studied in order to accomodate audio and video content.

3.2 CDR Profile 1 Requirements (Rich Multimedia Content)

The language profile combining XHTML, CSS and SVG according to the CDR rules is referred to in this document as CDR Profile 1. This section identifies the requirements specific for the use of XHTML as the parent CDR Profile 1 document and SVG Tiny as the child document. This profile supports presentation of rich multimedia content, as defined in 1.2 Definition of Rich Multimedia Content, and many of the requirements herein are related to this definition. XHTML and SVG have been chosen as the first profile for CDR according to analysis of existing specifications and their applicability to rich multimedia content: 1.3 Relationships With Other Technologies

These requirements should be considered as additional requirements to the applicable requirements specified in other sections of the document.

3.2.1 CDR Profile 1 MUST specify a user interaction model

CDR Profile 1 must support a clear user interaction model between components (event/focus management). Interactivity between components will help to provide a seamless and compelling user experience of documents which combine XHTML and media elements.

3.2.2 CDR Profile 1 MUST explain how a User Agent is able to identify a CDR Profile 1 document

CDR Profile 1 content (that is, content which confirms to a language profile built using the CDR Profile 1 specification) will be unambiguously identified as conforming to that language profile.

Content creators and viewer implementations will both better be able to meet the conformance requirements of the CDR specifications if such content can be unambiguously identified. Use of a specific mime type will furthermore (through the use of accept headers) allow servers to perform a simple check on whether a suitable client is available.

3.2.3 CDR Profile 1 MUST support 2D scalable vector graphics

2D Scalable vector graphics are required in order to easily deploy the same user interface onto multiple devices with different screen sizes / shapes and resolutions.

3.2.4 CDR Profile 1 MUST support audio

User agents must support the ability to play audio while rendering content. For example, a rendered document may choose to play music while the document is displayed.

3.2.5 CDR Profile 1 SHOULD support video

User agents should support the ability to play video while rendering content. For example, a rendered document may choose to play a video clip while the document is displayed.

3.2.6 CDR Profile 1 MUST support grid, flow, overlapping layouts

In order to create rich content document, content authors need the ability to position and layout information items or user interface components. This is crucial authoring feature which also allows content to adapt to different screen sizes and aspect ratio. Layout is a complement to the scalable nature to formats such as SVG. In CDR Profile 1, grid and flow layouts MUST be supported. Overlapping layout, i.e., the ability to layout component that overlap completely or partially, MUST also be supported.

3.2.7 CDR Profile 1 MUST support SVG backgrounds

A CDR user agent MUST support static SVG background images and MAY support animated or scripted SVG background images. Authors however should not assume that the latter functionality will be available.

3.2.8 CDR Profile 1 MAY support XHTML backgrounds

A CDR user agent MAY support compositing above an XHTML document defined as a background image.

3.2.9 CDR Profile 1 MUST support identification of markup and versions in CDF documents

CDR Profile 1 must provide a mechanism for uniquely identifying the markups and versions used in a particular CDF documents. Ensure that this information can be used in content negotiation.

3.2.10 CDR Profile 1 MUST support scalable diagrams that can be animated and can cause link traversal

An example of usage is the implementation of buttons which render their own visual feedback (animated buttons for navigation). These will provide a scalable alternative to the use of images as the source of links that can be traversed. These should be allowed to contain animation, but not rich interaction.

3.2.11 CDR Profile 1 MUST define how to reference SVGT graphics and resources from an XHTML document

A compound document by reference is defined as one root or parent document that makes a reference to separate child documents. Compound Document profiles that include SVG Tiny as referenced documents from XHTML must define how to reference the separate SVG Tiny graphics and resources.

3.2.12 CDR Profile 1 MUST support advertising the specific supported versions of formats and capabilities in headers

When a user agent makes a request for content, it must identify the content types it can support. This is done by specifying the mime-type, version, and profile information, for each supported type of content, in the request. Using HTTP, this can be done by either using the HTTP ACCEPTS header or the UAProf.

3.2.13 CDR Profile 1 MUST support XHTML as a root/host language

However it is not necessarily the only root hosting language.

3.2.14 The XHTML <object> element MUST be used for referring to other formats from XHTML

It is desirable to achieve the goal of embedding media and other objects into XHTML documents by using the existing <object> element rather than extending XHTML. The XHTML <object> element will be the method by which XML document types will be referenced from XHTML documents.

3.2.15 CDR Profile 1 MUST define the interaction model for an SVG document referenced by an XHTML document

To ensure interoperability, it is important that the CDR Profile 1 specification defines the interaction model for SVG documents referenced by XHTML documents. For example, the interaction model must define if and how interaction with an SVG document requires activation, or if activation is optional, how activation is controlled. By the same token, the interaction model should define the precedence rules between the SVG content and the referencing XHTML document, for example regarding hyperlinking: if an XHTML <object> is enclosed in an <a> element and the <object> references an SVG document which itself displays anchors, what is the behavior when clicking over one of the anchors in the SVG element?

3.2.16 CDR Profile 1 MUST define for animated SVG icons to act like HTML images (no need for interactivity, links, zoom and pan)

When used as background images, it is complex and most often irrelevant to provide interaction, such as zooming, panning, linking and mouse events processing, with the SVG graphics. However, animation makes senses in that it can provide decorative value.

3.2.17 CDR Profile 1 MUST define a way for events to trigger SVG animation

This is required in order to support pleasing button animation on select and activate events, such as focus events, beyond the current "on/off" capability in CSS

3.2.18 CDR Profile 1 MUST define the process for real-estate negotiation between an XHTML document and a referenced SVG document

CDR Profile 1 will provide comprehensive, mandatory rules that specify how document components from different XML vocabularies are scaled when they occur within other components (right-sizing). Support the ability for relative size measures to be used by authors. Ensure that rules define the behavior when fixed and relative sized components are used together within a single document. Ensure that the rules define the behavior associated with areas of the containing component not filled by the contained component. Thus, it will be possible to create truly screen size-independent content.

3.2.19 CDR Profile 1 MUST define handling of leftover SVG area

Rendering SVG content in a fixed-sized area often results in visible areas outside of the SVG viewbox. How these areas are filled must be defined.

3.2.20 CDR Profile 1 MUST define system font support in SVG

SVGT does not mandate system fonts, but in the context of a user agent, highly optimized system fonts are available. On many devices (when executed in the context of a user agent), highly optimized fonts are available.

It is therefore necessary to make highly optimized terminal fonts (used by the user agent) available to SVG as system fonts and to support the ability for common font sets to be used throughout the combined document. In particular, platform fonts should be able to be used if supported, with support font fall-back mechanisms to provide defaults in the event that chosen fonts are unavailable. The SVG <text> element should always display something as long as it is within the viewport.

Related to CDR requirement: 3.1.23 CDR User Agents MUST provide a default font for use by all components.

3.2.21 CDR Profile 1 SHOULD provide temporal synchronization with dynamic media

The Profile should be able to synchronize events with the start and end of playback of dynamic media objects, such as a video or audio stream.

3.2.22 CDR Profile 1 MAY provide functionality to stop and start media objects

The profile may include mechanisms to start playback of dynamic content (from the start of a stream) and to stop playback (with automatic rewind).

3.2.23 CDR Profile 1 MUST support a unified rendering and processing model

Ensure that rendering is consistent across all components of a CDR Profile 1 document. In particular, ensure that there are suitable rules that define how z-ordering applies between different components which may overlay one another. Also ensure that there are rules that define how transparency is handled when components overlay one another. CSS z-ordering may provide a mechanism for controlling this.

3.2.24 CDR Profile 1 SHOULD provide a way to play an animation while some referenced components of the Combined Document are loading

The profile should define mechanisms allowing to play an animation while one or several parts of the combined document are loading. For instance, this can be a progress bar informing the end-user about the loading progress or an advertising which plays while some audio/video content is being fetched.

3.2.25 CDR Profile 1 MUST specify the behavior of audio mixing

CDR Profile 1 must specify how user agents behave when multiple audio streams are played simultaneously within a compound document.

A References (Non-Normative)

B Acknowledgements (Non-Normative)

The editors would like to thank the contributors:

Jon Ferraiolo, Adobe
Vincent Hardy (Working Group Chair), Sun Microsystems Inc.
Scott Hayman, RIM
Dean Jackson (Working Group Team Contact), W3C
Kevin Kelly, IBM
Lasse Pajunen, Nokia
Peter Stark, Sony Ericsson
Petri Vuorimaa, Helsinki University of Technology

C Changes Log (Non-Normative)