Synchronized Multimedia Integration Language (SMIL) Boston Specification

W3C Working Draft 25 February 2000

This version:: http://www.w3.org/TR/2000/WD-smil-boston-20000225
(Other formats: single PostScript file, single PDF file, zip archive)
Latest version:: http://www.w3.org/TR/smil-boston
Previous version:: http://www.w3.org/TR/1999/WD-smil-boston-19991115
Editors:: Jeff Ayars (RealNetworks), Dick Bulterman (Oratrix), Aaron Cohen (Intel), Erik Hodge (RealNetworks), Philipp Hoschka (W3C), Eric Hyche (RealNetworks), Ken Day (Macromedia), Kenichi Kubota (Panasonic), Rob Lanphier (RealNetworks), Nabil Layaïda (INRIA), Philippe Le Hégaret (W3C), Thierry Michel (W3C), Jacco van Ossenbruggen (CWI), Lloyd Rutledge (CWI), Bridie Saccocio (RealNetworks), Patrick Schmitz (Microsoft), Warner ten Kate (Philips), Ted Wugofski (Gateway).

Abstract

This document specifies the "Boston" version of the Synchronized Multimedia Integration Language (SMIL, pronounced "smile"). SMIL Boston has the following two design goals:

Define a simple XML-based language that allows authors to write interactive multimedia presentations. Using SMIL Boston, an author can describe the temporal behavior of a multimedia presentation, associate hyperlinks with media objects and describe the layout of the presentation on a screen.
Allow reusing of SMIL syntax and semantics in other XML-based languages, in particular those who need to represent timing and synchronization. For example, SMIL Boston components should be used for integrating timing into XHTML [XHTML10].

Status of this document

This section describes the status of this document at the time of its publication. Other documents may supersede this document. The latest status of this document series is maintained at the W3C.

This document is the third Working Draft of the specification for the next version of SMIL code-named "Boston". It has been produced as part of the W3C Synchronized Multimedia Activity. The document has been written by the SYMM Working Group (members only). The goals of this group are discussed in the SYMM Working Group charter (members only).

Many parts of the document are still preliminary, and do not constitute full consensus within the Working Group. Also, some of the functionality planned for SMIL Boston is not contained in this draft. Many parts are not yet detailed enough for implementation, and other parts are only suitable for highly experimental implementation work.

At this point, the W3C SYMM WG seeks input by the public on the concepts and directions described in this specification. Please send your comments to www-smil@w3.org. Since it is difficult to anticipate the number of comments that come in, the WG cannot guarantee an individual response to all comments. However, we will study each comment carefully, and try to be as responsive as time permits.

This working draft may be updated, replaced or rendered obsolete by other W3C documents at any time. It is inappropriate to use W3C Working Drafts as reference material or to cite them as other than "work in progress". This document is work in progress and does not imply endorsement by the W3C membership.

A list of current W3C Recommendations and other technical documents can be found at http://www.w3.org/TR.

Quick Table of Contents

Full Table of Contents

1. About SMIL Boston

Editors: Philipp Hoschka (ph@w3.org), W3C; Aaron Cohen (aaron.m.cohen@intel.com), Intel

1.1 Introduction

This document specifies the "Boston" version of the Synchronized Multimedia Integration Language (SMIL, pronounced "smile"). SMIL Boston has the following two design goals:

Define a simple XML-based language that allows authors to write interactive multimedia presentations. Using SMIL Boston, an author can describe the temporal behavior of a multimedia presentation, associate hyperlinks with media objects and describe the layout of the presentation on a screen.
Allow reusing of SMIL syntax and semantics in other XML-based languages, in particular those who need to represent timing and synchronization. For example, SMIL Boston components should be used for integrating timing into XHTML.

SMIL Boston is defined as a set of markup modules, which define the semantics and an XML syntax for certain areas of SMIL functionality. All modules have an associated Document Object Model (DOM).

SMIL Boston deprecates a small amount of SMIL 1.0 syntax in favor of more DOM friendly syntax. Most notable is the change from hyphenated attribute names to mixed case (camel case) attribute names, e.g., clipBegin is introduced in favor of clip-begin. The SMIL Boston modules do not require support for these SMIL 1.0 attributes so that integration applications are not burdened with them. SMIL document players, those applications that support playback of "application/smil" documents (or however we denote SMIL documents vs. integration documents) must support the deprecated SMIL 1.0 attribute names as well as the new SMIL Boston names.

This specification is structured as a set of sections, defining module:

Section 2 presents an overview of the individual modules, and gives example profiles.
Section 3 defines the declarative animation module.
Section 4 presents the content control module, such as the switch and preload elements.
Section 5 describes the SMIL Boston basic layout module.
Section 6 defines the linking module.
Section 7 presents the media object module.
Section 8 defines the metadata module.
Section 9 defines the SMIL Boston structure module including the head, and body elements.
Section 10 defines the SMIL timing and synchronization module.
Section 11 describes the means of integrating SMIL timing into other XML-based languages.
Section 12 presents the transition effects module.
Section 13 defines the SMIL DOM interfaces for all of the above modules.

This specification also defines three profiles that are built using the above SMIL modules:

Section 14 defines the SMIL Boston Language Profile.
Section 15 defines the HTML + SMIL Language Profile.
Section 16 describes the SMIL Basic Language Profile requirements.

Finally, this specification defines a number of baseline media formats to be widely supported by SMIL players:

Section 17 presents a list of baseline media formats.

1.2 Acknowledgements

This document has been prepared by the Synchronized Multimedia Working Group (SYMM-WG) of the World Wide Web Consortium. The WG includes the following individuals:

Jin Yu, Compaq
Pietro Marchisio, CSELT
Lynda Hardman, CWI
Jacco van Ossenbruggen, CWI
Lloyd Rutledge, CWI
Ted Wugofski, Gateway (Invited Expert)
Masayuki Hiyama, Glocomm
Keisuke Kamimura, Glocomm
Michelle Y. Kim, IBM
Steve Wood, IBM
Nabil Layaïda, INRIA
Muriel Jourdan, INRIA
Aaron Cohen, Intel
Wayne Carr, Intel
Ken Day, Macromedia
Daniel Weber, Matsushita
Patrick Schmitz, Microsoft
Debbie Newman, Microsoft
Pablo Fernicola, Microsoft
Wo Chang, NIST
Didier Chanut, Nokia
Jack Jansen, Oratrix
Sjoerd Mullender, Oratrix
Dick Bulterman, Oratrix
Kenichi Kubota, Panasonic
Warner ten Kate, Philips
Ramon Clout, Philips
Jeff Ayars, RealNetworks
Erik Hodge, RealNetworks
Rob Lanphier, RealNetworks
Bridie Saccocio, RealNetworks
Eric Hyche, RealNetworks
Geoff Freed, WGBH
Philipp Hoschka, W3C
Philippe Le Hégaret, W3C
Thierry Michel, W3C.

2. Synchronized Multimedia Integration Language (SMIL) Modules

Editors:: Warner ten Kate <warner.ten.kate@philips.com>,
Ted Wugofski <wugofted@gateway.com>,
Patrick Schmitz <pschmitz@microsoft.com>.

2.1 Introduction

Since the publication of SMIL 1.0 [SMIL10], interest in the integration of SMIL concepts with the HTML, the Hypertext Markup Language [HTML40], and other XML languages, has grown. Likewise, the W3C HTML Working Group is specifying how XHTML, the Extensible Hypertext Markup Language [XHTML10], can be integrated with other languages. The strategy considered for integrating respective functionality with other XML languages is based on the concepts of modularization and profiling [MODMOD], [SMIL-MOD], [XMOD], [XPROF].

Modularization is a solution in which a language's functionality is partitioned into sets of semantically-related elements. Profiling is the combination of these feature sets to solve a particular problem. For the purposes of this specification we define:

element: An element is a representation of a semantic feature. An element has one representation in any given syntax.
module: A module is a collection of semantically-related elements.
module family: A module family is a collection of semantically-related modules. Each element is in one and only one module family. Modules in a module family are generally ordered by increasing functionality (each module is generally inclusive of the previous module in the module family).
profile: A profile is a collection of modules particular to an application domain or language. For example, the SMIL profile corresponds to the collection of modules that make up the SMIL language. Likewise, an enhanced television profile would correspond to the collection of modules for media-enhancement of broadcast television. In general, a profile would include only one module from a particular module family.
profile family: A profile family is a collection of profiles which all share a common set of modules. Those modules are defined as mandatory to a profile which wishes to be part of that profile family. Examples are the XHTML family and the SMIL family.

SMIL functionality is partitioned into modules based on the following design requirements:

Ensure that a profile may be defined that is completely backward compatibility with SMIL 1.0.
Ensure that a module's semantics maintain compatibility with SMIL semantics (this includes content and timing).
Specify modules that are isomorphic with other modules based on W3C recommendations.
Specify modules that can complement XHTML modules.
Adopt new W3C recommendations when appropriate and not in conflict with other requirements.
Specify how the modules support the document object model.

The first requirement is that modules are specified such that a collection of modules can be "recombined" in such a way as to be backward compatible with SMIL (it will properly play SMIL conforming content).

The second requirement is that the semantics of SMIL must not change when they are embodied in a module. Fundamentally, this ensures the integrity of the SMIL content and timing models. This is particularly relevant when a different syntax is required to integrate SMIL functionality with other languages.

The third requirement is that modules be isomorphic with other modules from other W3C recommendations. This will assist designers when sharing modules across profiles.

The fourth requirement is that specific attention be paid to providing multimedia functionality to the XHTML language. XHTML is the reformulation of HTML in XML.

The fifth requirement is that the modules should adopt new W3C recommendations when they are appropriate and when they do not conflict with other requirements (such as complementing the XHTML language).

The sixth requirement is to ensure that modules have integrated support for the document object model. This facilitates additional control through scripting and user agents.

These requirements, and the ongoing work by the SYMM Working Group, led to a partitioning of SMIL functionality into nine modules.

2.2 SMIL Modules

SMIL functionality is partitioned into nine (9) modules :

Animation Module
Content Control Module
Layout Module
Linking Module
Media Object Module
Metainformation Module
Structure Module
Timing and Synchronization Module
Transition Effects Module

Each of these modules introduces a set of semantically-related elements, properties, and attributes.

Further, there are the DOM modules [DOM1], [DOM2], [SMIL-DOM]. A profile may include DOM support. The part of DOM being supported, corresponds to the modules being selected in the profile.

2.2.1 Animation Module

Elements	Attributes	Minimal Content Model
animate	TBD	TBD
set	TBD	TBD
animateMotion	TBD	TBD
animateColor	TBD	TBD

When this module is used, it adds the animate, set, animateMotion, and animateColor elements to the content model of the par, seq, and excl elements of the Timing and Synchronization Module. It also adds these elements to the content model of the body element of the Structure Module.

2.2.2 Content Control Module

The Content Control Module provides a framework for selecting content based on a set of test attributes. The Content Control Module defines semantics for the switch element.

Elements Attributes Minimal Content Model

switch TBD TBD

- test attributes N/A

Elements	Attributes	Minimal Content Model
switch	TBD	TBD
-	test attributes	N/A

When this module is used, it adds the switch, element to the content model of the par, seq, and excl elements of the Timing and Synchronization Module. It also adds this element to the content model of the body element of the Structure Module. It also adds this element to the content model of the a element of the Linking Module. It also adds this element to the content model of the head element of the Structure Module.

Further, when this module is used, the test attributes are added to the attribute lists of all the elements in the Layout Module, the Media Object Module, the Timing and Synchronization Module, and the Transition Effect Module.

Effectuation applies only when the mentioned Modules are part of the profile at hand, of course.

2.2.3 Layout Module

The Layout Module provides a framework for spatial layout of visual components. The Layout Module defines semantics for the layout, root-layout, and region elements.

Elements Attributes Minimal Content Model

layout TBD TBD

root-layout TBD TBD

region TBD TBD

Elements	Attributes	Minimal Content Model
layout	TBD	TBD
root-layout	TBD	TBD
region	TBD	TBD

When this module is used, it adds the layout element to the content model of the head element of the Structure Module. It also adds this element to the content model of the switch element of the Content Control Module.

2.2.4 Linking Module

The Linking Module provides a framework for relating documents to content, documents and document fragments. The Linking Module defines semantics for the a and area elements.

Elements Attributes Minimal Content Model

a TBD TBD

area TBD TBD

Elements	Attributes	Minimal Content Model
a	TBD	TBD
area	TBD	TBD

When this module is used, it adds the area and a elements to the content model of the par, seq, and excl elements of the Timing and Synchronization Module. It also adds these elements to the content model of the body element of the Structure Module.

2.2.5 Media Object Module

When this module is used, it adds the ref, animation, audio, img, video, text, and textstream elements to the content model of the par, seq, and excl elements of the Timing and Synchronization Module. It also adds these elements to the content model of the body element of the Structure Module. It also adds these elements to the content model of the a element of the Linking Module.

2.2.6 Metainformation Module

The Metainformation Module provides a framework for describing a document, either to inform the human user or to assist in automation. The Metainformation Module defines semantics for the meta element.

Elements Attributes Minimal Content Model

meta TBD TBD

Elements	Attributes	Minimal Content Model
meta	TBD	TBD

When this module is used, it adds the meta element to the content model of the head element of the Structure Module.

2.2.7 Structure Module

The Structure Module provides a framework for structuring a SMIL document. The Structure Module defines semantics for the smil, head, and body elements.

Elements Attributes Minimal Content Model

smil Core, Accessibility, xmlns head?, body?

head Core, Accessibility, profile meta*, ( switch | layout )?

body Core, Accessibility ( Schedule | MediaContent | MediaControl | LinkAnchor )*

- skipContent N/A

Elements	Attributes	Minimal Content Model
smil	Core, Accessibility, xmlns	head?, body?
head	Core, Accessibility, profile	meta*, ( switch \| layout )?
body	Core, Accessibility	( Schedule \| MediaContent \| MediaControl \| LinkAnchor )*
-	skipContent	N/A

This module is a mandatory part in any profile family labeled "SMIL".

When this module is used the id, title, and skipContent attributes are added to all other modules used, including modules from other, non-SMIL, origine.

2.2.8 Timing and Synchronization Module

The Timing and Synchronization Module provides a framework for describing timing structure, timing control properties, and temporal relationships between elements. The Timing and Synchronization Module defines semantics for par, seq, and excl elements. In addition, this module defines semantics for attributes including begin, dur, end, repeatCount, repeatDur, and others.

@@ Make "and others" explicit.

@@ These enumerations need check on completeness and correctness.

Elements Attributes Minimal Content Model

par, seq, excl TBD TBD

begin, end, dur, repeatCount, repeatDur, TBD TBD

Elements	Attributes	Minimal Content Model
par, seq, excl	TBD	TBD
	begin, end, dur, repeatCount, repeatDur, TBD	TBD

This module is mandatory in any profile incorporating SMIL modules. By that, it is a mandatory module in any profile in the SMIL family. Note that upon building a profile which integrates SMIL timing with other, non-SMIL, modules, that the elements from this Timing and Synchronization module may appear as attributes to the elements from the other XML language, rather than as these elements themselves.

The timing attributes are used by all the elements in the Media Object Module, the Linking Module, the Content Control Module, and the Timing and Synchronization Module. Effectuation applies only when those Modules are part of the profile, of course. As upon integration with non-SMIL modules, the elements from this module may appear as attributes instead of elements, the referenced timing attributes are also used by those non-SMIL elements.

2.2.9 Transition Effects Module

The Transition Effects Module defines a taxonomy of transition effects as well as semantics and syntax for integrating these effects into XML documents

Elements Attributes Minimal Content Model

TBD TBD TBD

When this module is used, it adds the TBD element to the content model of the layout element of the Layout Module.

2.3 Isomorphism

A requirement for SMIL modularization is that the modules be isomorphic with other modules from other W3C recommendations. Isomorphism will assist designers when sharing modules across profiles. The Table below lists the isomorphism between SMIL and XHTML modules.

Table -- Isomorphism between SMIL modules and their corresponding XHTML modules.
SMIL modules		XHTML modules
Module Name	Elements	Module Name	Elements
Animation	animate, set, animateMotion, animationColor	-	-
Content Control	switch	-	-
Layout	layout, region, root-layout	Stylesheet	style
Linking	a, area	Hypertext	a
Linking	a, area	Client-side Image Map	map, area
Media Object	ref, audio, video, text, img, animation, textstream	Object	object, param
		Image	img
		Applet	applet, param
Metainformation	meta	Metainformation	meta
		Link	link
		Base	base
Structure	smil, head, body	Structure	html, head, body, title, span, div
Timing and Synchronization	par, seq, excl	-	-
Transition Effects	transition	-	-

As can be seen in the table, the Metainformation module appears in both SMIL and HTML. Work is underway to define a single module that can be shared by both SMIL and HTML. In SMIL Boston the Linking Module has been adapted towards isomorphism with the corresponding modules in XHTML.

2.4 Multimedia Profiles

There are a range of possible profiles that may be built using SMIL modules. Four profiles are defined to inform the reader of how profiles may be constructed to solve particular problems:

Lightweight Presentations Profile
SMIL-Boston Profile
SMIL-Basic Profile
HTML+SMIL Profile
Web Enhanced Media Profile

These example profiles are non-normative.

2.4.1 Lightweight Presentations Profile

The Lightweight Presentations Profile handles simple presentations, supporting timing of text content. The simplest version of this could be used to sequence stock quotes or headlines on constrained devices such as a palmtop device or a smart phone. This example profile might include the following SMIL modules:

Timing and Synchronization Module
Transition Effects Module
Animation Module

This profile may be based on XHTML modules [XMOD] with the addition of Timing and Synchronization Module.

2.4.2 SMIL-Boston Profile

The SMIL-Boston Profile supports the timeline-centric multimedia features found in language of the SMIL family. This profile is specified in the SMIL Boston Profile and includes the following SMIL modules:

Structure Module
Metainformation Module
Timing and Synchronization Module
Transition Effects Module
Animation Module
Content Control Module
Media Object Module
Layout Module
Linking Module

2.4.3 SMIL-Basic Profile

The SMIL-Basic Profile supports a leightweight version of the SMIL-Boston profile and is intended for use with resource-constrained devices such as mobile phones. This profile is part of the SMIL family and might include the following SMIL modules:

@@ Keep aligned with the requirements document.

Structure Module
Timing and Synchronization Module
Layout Module
Media Object Module
Linking Module

2.4.4 HTML+SMIL Profile

The HTML+SMIL Profile integrates SMIL timing into HTML. This profile is specified in the HTML+SMIL Profile and includes the following SMIL modules:

Timing and Synchronization Module
Transition Effects Module
Animation Module
Content Control Module
Media Object Module
Linking Module

This profile uses XHTML modules for structure and layout and SMIL modules for multimedia and timing. Since the Linking modules from the XHTML modules [XMOD] and the SMIL modules are isomorphic, basically the Linking Module may come from either module set. However, the SMIL Linking Module adds some additional attributes and semantics.

@@ Aren't these attributes and semantics already added through the Timing & Synchronization Module?

2.4.5 Web Enhanced Media Profile

The Web Enhanced Media Profile supports the integration of multimedia presentations with broadcast or on-demand streaming media. The primary media will often define the main timeline. This profile might include the following SMIL modules:

Timing and Synchronization Module
Transition Effects Module
Media Object Module
Linking Module

This profile is a lightweight version of the HTML+SMIL Profile in that it supports a smaller subset of functionality taken from the XHTML and SMIL modules. It differs from the SMIL-Basic Profile through its integration with XHTML.

3. The SMIL Animation Module

Editors: Patrick Schmitz (pschmitz@microsoft.com), (Microsoft); Aaron Cohen (aaron.m.cohen@intel.com), (Intel); Ken Day (kday@macromedia.com), (Macromedia)

3.1 Introduction

@@ "SMIL Boston" is used here for clarity -- need to distinguish SMIL 1.0, the (standalone) SMIL Animation module now in "last call", and this module. This will be corrected prior to going to Last Call.

This section defines the SMIL Boston Animation module. SMIL animation is a framework for incorporating animation onto a time line and a mechanism for composing the effects of multiple animations. It includes a set of basic animation elements that can be applied to any XML-based language. Since these elements and attributes are defined in a module, designers of other markup languages can reuse the functionality in the SMIL animation module when they need to include animation in their language.

This module is built upon the functionality of the first version of the SMIL Animation [SMIL-ANIMATION] module, currently in last call. The timing model included in the first version is in turn based upon SMIL 1.0 [SMIL10], with some changes and extensions to support interactive (event-based) timing. The extensions in that version of Antimation are compatible with a core subset of the functionality expected to be included in the SMIL Timing module.

This two-version approach has been used in order to facilitate release of a first version of SMIL Animation well before SMIL will be ready to go to Recommendation status.

In this version, the SMIL animation module has been reworked to directly use the SMIL timing module. It does not redefine timing markup specifically for the purpose of animation. It has also been extended to include time containers like <par> and <seq>, which were not supported in the first version.

The reader is presumed to have read and be familiar with the SMIL Timing module, on which this module depends.

While this document defines a base set of animation capabilities, it is assumed that host languages may build upon the support to define additional and/or more specialized animation elements. Animation only manipulates attributes and properties of the target elements, and so does not require any knowledge of the target element semantics beyond basic type information.

The examples in this document that include syntax for a host language use SMIL, SVG, XHTML and CSS. These are provided as an indication of possible integrations with various host languages. @@May be changed to SMIL-only examples prior to going to Recommendation.

Unresolved intra-SMIL references

@@@ Refs to other SMIL modules, to be fixed:

[wd-timing-repeatAttrs]: Definition of repeatCount & repeatDur attrs.
[wd-timing-TimingAndRealWorldClockTime]: This was a section in the "standalone" draft. Is there a counterpart in the Timing module?
[wd-timing-Restart]: Definition of restart
[wd-timing-TimingAttrsEntity]: DTD for timing attributes
[wd-timing-PropagatingTimes]: This was a section in the "standalone" draft. Is there a counterpart in the Timing module?
[wd-some-IDAttribute]: Definition of the ID attribute.

3.2 Overview and terminology

3.2.1 Basics of animation

Animation is inherently time-based. SMIL animation is defined in terms of the SMIL timing model. The animation capabilities are described by new elements with associated attributes and semantics, as well as the SMIL timing attributes. Animation is modeled as a function that changes the presented value of a specific attribute over time.

Animation is defined as a time-based manipulation of a target element (or more specifically of some attribute of the target element, the target attribute). The animation defines a mapping of time to values for the target attribute. This mapping takes into account all aspects of timing, as well as animation-specific semantics. It is based on an animation function that produces a value for the target attribute for any time within the simple duration.

The target attribute is the name of a feature of a target element as defined in a host language document. This may be (e.g.) an XML attribute contained in the element or a CSS property that applies to the element. By default, the target element of an animation will be the parent of the animation element (an animation element is typically a child of the target element). However, the target may be any element in the document, identified either by an ID reference or via an XLink [XLINK] locator reference.

When an animation is running, it does not actually change the attribute values in the DOM [DOM2]. The animation runtime must maintain a presentation value for each animated attribute, separate from the DOM or CSS Object Model (OM). If an implementation does not support an object model, it must maintain the original value as defined by the document as well as the presentation value. The presentation value is reflected in the display form of the document. Animations thus manipulate the presentation value, and do not affect the base value exposed by DOM or CSS OM.

The animation function is evaluated as needed over time by the implementation, and the resulting values are applied to the presentation value for the target attribute. Animation functions are continuous in time and can be sampled at whatever frame rate is appropriate for the rendering system. The syntactic representation of the animation function is independent of this model, and may be described in a variety of ways. The animation elements in this specification support syntax for a set of discrete or interpolated values, a path syntax for motion based upon SVG paths, key-frame based timing, evenly paced interpolation, and variants on these features. Animation functions could be defined that were purely or partially algorithmic (e.g. a random value function or a motion animation that tracks the mouse position) . In all cases, the animation exposes this as a function of time.

The presentation value reflects the effect of the animation upon the base value. The effect is the change to the value of the target attribute at any given time. When an animation completes, the effect of the animation is no longer applied, and the presentation value reverts to the base value by default. The animation effect can also be extended to freeze the last value for the length of time determined by the semantics of the fill attribute.

Animations can be defined to either override or add to the base value of an attribute. In this context, the base value may be the DOM value, or the result of other animations that also target the same attribute. This more general concept of a base value is termed the underlying value. Animations that add to the underlying value are described as additive animations. Animations that override the underlying value are referred to as non-additive animations.

As a simple example, the following defines an animation of an SVG rectangle shape. The rectangle will change from being tall and thin to being short and wide.

<rect ...>
   <animate attributeName="width"  from="10px"  to="100px" 
            begin="0s" dur="10s" />
   <animate attributeName="height" from="100px" to="10px"
            begin="0s" dur="10s" />
</rect>

The rectangle begins with a width of 10 pixels and increases to a width of 100 pixels over the course of 10 seconds. Over the same ten seconds, the height of the rectangle changes from 100 pixels to 10 pixels.

3.2.2 Animation function values

Many animations specify the animation function f(t) as a sequence of values to be applied over time. For some types of attributes (e.g. numbers), it is also possible to describe an interpolation function between values.

As a simple form of describing the values, animation elements can specify a from value and a to value. If the attribute takes values that support interpolation (e.g. a number), the animation function can interpolate values in the range defined by from and to, over the course of the simple duration. A variant on this uses a by value in place of the to value, to indicate an additive change to the attribute.

More complex forms specify a list of values, or even a path description for motion. Authors can also control the timing of the values, to describe "key-frame" animations, and even more complex functions.

3.2.3 Symbols used in the semantic descriptions

f(t): The simple animation function that maps times within the simple duration to values for the target attribute (0 <= t <= simple duration). Note that while F(t) defines the mapping for the entire animation, f(t) has a simplified model that just handles the simple duration.
F(t): The effect of an animation for any point in the animation. This maps any non-negative time to a value for the target attribute. A time value of 0 corresponds to the time at which the animation begins. Note that F(t) combines the animation function f(t) with all the other aspects of animation and timing controls.
B: The begin of an animation.
d: The simple duration of an animation.
AD: The active duration of an animation. This is the period during which time is actively advancing for the animation. This includes any effect of repeating the simple duration, but does not include the time during which the animation may be frozen.
AE: The active end. This is the end of the active duration of an animation.

3.3 Animation model

This section describes the attribute syntax and semantics for describing animations. The specific elements are not described here, but rather the common concepts and syntax that comprise the model for animation. Document issues are described, as well as the means to target an element for animation. The animation model is then defined by building up from the simplest to the most complex concepts: first the simple duration and animation function f(t), and then the overall behavior F(t). Finally, the model for combining animations is presented, and additional details of implications of the timing model on animation are described.

3.3.1 Specifying the animation target

The animation target is defined as a specific attribute of a particular element. The means of specifying the target attribute and the target element are detailed in this section.

The Target attribute

The target attribute to be animated is specified with attributeName. The value of this attribute is a string that specifies the name of the target attribute, as defined in the host language.

The attributes of an element that can be animated are often defined by different languages, and/or in different namespaces. For example, in many XML applications, the position of an element (which is a typical target attribute) is defined as a CSS property rather than as XML attributes. In some cases, the same attribute name is associated with attributes or properties in more than one language, or namespace. To allow the author to disambiguate the name mapping, an additional attribute attributeType is provided that specifies the intended namespace.

The attributeType attribute is optional. By default, the animation runtime will resolve the names according to the following rule: If there is a name conflict and attributeType is not specified, the CSS namespace is matched first (if CSS is supported in the host language), followed by the default namespace for the target element.

If a target attribute is defined in an XML Namespace other than the default namespace for the target element, the author must specify the namespace of the target attribute using the associated namespace prefix as defined in the scope of the target element. The prefix is prepended to the value for attributeName.

For more information on XML namespaces, see [XML-NS].

attributeName = <attributeName>

Specifies the name of the target attribute. An XMLNS prefix may be used to indicate the XML namespace for the attribute. The prefix will be interpreted in the scope of the target element.

attributeType = "CSS" | "XML" | "auto"

Specifies the namespace in which the target attribute and its associated values are defined. The attribute value is one of the following (values are case-sensitive):

"CSS": This specifies that the value of "attributeName" is the name of a CSS property, as defined for the host document. This argument value is only meaningful in host language environments that support CSS.
"XML": This specifies that the value of "attributeName" is the name of an XML attribute defined in the default XML namespace for the target element. If the value for attributeName has an XMLNS prefix, the implementation must use the associated namespace as defined in the scope of the target element.
"auto": The implementation should match the attributeName to an attribute for the target element. The implementation must first search through the CSS namespace for a matching property name, and if none is found, search the XML namespace.
This is the default.

The Target element

An animation element can define the target element of the animation either explicitly or implicitly. An explicit definition uses an attribute to specify the target element. The syntax for this is described below.

If no explicit target is specified, the implicit target element is the parent element of the animation element in the document tree. It is expected that the common case will be that an animation element is declared as a child of the element to be animated. In this case, no explicit target need be specified.

If an explicit target element reference cannot be resolved (e.g. no such element can be found), the animation has no effect. In addition, if the target element (either implicit or explicit) does not support the specified target attribute, the animation has no effect. See also Handling syntax errors.

The following two attributes can be used to identify the target element explicitly:

targetElement = "<IDREF>": This attribute specifies the target element to be animated. The attribute value must be the value of an XML identifier attribute of an element within the host document. For a formal definition of "IDREF", refer to XML 1.0 [XML10].
href = uri-reference: This attribute specifies an XLink locator, referring to the target element to be animated.

When integrating animation elements into the host language, the language designer should avoid including both of these attributes. If however, both attributes must be included in the host language, and they both occur in an animation element, the XLink "href" attribute takes precedence over the "targetElement" attribute.

The advantage of using a "targetElement" attribute is the simpler syntax of the attribute value compared to the "href" attribute. The advantage of using the XLink "href" attribute is that it is extensible to a full linking mechanism in future versions of SMIL Animation, and the animation element can be processed by generic XLink processors. The XLink form is also provided for host languages that are designed to use XLink for all such references. The following two examples illustrate the two approaches.

This example uses the simpler targetElement syntax:

<animate targetElement="foo" attribute="bar" .../>

This example uses the more flexible XLink locater syntax, with the equivalent target.

<animate href="#foo" attribute="bar" .../>

When using an XLink "href" attribute on an animation element, the following additional XLink attributes need to be defined in the host language. These may be defined in a DTD, or the host language may require these in the document syntax to support generic XLink processors. For more information, refer to the "XML Linking Language (XLink)" [XLINK].

The following XLink attributes are required by the XLink specification. The values are fixed, and so may be specified as such in a DTD. All other XLink attributes are optional, and do not affect SMIL Animation semantics.

type = 'simple': Identifies the type of XLink being used. To link to the target element, a simple link is used, and thus the attribute value is fixed to "simple".
actuate = 'onLoad': Indicates that the link to the target element is followed automatically (i.e., without user action).
@@ This may be in conflict with the Linking module. OTOH, for our purposes it means basically the same thing. Need to be consistent, of course.
show = 'embed': Indicates that the reference does not include additional content in the file.

Additional details on the target element specification as relates to the host document and language are described in Required definitions and constraints on animation targets.

3.3.2 Specifying the animation function f(t)

Every animation function defines the value of the attribute at a particular moment in time. The time range for which the animation function is defined is the simple duration. The animation function does not produce defined results for times outside the range of 0 to the simple duration.

The animation is described either as a list of values, or in a simplified form that describes the from, to and by values.

from = "<value>": Specifies the starting value of the animation.
to = "<value>": Specifies the ending value of the animation.
by = "<value>": Specifies a relative offset value for the animation.
values = "<list>": A semicolon-separated list of one or more values. Vector-valued attributes are supported using the vector syntax of the attributeType domain.

The animation values specified in the animation element must be legal values for the specified attribute. See also Animation function value details.

Leading and trailing white space, and white space before and after semi-colon separators, will be ignored.

If any values are not legal, the animation will have no effect (see also Handling Syntax Errors).

If a list of values is used, the animation will apply the values in order over the course of the animation (pacing and interpolation between these values is described in "Animation function calculation modes", below. If a list of values is specified, any from, to and by attribute values are ignored.

The simpler from/to/by syntax provides for several variants. Note that from is optional, but that one of by or to must be used (unless of course a list of values is provided). It is not legal to specify both by and to attributes - if both are specified, only the to attribute will be used (the by will be ignored). The combinations of attributes yield the following classes of animation:

from-to animation: Specifying a from value and a to value defines a simple animation, equivalent to a values list with 2 values. The animation function is defined to start with the from value, and to finish with the to value.
from-by animation: Specifying a from value and a by value defines a simple animation in which the animation function is defined to start with the from value, and to change this over the course of the simple duration d by a delta specified with the by attribute. This may only be used with attributes that support addition (e.g. most numeric attributes).
by animation: Specifying only a by value defines a simple animation in which the animation function is defined to offset the underlying value for the attribute, using a delta that varies over the course of the simple duration d, starting from a delta of 0 and ending with the delta specified with the by attribute. This may only be used with attributes that support addition.
to animation: This describes an animation in which the animation function is defined to start with the underlying value for the attribute, and finish with the value specified with the to attribute. Using this form, an author can describe an animation that will start with whatever value the attribute has originally, and will end up at the desired to value.

The last two forms "by animation" and "to animation" have additional semantic constraints when combined with other animations. The details of this are described below in the section How from, to and by attributes affect additive behavior.

Interpolation and indefinite simple durations

If the simple duration of an animation is indefinite (e.g. if no dur value is specified), interpolation is not generally meaningful. While it is possible to define an animation function that is not based upon a defined simple duration (e.g. some random number algorithm), most animations define the function in terms of the simple duration. If an animation function is defined in terms of the simple duration and the simple duration is indefinite, the first value of the animation function (i.e. f(0)) should be used (effectively as a constant) for the animation function.

Examples

The following example using the values syntax animates the width of an SVG shape over the course of 10 seconds, interpolating from a width of 40 to a width of 100 and back to 40.

<rect ...>
   <animate attributeName="width" values="40;100;40" dur="10s"/>
</rect>

The following "from-to animation" example animates the width of an SVG shape over the course of 10 seconds from a width of 50 to a width of 100.

<rect ...>
   <animate attributeName="width" from="50" to="100" dur="10s"/>
</rect>

The following "from-by animation" example animates the width of an SVG shape over the course of 10 seconds from a width of 50 to a width of 75.

<rect ...>
   <animate attributeName="width" from="50" by="25" dur="10s"/>
</rect>

The following "by animation" example animates the width of an SVG shape over the course of 10 seconds from the original width of 40 to a width of 70.

<rect width="40"...>
   <animate attributeName="width" by="30" dur="10s"/>
</rect>

The following "to animation" example animates the width of an SVG shape over the course of 10 seconds from the original width of 40 to a width of 100.

<rect width="40"...>
   <animate attributeName="width" to="100" dur="10s"/>
</rect>

Animation function calculation modes

By default, a simple linear interpolation is performed over the values, evenly spaced over the duration of the animation. Additional attributes can be used for finer control over the interpolation and timing of the values. The calcMode attribute defines the basic method of applying values to the attribute. The keyTimes attribute provides additional control over the timing of the animation function, associating a time with each value in the values list. Finally, the keySplines attribute provides a means of controlling the pacing of interpolation between the values in the values list.

calcMode = "discrete" | "linear" | "paced" | "spline"

Specifies the interpolation mode for the animation. This can take any of the following values. The default mode is "linear", however if the attribute does not support linear interpolation (e.g. for strings), the calcMode attribute is ignored and discrete interpolation is always used.

"discrete": This specifies that the animation function will jump from one value to the next without any interpolation.
"linear": Simple linear interpolation between values is used to calculate the animation function.
This is the default calcMode.
"paced": Defines interpolation to produce an even pace of change across the animation. This is only supported for values that define a linear numeric range, and for which some notion of "distance" between points can be calculated (e.g. position, width, height, etc.). If "paced" is specified, any keyTimes or keySplines will be ignored.
"spline": Interpolates from one value in the values list to the next according to a time function defined by a cubic Bezier spline. The points of the spline are defined in the keyTimes attribute, and the control points for each interval are defined in the keySplines attribute.

keyTimes = "<list>"

A semicolon-separated list of time values used to control the pacing of the animation. Each time in the list corresponds to a value in the values attribute list, and defines when the value should be used in the animation function. Each time value in the keyTimes list is specified as a floating point value between 0 and 1 (inclusive), representing a proportional offset into the simple duration of the animation element.
If a list of keyTimes is specified, there must be exactly as many values in the keyTimes list as in the values list.
Each successive time value must be greater than or equal to the preceding time value.
The keyTimes list semantics depends upon the interpolation mode:

For linear and spline animation, the first time value in the list must be 0, and the last time value in the list must be 1. The keyTime associated with each value defines when the value is set; values are interpolated between the keyTimes.
For discrete animation, the first time value in the list must be 0. The keyTime associated with each value defines when the value is set; the animation function uses each value until the next keyTime defined.

If there are any errors in the keyTimes specification (bad values, too many or too few values), the animation will have no effect
If the simple duration is indefinite, any <code>keyTimes</code> specification
will be ignored.

keySplines = "<list>"

A set of Bezier control points associated with the keyTimes list, defining a cubic Bezier function that controls interval pacing. The attribute value is a semi-colon separated list of control point descriptions. Each control point description is a set of four floating point values:

x1
    y1 x2 y2

, describing the Bezier control points for one time segment. The keyTimes values that define the associated segment are the Bezier "anchor points", and the keySplines values are the control points.
Thus, there must be one fewer sets of control points than there are keyTimes.
The values must all be in the range 0 to 1.
This attribute is ignored unless the calcMode is set to "spline".
If there are any errors in the keySplines specification (bad values, too many or too few values), the animation will have no effect.

If the keyTimes attribute is not specified, the values in the values attribute are assumed to be equally spaced through the animation duration, according to the calcMode:

For discrete animation, the duration is divided into equal time periods, one per value. The animation function takes on the values in order, one value for each time period.
For linear and spline animation, the duration is divided into n-1 even periods, and the animation function is a linear interpolation between the values at the associated times. Note that a linear animation will be a nicely closed loop if the first value is repeated as the last.

Note that for the shorthand forms to animation and from-to animation, there are only 1 and 2 values respectively. Thus a discrete to animation will simply set the "to" value for the simple duration. A discrete from-to animation will set the "from" value for the first half of the simple duration and the "to" value for the second half of the simple duration.

Note that if the calcMode is set to "paced", the keyTimes attribute is ignored, and the values in the values attribute are spaced to produce a constant rate of change as the target attribute value is interpolated.

If the argument values for keyTimes or keySplines are not legal (including too few or too many values for either attribute), the animation will have no effect (see also Handling syntax errors).

In the calcMode, keyTimes and keySplines attribute values, leading and trailing white space and white space before and after semi-colon separators will be ignored.

Examples

This example describes a somewhat unusual usage: "from-to animation" with discrete animation. The "stroke-linecap" attribute of SVG elements takes a string, and so implies a calcMode of discrete. The animation will set the stroke-linecap property to "round" for 5 seconds (half the simple duration) and then set the stroke-linecap to "square" for 5 seconds.

<rect stroke-linecap="butt"...>
   <animate attributeName="stroke-linecap" 
      from="round" to="square" dur="10s"/>
</rect>

This example illustrates the use of keyTimes:

<animate attributeName="x" dur="10s" values="0; 50; 100" 
     keyTimes="0; .8; 1" calcMode="linear"/>

The keyTimes values causes the "x" attribute to have a value of "0" at the start of the animation, "50" after 8 seconds (at 80% into the simple duration) and "100" at the end of the animation. The value will change more slowly in the first half of the animation, and more quickly in the second half.

Extending this example to use keySplines:

<animate attributeName="x" dur="10s" values="0; 50; 100" 
     keyTimes="0; .8; 1" calcMode="spline" 
     keySplines=".5 0 .5 1; 0 0 1 1" />

The keyTimes still causes the "x" attribute to have a value of "0" at the start of the animation, "50" after 8 seconds and "100" at the end of the animation. However, the keySplines values define a curve for pacing the interpolation between values. In the example above, the spline causes an ease-in and ease-out effect between time 0 and 8 seconds (i.e. between keyTimes 0 and .8, and values "0" and "50"), but a strict linear interpolation between 8 seconds and the end (i.e. between keyTimes .8 and 1, and values "50" and "100"). See Figure 1 below for an illustration of the curves that these keySplines values define.

For some attributes, the pace of change may not be easily discernable by viewers. However for animations like motion, the ability to make the speed of the motion change gradually, and not in abrupt steps, can be important. The keySplines attribute provides this control.

The following figure illustrates the interpretation of the keySplines attribute. Each diagram illustrates the effect of keySplines settings for a single interval (i.e. between the associated pairs of values in the keyTimes and values lists.). The horizontal axis can be thought of as the input value for the unit progress of interpolation within the interval - i.e. the pace with which interpolation proceeds along the given interval. The vertical axis is the resulting value for the unit progress, yielded by the keySplines function. Another way of describing this is that the horizontal axis is the input unit time for the interval, and the vertical axis is the output unit time. See also the section Timing and real-world clock times.

Example keySplines01 - keySplines of 0 0 1 1 (the default) keySplines="0 0 1 1" (the default)

Example keySplines02 - keySplines of .5 0 .5 1 keySplines=".5 0 .5 1"

Example keySplines03 - keySplines of .5 0 .5 1 keySplines="0 .75 .25 1"

Example keySplines04 - keySplines of .5 0 .5 1 keySplines="1 0 .25 .25"

Figure - Illustration of keySplines effect.

To illustrate the calculations, consider the simple example:

<animate dur="4s" values="10; 20" keyTimes="0; 1"
     calcMode="spline" keySplines={as in table} />

Using the keySplines values for each of the four cases above, the approximate interpolated values as the animation proceeds are:

keySplines values Initial value After 1s After 2s After 3s Final value

0 0 1 1 10.0 12.5 15.0 17.5 20.0

.5 0 .5 1 10.0 11.0 15.0 19.0 20.0

0 .75 .25 1 10.0 18.0 19.3 19.8 20.0

1 0 .25 .25 10.0 10.1 10.6 16.9 20.0

For a formal definition of Bezier spline calculation, see [Foley] pp. 488-491.

3.3.3 Specifying the animation effect F(t)

As described above, the animation function f(t) defines the animation for the simple duration. However SMIL Animation allows the author to repeat this, and to specify whether the animation should simply end when the active duration completes, or whether it should be frozen at the last value. In addition, the author can specify how each animation should be combined with other animations and the underlying DOM value.

This section describes the syntax and associated semantics for the additional functionality. A detailed model for combining animations is described, along with additional details of implications of the timing model.

Repeated animations

Repeating an animation causes the animation function f(t) to be "played" several times in sequence. The author can specify either how many times to repeat, using the timing attribute repeatCount, or how long to repeat, using the timing attribute repeatDur. Each repeat iteration is one instance of "playing" the animation function f(t). If the simple duration d is indefinite, the animation cannot repeat.

The repeatCount and repeatDur attributes are described in detail in [wd-timing-repeatAttrs].

Examples

In the following example, the 2.5 second animation function will be repeated twice; the active duration AD will be 5 seconds.

<animate attributeName="top" from="0" to="10" dur="2.5s" repeatCount="2" />

In the following example, the animation function will be repeated two full times and then the first half is repeated once more; the active duration AD will be 7.5 seconds.

<animate attributeName="top" from="0" to="10" dur="3s" repeatCount="2.5" />

In the following example, the animation function will repeat for a total of 7 seconds. It will play fully two times, followed by a fractional part of 2 seconds. This is equivalent to a repeatCount of 2.8. The last (partial) iteration will apply values in the range "0" to "8".

<animate attributeName="top" from="0" to="10" dur="2.5s" repeatDur="7s" />

In the following example, the simple duration is longer than the duration specified by repeatDur, and so the active duration will effectively cut short the simple duration. However, animation function still uses the specified simple duration. The effect of the animation is to interpolate the value of "top" from 10 to 15, over the course of 5 seconds.

<animate attributeName="top" from="10" to="20" dur="10s" repeatDur="5s" />

Controlling behavior of repeating animation - Cumulative animation

The author may also select whether a repeating animation should repeat the original behavior for each iteration, or whether it should build upon the previous results, accumulating with each iteration. For example, a motion path that describes an arc can repeat by moving along the same arc over and over again, or it can begin each repeat iteration where the last left off, making the animated element bounce across the window. This is called cumulative animation.

Using the path notation for a simple arc, we describe this example as:

<img ...>
   <animateMotion path="c( 3 5 8 5 10 0)" dur="10s"
      accumulate="sum" repeatCount="10" />
</img>

@@ Pictures would help here

The image moves from the original position along the arc over the course of 10 seconds. As the animation repeats, it builds upon the previous value and begins the second arc where the first one ended. In this way, the image "bounces" across the screen. This could be described as a complete path, but the path description would get quite large, and would be more cumbersome to edit.

Note that cumulative animation only controls how a single animation accumulates the results of the animation function as it repeats. It specifically does not control how one animation interacts with other animations to produce a presentation value. This latter behavior is described in the section Additive animation.

Any numeric attribute that supports addition can support cumulative animation. For example, we can grow the "width" of an SVG "rect" element by 100 pixels in 100 seconds.

<rect width="20px"...>
   <animate attributeName="width" by="10px" dur="10s"
      accumulate="sum" repeatCount="10" />
</rect>

After 10 seconds, the rectangle is 30 pixels wide. The animation repeats, and builds upon the previous values growing to 40 pixels after 20 pixels, and up to 120 pixels wide after all ten repeats.

The behavior of repeating animations is controlled with the accumulate attribute:

accumulate = "none" | "sum"

Controls whether or not the animation is cumulative.

"sum": Each repeat iteration after the first builds upon the last value of the previous iteration.
"none": Repeat iterations are not cumulative, and simply repeat the animation function f(t). This is the default.

This attribute is ignored if the target attribute value does not support addition, or if the animation element does not repeat.
Cumulative animation is not defined for "to animation". This attribute will be ignored if the animation function is specified with only the to attribute. See also Specifying function values.

To produce the cumulative animation behavior, the animation function f(t) must be modified slightly. Each iteration after the first must add in the last value of the previous iteration - this is expressed as a multiple of the last value specified for the animation function. Note that cumulative animation is defined in terms of the values specified for the animation behavior, and not in terms of sampled or rendered animation values. The latter would vary from machine to machine, and could even vary between document views on the same machine.

Let f_i(t) represent the cumulative animation function for a given iteration i.

The first iteration f₀(t) is unaffected by accumulate, and so is the same as the original animation function definition.

f₀(t) = f(t)

Let ve be the last value specified for the animation function (e.g. the "to" value or the last value in a "values" list). Each iteration after the first (i.e. f_i(t) where i >= 1 ) adds in the computed offset:

f_i(t) = (ve * i) + f(t)

Freezing animations

@@ Rewrite to make reference to (and use) Timing module's definition. (Say what it means to freeze an animation, rather than define the fill attribute.)

By default when an animation element ends, its effect is no longer applied to the presentation value for the target attribute. For example, if an animation moves an image and the animation element ends, the image will "jump back" to its original position.

<img top="3" ...>
   <animate begin="5s" dur="10s" attributeName="top" by="100"/>
</img>

The image will appear stationary at the top value of "3" for 5 seconds, then move 100 pixels down in 10 seconds. 15 seconds after the image begin, the animation ends, the effect is no longer applied, and the image jumps back from 103 to 3 where it started (i.e. to the underlying value of the top attribute).

The fill attribute can be used to maintain the value of the animation after the active duration of the animation element ends:

<img top="3" ...>
   <animate begin= "5s" dur="10s" attributeName="top" by="100"
          fill="freeze" />
</img>

The animation ends 15 seconds after the image begin, but the image remains at the top value of 103. The attribute "freezes" the last value of the animation @@ "for the period of time defined by the fill attribute" will make sense here once this section is rewritten.

The freeze behavior of an animation is controlled using the "fill "attribute:

fill = "freeze" | "remove"

This attribute can have the following values:

freeze: The animation effect F(t) is defined to freeze the effect value at the last value of the active duration. The animation effect is "frozen" @@ "for the period of time defined by the fill attribute", as above.
remove: The animation effect is removed (no longer applied) when the active duration of the animation is over. After the active end AE of the animation, the animation no longer affects the target (unless the animation is restarted - see Restarting animations).
This is the default value.
@@ Need to deal with the other values for fill included in the timing module 'hold' and maybe 'transition'

This functionality is also useful when a series of motions are defined that should build upon one another, as in this example:

<img ...>
   <animateMotion begin="0" dur="5s" path="[some path]"
           additive="sum" fill="freeze" />
   <animateMotion begin="5s" dur="5s" path="[some path]"
           additive="sum" fill="freeze" />
   <animateMotion begin="10s" dur="5s" path="[some path]"
           additive="sum" fill="freeze" />
</img>

The image moves along the first path, and then starts the second path from the end of the first, then follows the third path from the end of the second, and stays at the final point. The semantics of the additive attribute are defined in the next section.

Note that if the active duration cuts short the simple duration (including the case of partial repeats), then the freeze value is defined by the shortened simple duration. In the following example, the animation function repeats two full times and then again for one-half of the simple duration. In this case, the freeze value will be 15:

<animate from="10" to="20" dur="4s" 
         repeatCount="2.5" fill="freeze" .../>

In the following example, the dur attribute is missing, and so the simple duration is indefinite. The active duration is constrained by end to be 10 seconds. Since interpolation is not defined, the freeze value will be 10:

<animate from="10" to="20" end="10s" fill="freeze" .../>

Additive animation

It is frequently useful to define animation using offsets or deltas from an attribute's value, rather than absolute values. A simple "grow" animation can increase the width of an object by 10 pixels:

<rect width="20px" ...>
   <animate attributeName="width" from="0px" to="10px" dur="10s"
      additive="sum"/>
</rect>

The width begins at 20 pixels, and increases to 30 pixels over the course of 10 seconds. If the animation were declared to be non-additive, the same from and to values would make the width go from 0 to 10 pixels over 10 seconds.

In addition, many complex animations are best expressed as combinations of simpler animations. A "vibrating" path, for example, can be described as a repeating up and down motion added to any other motion:

<img ...>
   <animateMotion from="0,0" to="100,0" dur="10s" />
   <animateMotion values="0,0; 0,5; 0,0" dur="1s"
                  repeatDur="10s" additive="sum"/>
</img>

When there are multiple animations defined for a given attribute that overlap at any moment, the two either add together or one overrides the other. Animations overlap when they are both either active or frozen at the same moment. The ordering of animations (e.g. which animation overrides which) is determined by a priority associated with each animation. The animations are prioritized according to when each begins. The animation first begun has lowest priority and the most recently begun animation has highest priority.

Higher priority animations that are not additive will override all earlier animations, and simply set the attribute value. Animations that are additive apply (i.e. add to) to the result of the earlier-activated animations. For details on how animations are combined, see The animation sandwich model.

The additive behavior of an animation is controlled by the additive attribute:

additive = "replace" | "sum"

Controls whether or not the animation is additive.

"sum": Specifies that the animation will add to the underlying value of the attribute and other lower priority animations.
"replace": Specifies that the animation will override the underlying value of the attribute and other lower priority animations. This is the default, however the behavior is also affected by the animation value attributes by and to, as described in "How from, to and by attributes affect additive behavior", below.

This attribute is ignored if the target attribute does not support additive animation.

The host language must specify which attributes support additive animation. It may be defined for numeric attributes and other data types for which an addition function is defined. This may include numeric attributes for concepts such as position, widths and heights, sizes, etc. It also may include color (refer to The animateColor element) and other data types as specified by the host language. Some numeric attributes (e.g. a telephone number attribute) may not sensibly support addition.

Attribute types such as strings and Booleans, for which addition is not defined, cannot support additive animation.

While many animations of numerical attributes will be additive, this is not always desired. As an example of an animation that is defined to be non-additive, consider a hypothetical extension animation "mouseFollow" that causes an object to track the mouse.

<img ...>
   <animateMotion dur=10s repeatDur="indefinite"
           path="[some nice path]" />
   <mouseFollow begin="mouseover" dur="5s"
           additive="replace" fill="remove" />
</img>

The mouse-tracking animation runs for 5 seconds every time the user mouses over the image. It cannot be additive, or it will just offset the motion path in some odd way. The mouseFollow needs to override the animateMotion while it is active. When the mouseFollow completes, its effect is no longer applied and the animateMotion again controls the presentation value for position.

How from, to and by attributes affect additive behavior.

The attribute values to and by, used to describe the animation function, can override the additive attribute in certain cases:

If by is used without from, the animation is defined to be additive (i.e. the equivalent of additive="sum").
If to is used without from (i.e. a "to animation"), and if the attribute supports addition, the animation is defined to be a kind of mix of additive and non-additive. The underlying value is used as a starting point as with additive animation, however the ending value specified by the to attribute overrides the underlying value as though the animation was non-additive.

For the hybrid case of a "to-animation", the animation function f(t) is defined in terms of the underlying value, the specified to value, and the current value of t (i.e. time) relative to the simple duration d.

v_cur is the current base value (at time t)
v_to is the defined "to" value
f(t) = v_cur+ ((v_to- v_cur) * (t/d))

Note that if no other (lower priority) animations are active or frozen, this defines simple interpolation. However if another animation is manipulating the base value, the "to-animation" will add to the effect of the lower priority, but will dominate it as it nears the end of the simple duration, eventually overriding it completely. The value for F(t) when a "to-animation" is frozen (at the end of the simple duration) is just the to value. If a "to-animation" is frozen anywhere within the simple duration (e.g. using a repeatCount of "2.5"), the value for F(t) when the animation is frozen is the value computed for the end of the active duration. Even if other, lower priority animations are active while a "to-animation" is frozen, the value for F(t) does not change.

For an example of additive "to-animation", consider the following two additive animations. The first, a "by-animation" applies a delta to attribute "x" from 0 to -10. The second, a "to-animation" animates to a final value of 10.

 <foo x="0" .../>
    <animate id="A1" attributeName="x" 
        by="-10" dur="10s" fill="freeze" />
    <animate id="A2" attributeName="x" 
        to="10"  dur="10s" fill="freeze" />
 </foo>

The presentation value for "x" in the example above, over the course of the 10 seconds is presented in Figure 2 below. These values are simply computed using the formula described above. Note that the value for F(t) for A2 is the presentation value for "x".

Time F(t) for A1 F(t) for A2

0 0 0

1 -1 0.1

2 -2 0.4

3 -3 0.9

4 -4 1.6

5 -5 2.5

6 -6 3.6

7 -7 4.9

8 -8 6.4

9 -9 8.1

10 -10 10

Figure 2 - Effect of Additive to-animation example

Additive and Cumulative animation

The "accumulate" attribute should not be confused with the "additive" attribute. The "additive" attribute defines how an animation is combined with other animations and the base value of the attribute. The "accumulate" attribute defines only how the animation function interacts with itself, across repeat iterations.

Typically, authors expect cumulative animations to be additive (as in the examples described for accumulate above), but this is not required. The following example is not additive.

<img ...>
   <animate dur="10s" repeatDur="indefinite"
            attributeName="top" from="20" by="10"
            additive="replace" accumulate="sum" />
</img>

The animation overrides whatever original value was set for "top", and begins at the value 20. It moves down by 10 pixels to 30, then repeats. It is cumulative, so the second iteration starts at 30 and moves down by another 10 to 40. Etc.

When a cumulative animation is also defined to be additive, both features function normally. The accumulated effect for F(t) is used as the value for the animation, and is added to the underlying value for the target attribute. Refer also to The animation sandwich model.

Restarting animations

Animation elements follow the definition of restart in the SMIL Timing module. This section is descriptive.

When an animation restarts, the defining semantic is that it behaves as though this were the first time the animation had begun, independent of any earlier behavior. The animation effect F(t) is defined independent of the restart behavior. Any effect of an animation playing earlier is no longer applied, and only the current animation effect F(t) is applied.

If an additive animation is restarted while it is active or frozen, the previous effect of the animation (i.e. before the restart) is no longer applied to the attribute. Note in particular that cumulative animation is defined only within the active duration of an animation. When an animation restarts, all accumulated context is discarded, and the animation effect F(t) begins accumulating again from the first iteration of the restarted active duration.

3.3.4 Handling syntax errors

The specific error handling mechanisms for each attribute are described with the individual syntax descriptions. However, some of these specifications describe the behavior of an animation with syntax errors as "having no effect". This means that the animation will continue to behave normally with respect to timing, but will not manipulate any presentation value, and so will have no visible impact upon the presentation.

In particular, this means that if other animation elements are defined to begin or end relative to an animation that "has no effect", the other animation elements will begin and end as though there were no syntax errors. The presentation runtime may indicate an error, but need not halt presentation or animation of the document. Some host languages and/or runtimes may choose to impose stricter error handling (see also Error handling semantics for a discussion of host language issues with error handling). Authoring environments may also choose to be more intrusive when errors are detected.

3.3.5 The animation sandwich model

When an animation is running, it does not actually change the attribute values in the DOM. The animation runtime must maintain a presentation value for any target attribute, separate from the DOM, CSS, or other object model (OM) in which the target attribute is defined. The presentation value is reflected in the display form of the document. The effect of animations is to manipulate this presentation value, and not to affect the underlying DOM or CSS OM values.

The remainder of this discussion uses the generic term OM for both the XML DOM [DOM2] as well as the CSS-OM. If an implementation does not support an object model, it must maintain the original value as defined by the document as well as the presentation value; for the purposes of this section, we will consider this original value to be equivalent to the value in the OM.

The model accounting for the OM and concurrently active or frozen animations for a given attribute is described as a "sandwich", an analogy to the layers of meat and cheeses in a "submarine sandwich". On the bottom of the sandwich is the base value taken from the OM. Each active (or frozen) animation is a layer above this. The layers (i.e. the animations) are placed on the sandwich in order according to priority, with higher priority animations placed above lower priority animations. Note that animations manipulate the presentation value coming out of the OM in which the attribute is defined, and pass the resulting value on to the next layer of document processing. This does not replace or override any of the normal document OM processing cascade.

Specifically, animating an attribute defined in XML will modify the presentation value before it is passed through the style sheet cascade, using the XML DOM value as its base. Animating an attribute defined in a style sheet language will modify the presentation value passed through the remainder of the cascade.

In both the DOM 2 CSS-OM and in CSS2, the terms "specified", "computed" and "actual" are used to describe the results of evaluating the syntax, the cascade and the presentation rendering. When animation is applied to CSS properties of a particular element, the base value to be animated is read using the (readonly) getComputedStyle() method on that element. The values produced by the animation are written into an override stylesheet for that element, which may be obtained using it's getOverrideStyle() method. These new values then affect the cascade and are reflected in a new computed value (and thus, modified presentation). This means that the effect of animation overrides all style sheet rules, except for user rules with the !important property. This enables !important user style settings to have priority over animations, an important requirement for accessibility. Note that the animation may have side-effects upon the document layout. See also the [CSS2] specification (the terms are defined in section 6.1).

Within an OM, animations are prioritized according to when each begins. The animation first begun has lowest priority and the most recently begun animation has highest priority. When two animations start at the same moment in time, the activation order is resolved as follows:

If one animation is a time dependent of another (e.g. it is specified to begin when another begins), then the time dependent is considered to activate after the syncbase element, and so has higher priority. Time dependency is further discussed in Propagating changes to times. This rule applies independent of the timing described for the syncbase element - i.e. it does not matter whether the syncbase element begins on an offset, relative to another syncbase, relative to an event-base, or via hyperlinking. In all cases, the syncbase is begun before any time dependents are begun, and so the syncbase has lower priority than the time dependent.
If two animations share no time dependency relationship (e.g. neither is defined relative to the other, even indirectly) the element that appears first in the document has lower priority. This includes the cases in which two animation elements are defined relative to the same syncbase or event-base.

Note that if an animation is restarted (see also Restarting animations), it will always move to the top of the priority list, as it becomes the most recently activated animation. That is, when an animation restarts, its layer is pulled out of the sandwich, and added back on the very top. Note also that when an element repeats, the priority is not affected (repeat behavior is not defined as restarting).

Each additive animation adds its effect to the result of all sandwich layers below. A non-additive animation simply overrides the result of all lower sandwich layers. The end result at the top of the sandwich is the presentation value that must be reflected in the document view.

Some attributes that support additive animation have a defined legal range for values (e.g. an opacity attribute may allow values between 0 and 1). In some cases, an animation function may yield out of range values. It is up to the implementation to clamp the results at the top of the animation stack to the legal range before applying them to the presentation value. However, the effect of all the animations in the stack should be combined, before any clamping is performed. Although individual animation functions may yield out of range values, the combination of additive animations in the animation stack may still be legal. Clamping only the final result and not the effect of the individual animation functions provides support for these cases. The host language must define the clamping semantics for each attribute that can be animated. As an example, this is defined for The animateColor element.

Initially, before any animations for a given attribute are active, the presentation value will be identical to the original value specified in the document (the OM value).

When all animations for a given attribute have completed and the associated animation effects are no longer applied, the presentation value will again be equal to the OM value. Note that if any animation is defined with fill="freeze", the effect of the animation will be applied as long as the document is displayed, and so the presentation value will reflect the animation effect until the document end. Refer also to the section "Freezing animations".

Some animations (e.g. animateMotion) will implicitly target an attribute, or possibly several attributes (e.g. the "posX" and "posY" attributes of some layout model). These animations must be placed in the respective animation stack for each attribute that is affected. Thus, e.g. an animateMotion animation may be in more than one animation stack (depending upon the layout model of the host language). For animation elements that implicitly target attributes, the host language designer must specify what attributes are implicitly targeted, and the runtime must maintain the animation stacks accordingly.

Note that any queries (via DOM interfaces) on the target attribute will reflect the OM value, and will not reflect the effect of animations. Note also that the OM value may still be changed via the OM interfaces (e.g. using script). While it may be useful or desired to provide access to the final presentation value after all animation effects have been applied, such an interface is not provided as part of SMIL Animation. A future version may address this.

Although animation does not manipulate the OM values, the document display must reflect changes to the OM values. Host languages can support script languages that can manipulate attribute values directly in the OM. If an animation is active or frozen while a change to the OM value is made, the behavior is dependent upon whether the animation is defined to be additive or not, as follows: (see also the section Additive animation).

If only additive animations are active or frozen (i.e. no non-additive animations are active or frozen for the given attribute) when the OM value is changed, the presentation value must reflect the changed OM value as well as the effect of the additive animations. When the animations complete and the effect of each is no longer applied, the presentation value will be equal to the changed OM value.
If any non-additive animation is running when the OM value is changed, the presentation value will not reflect the changed OM value, but will only reflect the effect of the highest priority non-additive animation, and any still higher priority additive animations. When all non-additive animations complete and the effect of each is no longer applied, the presentation value will reflect the changed OM value and the effect of any additive animations that are active or frozen.

3.3.6 Implications of Timing Model for animation

The model of timing defined in the Timing module has several important results for animation: the definition of repeat, and the value sampled during the "frozen" state.

When repeating an animation, the arithmetic follows the end-point exclusive model. Consider the example:

<animation dur="4s" repeatCount="4" .../>

At time 0, the simple duration is sampled at 0, and the first value is applied. This is the inclusive begin of the interval. The simple duration is sampled normally up to 4 seconds. However, the appropriate way to map time on the active duration to time on the simple duration is to use the remainder of division by the simple duration:

simpleTime = REMAINDER( activeTime, d )

F(t) = f( REMAINDER( t, d ) ) where t is within the active duration

Note: REMAINDER( t, d ) is defined as t - d*floor(t/d)

Using this, a time of 4 (or 8 or 12) maps to the time of 0 on the simple duration. The endpoint of the simple duration is excluded from (i.e. not actually sampled on) the simple duration.

This implies that the last value of an animation function f(t) may never actually be applied (e.g. for a linear interpolation). In the case of an animation that does not repeat and does not specify fill="freeze", this may in fact be the case. However, in the following example, the appropriate value for the frozen state is clearly the "to" value:

<animation from="0" to="5" dur="4s" fill=freeze .../>

This does not break the interval timing model, but does require an additional qualification for the animation function F(t) while in the frozen state:

If the active duration is an even multiple of the simple duration, the value to apply in the frozen state is the last value defined for the animation function f(t).

The definition of accumulate also aligns to this model. The arithmetic is effectively inverted and values accumulate by adding in a multiple of the last value defined for the animation function f(t).

3.3.7 Animation function value details

Animation function values must be legal values for the specified attribute. Three classes of values are described:

Unitless scalar values. These are simple scalar values that can be parsed and set without semantic constraints. This class includes integers (base 10) and floating point (format specified by the host language).
String values. These are simple strings.
Language abstract values. These are values like CSS-length and CSS-angle values that have more complex parsing, but that can yield numbers that may be interpolated.

The animate element can interpolate unitless scalar values, and both animate and set elements can handle String values without any semantic knowledge of the target element or attribute. The animate and set elements must support unitless scalar values and string values. The host language must define which language abstract values should handled by these elements. Note that the animateColor element implicitly handles the abstract values for color values, and that the animateMotion element implicitly handles position and path values.

In order to support interpolation on attributes that define numeric values with some sort of units or qualifiers (e.g. "10px", "2.3feet", "$2.99"), some additional support is required to parse and interpolate these values. One possibility is to require that the animation framework have built-in knowledge of the unit-qualified value types. However, this violates the principal of encapsulation and does not scale beyond CSS to XML languages that define new attribute value types of this form.

The recommended approach is for the animation implementation for a given host environment to support two interfaces that abstract the handling of the language abstract values. These interfaces are not formally specified, but are simply described as follows:

The first interface converts a string (the animation function value) to a unitless, canonical number (either an integer or a floating point value). This allows animation elements to interpolate between values without requiring specific knowledge of data types like CSS-length. The interface will likely require a reference to the target attribute, to determine the legal abstract values. If the passed string cannot be converted to a unitless scalar, the animation element will treat the animation function values as strings, and the calcMode will default to "discrete".
The second interface converts a unitless canonical number to a legal string value for the target attribute. This may, for example, simply convert the number to a string and append a suffix for the canonical units. The animation element uses the result of this to actually set the presentation value.

Support for these two interfaces ensures that an animation engine need not replicate the parser and any additional semantic logic associated with language abstract values.

This is not an attempt to specify how an implementation provides this support, but rather a requirement for how values are interpreted. Animation behaviors should not have to understand and be able to convert among all the CSS-length units, for example. In addition, this mechanism allows for application of animation to new XML languages, if the implementation for a language can provide parsing and conversion support for attribute values.

3.4 Animation elements

This section defines the syntax and semantics of animation elements. @@ DTD definitions are used in this working draft. The Working Group expects to replace them with schema-based defintions prior to Recommendation.

3.4.1 Common syntax DTD definitions

Timing attributes are defined in the SMIL Timing module.

Animation attributes

<!ENTITY % animAttrs
  attributeName  CDATA  #REQUIRED
  attributeType  CDATA  #IMPLIED
  additive       (replace | sum) "replace"
  accumulate     (none | sum) "none"
>

<!ENTITY % animTargetAttr
  targetElement  IDREF  #IMPLIED
>

<!ENTITY % animLinkAttrs
  type     (simple | extended | locator | arc) #FIXED "simple"
  show     (new | embed | replace) #FIXED 'embed'
  actuate  (user | auto) #FIXED 'auto'
  href     CDATA  #IMPLIED
>

3.4.2 The animate element

The <animate> element introduces a generic attribute animation that requires little or no semantic understanding of the attribute being animated. It can animate numeric scalars as well as numeric vectors. It can also animate discrete sets of non-numeric attributes. The <animate> element is an empty element - it cannot have child elements.

This element supports from/to/by and values descriptions for the animation function, as well as all of the calculation modes. It supports all the described timing attributes. These are all described in respective sections above.

<!ELEMENT animate EMPTY>
<!ATTLIST animate
  %timingAttrs
  %animAttrs
  id             ID     #IMPLIED 
  calcMode       (discrete | linear | paced | spline ) "linear"
  values         CDATA  #IMPLIED
  keyTimes       CDATA  #IMPLIED
  keySplines     CDATA  #IMPLIED
  from           CDATA  #IMPLIED
  to             CDATA  #IMPLIED
  by             CDATA  #IMPLIED
>

Numerous examples are provided above.

3.4.3 The set element

The <set> element provides a simple means of just setting the value of an attribute for a specified duration. As with all animation elements, this only manipulates the presentation value, and when the animation completes, the effect is no longer applied. That is, <set> does not permanently set the value of the attribute.

The <set> element supports all attribute types, including those that cannot reasonably by interpolated and that more sensibly support semantics of simply setting a value (e.g. strings and Boolean values). The set element is non-additive. The additive and accumulate attributes are not allowed.

The <set> element supports all the timing attributes to specify the simple and active durations. However, the repeatCount and repeatDur attributes will just affect the active duration of the <set>, extending the effect of the <set> (since it is not really meaningful to "repeat" a static operation). Note that using fill="freeze" with <set> will have the same effect as defining the timing so that the active duration is "indefinite".

The <set> element supports a more restricted set of attributes than the <animate> element (in particular, only one value is specified, and no interpolation control is supported):

<!ELEMENT set EMPTY>
<!ATTLIST set
  %timingAttrs
  id             ID     #IMPLIED 
  attributeName  CDATA  #REQUIRED
  attributeType  CDATA  #IMPLIED
  to             CDATA  #IMPLIED
>

to = "<value>": Specifies the value for the attribute during the duration of the <set> element. The argument value must match the attribute type.

Examples

The following changes the stroke-width of an SVG rectangle from the original value to 5 pixels wide. The effect begins at 5 seconds and lasts for 10 seconds, after which the original value is again used.

<rect ...>
   <set attributeName="stroke-width" to="5px" 
            begin="5s" dur="10s" fill="remove" />
</rect>

The following example sets class attribute of the text element to the string "highlight" when the mouse moves over the element, and removes the effect when the mouse moves off the element.

<text>This will highlight if you mouse over it...
   <set attributeName="class" to="highlight" 
            begin="mouseover" end="mouseout" />
</text>

3.4.4 The animateMotion element

In order to abstract the notion of motion paths across a variety of layout mechanisms, we introduce the <animateMotion> element. This describes motion in the abstract - the host language defines the layout model and must specify the precise semantics of motion.

All values must be x, y value pairs. Each x and y value may specify any units supported for element positioning by the host language. The host language defines the default units. In addition, the host language defines the reference point for positioning an element. This is the point within the element that is aligned to the position described by the motion animation. The reference point defaults in some languages to the upper left corner of the element bounding box; in other languages (such as SVG) the reference point may be specified for the element.

The attributeName and attributeType attributes are not used with animateMotion, as the manipulated position attribute(s) are defined by the host language. If the position is exposed as an attribute or attributes that can also be animated (e.g. as "top" and "left", or "posX" and "posY"), implementations must combine <animateMotion> animations into the respective stacks with other animations that manipulate individual position attributes. See also the section The animation sandwich model.

The <animateMotion> element adds an additional syntax alternative for specifying the animation, the "path" attribute. This allows the description of a path using a subset of the SVG path syntax. Note that if a path is specified, it will override any specified values for values or from/to/by attributes.

The default calculation mode (calcMode) for animateMotion is "paced". This will produce constant velocity motion along the specified path. Note that while animateMotion elements can be additive, authors should note that the addition of two or more "paced" (constant velocity) animations may not result in a combined motion animation with constant velocity.

<!ELEMENT animateMotion EMPTY>
<!ATTLIST animateMotion
  %timingAttrs
  id             ID     #IMPLIED 
  additive       (replace | sum) "replace"
  accumulate     (none | sum) "none"
  calcMode       (discrete | linear | paced | spline) "paced"
  values         CDATA  #IMPLIED
  from           CDATA  #IMPLIED
  to             CDATA  #IMPLIED
  by             CDATA  #IMPLIED
  keyTimes       CDATA  #IMPLIED
  keySplines     CDATA  #IMPLIED
  path           CDATA  #IMPLIED
  origin         (default) "default"
/>

path = "<path-description>"

Specifies the curve that describes the attribute value as a function of time. The supported syntax is a subset of the SVG path syntax. Support includes commands to describes lines ("MmLlHhVvZz") and Bezier curves ("Cc"). For details refer to the path specification in SVG [SVG].
Note that SVG provides two forms of path commands - "absolute" and "relative". These terms may appear to be related to the definition of additive animation and the "origin" attribute, however they should not be confused. The terms "absolute" and "relative" apply only to the definition of the path itself, and not to the operation of the animation. The "relative" commands define a path point relative to the previously specified point. The terms "absolute" and "relative" are unrelated to the definitions of both "additive" animation or the specification of the "origin".

For the "absolute" commands ("MLHVZC"), the host language must specify the coordinate system of the path values.
If the "relative" commands ("mlhvzc") are used, they simply define the point as an offset from the previous point on the path. This does not affect the definition of "additive" or "origin" for the animateMotion element.

Move To commands - "M <x> <y>" or "m <dx> <dy>": Start a new sub-path at the given (x,y) coordinate. If a moveto is followed by multiple pairs of coordinates, the subsequent pairs are treated as implicit lineto commands.
Line To commands - "L <x> <y>" or "l <dx> <dy>": Draw a line from the current point to the given (x,y) coordinate which becomes the new current point. A number of coordinate pairs may be specified to draw a polyline.
Horizontal Line To commands - "H <x>" or "h <dx>": Draws a horizontal line from the current point (cpx, cpy) to (x, cpy). Multiple x values can be provided (although this generally only makes sense for the relative form).
Vertical Line To commands - "V <y>" or "v <dy>": Draws a vertical line from the current point (cpx, cpy) to (cpx, y). Multiple y values can be provided (although generally only makes sense for the relative form).
Closepath commands - "Z" or "z": The "closepath" causes an automatic straight line to be drawn from the current point to the initial point of the current subpath.
Cubic bezier Curve To commands - "C <x1> <y1> <x2> <y2> <x> <y>" or "c <dx1> <dy1> <dx2> <dy2> <dx> <dy>": Draws a cubic Bezier curve from the current point to (x,y) using (x1,y1) as the control point at the beginning of the curve and (x2,y2) as the control point at the end of the curve. Multiple sets of coordinates may be specified to draw a polybezier.

When a path is combined with "linear" or "spline" calcMode settings, the number of values is defined to be the number of points defined by the path, unless there are "move to" commands within the path. A "move to" command does not count as an additional point for the purpose of keyTimes and spline, and should not define an additional "segment" for the purposes of timing or interpolation. When a path is combined with a "paced" calcMode setting, all "move to" commands are considered to have 0 length (i.e. they always happen instantaneously), and should not be considered in computing the pacing.

calcMode

Defined as above in Animation function calculation modes, but note that the default calcMode for animateMotion is "paced". This will produce constant velocity motion across the path.
The use of "discrete" for the calcMode together with a "path" specification is allowed, but is generally not useful (it will simply jump the target element from point to point).
The use of "linear" for the calcMode with more than 2 points described in "values", "path" or "keyTimes" may result in motion with varying velocity. The "linear" calcMode specifies that time is evenly divided among the segments defined by the "values" or "path" (note: any "keyTimes" list defines the same number of segments). The use of "linear" does not specify that time is divided evenly according to the distance described by each segment.
For motion with constant velocity, calcMode should be set to "paced".
For complete velocity control, calcMode can be set to "spline" and the author can specify a velocity control spline with "keyTimes" and "keySplines".

origin= "default"

Specifies the origin of motion for the animation. The values and semantics of this attribute are dependent upon the layout and positioning model of the host language. In some languages, there may be only one option (i.e. "default"). However, in CSS positioning for example, it is possible to specify a motion path relative to the container block, or to the layout position of the element. It is often useful to describe motion relative to the position of the element as it is laid out (e.g. from off screen left to the layout position, specified as from="(-100, 0)" and to="(0, 0)". Authors must be able to describe motion both in this manner, as well as relative to the container block. The origin attribute supports this distinction. Nevertheless, because the host language defines the layout model, the host language must also specify the "default" behavior, as well as any additional attribute values that are supported.
Note that the definition of the layout model in the host language specifies whether containers have bounds, and the behavior when an element is moved outside the bounds of the layout container. In CSS2 [CSS2], for example, this can be controlled with the "clip" property.
Note that for additive animation, the "origin" distinction is not meaningful. This attribute only applies when additive is set to "replace".

@@Should add an example, although some are included above.

3.4.5 The animateColor element

The <animateColor> element specifies an animation of a color attribute. The host language must specify those attributes that describe color values, and that can support color animation.

All values must represent sRGB color values. Legal value syntax for attribute values is defined by the host language.

Interpolation is defined on a per-color-channel basis.

<!ELEMENT animateColor EMPTY>
<!ATTLIST animateColor
  %animAttrs
  %timingAttrs
  id             ID     #IMPLIED 
  calcMode       (discrete | linear
                  | paced | spline ) "linear"
  values         CDATA  #IMPLIED
  from           CDATA  #IMPLIED
  to             CDATA  #IMPLIED
  by             CDATA  #IMPLIED
  keyTimes       CDATA  #IMPLIED
  keySplines     CDATA  #IMPLIED
>

The values in the from/to/by and values attributes may specify negative and out of gamut values for colors. The function defined by an individual animateColor may yield negative or out of gamut values. The implementation must correct the resulting presentation value, to be legal for the destination (display) colorspace. However, as described in The animation stack model, the implementation should only correct the final result of all animations for a given attribute, and should not correct the effect of individual animations.

Values are corrected by "clamping" the values to the correct range. Values less than the minimum allowed value are clamped to the minimum value (commonly 0, but not necessarily so for some color profiles). Values greater than the defined maximum are clamped to the maximum value (defined by the attributeType domain) .

Note that color values are corrected by clamping them to the gamut of the destination (display) colorspace. Some implementations may be unable to process values which are outside the source (sRGB) colorspace and must thus perform clamping to the source colorspace, then convert to the destination colorspace and clamp to its gamut. The point is to distinguish between the source and destination gamuts; to clamp as late as possible, and to realize that some devices, such as inkjet printers which appear to be RGB devices, have non-cubical gamuts.

Note to implementers: When animateColor is specified as a "to animation", the animation function should assume Euclidean RGB-cube distance where deltas must be computed. See also Specifying function values and How from, to and by attributes affect additive behavior. Similarly, when the calcMode attribute for animateColor is set to "paced", the animation function should assume Euclidean RGB-cube distance to compute the distance and pacing.

3.5 Integrating SMIL Animation into a host language

This section describes what a language designer must actually do to specify the integration of SMIL Animation into a host language. This includes basic definitions and constraints upon animation.

3.5.1 Required host language definitions

The host language designer must provide the basis for animation semantics in the context of the particular host language.

The host language designer must integrate the SMIL Timing module into the host language.

3.5.2 Required definitions and constraints on animation targets

Specifying the target element

The host language designer must choose whether to support the targetElement attribute, or the XLink attributes for specifying the target element. Note that if the XLink syntax is used, the host language designer must decide how to denote the XLink namespace for the associated attributes. The namespace can be fixed in a DTD, or the language designer can require colonized attribute names to denote the XLink namespace for the attributes. The required XLink attributes have fixed values, and so may also be specified in a DTD, or can be required on the animation elements. Host language designers may require that the optional XLink attributes be specified. These decisions are left to the host language designer - the syntax details for XLink attributes do not affect the semantics of SMIL Animation.

In general, target elements may be any element in the document. Host language designers must specify any exceptions to this. Host language designers are discouraged from allowing animation elements to target elements outside of the document in which the animation element is defined (the XLink syntax for the target element could allow this, but the SMIL timing and animation semantics of this are not defined in this version of SMIL Animation).

Target attribute issues

The definitions in this module can be used to animate any attribute of any element in a host document. However, it is expected that host language designers integrating SMIL Animation may choose to constrain which elements and attributes can support animation. For example, a host language may not support animation of the language attribute of a script element. A host language which included a specification for DOM functionality might limit animation to the attributes which may legally be modified through the DOM.

Any attribute of any element not specifically excluded from animation by the host language may be animated, as long as the underlying data type (as defined by the host language for the attribute) supports discrete values (for discrete animation) and/or addition (for interpolated and additive animation).

Additive and cumulative animation is supported for any attribute for which animation is supported and for which addition is defined by the host language for the underlying data type, unless the attribute is specifically excluded from cumulative and additive animation.

All constraints upon animation must be described in the host language specification, as the DTD cannot reasonably express this.

The host language must define which language abstract values should be handled for animated attributes. For example, a host language that incorporates CSS may require that CSS length values be supported. This is further detailed in Animation function value details.

The host language must specify the interpretation of relative values. For example, if a value is specified as a percentage of the size of a container, the host language must specify whether this value will be dynamically interpreted as the container size is animated.

The host language must specify the semantics of clamping values for attributes. The language must specify any defined ranges for values, and how out of range values will be handled.

The host language must specify the formats supported for numeric attribute values. This includes integer values and especially floating point values for attributes such as keyTimes and keySplines. As a reasonable minimum, host language designers are encouraged to support the format described in [CSS2]. The specific reference within the CSS specification for these data types is section 4.3.1 Integers and real numbers of [CSS2].

Integrating animateMotion functionality

The host language specification must define which elements, if any, can be the target of animateMotion. In addition, the host language specification must describe the positioning model for elements, and must describe the model for animateMotion in this context (i.e. the semantics of the "default" value for the origin attribute must be defined). If there are different ways to describe position, additional attribute values for the origin attribute should be defined to allow authors control over the positioning model.

Example: SVG

As an example, SVG [SVG] integrates SMIL Animation. It specifies which of the elements, attributes and CSS properties may be animated. Some attributes (e.g. "viewbox" and "fill-rule") support only discrete animation, and others (e.g. "width", "opacity" and "stroke") support interpolated and additive animation. An example of an attribute that does not support any animation is the "xlink:actuate" attribute on the <use> element (the value of this attribute is fixed to "auto" in the DTD).

@@ The XLink syntax used here may be out of date (actuate=auto is now actuate=onLoad?). Once SVG/XLink settles on values for actuate, this section must be updated.

SVG details the format of numeric values, describing the legal ranges and allowing "scientific" (exponential) notation for floating point values.

3.5.3 Constraints on manipulating animation elements

Language designers integrating SMIL Animation are encouraged to disallow manipulation of attributes of the animation elements, after the document has begun. This includes both the attributes specifying targets and values, as well as the timing attributes. In particular, the id attribute (of type ID) on all animation elements must not be mutable (i.e. should be read-only). Requiring animation runtimes to track changes to id values introduces considerable complexity, for what is at best a questionable feature.

It is recommended that language specifications disallow manipulation of animation element attributes through DOM interfaces after the document has begun. It is also recommended that language specifications disallow the use of animation elements to target other animation elements.

Dynamically changing the attribute values of animation elements introduces semantic complications to the model that are not yet sufficiently resolved. This constraint may be lifted in a future version of SMIL Animation.

3.5.4 Extending animation

Language designers integrating SMIL Animation are encouraged to define new animation elements where such additions will be of convenience to authors. The new elements must be based on SMIL Animation and SMIL Timing, and must stay within the framework provided by SMIL Timing and Animation.

Language designers are also encouraged to define support for additive and cumulative animation for non-numeric data types where addition can sensibly be defined.

3.5.5 Error handling semantics

The host language designer may impose stricter constraints upon the error handling semantics. That is, in the case of syntax errors, the host language may specify additional or stricter mechanisms to be used to indicate an error. An example would be to stop all processing of the document, or to halt all animation.

Host language designers may not relax the error handling specifications, or the error handling response (as described in Handling syntax errors). For example, host language designers may not define error recovery semantics for missing or erroneous values in the values or keyTimes attribute values.

3.5.6 SMIL Animation namespace

Language designers can choose to integrate SMIL Animation as an independent namespace, or can integrate SMIL Animation names into a new namespace defined as part of the host language. Language designers that wish to put the SMIL Animation functionality in an isolated namespace should use the following namespace:

@@ URI to be confirmed by W3C webmaster. Differs from [SMIL-ANIMATION].

4. SMIL Content Control

Editors: Jeffrey Ayars (jeffa@real.com), RealNetworks; Dick Bulterman, (Dick.Bulterman@oratrix.com), Oratrix

4.1 Introduction

This Section defines the SMIL content control module. This module contains elements and attributes which provide for runtime content choices and optimized content delivery. Since these elements and attributes are defined in a module, designers of other markup languages can reuse the functionality in the SMIL content control module when they need to include media content control in their language. Conversely, language designers incorporating other SMIL modules do not need to include the content module if other content control functionality is already present.

Proposed Extensions to SMIL 1.0 content control functionality include:

Allow definition of priorities for different media objects. This allows for example dropping certain objects from the presentation or dropping layers in a layered encoding when there are insufficient resources (e.g. bandwidth, CPU).
Allow additional test-attributes (e.g. CPU-type, ...).
Allow author-defined test-attributes.
Allow user to see media objects that are important to him/her even though author excluded them at the current bitrate (accessibility requirement).
Allow display of time dependent links in a static list (accessibility requirement).
Allow declaration of media objects to be preloaded, as bandwidth allows, to improve presentation quality.

4.2 Content Selection

SMIL 1.0 provides a "test-attribute" mechanism to process an element only when certain conditions are true, e.g. when the client has a certain screen-size. SMIL 1.0 also provides the "switch" element for expressing that a set of document parts are alternatives, and that the first one fulfilling certain conditions should be chosen. This is useful to express that different language versions of an audio file are available, and to have the client select one of them. SMIL Boston includes these features and extends them by supporting new system test-attributes, as well as the ability to customize a presentation to an individual viewer by providing author defined, user selected test-attributes.

4.2.1 The `<switch>` Element

The switch element allows an author to specify a set of alternative elements from which only one acceptable element should be chosen. In SMIL Boston, an element is acceptable if the element is a SMIL Boston element, the media-type can be decoded (if the element declares media), and all of the test-attributes of the element evaluate to "true". When integrating content control into other languages, the language designer must specify what constitutes an "acceptable element."

An element is selected as follows: the player evaluates the elements in the order in which they occur in the switch element. The first acceptable element is selected at the exclusion of all other elements within the switch.

Thus, authors should order the alternatives from the most desirable to the least desirable. Furthermore, authors should place a relatively fail-safe alternative as the last item in the <switch> so that at least one item within the switch is chosen (unless this is explicitly not desired). Implementations should NOT arbitrarily pick an object within a <switch> when test-attributes for all child elements fail.

Note that some network protocols, e.g. HTTP and RTSP, support content-negotiation, which may be an alternative to using the "switch" element in some cases.

Attributes

The switch element can have the following attributes:

id: An XML identifier
title: This attribute offers advisory information about the element for which it is set. Values of the title attribute may be rendered by user agents in a variety of ways. For instance, visual browsers frequently display the title as a "tool tip" (a short message that appears when the pointing device pauses over an object).

4.2.2 Predefined Test Attributes

This specification defines a list of test attributes that can be added to language elements, as allowed by the language designer. In SMIL 1.0, these elements are synchronization and media elements. Conceptually, these attributes represent Boolean tests. When one of the test attributes specified for an element evaluates to "false", the element carrying this attribute is ignored.

Within the list below, the concept of "user preference" may show up. User preferences are usually set by the playback engine using a preferences dialog box, but this specification does not place any restrictions on how such preferences are communicated from the user to the SMIL player.

This version of SMIL defines the following test attributes. Note that some hyphenated test attribute names from SMIL 1.0 have been deprecated in favor of names using the current SMIL camelCase convention. For these, the deprecated SMIL 1.0 name is shown in parentheses after the preferred name.

systemBitrate (system-bitrate)

This attribute specifies the approximate bandwidth, in bits-per-second, available to the system. The measurement of bandwidth is application specific, meaning that applications may use sophisticated measurement of end-to-end connectivity, or a simple static setting controlled by the user. In the latter case, this could for instance be used to make a choice based on the users connection to the network. Typical values for modem users would be 14400, 28800, 56000 bit/s etc. Evaluates to "true" if the available system bitrate is equal to or greater than the given value. Evaluates to "false" if the available system bitrate is less than the given value.
The attribute can assume any integer value greater than 0. If the value exceeds an implementation-defined maximum bandwidth value, the attribute always evaluates to "false".

systemCaptions (system-captions)

This attribute allows authors to distinguish between a redundant text equivalent of the audio portion of the presentation (intended for audiences such as those with hearing disabilities or those learning to read who want or need this information) and text intended for a wide audience. The attribute can has the value "on" if the user has indicated a desire to see closed-captioning information, and it has the value "off" if the user has indicated that they don't wish to see such information. Evaluates to "true" if the value is "on", and evaluates to "false" if the value is "off".

systemLanguage (system-language)

The attribute value is a comma-separated list of language names as defined in [RFC1766].

Evaluates to "true" if one of the languages indicated by user preferences exactly equals one of the languages given in the value of this parameter, or if one of the languages indicated by user preferences exactly equals a prefix of one of the languages given in the value of this parameter such that the first tag character following the prefix is "-".

Evaluates to "false" otherwise.

Note: This use of a prefix matching rule does not imply that language tags are assigned to languages in such a way that it is always true that if a user understands a language with a certain tag, then this user will also understand all languages with tags for which this tag is a prefix.

The prefix rule simply allows the use of prefix tags if this is the case.

Implementation note: When making the choice of linguistic preference available to the user, implementers should take into account the fact that users are not familiar with the details of language matching as described above, and should provide appropriate guidance. As an example, users may assume that on selecting "en-gb", they will be served any kind of English document if British English is not available. The user interface for setting user preferences should guide the user to add "en" to get the best matching behavior.

Multiple languages MAY be listed for content that is intended for multiple audiences. For example, a rendition of the "Treaty of Waitangi", presented simultaneously in the original Maori and English versions, would call for:

<audio src="foo.rm" systemLanguage="mi, en"/>

However, just because multiple languages are present within the object on which the systemLanguage test attribute is placed, this does not mean that it is intended for multiple linguistic audiences. An example would be a beginner's language primer, such as "A First Lesson in Latin," which is clearly intended to be used by an English-literate audience. In this case, the systemLanguage test attribute should only include "en".

Authoring note: Authors should realize that if several alternative language objects are enclosed in a "switch", and none of them matches, this may lead to situations such as a video being shown without any audio track. It is thus recommended to include a "catch-all" choice at the end of such a switch which is acceptable in all cases.

systemOverdubOrCaption (system-overdub-or-caption)

This attribute is a setting which determines if users prefer overdubbing or captioning when the option is available. The attribute can have the values "caption" and "overdub". Evaluates to "true" if the user preference matches this attribute value. Evaluates to "false" if they do not match. This test attribute has been deprecated in favor of using systemOverdubOrSubtitle and systemCaptions.

systemRequired (system-required)

This attribute specifies the name of an extension. The extension may be a newly adopted language element or attribute, or may be the namespace prefix or URI for a namespace extension. Evaluates to "true" if the extension is supported by the implementation, otherwise, this evaluates to "false". [NAMESPACES]

systemScreenSize (system-screen-size)

Attribute values have the following syntax:
screen-size-val ::= screen-height"X"screen-width
Each of these is a pixel value, and must be an integer value greater than 0. Evaluates to "true" if the SMIL playback engine is capable of displaying a presentation of the given size. Evaluates to "false" if the SMIL playback engine is only capable of displaying smaller presentations.

systemScreenDepth (system-screen-depth)

This attribute specifies the depth of the screen color palette in bits required for displaying the element. The value must be greater than 0. Typical values are 1, 8, 24, 32 .... Evaluates to "true" if the SMIL playback engine is capable of displaying images or video with the given color depth. Evaluates to "false" if the SMIL playback engine is only capable of displaying images or video with a smaller color depth.

systemOverdubOrSubtitle

This attribute specifies whether subtitles or overdub is rendered for people who are watching a presentation where the audio may be in a language in which they are not fluent. This attribute can have two values: "overdub", which selects for substitution of one voice track for another, and "subtitle", which means that the user prefers the display of subtitles.

systemAudioDesc

This test attribute specifies whether or not closed audio descriptions should be rendered. This is intended to provide authors with the ability to support audio descriptions for blind users like systemCaptions provides text captions for deaf users. The attribute has the value "on" if the user has indicated a desire to hear audio descriptions, and it has the value "off" if the user has indicated that they don't wish to hear audio descriptions. Evaluates to "true" if the value is "on", and evaluates to "false" if the value is "off".

systemOperatingSystem

TBD

systemCPU

TBD

systemContentLocation

TBD (i.e. Streaming/Stored)

system???

TBD (i.e. Selecting embedded information (element in aggregate))

system????

TBD (i.e. Costs of accessing a stream, free or Pay-Per-View)

systemComponent

CDATA that describes a component of the playback system, e.g. user-agent component/feature, number of audio channels, codec, HW mpeg decoder, etc.

Examples

1) Choosing between content with different total bitrates

In a common scenario, implementations may wish to allow for selection via a systemBitrate attribute on elements. The media player evaluates each of the "choices" (elements within the switch) one at a time, looking for an acceptable bitrate given the known characteristics of the link between the media player and media server.

<par>
  <text .../>
  <switch>
    <par systemBitrate="40000">
    ...
    </par>
    <par systemBitrate="24000">
    ...
    </par>
    <par systemBitrate="10000">
    ........
    </par>
  </switch>
</par>
...

2) Choosing between audio resources with different bitrates

The elements within the switch may be any combination of elements. For instance, one could merely be specifying an alternate audio track:

...
<switch>
   <audio src="joe-audio-better-quality" systemBitrate="16000" />
   <audio src="joe-audio" systemBitrate="8000" />
</switch>
...

3) Choosing between audio resources in different languages

In the following example, an audio resource is available both in French and in English. Based on the user's preferred language, the player can choose one of these audio resources.

...
<switch>
   <audio src="joe-audio-french" systemLanguage="fr"/>
   <audio src="joe-audio-english" systemLanguage="en"/>
</switch>
...

4) Choosing between content written for different screens

In the following example, the presentation contains alternative parts designed for screens with different resolutions and bit-depths. Depending on the particular characteristics of the screen, the player can choose one of the alternatives.

...
<par>
  <text .../>
  <switch>
    <par systemScreenSize="1280X1024" systemScreenDepth="16">
    ........
    </par>
    <par systemScreenSize="640X480" systemScreenDepth="32">
    ...
    </par>
    <par systemScreenSize="640X480" systemScreenDepth="16">
    ...
    </par>
  </switch>
</par>
...

5) Distinguishing caption tracks from stock tickers

In the following example, captions are shown only if the user wants captions on.

...
<seq>
  <par>
    <audio      src="audio.rm"/>
    <video      src="video.rm"/>
    <textstream src="stockticker.rtx"/>
    <textstream src="closed-caps.rtx" systemCaptions="on"/>
  </par>
</seq>
...

6) Choosing the language of overdub and subtitle tracks

In the following example, a French-language movie is available with English, German, and Dutch overdub and subtitle tracks. The following SMIL segment expresses this, and switches on the alternatives that the user prefers.

...
<par>
  <switch>
    <audio src="movie-aud-en.rm" systemLanguage="en" 
      systemOverdubOrSubtitle="overdub"/>
    <audio src="movie-aud-de.rm" systemLanguage="de" 
      systemOverdubOrSubtitle="overdub"/>
    <audio src="movie-aud-nl.rm" systemLanguage="nl" 
      systemOverdubOrSubtitle="overdub"/>
    <!-- French for everyone else -->
    <audio src="movie-aud-fr.rm"/>
  </switch>
  <video src="movie-vid.rm"/>
  <switch>
    <textstream src="movie-sub-en.rt" systemLanguage="en"
      systemOverdubOrSubtitle="subtitle"/>
    <textstream src="movie-sub-de.rt" systemLanguage="de"
      systemOverdubOrSubtitle="subtitle"/>
    <textstream src="movie-sub-nl.rt" systemLanguage="nl"
      systemOverdubOrSubtitle="subtitle"/>
    <!-- French captions for those that really want them -->
    <textstream src="movie-caps-fr.rt" systemCaptions="on"/>
  </switch>
</par>
...

4.2.3 System Test Attribute In-Line Use

During the development of the SMIL 1.0, the issue of content selectability within a presentation received a great deal of attention. Early on, it was decided that a <switch> construct would form the basic selection primitive in the language. A <switch> allows a series of alternatives to be specified for a particular piece of content, one of which is selected by the runtime environment for presentation. An example of how a <switch> might be used to control the alternatives that could accompany a piece of video in a presentation would be:

...
<par>
  <video src="anchor.mpg" ... />
  <switch>
    <audio src="dutch.aiff"   systemLanguage="DU" systemCaptions="overdub" ... />
    <audio src="english.aiff" systemLanguage="EN" systemCaptions="overdub"... />
    <text  src="dutch.html"   systemLanguage="DU" systemCaptions="captions"... />
    <text  src="english.html" systemLanguage="EN" systemCaptions="captions"... />
  </switch>
</par> 
...

This fragment (which is pseudo-SMIL for clarity) says that a video is played in parallel with one of: Dutch audio, English audio, Dutch text, or English text. SMIL does not specify the selection mechanism, only a way of specifying the alternatives. While <switch>-based content control is a powerful mechanism, it comes with two problems.

First, it restricts the resolution of a <switch> to a single alternative. (If you want Dutch audio and Dutch text, you need to specify a compound <switch> statement, but in so doing, you always get the compound result.)

Second, and more restrictively, it requires the author to explicitly state all of the possible combinations of input streams during authoring. If the user wanted Dutch audio and English text, this possibility must have been considered at authoring time.

A solution to both problems is to allow in-line use of System Test Attributes, as given in the following document fragment:

...
<par>
  <video src="anchor.mpg" ... />
  <switch>
    <audio src="dutch.aiff"   systemLanguage="DU" systemCaptions="overdub" ... />
    <audio src="english.aiff" systemLanguage="EN" systemCaptions="overdub"... />
    <text  src="dutch.html"   systemLanguage="DU" systemCaptions="captions"... />
    <text  src="english.html" systemLanguage="EN" systemCaptions="captions"... />
  </switch>
</par> 
...

This example says: a video is accompanied by four other data objects, all of which are (logically) shown in parallel. This is, of course, exactly what happens: all five do run in parallel, but it could be that only the video and one audio stream are actually selected by the user (or a user agent) to be rendered during the presentation. At author time you know which logical streams are available, but it is only at runtime that you know which combination of all potentially available stream actually meet the user's needs. Logically, the alternatives indicated by the in-line construct could be represented as a set of <switch> statements, although the resulting <switch> could become explosive in size. Use of an in-line test mechanism significantly simplifies the specification of adaptive content in the case that many independent alternatives exist.

4.2.4 User Groups

The provision of <switch>-based and in-line system test attributes provides a selection mechanism based on general system attributes. This version of SMIL extends this notion with the definition of user test attributes. User test attributes allow presentation authors to define their own test attributes for use in a specific document.

The elements used to provide user group functionality are:

The `<user_attributes>` element

A section within the SMIL head that contains definitions of each of the user groups. The elements within the section define a collection of author-specified test attributes that can be used in the document.

The `<u_group>` element

An author-defined grouping of related media objects. These are defined within the section delineated by the <user_attributes> elements that make up part of the document header, and they are referenced within a media object definition.

The <u_group> element supports the following attributes:

id: the internal name of the attribute.
title: a string that can be used by a user-interface to provides a selection mechanism.
u_state: the evaluated state of the <u_group>. The initial state for the <u_group> is given in the value of this attribute, if unspecified, it defaults to RENDERED. The run-time state is defined by the user or the user agent via the SMIL DOM. If a particular playback environment does not (or cannot) support user selection, the u_state attribute controls the author-specified default presentation.
override: the author is given the ability to block overrides to the initial state by explicitly prohibiting this in the <u_group> definition. It is up to the runtime environment to enforce this attribute. The attribute can also be used to influence adaptive behavior at lower level in the transport hierarchy.
It would be good to have more explanation of this last use.

In addition to the <user_attribute> and <u_group> elements, this module provides a u_group attribute that can be applied to content requiring selection.

The `u_group` attribute

The u_group attribute is evaluated as a test attribute, if the u_group attribute evaluates to true, the associated element is evaluated, otherwise it and its content is skipped.

The following example shows how user groups can be applied within a SMIL document:

  1 <smil>
  2    <head>
  3       <layout>
  4          <!-- define projection regions -->
  5       </layout>
  6       <user_attributes>
  7          <u_group id="nl_aud" u_state="RENDERED" title="Dutch Audio Cap" override="allowed" />
  8          <u_group id="uk_aud" u_state="NOT_RENDERED" title="English Audio Cap" override="allowed" />
  9          <u_group id="nl_txt" u_state="NOT_RENDERED" title="Dutch Text Cap"override="allowed" />
 10          <u_group id="uk_txt" u_state="NOT_RENDERED" title="English Text Cap" override="allowed" />
 11       </user_attributes>
 12    </head>
 13    <body>
 14       ...
 15       <par>
 16          <video src="announcer.rm" region="a"/>
 17          <text src="news_headline.html" region="b"/>
 18          <audio src="story_1_nl.rm" u_group="nl_aud"/>
 19          <audio src="story_1_uk.rm" u_group="uk_aud-cam"/>
 20          <text src="story_1_nl.html" u_group="nl_txt" region="c"/>
 21          <text src="story_1_uk.html" u_group="uk_txt" region="d"/>
 22       </par>
 23       ...
 24    </body>
 25 </smil>

Lines 6 through 11 define the available groups. Each group contains an identifier and a title (which can be used by the user interface agent to label the group), as well as the (optional) initial state definition and override flag.

In line 7, a <u_group> named "nl_aud" is defined for Dutch audio captions that is initially set to RENDERED. The other groups in this (very simple) example are set to NOT_RENDERED.

In lines 15 through 22, a SMIL <par> construct is used to identify a portion of a presentation. In this <par>, a single video (line 16) is accompanied by two audio streams (18,19) and two text streams (20,21), one each for English and Dutch. The <par> also contains a text title that contains a headline.

The interaction of the user interface and the initial state determine which objects are rendered. Note that the same attributes are used across the entire document, meaning that the user only needs to select his/her content preferences once to control related groups of information. In the example, user is free to have the video and headline text accompanied by any combination of English and Dutch captions. (Note that if two audio captions are selected, the player will need to determine how these are processed for delivery.)

While this example shows in-line use of user groups, the groups could also be applied as test attributes in a <switch>. Similarly, the system test attributes typically found in a <switch> could also be used in-line as a control attribute on an element along with the u_group attribute.

A previous version of this specification used camelCase for the user group elements and attributes instead of the underlined convention used here. We need to standardize this across the SMIL modules.

4.3 Presentation Priority/Grouping

The following is still under development by the SYMM Working Group. The working group is interested in considering this functionality but the syntax and semantics described here are only preliminary thinking.

Define a means to group collections of objects that share a common policy. A Channel defines a partitioning of elements into groups each group has a common set of access policies control use of quasi-physical resources: - priority - common server - common access rights / charging model - local resource use (layout, devices, etc.)

4.4 User-Centered Adaptation

Focus on presentation as collection of content: each of the components may have a different user-level representation, encoding:

(natural) language
level of semantic detail
ability / rights to access particular type of content

At author-time, you know alternatives; at use-time, you select

4.5 Presentation Optimization

4.5.1 The `<prefetch>` element

This element will give a suggestion or hint to a user-agent that a media resource will be used in the future and the author would like part or all of the resource fetched ahead of time to make to make the document playback more smoothly. User-agents can ignore <prefetch> elements, though doing so may cause an interruption in the document playback when the resource is needed. It gives authoring tools or savvy authors the ability to schedule retrieval of resources when they think that there is available bandwidth or time to do it. A <prefetch> element is contained within the body of an XML document, and its scheduling is based on its lexical order unless explicit timing is present.

The <prefetch> element, like media object elements, can have id and src. If SMIL Boston Timing is integrated into the document, begin, end, dur, clipBegin, and clipEnd attributes are also available. The id and src elements are the same as for other media objects id names the element for reference in the document and src names the resource to be prefetched. When a media object with the same src URL is encountered the user-agent can use any data it prefetched to begin playback without rebuffering or other interruption. The timing attributes begin, end, dur would constrain the presentation time period for prefetching the element. At the end of the presentation time specified by end or dur, the prefetch operation should stop. The clipBegin and clipEnd elements are used to identify the part of the src clip to prefetch, if only the last 30s of the clip are being played, we don't want to prefetch it from the beginning. Likewise if only the middle 30 seconds of the clip are begin played, we don't want to prefetch more data than will be played.

The `mediaSize`, `mediaTime`, and `bandwidth` Attributes

In addition to the attributes allowed on Media Object Elements, the following attributes are allowed:

mediaSize : bytes-value | percent-value: Defines how much of the resource to fetch as a function of the file size of the resource. To fetch the entire resource without knowing its size, specify 100%. The default is 100%.
mediaTime : clock-value | percent-value: Defines how much of the resource to fetch as a function of the duration of the resource. To fetch the entire resource without knowing its duration, specify 100%. The default is 100%.
bandwidth : bitrate-value | percent-value: Defines how much network bandwidth the user-agent should use when doing the prefetch. To use all that is available, specify 100%. The default is 100%

If both mediaSize and mediaTime are specified, mediaSize is used and mediaTime is ignored.

For descrete media (non-time based media like text/html or image/png) using the mediaTime attribute causes the entire resource to be fetched.

Documents must still playback even when the prefetch elements are ignored, although rebuffering or pauses in presentation of the document may occur.

If a prefetch element is repeated, due to restart or repeat on a parent element the prefetch operation should occur again. This insures appropriately "fresh" data is displayed if, for example, the prefetch is for a banner ad to a URL whose content changes with each request. Note that prefetching data from a URL that changes the content dynamically is dangerous if the entire resource isn't prefetched as the subsequent request for the remaining data may yield data from a newer resource. A user-agent should respect any appropriate caching directives applied to the content, e.g. no-cache 822 headers in HTTP. More specifically, content marked as non-cachable would have to be refetched each time it was played, where content that is cachable could be prefetched once, with the results of the prefetch cached for future use.

If the clipBegin or ClipEnd in the media object are different from the prefetch, an implementation can use any data that was fetched and applies but the result may not be optimal.

Attribute value syntax

bytes-value

The bytes-value value has the following syntax:

bytes-value ::= Digit+; any positive number

percent-value

The percent-val value has the following syntax:

percent-value ::= Digit+ "%"; any positive number in the range 0 to 100

clock-value

The clock-value value has the following syntax:

Clock-val         ::= ( Hms-val | Smpte-val )
Smpte-val         ::= ( Smpte-type )? Hours ":" Minutes ":" Seconds 
                      ( ":" Frames ( "." Subframes )? )?
Smpte-type        ::= "smpte" | "smpte-30-drop" | "smpte-25"
Hms-val           ::= ( "npt=" )? (Full-clock-val | Partial-clock-val 
                      | Timecount-val)
Full-clock-val    ::= Hours ":" Minutes ":" Seconds ("." Fraction)?
Partial-clock-val ::= Minutes ":" Seconds ("." Fraction)?
Timecount-val     ::= Timecount ("." Fraction)? (Metric)?
Metric            ::= "h" | "min" | "s" | "ms"
Hours             ::= DIGIT+; any positive number
Minutes           ::= 2DIGIT; range from 00 to 59
Seconds           ::= 2DIGIT; range from 00 to 59
Frames            ::= 2DIGIT; @@ range?
Subframes         ::= 2DIGIT; @@ range?
Fraction	  ::= DIGIT+
Timecount         ::= DIGIT+
2DIGIT		  ::= DIGIT DIGIT
DIGIT		  ::= [0-9]

For Timecount values, the default metric suffix is "s" (for seconds).

bitrate-value

The bitrate-value value specifies a number of bits per second. It has the following syntax:

bitrate-value ::= Digit+; any positive number

Examples

1) Prefetch the image so it can be displayed immediately after the video ends:

<smil> <body> <seq> <par> <prefetch id="endimage" src="http://www.w3c.org/logo.gif"/> <text id="interlude" src="http://www.w3c.org/pleasewait.html" fill="freeze"/> </par> <video id="main-event" src="rtsp://www.w3c.org/video.mpg"/> <image src="http://www.w3c.org/logo.gif" fill="freeze"/> </seq> </body> </smil>

No timing is specified so default timing applies in the above example. The text is discrete media so it ends immediately, the prefetch is defaulted to prefetch the entire image at full available bandwidth and the prefetch element ends when the image is downloaded. That ends the <par> and the video begins playing. When the video ends the image is shown.

2) Prefetch the images for a button so that rollover occurs quickly for the end user:

<html> <body> <prefetch id="upimage" src="http://www.w3c.org/up.gif"/> <prefetch id="downimage" src="http://www.w3c.org/down.gif"/> ....  <img src="http://www.w3c.org/up.gif"/> </body> </html>

4.6 Open Issues

Can prefetch elements be used as timebases for sync? This could be an useful capability to be supported. We should be able to start a prefetch and not play the content until it completes. This means that prefetch has to have effective begin and end, depending upon how long it actually takes to get the data. Of course, if prefetching is optional, we need to decide when the begin and end events fire. However this introduces the problem of how to handle errors. Even though the prefetch may not be allowed or fail, there may be other things dependant upon the timing of the prefetch element. In this case it is appropriate for the element's timing to continue and fire begin\end events as if the prefetch element ran to completion. Since this is all very complicated, and prefetch is intended to be transparent, one idea is that we explicitly prohibit prefetch from being a syncbase. This is not as simple as it sounds, say that a prefetch element is in the middle of a <seq>. Maybe the simplest solution is to allow prefetch as a syncbase, and to say that for sync purposes, all prefetch elements always have duration zero, and fire begin\end events event if the prefetch itself fails or is not allowed

5. SMIL Layout Module

Editors: Aaron Cohen (aaron.m.cohen@intel.com), Intel; Dick Bulterman (Dick.Bulterman@oratrix.com), Oratrix

5.1 Introduction

This Section defines the SMIL layout module. This module contains elements and attributes allowing for positioning of media elements on the rendering surface (either visual or acoustic). Since these elements and attributes are defined in a module, designers of other markup languages can choose whether or not to include this functionality in their languages. Therefore, language designers incorporating other SMIL modules do not need to include the layout module if sufficient layout functionality is already present.

The major changes with respect to the layout elements and attributes in SMIL 1.0 [SMIL10] is the addition of support for:

multiple top-level layout windows,
hierarchical region definition within a layout window

Other changes are minor. SMIL 1.0 already provides for using alternative layout models, for example CSS [SMIL-CSS2], [CSS2], and these can provide much of the additional functionality desired over SMIL basic layout.

It is the intention of this version of the Layout Module to align SMIL Boston Layout with current CSS2 functionality. There are some conflicts in mapping CSS2 layout to a language, such as SMIL, where the layout hierarchy is not reflected in the XML structure of the SMIL document. This necessitated dropping desirable features that could not be directly supported by mapping CSS to SMIL such as: multiple z-ordering within hierarchical regions, the alignment of objects within regions, and object-specific placement offsets within regions. It is desired that a future version of W3C layout technology will add support for these features to the SMIL language. A future version of the Layout module may include proof-of-concept support for these features.

5.2 Brief overview of SMIL basic layout

SMIL 1.0 includes a basic layout model for organizing media elements into regions on the visual rendering surface. The <layout> element is used in the document <head> to declare a set of regions on which media elements are rendered. Media elements declare which region they are to be rendered into with the region attribute.

Each region has a set of CSS2 compatible properties such as top, left, height, width, and background-color. These properties can be declared using a syntax defined by the type attribute of the layout element. In this way, media layout can be described using the SMIL 1.0 basic layout syntax, CSS2 syntax, or some other syntax.

For example, to describe a region with the id "r" at location 15,20 that is 100 pixels wide by 50 pixels tall using the SMIL basic layout model:


    <layout>

    <region id="r" top="15" left="20px" width="100px" height="50px"/>

    </layout>

To create the same region using CSS2 syntax:


    <layout type="text/css">

    [region="r"] { top: 15px; left: 20px; width: 100px; height:50px; }

    </layout>

To display a media element in the region declared above, specify the region's id as the region attribute of the media element:


    <ref region="r" src="http://..." />

Additionally, implementations may choose to allow using the CSS syntax to set the media layout directly. This can be done by using the selector syntax to set layout properties on the media elements. For example, to display all video and image elements in a rectangle at the same size and position as the examples above:


    <layout type="text/css">

    video, img { top:15px; left:20px; width:100px; height=50px; }

    </layout>

Note that multiple layout models can be specified within a <switch> element, each with a different type. The first layout with a type supported by the implementation will be the one used.

5.3 Extensions to SMIL 1.0 Basic Layout

The extensions proposed for SMIL/Boston fall into two groups:

multiple top-level layout windows,
hierarchical region definition within a layout window

The characteristics of each extension group will be presented in this section. The full syntax will be described in later sections.

5.3.1 Multiple Top-Level Window Support

In SMIL 1.0, each presentation was rendered into a root window of a specific size/shape. The root window contained regions to manage the rendering of specific media objects.

This specification supports the concept of multiple top-level windows. Since there is no longer a single root window, we use the term top-level instead. The assignment of the regions to individual top level windows allows independent placement and resizing of each top-level window.

A top level window is declared in a manner similar to the SMIL 1.0 root layout window, except that multiple instances of the top level may occur:

    <layout>
      <top-layout id="WinV" title=" Video  " width="320" height="240"/>
      <region id="pictures" title="pictures" height="100%" fit="meet"/>
      </top-layout>

      <top-layout id="WinC" title=" Captions  " width="320" height="60">
        <region id="captions" top="WinC" title="caption text" top="90%" fit="meet"/>
      </top-layout>
    </layout>

In this example, two top-level windows are defined ("WinV" and "WinC"), and two regions are defined with one region assigned to WinV and the other to WinC. The definitions of the top-level windows and the contained regions use the new hierarchical layout functionality, as discussed in the next section.

The top-level windows function as rendering containers only, that is, they do not carry temporal significance. In other words, each window does not define a separate timeline or any other time-container properties. There is still a single master timeline for the SMIL presentation, no matter how many top-level windows have been created. This is important to allow synchronization between media displayed in separate top-level windows.

All top level windows are opened as soon as the presentation is started. If a window is closed (by the user) while any of the elements displayed in that window are active, there is no effect on the timeline of those elements. However, a player may choose not to decode content as a performance improvement.

For SMIL 1.0 compatibility, the <root-layout> element will continue to support SMIL 1.0 layout semantics. The new <top-layout> element will support the extension semantics and an improved, nested syntax.

Note also that any one region may belong to at most one top-level (or root-level) window. Regions not declared as children of a <top-layout>element belong to the <root-layout> window. If no <root-layout> element has been declared, the region is assigned to a default window according to SMIL 1.0 layout semantics.

5.3.2 Hierarchical Region Layout

A new feature in this layout module is support for hierarchical layout. This allows for the declaration of regions nested inside other regions, much like regions are laid out inside the top level window declared by the <top-layout> element. For example, the following declares a top level window of 640 by 480 pixels, regions "left" and "right" which covers the left and right sides of the window respectively, and a subregion "inset" that is centered within "right".

<layout>
	<top-layout width="640px" height="480px" />
	<region id="left" top="0px" left="0px" width="320px" height="480px" />
	<region id="right" top="0px" left="320px" width="320px" height="480px">
		<region id="inset" top="140px" left="80" width="160px" height="200px" />
	</region>
</layout>

The resulting layout looks like this:

5.4 SMIL basic layout syntax and semantics

5.4.1 Elements and attributes

This section defines the elements and attributes that make up the SMIL basic layout module.

The <`layout>` element

The <layout> element determines how the elements in the document's body are positioned on an abstract rendering surface (either visual or acoustic).

The <layout> element must appear before any of the declared layout is used in the document. If present, the <layout> element must appear in the <head> section of the document. If a document contains no <layout> element, the positioning of the body elements is implementation-dependent.

It is recommended that profiles including the SMIL layout module also support the SMIL Content Control module. A document can then support multiple alternative layouts by enclosing several <layout> elements within the SMIL <switch> element. This could also be used to describe the document's layout using different layout languages. Support for the system test attributes in the SMIL Content Control module also enables greater author flexibility as well as user accessibility.

Default layout values can be assigned to all renderable elements by selecting the empty layout element <layout></layout>. If the document does not include a <layout> element, then the positioning of media elements is implementation dependent.

Element attributes

id: This value uniquely identifies the layout element within a document. Its value is an XML identifier.
type: This attribute specifies which layout language is used in the layout element. If the player does not understand this language, it must skip the element and all of its content up until the next </layout> tag. The default value of the type attribute is "text/smil-basic-layout". This identifier value supports SMIL 1.0 layout semantics. To enable the multiple top-level window and hierarchical layout extensions in this specification, declare the value of this attribute to be "text/smil-extended-layout".

Element content

If the type attribute of the layout element has the value "text/smil-basic-layout", it can contain the following elements:

region
root-layout

If the type attribute of the layout element has the value "text/smil-extended-layout", it can contain the following elements:

region
root-layout
top-layout

If the type attribute of the <layout> element has another value, the element contains character data.

The <`region>` element

The region element controls the position, size and scaling of media object elements.

In the following example fragment, the position of a text element is set to a 5 pixel distance from the top border of the rendering window:


<smil>

  <head>

    <layout>

	<root-layout width="320" height="480" />    

      	<region id="a" top="5" />

    </layout>

  </head>

  <body>

    <text region="a" src="text.html" dur="10s" />

  </body>

</smil>

The position of a region, as specified by its "top" and "left" attributes, is always relative to the parent geometry, which is defined by the parent element. For <region> elements whose immediate parent is a layout element, the region position is defined relative to the root window declared in the sibling <root-layout> element. For <region> elements that are children of a <top-layout> element the region position is defined relative to the top level window declared in the parent <top-layout> element.

For <region> elements whose immediate parent is another <region> element, the sub-region position is defined relative to the position of the region defined by the parent element. Note that this is only allowed for regions that are descendants of a <top-layout> element.

When region sizes, as specified by "width" and "height" attributes are declared relative with the "%" notation, the size of a region is relative to the size of the parent geometry. Sizes declared as absolute pixel values maintain those absolute values, even when used on attributes in a sub-region.

Note that a sub-region may be defined in such a way as to extend beyond the limits of its parent. In this case the sub-region should be clipped to the parent boundaries.

Element attributes

The <region> element can have the following visual attributes:

backgroundColor

The use and definition of this attribute are identical to the "background-color" property in the CSS2 specification, except that SMIL basic layout does not require support for "system colors".

background-color

Deprecated. Equivalent to "backgroundColor", which replaces this attribute. The language profile must define whether or not the 'background-color" attribute is supported. If both the "backgroundColor" and "background-color" attributes are absent, then background is transparent.

bottom

The use and definition of this attribute are identical to the "bottom" property in the CSS2 specification. Attribute values can be "percentage" values, and a variation of the "length" values defined in CSS2. For "length" values, SMIL basic layout only supports pixel units as defined in CSS2. It allows the author to leave out the "px" unit qualifier in pixel values (the "px" qualifier is required in CSS2). Conflicts between the region size attributes "bottom", "left", "right", "top", "width", and "height" are resolved according to the rules for absolutely positioned, replaced elements in [CSS2]. The default value of this attribute is 'auto'.

fit

This attribute specifies the behavior if the intrinsic height and width of a visual media object differ from the values specified by the height and width attributes in the <region> element. This attribute does not have a 1-1 mapping onto a CSS2 property, but can be simulated in CSS2.
This attribute can have the following values:

fill

Scale the object's height and width independently so that the content just touches all edges of the box.

hidden

Has the following effect:

If the intrinsic height (width) of the media object element is smaller than the height (width) defined in the "region" element, render the object starting from the top (left) edge and fill up the remaining height (width) with the background color.
If the intrinsic height (width) of the media object element is greater than the height (width) defined in the "region" element, render the object starting from the top (left) edge until the height (width) defined in the "region" element is reached, and clip the parts of the object below (right of) the height (width).

meet

Scale the visual media object while preserving its aspect ratio until its height or width is equal to the value specified by the height or width attributes, while none of the content is clipped. The object's left top corner is positioned at the top-left coordinates of the box, and empty space at the left or bottom is filled up with the background color.

scroll

A scrolling mechanism should be invoked when the element's rendered contents exceed its bounds.

slice

Scale the visual media object while preserving its aspect ratio so that its height or width are equal to the value specified by the height and width attributes while some of the content may get clipped. Depending on the exact situation, either a horizontal or a vertical slice of the visual media object is displayed. Overflow width is clipped from the right of the media object. Overflow height is clipped from the bottom of the media object.

The default value of "fill" is "hidden".

height

The use and definition of this attribute are identical to the "height" property in the CSS2 specification. Attribute values follow the same restrictions and rules as the values of the "bottom" attribute.