Copyright © 1999
W3C
® (
MIT,
INRIA,
Keio), All Rights Reserved. W3C
liability,
trademark,
document
use and
software
licensing rules apply.
This document specifies the "Boston" version of the Synchronized Multimedia Integration Language (SMIL, pronounced "smile"). SMIL Boston has the following two design goals:
This section describes the status of this document at the time of its publication. Other documents may supersede this document. The latest status of this document series is maintained at the W3C.
This document is the first working draft of the specification for the next version of SMIL code-named "Boston". It has been produced as part of the W3C Synchronized Multimedia Activity. The document has been written by the SYMM Working Group (members only). The goals of this group are discussed in the SYMM Working Group charter (members only).
Many parts of the document are still preliminary, and do not constitute full consensus within the Working Group. Also, some of the functionality planned for SMIL Boston is not contained in this draft. Many parts are not yet detailed enough for implementation, and other parts are only suitable for highly experimental implementation work.
At this point, the W3C SYMM WG seeks input by the public on the concepts and directions described in this specification. Please send your comments to www-smil@w3.org. Since it is difficult to anticipate the number of comments that come in, the WG cannot guarantee an individual response to all comments. However, we will study each comment carefully, and try to be as responsive as time permits.
The only difference between this working draft and the version from August 3 1999 is that the draft is also provide as a single HTML document.
This working draft may be updated, replaced or rendered obsolete by other W3C documents at any time. It is inappropriate to use W3C Working Drafts as reference material or to cite them as other than "work in progress". This document is work in progress and does not imply endorsement by the W3C membership.
B. Synchronized Multimedia Integration Language (SMIL) Modules
D. Content Control Module (detailed specification not yet available)
E. Event Module (detailed specification not yet available)
F. Integration Module (detailed specification not yet available)
G. Layout Module (detailed specification not yet available)
ref, animation, audio, img, video, text
and textstream elements
rtpmap element
J. Metainformation Module (detailed specification not yet available)
K. Structure Module (detailed specification not yet available)
L. SMIL Timing and Synchronization
M. Integrating SMIL Timing into other XML-Based Languages
O. Synchronized Multimedia Integration Language (SMIL) Document Object Model
This document specifies the "Boston" version of the Synchronized Multimedia Integration Language (SMIL, pronounced "smile"). SMIL Boston has the following two design goals:
SMIL Boston is defined as a set of markup modules, which define the semantics and an XML syntax for certain areas of SMIL functionality. All modules have an associated Document Object Model (DOM).
SMIL Boston deprecates some SMIL 1.0 syntax in favor of more DOM friendly syntax. Most notable is the change from hyphenated attribute names to mixed case (camel case) attribute names, e.g., clipBegin is introduced in favor of clip-begin. The SMIL Boston modules do not contain these SMIL 1.0 attributes so that integration applications are not burdened with supporting them. SMIL document players, those applications that support playback of "application/smil" documents (or <smil></smil> documents (or however we denote SMIL documents vs. integration documents)) must support the SMIL 1.0 attribute names.
This specification is structured as follows: Section B presents the individual modules in more detail, and gives example profiles. Section 2 defines the animation module. Section C defines the animation module. Section D defines control elements such as the switch element. Section E defines the SMIL event model. Section F defines syntax that is only used when SMIL modules are integrated into other XML-based languages. Section G defines the elements that can be used to define the layout of a SMIL presentation. Section H provides for XML linking into SMIL documents. Section I defines elements and attributes allowing to describe media objects. Section J defines the meta element functionality. Section K defines the elements that form the sceleton of a SMIL document (head, body etc.). Section L defines the Timing and Synchronization elements. In particular, this Section defines the time model used in SMIL. Section M explains how SMIL timing can be integrated into other XML-based languages.
This document has been prepared by the Synchronized Multimedia Working Group (WG) of the World Wide Web Consortium. The WG includes the following individuals:
In addition to the working group members, the following people contributed to the SMIL effort: Dan Austin (CNET), Rob Glidden (Web3D), Mark Hakkinen (The Productivity Works), Jonathan Hui (Canon), Rob Lanphier (RealNetworks), Tony Parisi (Web3D), Dave Raggett (W3C).
This is a working draft of a specification of synchronized multimedia integration language (SMIL) modules. These modules may be used to provide multimedia features to other XML based languages, such as the Extensible Hypertext Markup Language (XHTML). To demonstrate how these modules may be used, this specification outlines a set of sample profiles based on common use cases.
The first W3C Working Group on Synchronized Multimedia (SYMM) developed SMIL, the Synchronized Multimedia Integration Language [SMIL]. This XML-based language [XML] is used to express timing relationships among media elements such as audio and video files. SMIL 1.0 documents describe multimedia presentations that can be played in a SMIL-conformant viewer.
Since the publication of SMIL 1.0, interest in the integration of SMIL concepts with the HTML, the Hypertext Markup Language [HTML], and other XML languages, has grown. Likewise, the W3C HTML Working Group is exploring how the XHTML, the Extensible Markup Language [XHTML], can be integrated with other languages. Both Working Groups are considering modularization as a strategy for integrating their respective functionality with each other and other XML languages.
Modularization is a solution in which a language's functionality is partitioned into sets of semantically-related elements. Profiling is the combination of these feature sets to solve a particular problem. For the purposes of this specification we define:
SMIL functionality is partitioned into modules based on the following design requirements:
The first requirement is that modules are specified such that a collection of modules can be "recombined" in such a way as to be backward compatible with SMIL (it will properly play SMIL conforming content).
The second requirement is that the semantics of SMIL must not change when they are embodied in a module. Fundamentally, this ensures the integrity of the SMIL content and timing models. This is particularly relevant when a different syntax is required to integrate SMIL functionality with other languages.
The third requirement is that modules be isomorphic with other modules from other W3C recommendations. This will assist designers when sharing modules across profiles.
The fourth requirement is that specific attention be payed to providing multimedia functionality to the XHTML language. XHTML is the reformulation of HTML in XML.
The fifth requirement is that the modules should adopt new W3C recommendations when they are appropriate and when they do not conflict with other requirements (such as complementing the XHTML language).
The sixth requirement is to ensure that modules have integrated support for the document object model. This facilitates additional control through scripting and user agents.
These requirements, and the ongoing work by the SYMM Working Group, led to a partitioning of SMIL functionality into nine modules.
SMIL functionality is partitioned into nine (9) modules :
Each of these modules introduces a set of semantically-related elements, properties, and attributes.
The Animation Module provides a framework for incorporating animation onto a timeline (a timing model) and a mechanism for composing the effects of multiple animations (a composition model). The Animation Module defines semantics for the animate, set, move, and colorAnim elements.
The Content Control Module provides a framework for selecting content based on a set of test attributes. The Content Control Module defines semantics for the switch element.
The Event Module provides a framework for realizing the event model specified in the W3C Document Object Model Level 2. The Event Module defines semantics for the eventhandler and event elements.
The Layout Module provides a framework for spatial layout of visual components. The Layout Module defines semantics for the layout, root-layout, and region elements.
The Linking Module provides a framework for relating documents to content, documents and document fragments. The Linking Module defines semantics for the a and anchor elements.
The Media Object Module provides a framework for declaring media. The Media Object Module defines semantics for the ref, animation, audio, img, video, text, textstream, xref, xanimation, xaudio, ximg, xvideo, xtext, xtextstream elements.
The Metainformation Module provides a framework for describing a document, either to inform the human user or to assist in automation. The Metainformation Module defines semantics for the meta element.
The Structure Module provides a framework for structuring a SMIL document. The Structure Module defines semantics for the smil, head, and body elements.
The Timing and Synchronization Module provides a framework for describing timing structure, timing control properties, and temporal relationships between elements. The Timing and Synchronization Module defines semantics for par, seq, excl, and choice elements. In addition, this module defines semantics for properties such as begin, beginAfter, beginWith, beginEvent, dur, end, endEvent, endWith, eventRestart, repeat, repeatDur, timeAction, and timeline. These elements and attributes are subject to change.
A requirement for SMIL modularization is that the modules be isomorphic with other modules from other W3C recommendations. Isomorphism will assist designers when sharing modules across profiles.
SMIL modules
|
HTML modules
|
||
| Animation | animate | - | - |
| Content Control | switch | - | - |
| Event | event, eventhandler | Intrinsic Events | onevent |
| Event | event, eventhandler | ||
| Layout | layout, region, root-layout | Stylesheet | style |
| Linking | a, anchor | Hypertext | a |
| Link | link | ||
| Base | base | ||
| Image Map | map, area | ||
| Media Object | ref, audio, video, text, img, animation, textstream | Object | object, param |
| Image | img | ||
| Applet | applet, param | ||
| Metainformation | meta | Metainformation | meta |
| Structure | smil, head, body | Structure | html, head, body, title |
| ??? | div and span | ||
| Timing and Synchronization | par, seq | - | - |
As can be seen in the table, there are two modules that appear in both SMIL and HTML: Event and Metainformation. Work is underway to define a single module that can be shared by both SMIL and HTML.
There are a range of possible profiles that may be built using SMIL modules. Four profiles are defined to inform the reader of how profiles may be constructed to solve particular problems:
These example profiles are non-normative.
The Lightweight Presentations Profile handles simple presentations, supporting timing of text content. The simplest version of this could be used to sequence stock quotes or headlines on constrained devices such as a palmtop device or a smart phone. This example profile might include the following SMIL modules:
This profile may be based on XHTML modules [XMOD] with the addition of Timing and Synchronization Module. Transitions might be accomplished using the Animation Module.
The SMIL-Boston Profile supports the timeline-centric multimedia features found in SMIL language. This profile might include the following SMIL modules:
The XHTML Presentations Profile integrates multimedia, XHTML layout, and CSS positioning. This profile might include the following SMIL modules:
This profile would use XHTML modules for structure and layout and SMIL modules for multimedia and timing. The linking functionality may come from the XHTML modules [XMOD] or from the SMIL modules.
The Web Enhanced Media Profile supports the integration of multimedia presentations with broadcast or on-demand streaming media. The primary media will often define the main timeline. This profile might include the following SMIL modules:
This profile is similar to the XHTML Presentations Profile with additional support to manage stream events and synchronization of the document's clock to the primary media.
[SMIL] "Synchronized Multimedia Integration Language (SMIL) 1.0 Specification", P. Hoschka, 15 Jun 98. This is available at http://www.w3.org/TR/REC-smil.
[XML] "Extensible Markup Language (XML) 1.0", T. Bray, J. Paoli, C. M. Sperberg-McQueen, 10 Feb 98. This is available at http://www.w3.org/TR/REC-xml.
[HTML] "HTML 4.0 Specification", D. Raggett, A. Le Hors, I. Jacobs, 24 Apr 98. This is available at http://www.w3.org/TR/REC-html40.
[XHTML] "Extensible Markup Language (XHTML) 1.0 Specification"
[XMOD] "Modularization of XHTML Working Draft"
We will probably want to follow the HTML WG's lead on architecting module DTDs and the drivers for combining these DTDs automatically. We might want to consider our schedule in light of the XML Schema schedule.
The modules defined in this WD need to clearly align with the interfaces defined in the SMIL DOM WD and the existing DOM Level 1 and DOM Level 2 interfaces.
This is a working draft of a specification of animation functionality for XML documents. It is part of work in the Synchronized Multimedia Working Group (SYMM) towards a next version of the SMIL language and modules. It describes an animation framework as well as a set of base XML animation elements, included in SMIL and suitable for integration with other XML documents.
The first W3C Working Group on Synchronized Multimedia (SYMM) developed SMIL - Synchronized Multimedia Integration Language. This XML-based language is used to express synchronization relationships among media elements. SMIL 1.0 documents describe multimedia presentations that can be played in SMIL-conformant viewers.
SMIL 1.0 was focused primarily on linear presentations, and did not include support for animation. Other working groups (especially Graphics) are exploring animation support for things like vector graphics languages. As the timing model is at the heart of animation support, it is appropriate for the SYMM working group to define a framework for animation support, and to define a base set of widely applicable animation structures. This document describes that support.
Where SMIL 1.0 defined a document type and the associated semantics, the next version modularizes the functionality. The modularization facilitates integration with other languages, and the development of profiles suited to a wider variety of playback environments. See also "Synchronized Multimedia Modules based upon SMIL 1.0" (W3C members only). The Animation Module described herein is designed with the same goals in mind, and in particular to satisfy requirements such as those of the Graphics Working Group.
This document describes a framework for incorporating animation onto a time line and a mechanism for composing the effects of multiple animations. A set of basic animation elements are also described that can be applied to any XML-based language that supports a Document Object Model. A language in which this module is embedded is referred to as a host language.
Animation is inherently time-based. SMIL animation is defined in terms of the SMIL timing model, and is dependent upon the support described in the SMIL Timing and Synchronization Module. The capabilities are described by new elements with associated attributes and associated semantics, as well as the SMIL timing attributes. Animation is modeled as a local time line. An animation element is typically a child of the target element, the element that is to be animated.
While this document defines a base set of animation capabilities, it is assumed that host languages will build upon the support to define additional and/or more specialized animation elements. In order to ensure a consistent model for document authors and runtime implementors, we introduce a framework for integrating animation with the SMIL timing model. Animation only manipulates attributes of the target elements, and so does not require any specific knowledge of the target element semantics.
An overview of the fundamentals of SMIL animation is given in Animation Framework. The syntax of the animation elements and attributes is specified in Animation Syntax. The semantics of animation is specified in Animation Semantics. The normative definition of syntax is entirely contained in Animation Syntax, and the normative definition of precise semantics is entirely contained in Animation Semantics. All other text in this specification is informative. In cases of conflicts, the normative form sections take precedence. Anyone having a detailed question should refer to the Animation Syntax and Animation Semantics, as appropriate.
This section is informative. Readers who need to resolve detailed questions of syntax or semantics should refer to Animation Syntax and Animation Semantics, respectively, which are the only normative forms.
Animation is inherently time-based, changing the values of element attributes over time. The SYMM Working Group defines a generalized model for timing and synchronization that applies to SMIL documents, and is intended to be included in other XML-based host languages. While this document defines a base set of animation elements, it is assumed that other host languages will build upon the support to define additional and/or more specialized elements. In order to ensure a consistent model for both document authors and runtime implementors, we introduce a framework for integrating animation with the SMIL timing model.
[@@Ed: We intend that this section be a useful discussion of our central animation concepts, using simple examples. Feedback on its usefulness and clarity will be appreciated. In particular, the syntax elements used are not introduced prior to their use. It is hoped that the examples are sufficiently simple that an intuitive understanding of from/to/by will be sufficient. Details are in Section 3, Animation Syntax.]
@@@[Issue] This draft is written in terms of XML attribute animation. However, there is a need to animate DOM attributes which are not exposed as XML attributes. This applies in particular to structured attributes. A mechanism for naming these attributes is needed.
Animation is defined as a time-based manipulation of a target element (or more specifically of some attribute of the target element, the target attribute). The definition expresses a function, the animation function, of time from 0 to the simple duration of the animation element. The definition is evaluated as needed over time by the runtime engine, and the resulting values are applied to the target attribute. The functional representation of the animation's definition is independent of this model, and may be expressed as a sequence of discrete values, a keyframe based function, a spline function, etc. In all cases, the animation exposes this as a function of time.
For example, the following defines the linear animation of a bitmap. The bitmap appears at the top of the region, moves 100 pixels down over 10 seconds, and disappears.
<par> <img dur="10s" ...> <animate attribute="top" from="0" to="100" dur="10s"/> </img> </par>
Animation has a very simple model for time. It just uses the animation element's local time line, with time varying from 0 to the duration. All other timing functions, including synchronization and time manipulations such as repeat, time scaling, etc. are provided (transparently) by the timing model. This makes it very simple to define animations, and properly modularizes the respective functionality.
Other features of the SMIL Timing and Synchronization module may be used to create more complex animations. For example, an accelerated straight-line motion can be created by applying an acceleration time filter to a straight-line, constant-velocity motion. There are many other examples.
It is frequently useful to define animation as a change in an attribute's value. Motion, for example, is often best expressed as an increment, such as moving an image from it's initial position to a point 100 pixels down:
<par>
<img dur="10s" ...>
<animate attribute="top" by="100"
dur="10s" additive="true"/>
</img>
</par>
Many complex animations are best expressed as combinations of simpler animations. A corkscrew path, for example, can be described as a circular motion added to a straight-line motion. Or, as a simpler example, the example immediately above can be slowed to move only 40 pixels over the same period of time by inserting a second additive <animate> which by itself would animate 60 pixels the other direction:
<par>
<img dur="10s" ...>
<animate attribute="top" by="100"
dur="10s" additive="true"/>
<animate attribute="top" by="-60"
dur="10s" additive="true"/>
</img>
</par>
When there are multiple animations active for an element at a given moment, they are said to be composed, and the resulting animation is composite. The active animations are applied to the current underlying value of the target attribute in activation order (first begun is first applied), with later additive animations being applied to the result of the earlier-activated animations. When two animations start at the same moment, the first in lexical order is applied first.
A non-additive animation masks all animations which began before it, until the non-additive animation ends.
Numeric attributes generally can have additive animations applied, though it may not make sense for some. Types such as strings and booleans, for which addition is not defined, cannot.
As long as the host language defines addition of the target attribute type and the value of the animation function, additive animation is possible. For example, if the language defines date arithmetic, date attributes can have additive animations applied, perhaps as a number of days to be added to the date. Such attributes are said to support composite animation.
The author may also select whether a repeating animation should repeat the original behavior for each iteration, or whether it should build upon the previous results, accumulating with each iteration. For example, a motion path that describes an arc can repeat by drawing the same arc over and over again, or it can begin each repeat iteration where the last left off, making the animated element bounce across the window. This is called cumulative animation.
Repeating our 100-pixel-move-down example, we can move 1,000 pixels in 100 seconds.
<par> <img dur="100s" ...>
<animate dur="10s" repeat="indefinite" accumulate="true" attribute="top" by="100"/> </img> </par>
This example can, of course, be coded as a single 100-second, 1000-pixel motion. With more complex paths, additive animation is much more valuable. For example, if one created a motion path for a single sine wave, a repeated sine wave animation could easily be created by cumulatively repeating the single wave.
Typically, authors expect cumulative animations to be additive (as in the example directly above), but this is not required. The following example is not additive. It starts at the absolute position given, 20. It moves down by 10 pixels to 30, then repeats. It is cumulative, so the second iteration starts at 30 and moves down by another 10 to 40. Etc.
<par> <img dur="100s" ...>
<animate dur="10s" repeat="indefinite" attribute="top" from="20" by="10"
additive="false" accumulate="true"/> </img> </par>
Cumulative animations are possible for any attribute which supports animation composition. When the animation is also additive, as composite animations typically are, they compose just as straight additive animations do (using the cumulative value).
When an animation element ends, its affect is normally removed from the target. For example, if an animation moves an image and the animation element ends, the image will jump back to its original position. For example:
<par> <img dur="20s" ...> <animate begin="5s" dur="10s" attribute="top" by="100"/> </img> </par>
The image will appear stationary for 5 seconds (begin="5s" in the <animate>), then move 100 pixels down in 10 seconds (dur="10s", by="100"). At the end of the movement the animation element ends, so it's effect ends and the image jumps back where it started (to the underlying value of the top attribute). The image lasts for 20 seconds, so it will remain back at the original position for 5 seconds then disappear.
The standard timing attribute fill can be used to maintain the value of the animation after the simple duration of the animation element ends:
<par>
<img dur="20s" ...>
<animate begin="5s" dur="10s" fill="freeze"
attribute="top" by="100"/>
</img>
</par>
The <animate> ends after 10 seconds, but fill="freeze" keeps its final effect active until it is ended by the ending of its parent element, the image.
However, it is frequently useful to define an animation as a sequence of additive steps, one building on the other. For example, the author might wish to move an image rapidly for 2 seconds, slowly for another 2, then rapidly for 1, ending 100 pixels down. It is natural to express this as a <seq>, but each element of a <seq> ends before the next begins.
The attribute hold keeps final effect applied until ended by target element itself, the image, ends:
<par> <img dur="100s" ...>
<seq>
<animate dur="2s" attribute="top" by="50" hold="true"/> <animate dur="2s" attribute="top" by="10" hold="true"/> <animate dur="1s" attribute="top" by="40" hold="true"/> </seq>
</img> </par>
The effect of the held animations are essentially attached to the target to achieve the desired result. In this example, it will have moved 50 pixels after 2 seconds and 60 after 4. At 5 seconds it will reach 100 pixels and stay there. Note that not only does each <animate> end before the image, but the <seq> containing the animation elements also ends (when the last <animate> ends). The effect of the held animations is retained until the image ends.
The difference between hold="true" and fill="freeze" is that hold causes the animation to "stick" to the target element until the target element ends, while the duration of the fill is determined by the parent of the animation element.
The above example is equivalent to both of the following examples, but easier to visualize and maintain:
<!-- Equivalent animation using a <seq> -->
<img dur="100s" ...>
<seq>
<animate dur="2s" attribute="top" by="50"/> <animate dur="2s" attribute="top" values="50 60" additive="true"/> <animate dur="1s" attribute="top" values="60 100" additive="true" fill="freeze"/> </seq>
</img>
<!-- Equivalent animation using a <par> -->
<img dur="100s" ...>
<par>
<animate dur="2s" attribute="top" by="50" fill="freeze"/> <animate begin="prev.end" dur="2s" attribute="top" by="10" fill="freeze"/> <animate begin="prev.end" dur="1s" attribute="top" by="40" fill="freeze"/> </par>
</img>
The trick here is that fill="freeze" causes the animation elements to last until the end of the <seq> or <par>, respectively, which in turn lasts until the image ends. With more complex paths, the arithmetic would be impractical and difficult to maintain.
@@@Issue If animation elements were allowed to animate the parameters of other animation elements, certain use cases become very easy. For example, a dying oscillation could be created by placing an undamped oscillation animation, then animating the length of the oscillation's path (decreasing it over time). The SYMM WG is uncertain whether the complexity of this feature is worth its benefit.
@@@ Issue We need to define what it means to animate an attribute that has been changed by scripting or by another DOM client while the <animate> is active. This involves some implementation issues. Some alternatives: changing an attribute with script cancels the animation, changing an attribute simply changes the "initial state" of that attribute and the animation proceeds as if the attribute started out with that values.
By default, the target of an animation element will be the closest ancestor for which the manipulated attribute is defined. However, the target may be any (@@@??) element in the <body> of the document, identified by its element id. [@@@Should be limited to elements which are known when the animation begins, or perhaps to those known when the animation is encountered in the text -- should be similar to other limitations on idrefs. Probably no forward references past the point in document loading at which playback starts.]
An animation element affects its target only if both are active at the same time. The calculation of the target attribute at a given moment in time uses the animation element's timeline (current position on its timeline and simple duration) to compute the new value of the animated attribute of the target.
For example, in the following animation the image repeatedly moves 100 pixels down, from 0 to 100, and jumps back to the top. The 10 second animation begins 5 seconds before the target element. So, the target appears at 50, moves down for 5 seconds to 100, jumps back to the top, and goes into a series of 10-second motions from 0 to 100.
<par>
<img id="a" begin="5s" .../>
<animate target="a" begin="0s" dur="10s" repeat="indefinite"
attribute="top" from="0" to="100"/>
</par>
Note that in this example, the animation is running before the target exists, so it cannot be a child of the target. It must explicitly identify the target.
This is very useful for starting part of the way into spline-based paths, as splines are hard to split.
The definitions in this module could be used to animate any attribute. However, it is expected that host languages will constrain what elements and attributes animation may be applied to. For example, we do not expect that most host languages will support animation of the src attribute of a media element. A host language which included a DOM might limit animation to the attributes which may be modified through the DOM.
Any attribute of any element not specifically excluded from animation by the host language may be animated, as long as the underlying datatype supports discrete values (for discrete animation) or addition (for additive animation).
This section defines the XML animation elements and attributes. It is the normative form for syntax questions. See Animation Semantics for semantic definitions; all discussion of semantics in this section is informative.
All animation elements use the common timing markup described in the SMIL Timing and Synchronization module. In addition, animation elements share attributes to control composition, and to describe the calculation mechanism.
The <animate> element introduces a generic attribute animation that requires no semantic understanding of the attribute being animated. It can animate numeric scalars as well as numeric vectors. It can also animate discrete sets of non-numeric attributes.
The basic form is to provide a list of values:
The values array and calcMode together define the animation function. For discrete animation, the duration is divided into even time periods, one per value. The animation function takes on the values in order, one value for each time period. For linear animation, the duration is divided into n-1 even periods, and the animation function is a linear interpolation between the values at the associated times. Note that a linear animation will be a nicely closed loop if the first value is repeated as the last.
from/to/by specification of animation function
For convenience, the values for a simple discrete or linear animation may be specified using a from/to/by notation, replacing the values and additive attributes. From is optional in all cases. To or by (but not both) must be specified. If a values attribute or an additive attribute is specified, none of these three attributes may be specified. [@@@Issue] Need to specify behavior in error cases.
Animations expressed using from/to/by are equivalent to the same animation with from and to or by replaced by values. Examples of equivalent <animate> elements:
from/to/by form
|
values form
|
<animate ... by="10"/>
|
<animate ... values="0 10" additive="true"/>
|
<animate ... from="5" by="10"/>
|
<animate ... values="5 15" additive="true"/>
|
<animate ... from="10" to="20"/>
|
<animate ... values="10 20" additive="false"/>
|
<animate ... to="10"/>
|
<animate ... values="b 10" additive="false"/>,
where b is the base value for the animation. |
The <set> element is a convenience form of the <animate> element. It supports all attribute types, including those that cannot reasonably by interpolated, and that more sensibly support semantics of setting a value over the specified duration (e.g. strings and boolean values). The <set> element is non-additive. While this supports the general set of timing attributes, the effect of the "repeat" attribute is just to extend the defined duration. In addition, using "fill=freeze" will have largely the same effect as an indefinite duration.
<Set> takes the "attribute" and "target" attributes from the generic attribute list described above, as well as the following:
Formally, <set ... to=z .../> is defined as <animate ... calcMode="discrete" additive="false" values=z .../>.
[@@@ Issue] The WG does not agree on the inclusion of this element in SMIL. This would be a very reasonable extension in other host languages, and there is value in a standardized motion animation element. We are interested in feedback from others who are defining potential host languages.
In order to abstract the notion of motion paths across a variety of layout mechanisms, we introduce the <move> element. This takes all the attributes of <animate> described above, as well as two additional attributes:
The following is one such possible definition:
In order to abstract the notion of color animation, we introduce the <colorAnim> element. This takes all the generic attributes described above, supporting string values as well as RGB values for the individual argument values. The animation of the color is defined to be in HSL space. [@@@ need to explain why & interaction with RGB values -- examples. Might want rgb-space animation for improved performance when it's "good enough" for the author]. This element takes one additional attribute as well:
- direction
- This specifies the direction to run through the colors, relative to the standard color wheel. If the to and from are the same values and clockwise or cclockwise were specified, the animation will cycle full circle through the color wheel.
- Legal values are:
- clockwise
- Animate colors between the from and to values in the clockwise direction on the color wheel. This is the default
- cclockwise
- Animate colors between the from and to values in the counter-clockwise direction on the color wheel.
- nohue
- Do not animate the hue, but only the saturation and level. This allows for simple saturation animations, ignoring the hue and ensuring that it does not cycle.
We may need to support extensions to the path specification to allow the direction to be specified between each pair of color values in a path specification. This would allow for more complex color animations specified as a path.
@@@ Need a section with precise mathematical definitions of animation semantics
Need to mention and point to DOM Core and SMIL DOM specs. May want to discuss issues which host languages must specify: Interaction between animation and DOM manipulations, the mechanism for determining property type, definition of addition. Animation of DOM attributes not exposed as XML attributes discussion may belong here.
Related section: Interaction with DOM Manipulations
In no particular order:
The SMIL linking module defines the user-initiated hyperlink elements that can be used in a SMIL document. It describes
XPointer [XPTR] allows components of XML documents to be addressed in terms of their placement in the XML structure rather than on their unique identifiers. This allows referencing of any portion of an XML document without having to modify that document. Without XPointer, pointing within a document may require adding unique identifiers to it, or inserting specific elements into the document, such as a named anchor in HTML. XPointers are put within the fragment identifier part of a URI.
XLink (XML Linking Language) [XLINK] defines a set of generic attributes that can be used when defining linking elements in an XML-encoded language. Using these generic XLink attributes has the advantage that users find the same syntactic constructs with the same semantics in many XML-based languages, resulting in a faster learning curve. It also enables generic link processors to process the hyperlinking semantics in XLink documents without understanding the details of the DTD. For example, it allows users of a generic XML browser to follow SMIL links.
Both XLink and XPointer are subject to change. At the time of this document's writing, neither is a full W3C recommendation. This document is based on the public Working Drafts ([XLINK], [XPTR]). It will change when these two formats change.
SMIL 1.0 allowed authors to playing back a SMIL presentation at a particular element rather than at the beginning by using a URI with a fragment identifier, e.g. "doc#test", where "test" was the value of an element identifier in the SMIL document "doc". This meant that only elements with an "id" attribute could be the target of a link.
The SMIL Linking module defined in this specification allows using any element in a SMIL document as target of a link. SMIL software must fully support the use of XPointers for fragment identifiers in URIs pointing into SMIL documents.
Example:
The following URI selects the 4th par element of an element called "bar":
http://www.w3.org/foo.smil#id("bar").child(4,par)
Note that XPointer only allows navigating in the XML document tree, i.e. it does not actually understand the time structure of a SMIL document.
Error handling
When a link into a SMIL document contains an unresolvable XPointer ("dangling link") because it identifies an element that is not actually part of the document, SMIL software should ignore the XPointer, and start playback from the beginning of the document.
When a link into a SMIL document contains an XPointer which identifies an element that is the content of a "switch" element, SMIL software should interpret this link as going to the parent "switch" element instead. The result of the link traversal is thus to play the "switch" element child that passes the usual switch child selection process.
The use of XPointer is not restricted to XLink attributes. Any attribute specifying a URI can use an XPointer (unless, of course, prohibited for that attributes document set).
XPointer can be used in various SMIL attributes which refer to XML components in the same SMIL document or in external XML documents. These include
a Element
The "a" element has the same syntax and semantics as the SMIL 1.0 "a" element. All SMIL 1.0 attributes can still be used. The following lists attributes that are newly introduced by this specification, and attributes that are extended with respect to SMIL 1.0:
All XLink attributes not mentioned in the list above are not allowed in SMIL.
Element Content
No changes to SMIL 1.0.
area Element
This element extends the syntax and semantics of the HTML 4.0 "area" element with constructs required for timing. The SMIL 1.0 "anchor" element is deprecated in favor of "area".
The "area" element can have the attributes listed below, with the same syntax and semantics as in HTML 4.0:
The following lists attributes that are newly introduced by this specification, and attributes that are extended with respect to HTML 4.0:
Element Content
An "area" elements can contain "seq" and "par" elements for scheduling other "area" elements over time.
Examples
1) Decomposing a video into temporal segments
In the following example, the temporal structure of an interview in a newscast (camera shot on interviewer asking a question followed by shot on interviewed person answering ) is exposed by fragmentation:
<smil>
<body>
<video src="video" title="Tom Cruise interview 1995" >
<seq>
<area dur="20s" title="first question" />
<area dur="50s" title="first answer" />
</seq>
</video>
</body>
</smil>
2) Associating links with spatial segments In the following example, the screen space taken up by a video clip is split into two sections. A different link is associated with each of these sections.
<smil>
<body>
<video src="video" title="Tom Cruise interview 1995" >
<area shape="rect" coords="5,5,50,50"
title="Journalist" href="http://www.cnn.com" xml:link="simple" />
<area shape="rect" coords="5,60,50,50"
title="Tom Cruise" href="http://www.brando.com" xml:link="simple" />
</video>
</body>
</smil>
3) Associating links with temporal segments
In the following example, the duration of a video clip is split into two sub-intervals. A different link is associated with each of these sub-intervals.
<smil>
<body>
<video src="video" title="Tom Cruise interview 1995" >
<seq>
<area dur="20s" title="first question"
href="http://www.cnn.com" xml:link="simple" />
<area dur="50s" title="first answer"
href="http://www.brando.com" xml:link="simple" />
</seq>
</video>
</body>
</smil>
ref, animation, audio, img, video, text
and textstream elements
rtpmap element
This Section defines the SMIL media object module. This module contains elements and attributes allowing to describe media objects. Since these elements and attributes are defined in a module, designers of other markup languages can reuse the SMIL media module when they need to include media objects into their language.
Changes with respect to the media object elements in SMIL 1.0 include changes required by basing SMIL on XLink [XLINK], and changes that provide additional functionality that was brought up as Requirements in the Working Group.
ref, animation, audio, img, video, text
and textstream elements
These elements can contain all attributes defined for media object elements in SMIL 1.0 with the changes described below, and the additional attributes described below.
clipBegin, clipEnd, clip-begin, clip-end
Using attribute names with hyphens such as "clip-begin" and "clip-end" is problematic when using a scripting language and the DOM to manipulate these attributes. Therefore, this specification adds the attribute names "clipBegin" and "clipEnd" as an equivalent alternative to the SMIL 1.0 "clip-begin" and "clip-end" attributes. The attribute names with hyphens are deprecated. Software supporting SMIL Boston must be able to handle all four attribute names, whereas software supporting only the SMIL media object module does not have to support the attribute names with hyphens. If an element contains both the old and the new version of a clipping attribute, the the attribute that occurs later in the text is ignored.
Example:
<audio src="radio.wav" clip-begin="5s" clipBegin="10s" />
The clip begins at second 5 of the audio, and not at second 10, since the "clipBegin" attribute is ignored.
The syntax of legal values for these attributes is defined by the following BNF:
Clip-value ::= [ Metric ] "=" ( Clock-val | Smpte-val ) |
"name" "=" name-val
Metric ::= Smpte-type | "npt"
Smpte-type ::= "smpte" | "smpte-30-drop" | "smpte-25"
Smpte-val ::= Hours ":" Minutes ":" Seconds
[ ":" Frames [ "." Subframes ]]
Hours ::= Digit Digit
/* see XML 1.0 for a definition of ´Digit´*/
Minutes ::= Digit Digit
Seconds ::= Digit Digit
Frames ::= Digit Digit
Subframes ::= Digit Digit
name-val ::= ([^<&"] | [^<&´])*
/* Derived from BNF rule [10] in [XML]
Whether single or double quotes are
allowed in a name value depends on which
type of quotes is used to quote the
clip attribute value */
This implies the following changes to the syntax defined in SMIL 1.0:
<audio clipBegin="name=song1" clipEnd="name=dj1" />
Handling of new syntax in SMIL 1.0 software
Authors can use two approaches for writing SMIL Boston presentations that use the new clipping syntax and functionality ("name", default metric) defined in this specification, but can still can be handled by SMIL 1.0 software.
First, authors can use non-hyphenated versions of the new attributes that use the new functionality, and add SMIL 1.0 conformant clipping attributes later in the text.
Example:
<audio src="radio.wav" clipBegin="name=song1" clipEnd="name=moderator1"
clip-begin="0s" clip-end="3:50" />
SMIL 1.0 players implementing the recommended extensibility rules of SMIL 1.0 [SMIL] will ignore the clip attributes using the new functionality, since they are not part of SMIL 1.0. SMIL Boston players, in contrast, will ignore the clip attributes using SMIL 1.0 syntax, since they occur later in the text.
The second approach is to use the following steps:
Example:
<switch>
<audio src="radio.wav" clipBegin="name=song1" clipEnd="name=moderator1"
system-required=
"@@http://www.w3.org/AudioVideo/Group/Media/extended-media-object19990707" />
<audio src="radio.wav" clip-begin="0s" clip-end="3:50" />
</switch>
alt, longdesc
If the content of these attributes is read by a screen-reader, the presentation should be paused while the text is read out, and resumed afterwards.
New Accessibility Attributes
longdesc and alt text are read out by
a screen reader for the current document. This value must be a number between
0 and 32767. User agents should ignore leading zeros. The default value is
0.alt or longdesc attributes
are read by a screen reader according to the following rules:
To make SMIL 1.0 media objects elements XLink-conformant, the attributes defined in the XLink specification are added as described below.
Note: Due to a limitation in the current XLink draft, only the "src" attribute is treated as an Xlink locator, the "longdesc" attribute is treated as non-XLink linking mechanism (as allowed in Section 8 of the XLink draft). See Appendix for an XLink-conformant equivalent of SMIL 1.0 elements that contain a "longdesc" attribute.
<smil>
<body>
<audio src="audio.wav" xml:attributes="href src" />
</body>
</smil>
When using SMIL in conjunction with the Real Time Transport Protocol (RTP, [RFC1889]), which is designed for real-time delivery of media streams, a media client is required to have initialization parameters in order to interpret the RTP data. These are typically described in the Session Description Protocol (SDP, [RFC2327]). This can be delivered in the DESCRIBE portion of the Real Time Streaming Protocol (RTSP, [RFC2326]), or can be delivered as a file via HTTP.
Since SMIL provides a media description language which often references SDP via RTSP and can also reference SDP files via HTTP, a very useful optimization can be realized by merging parameters typically delivered via SDP into the SMIL document. Since retrieving a SMIL document constitutes one round trip, and retrieving the SDP descriptions referenced in the SMIL document constitutes another round trip, merging the media description into the SMIL document itself can save a round trip in a typical media exchange. This round-trip savings can result in a noticeably faster start-up over a slow network link.
This applies particularly well to two primary usage scenarios:
(see also "The rtpmap element" below)
SDP-related Attributes
Example
<audio src="rtsp://www.w3.org/test.rtp" port="49170-49171"
transport="RTP/AVP" fmt-list="96,97,98" />
Element Content
Media object elements can contain the following elements:
rtpmap element
If the media object is transferred using the RTP protocol, and uses a dynamic payload type, SDP requires the use of the "rtpmap" attribute field. In this specification, this is mapped onto the "rtpmap" element, which is contained in the content of the media object element. If the media object is not transferred using RTP, this element is ignored.
Attributes
encoding-val ::= encoding-name "/" clock-rate "/" encoding-params
encoding-name ::= name-val clock-rate ::= +Digit encoding-params ::= ??
Element Content
"rtpmap" is an empty element
Example
<audio src="rtsp://www.w3.org/foo.rtp" port="49170"
transport="RTP/AVP" fmt-list="96,97,98">
<rtpmap payload="96" encoding="L8/8000" />
<rtpmap payload="97" encoding="L16/8000" />
<rtpmap payload="98" encoding="L16/11025/2" />
</audio>
A media object referenced by a media object element is often rendered by software modules referred to as media players that are separate from the software module providing the synchronization between different media objects in a presentation (referred to as synchronization engine).
Media players generally support varying levels of control, depending on the constraints of the underlying renderer as well as media delivery, streaming etc. This specification defines 4 levels of support, allowing for increasingly tight integration, and broader functionality. The details of the interface will be presented in a separate document.