previous   next   contents  

7. SMIL 3.0 Media Object

Editor for SMIL 3.0
Dick Bulterman, CWI
Eric Hyche, RealNetworks.
Editor for SMIL 2.0
Dick Bulterman, CWI
Rob Lanphier, RealNetworks.

Table of contents

7.1 Changes for SMIL 3.0

This section is informative.

There are three major changes to the Media Object modules for SMIL 3.0: the first is the splitting of the SMIL 2.1 MediaParam module into two modules: the MediaParam and MediaRenderAttributes modules; the second is the introduction of the MediaOpacity module, containing new rendering attributes for chroma key and opacity control; the third is the introduction of the MediaPanZoom module. The rationale for these changes is:

  1. The splitting of the SMIL 2.1 MediaParam module provides a better differentiation of functionality, which can help user agent profile designers be more selective in the features they need to support.
  2. The MediaOpacity module is added to define control over various aspects of media opacity using the mediaOpacity, mediaBackgroundOpacity, chromaKey, chromaKeyOpacity, and chromaKeyTolerance attributes.
  3. The MediaPanZoom module defines the viewBox attribute to provide a framework for panning and zooming over media content. (The viewBox attribute is based largely on equivalent functionality in SVG.)

The MediaParam module also includes new text that explicitly discusses the behavior of adding the various media control attributes defined in that section to a SMIL layout region definition as a means of providing a global mechanism for applying default attribute settings to all content rendered within that region.

A number of editorial changes have also been integrated into the various Media Object modules descriptions; these do not impact the functionality defined in earlier versions of SMIL.

7.2 Introduction

This section is informative.

This section defines the SMIL media object modules, which are composed of the BasicMedia module and nine modules with additional functionality that build on top of the BasicMedia module: the BrushMedia, MediaClipping, MediaClipMarkers, MediaParam, MediaRenderAttributes, MediaOpacity, MediaAccessibility, MediaDescription, and MediaPanZoom modules. These modules contain elements and attributes used to reference external media objects or control media object rendering behavior. Since these elements and attributes are defined in a series of modules, designers of other markup languages can reuse the SMIL media module when they need to include media objects into their language.

The differences between current media object functionality and that provided by the SMIL 1.0 specification are explained in Appendix A.

7.3 Definitions

This section is normative.

SMIL provides a number of timing-related concepts that are used to determine activation, duration and termination of media objects in a presentation. The temporal semantics of these concepts are discussed in the SMIL 3.0 Timing and Synchronization module.

Intrinsic Duration
The duration of a referenced media item based on the temporal properties of that item (defined next), without any explicit SMIL timing markup. Some media objects have a well-defined notion of implicit duration (such as a 7 second audio clip), while other objects do not have well-defined durations (such as a string of plain text). In SMIL, the implicit duration for any media object that does not have a well-defined duration is set to be zero seconds. The implicit duration is used to calculate scheduling information; it is sometimes independent of the actual duration of a media object (such as with a live media stream or with an image with multiple internal frames when no particular duration can be derived by the SMIL scheduler). From a scheduling perspective, an object's intrinsic duration forms the basis for the simple duration of the object during presentation. This duration can be shortened or extended using SMIL timing markup.
Continuous Media
Media objects, such as stored audio or video files, for which there is a measurable and well-understood duration. For example, a five second audio clip is continuous media, because it has a well-understood duration of five seconds. Opposite of "discrete media". See also the definition of continuous media in the Timing module.
Discrete Media
Media objects, such as images or non-timed text data, that has no obvious duration. For example, a JPEG image is generally considered discrete media, because there's nothing in the file indicating how long the JPEG should be displayed. Opposite of "continuous media". See also the definition of discrete media in the Timing module.

The distinction between continuous and discrete media is sometimes arbitrary and may be SMIL renderer dependent. For example, animated images that do not have a well-defined duration (simply a repeating collection of frames) are classified for SMIL scheduling purposes as being discrete media; such objects have an intrinsic scheduling duration of zero seconds.

7.4 SMIL BasicMedia Module

This section is normative.

This module defines the baseline media functionality of a SMIL player.

7.4.1 Media Object Elements - ref, and its synonyms animation, audio, img, text, textstream and video

SMIL defines a single generic media object element that allows the inclusion of external media objects into a SMIL presentation. Media objects are included by reference (using a URI).

ref
Generic media reference

In addition to the ref element, SMIL allows the use of the following set of synonyms:

animation
Animated vector graphics or other animated format
audio
Audio clip
img
Still image, such as PNG or JPEG
text
External text reference
textstream
A text document that includes timing information for the purpose of time-dependent rendering of portions of the text document.
video
Video clip

All of these media elements are semantically identical. When playing back an external media object, the player must not derive the exact type of the media object from the name of the media object element. Instead, it must rely solely on other sources about the type, such as the type information communicated by a server or the operating system, or by using type information contained in the type attribute.

This section is informative.

Authors are encouraged to use meaningful synonyms (animation, audio, img, video, text or textstream) when referencing external media objects. This is in order to increase the readability of the SMIL document. Some SMIL implementations may require the use of an element type that matches the information type of the object. When in doubt about the group of a media object, authors should use the generic "ref" element.

The animation element defined here should not be confused with the elements defined in the SMIL 3.0 Animation Module. The animation element defined in this module is used to include an external animation object file (such as a vector graphics animation) by reference. This is in contrast to the elements defined in the Animation module, which provide an in-line syntax for the animation of attributes and properties of other elements.

SMIL 3.0 also supports the smilText element for defining in-line timed text content. This functionality is described in the smilText Modules specification.

Anchors and links can be attached to visual media objects, i.e. media objects rendered on a visual abstract rendering surface.

Attributes Definitions

Languages implementing the SMIL BasicMedia Module must define which attributes may be attached to media object elements. In all languages implementing the SMIL BasicMedia module, media object elements can have the following attributes:

src
The value of the src attribute is the [URI] of the media element, used for locating and fetching the associated media.

The attribute supports fragment identifiers and the '#' connector in the URI value. The fragment part is an id value that identifies one of the elements within the referenced media item. With this construct, SMIL 3.0 supports locators as currently used in HTML (that is, it uses locators of the form http://www.example.org/some/path#anchor1), with the difference that the values are of unique identifiers and not the values of "name" attributes. Generally speaking, this type of addressing implies that the target media is of a structured type that supports the concept of id, such as HTML or XML-based languages.

Note that this attribute is not required. A media object with no src attribute has an intrinsic duration of zero, and participates in timing just as any other media element. No media will be fetched by the SMIL implementation for a media element without a src attribute.

type
Content type of the media object referenced by the src attribute. The usage of this attribute depends on the protocol of the src attribute.
RTSP [RTSP]
The type attribute is used for purposes of content selection and when the type of the referenced media is not otherwise available. It may be overridden by the contents of the RTSP DESCRIBE response or by the static RTP payload number.
HTTP [HTTP]
The type attribute is used for an alternative method of content selection and when the type of the referenced media is not otherwise available. It may override the contents of the "Content-type" field in an HTTP exchange only if a user has allowed such overrides, as specified in the TAG Finding Authoritative Metadata [AM]. The nominal precedence order for type resolution is: via the HTTP content-type field, via the type attribute, and then by using other clues (such as file inspection or use of the file extension).
FTP [FTP] and local file playback URL [URI]
The type attribute value takes precedence over other possible sources of the media type (for instance, the file extension).

When the content represented by a URL is available in many data formats, implementations MAY use the type value to influence which of the multiple formats is used. For instance, on a server implementing HTTP content negotiation, the client may use the type attribute to order the preferences in the negotiation. The type attribute is not intended for use in media sub-stream selection.

For protocols not enumerated in this specification, implementations should use the following rules: When the media is encapsulated in a media file and delivered intact to the SMIL user agent via a protocol designed for delivery as a complete file, the media type as provided by this protocol should take precedence over the type attribute value. For protocols which deliver the media in a media-aware fashion, such as those delivering media in a manner using or dependent upon the specific type of media, the application of the type attribute is not defined by this specification.

Element Content

Languages utilizing the SMIL BasicMedia module must define the complete set of elements which may act as children of media object elements. There are currently no required children of a media object defined in the BasicMedia Module, but languages utilizing the BasicMedia module may impose requirements beyond this specification.

7.4.2 Integration Requirements

If the including profile supports the XMLBase functionality [XMLBase] , the values of the src and longdesc attributes on the media object elements must be interpreted in the context of the relevant XMLBase URI prefix.

User-agent implementations are responsible for defining the rendering behavior when fragment addressing is used in the src attribute. Such definition should be added to language profiles that wish to include specific media addressing features. For example:
- User-agents should define the default behavior for when referencing a non-existing id in the target media document.
- User-agents should define the rendering method for the selected media fragment: in context, with or without highlighting and scrolling, or stand-alone (selective rendering only).
- User-agents should describe the timing implication for when addressing timed-content.

SMIL 3.0 allows but does not require user agents to be able to process XPointer values in the URI value of the src attribute. The SMIL 3.0 Linking Module provides additional information related to XPointer.

7.5 SMIL MediaParam Module

This section is normative.

This section defines the elements and attributes that make up the SMIL MediaParam Module definition. The MediaParam module is intended to provide a uniform mechanism for media object initialization. Languages implementing elements and attributes found in the MediaParam module must implement all elements and attributes defined below, as well as BasicMedia.

7.5.1 The param element

The param element allows a general parameter value to be sent to a media object renderer as a name/value pair. This parameter is sent to the renderer at the time that the media object is processed by the scheduler. It is up to the media renderer to associate an action with the given param. The media renderer may choose to ignore any unknown or inappropriate param values (such as sending a font size to an audio object).

Any number of param elements may appear (in any order) in the content of a media object element or in a paramGroup element. If a given parameter is defined multiple times, the lexically last version of that parameter value should be used.

The syntax of names and values is assumed to be understood by the object's implementation. The SMIL specification does not specify how user agents should retrieve name/value pairs.

Attribute definitions
name
(CDATA) This attribute defines the name of a run-time parameter, assumed to be known by the inserted object. Whether the property name is case-sensitive depends on the specific object implementation.
value
(CDATA) This attribute specifies the value of a run-time parameter specified by name. Property values have no meaning to SMIL; their meaning is determined by the object in question.
valuetype
["data"|"ref"|"object"] This attribute specifies the type of the value attribute. Possible values:
  • data: This is default value for the attribute. It means that the value specified by value will be evaluated and passed to the object's implementation as a string.
  • ref: The value specified by value is a URI [URI] that designates a resource where run-time values are stored. This allows support tools to identify URIs given as parameters. The URI must be passed to the object as is, i.e., unresolved.
  • object: The value specified by value is an identifier that refers to a media object declaration in the same document. The identifier must be the value of the id attribute set for the declared media object element.
type
This attribute specifies the content type of the resource designated by the value attribute only in the case where valuetype is set to "ref". This attribute thus specifies for the user agent, the type of values that will be found at the URI designated by value. See 6.7 Content Type in [HTML4] for more information.

Example

This section is informative.

To illustrate the use of param, suppose that we have a facial animation plug-in that is able to accept different moods and accessories associated with characters. These could be defined in the following way:
<ref src="http://www.example.com/herbert.face">
  <param name="mood" value="surly" valuetype="data"/>
  <param name="accessories" value="baseball-cap,nose-ring" valuetype="data"/>
</ref>

7.5.2 The paramGroup element

The paramGroup element provides a convenience mechanism for defining a collection of media parameters that may be reused with several different media objects. If present, the paramGroup element must appear in the head section of the document. The content of the paramGroup element consists of zero or more param elements. The paramGroup element may not contain nested paramGroup element definitions.

Element attributes

id
This attribute specifies the ID by which the param group is referenced in a media object reference.

Examples

This section is informative.

This section contains several fragments that illustrate uses of the paramGroup element.

In the following fragment, a paramGroup is created to define parameters that are passed to several different media objects:

<smil ... >
  <head>
    ...
    <paramGroup id="clown">
       <param name="mood" value="upBeat" valuetype="data"/>
       <param name="accessories" value="flowers,dunceCap"/>
    </paramGroup>
    ...
  </head>
  <body>
    ...
    <ref src="http://www.example.com/andy.face" paramGroup="clown"/>
    ...
    <ref src="http://www.example.com/sally.face" paramGroup="clown"/>
    ...
  </body>
</smil>

In the following example, a media object provides an additional param value:

<smil ... >
  <head>
    ...
    <paramGroup id="clown">
       <param name="mood" value="upBeat" valuetype="data"/>
       <param name="accessories" value="flowers,dunceCap"/>
    </paramGroup>
    ...
  </head>
  <body>
    ...
    <ref src="http://www.example.com/andy.face" paramGroup="clown">
      <param name="gender" value="male"/>
    </ref>
    ...
  </body>
</smil>

In this final example, a media object provides a duplicate param value. The behavior in this case depends on the media renderer; all param values are passed to the renderer in the lexical order of the SMIL source file. It is expected that the lexically last value for any parameter sent to the renderer be used, if possible.

<smil ... >
  <head>
    ...
    <paramGroup id="clown">
       <param name="mood" value="upBeat" valuetype="data"/>
       <param name="accessories" value="flowers,dunceCap"/>
    </paramGroup>
    ...
  </head>
  <body>
    ...
    <ref src="http://www.example.com/andy.face" paramGroup="clown">
      <param name="gender" value="male"/>
      <param name="mood" value="depressed" valuetype="data"/>
    </ref>
    ...
  </body>
</smil>

7.5.3 Element Attributes for Media Object Initialization

In addition to the element attributes defined in BasicMedia, media object elements and layout regions may add the media initialization attribute defined below.

paramGroup
Used to specify the name of a paramGroup that was defined in the document head. The value is a single IDREF that refers to the ID of a paramGroup element. If the named paramGroup does not exist, this attribute is ignored. If this attribute is defined on a SMIL layout region definition, it specifies a default value for all content displayed within that region.

7.5.4 Integration Requirements

Any profile that integrates the functionality of this module is strongly encouraged to define a set of common parameter names that may be used to initialize common media object types for that profile. This can significantly increase interoperability of user agents and media rendering libraries.

The supported uses of the type and valuetype attributes on the param element must be specified by the integrating profile. If a profile does not specify this, the type and valuetype attributes will be ignored in that profile.

7.6 SMIL MediaRenderAttributes Module

This section is normative.

This section defines the elements and attributes that make up the SMIL MediaRenderAttributes Module definition. Languages implementing elements and attributes found in the MediaRenderAttributes module must implement all elements and attributes defined below, as well as BasicMedia.

7.6.1 Elements

This module does not define any elements.

7.6.2 Element Rendering Attributes for All Media Objects

In addition to the element attributes defined in BasicMedia, media object elements and layout regions may have the attributes and attribute extensions defined below.

erase
Controls the behavior of the media object after the effects of any timing are complete. For example, when SMIL Timing is applied to a media element, erase controls the display of the media when the active duration of the element and when the freeze period defined by the fill attribute is complete (see SMIL Timing and Synchronization module). If this attribute is defined on a SMIL layout region definition, it specifies a default value for all content displayed within that region.

Values:

whenDone (default)
When this is specified (or implied) the media removal occurs at the end of any applied timing.
never
When this value is specified, the last state of the media is kept displayed until the display area is reused (or if the display area is already being used by another media object). Any profile that integrates this element must define what is meant by "display area" and further define the interaction. Intrinsic hyperlinks (e.g., Flash, HTML) and explicit hyperlinks (e.g., area, a) stay active as long as the hyperlink is displayed. If timing is re-applied to an element, the effect of the erase=never is cleared. For example, when an element is restarted according to the SMIL Timing and Synchronization module, the element is cleared immediately before it restarts.

Example:

This section is informative.

<par>
  <seq>
    <par>
      <img src="image1.jpg" region="foo1" fill="freeze" erase="never" .../>
      <audio src="audio1.au"/>        
    </par>

    <par>
      <img src="image2.jpg" region="foo2" fill="freeze" erase="never" .../>
      <audio src="audio2.au"/>        
    </par>
     ...
    <par>
      <img src="imageN.jpg" region="fooN" fill="freeze" erase="never" .../>
      <audio src="audioN.au"/>        
    </par>
  </seq>
</par>

In this example, each image is successively displayed and remains displayed until the end of the presentation.

mediaRepeat
Used to strip the intrinsic repeat value of the underlying media object. The interpretation of this attribute is specific to the media type of the media object, and is only applicable to those media types for which there is a definition of a repeat value found in the media type format specification. Media type viewers used in SMIL implementations will need to expose an interface for controlling the repeat value of the media for this attribute to be applied. For all media types where there is an expectation of interoperability between SMIL implementations, there should be a formal specification of the exact repeat value to which the mediaRepeat attribute applies. If this attribute is defined on a SMIL layout region definition, it specifies a default value for all content displayed within that region.

Values:

strip
Strip the intrinsic repeat value of the media object.
preserve (default)
Leave the intrinsic repeat value of the media object intact.

As an example of how this would be used, many animated GIFs intrinsically repeat indefinitely. The application of mediaRepeat= "strip" allows an author to remove the intrinsic repeat behavior of an animated GIF on a per-reference basis, causing the animation to display only once, regardless of the repeat value embedded in the GIF.

When mediaRepeat is used in conjunction with SMIL Timing Module attributes, this attribute is applied first, so that the repeat behavior can then be controlled with the SMIL Timing Module attributes such as repeatCount and repeatDur.

sensitivity
Used to provide author control over the sensitivity of media to user interface selection events, such as the SMIL 2.1 activateEvent, and hyperlink activation. If the media is sensitive at the event location, it captures the event, and will not pass the event through to underlying media objects.  If not, it allows the event to be passed through to any media objects lower in the display hierarchy. If this attribute is defined on a SMIL layout region definition, it specifies a default value for all content displayed within that region.

Values:

opaque
The media is sensitive to user interface selection events over the entire area of the media.  This is the default.
transparent
The media is not sensitive to user interface selection events over the entire area of the media. Any user interface selection events will be "passed through" to any underlying media.
percentage-value
The media sensitivity to user interface selection events is dependent upon the opacity of the media at the location of the event (the alpha channel value). If rendered media supports an alpha channel and the opacity of the media is less than the given percentage value at the event location, the behavior will be transparent as specified above. Otherwise the behavior will be as opaque. Valid values are non-negative CSS2 percentage values.

7.6.3 Integration Requirements

Any profile that supports the erase attribute must define what is meant by "display area" and further define the interaction. See the definition of erase for more details.

7.7 SMIL MediaOpacity Module

This section is normative.

This section defines the elements and attributes that make up the SMIL MediaOpacity Module definition. Languages implementing elements and attributes found in the MediaOpacity module must implement all elements and attributes defined below, as well as BasicMedia.

7.7.1 Elements

This module does not define any elements.

7.7.2 Element Attributes for All Media Objects

In addition to the element attributes defined in BasicMedia, media object elements and layout regions may have the attributes and attribute extensions defined below.

chromaKey
This attribute defines the color to be used for chroma key opacity manipulation. It accepts a single CSS2 color value. If media objects or implementations cannot support manipulation of the chroma key value, this attribute is ignored. If this attribute is defined on a SMIL layout region definition, it specifies a default value for all content displayed within that region.
chromaKeyOpacity
This attribute defines the opacity of the chroma key value defined with the chromaKey attribute. It accepts a percentage value in the range 0-100%, with 100% meaning fully opaque. If a chroma key color is defined, the default value is 0% (fully transparent). If no chroma key color is defined or if implementations cannot support manipulation of the media opacity value, this attribute is ignored. If this attribute is defined on a SMIL layout region definition, it specifies a default value for all content displayed within that region.
chromaKeyTolerance
This attribute defines a color value that specifies a tolerance value that is added and subtracted from the effective chroma key. If a chroma key color was defined, the default value of this attribute is #000000. If no chroma key color was defined or if implementations cannot support manipulation of the chroma key value, this attribute is ignored. If this attribute is defined on a SMIL layout region definition, it specifies a default value for all content displayed within that region.
mediaOpacity
This attribute defines the opacity of the media object. It accepts a percentage value in the range 0-100%, with 100% meaning fully opaque. If implementations cannot support manipulation of the media opacity value, this attribute is ignored. The default value of this attribute is 100%. If this attribute is defined on a SMIL layout region definition, it specifies a default value for all content displayed within that region.
mediaBackgroundOpacity
This attribute defines the background color opacity of the media object for media objects that explicitly define a media background color. It accepts a percentage value in the range 0-100%, with 100% meaning fully opaque. If either media objects or implementations cannot support manipulation of the media background color opacity, this attribute is ignored. The default value of this attribute is 100%. If this attribute is defined on a SMIL layout region definition, it specifies a default value for all media background opacity displayed within that region.

This section is informative.

The attributes in this module allow the opacity (that is, the degree to which a media object is transparent) to be defined. Opacity can be controlled in several ways, depending on the type of media being used. For unstructured media (that is, media that does not contain an explicitly-defined background color), the chromaKey attribute can be used to identify a particular color that will serve as the background color for purposes of opacity manipulation. If a chromaKey is used, the chromaKeyOpacity attribute can specify the degree of transparency desired. Since the color used to define a background may not be exactly preserved within a media object, the chromaKeyTolerance attribute allows a tolerance range to be defined for the chroma key color.

Some media objects, such as RealText, smilText, GIF, PNG, and Flash, define an explicit background color. In these cases, the specification of the opacity of that color can be done using the mediaBackgroundOpacity attribute. In these cases, only the defined color is manipulated.

In addition to specifying the transparency level of a particular background color, SMIL also allows the specification of the transparency level of a total media object. This is accomplished using the mediaOpacity attribute.

Note that SMIL layout also defines the backgroundOpacity attribute to control the transparency of a layout region.

7.7.3 Integration Requirements

This module does not introduce any special integration constraints.

7.8 SMIL MediaClipping Module

This section is normative.

This section defines the attributes that make up the SMIL MediaClipping Module definition. Languages implementing the attributes found in the MediaClipping module must implement the attributes defined below, as well as BasicMedia.

7.8.1 MediaClipping Attributes

clipBegin (clip-begin)
The clipBegin attribute specifies the beginning of a sub-clip of a continuous media object as offset from the start of the media object. This offset is measured in normal media playback time from the beginning of the media.
Values in the clipBegin attribute have the following syntax:
Clip-value-MediaClipping ::= [ Metric "=" ] ( Clock-val | Smpte-val )
Metric            ::= Smpte-type | "npt" 
Smpte-type        ::= "smpte" | "smpte-30-drop" | "smpte-25"
Smpte-val         ::= Hours ":" Minutes ":" Seconds 
                      [ ":" Frames [ "." Subframes ]]
Hours             ::= Digit+ 
                  /* see XML 1.0 for a definition of ´Digit´*/
Minutes           ::= Digit Digit; range from 00 to 59
Seconds           ::= Digit Digit; range from 00 to 59

Frames            ::= Digit Digit; smpte range = 00-29, smpte-30-drop range = 00-29, smpte-25 range = 00-24
Subframes         ::= Digit Digit; smpte range = 00-01, smpte-30-drop range = 00-01, smpte-25 range = 00-01
      
 

The value of this attribute consists of a metric specifier, followed by a time value whose syntax and semantics depend on the metric specifier. The following formats are allowed:

SMPTE Timestamp
SMPTE time codes [SMPTE] can be used for frame-level access accuracy. The metric specifier can have the following values:
smpte
smpte-30-drop
These values indicate the use of the "SMPTE 30 drop" format (approximately 29.97 frames per second), as defined in the SMPTE specification (also referred to as "NTSC drop frame"). The "frames" field in the time value can assume the values 0 through 29. The difference between 30 and 29.97 frames per second is handled by dropping the first two frame indices (values 00 and 01) of every minute, except every tenth minute.
smpte-25
The "frames" field in the time specification can assume the values 0 through 24. This corresponds to the PAL standard as noted in [SMPTE]

The time value has the format hours:minutes:seconds:frames.subframes. If the subframe value is zero, it may be omitted. Subframes are measured in one-hundredths of a frame.
Examples:
clipBegin="smpte=10:12:33"

This section is informative.

The introduction of subframe notation in SMIL 2.1 introduced an inconsistency with SMIL 1.0. As of this draft, SMIL 3.0 has deprecated the subframe notation.

Normal Play Time
Normal Play Time expresses time in terms of SMIL clock values. The metric specifier is "npt", and the syntax of the time value is identical to the syntax of SMIL clock values.
Examples:
clipBegin="npt=123.45s"
clipBegin="npt=12:05:35.3
"
Marker
Not defined in this module. See clipBegin Media Marker attribute extension in the MediaClipMarkers module.

If no metric specifier is given, then a default of "npt=" is presumed.

When used in conjunction with the timing attributes from the SMIL Timing Module, this attribute is applied before any SMIL Timing Module attributes.

clipBegin may also be expressed as clip-begin for compatibility with SMIL 1.0. Software supporting the SMIL 2.1 Language Profile must be able to handle both clipBegin and clip-begin, whereas software supporting only the SMIL MediaClipping module only needs to support clipBegin. If an element contains both a clipBegin and a clip-begin attribute, then clipBegin takes precedence over clip-begin.

Example:

<audio src="radio.wav" clip-begin="5s" clipBegin="10s" />

The clip begins at second 10 of the audio, and not at second 5, since the clip-begin attribute is ignored. A strict SMIL 1.0 implementation will start the clip at second 5 of the audio, since the clipBegin attribute will not be recognized by that implementation. See Changes to SMIL 1.0 Media Object Attributes for more discussion on this topic.

clipEnd (clip-end)
The clipEnd attribute specifies the end of a sub-clip of a continuous media object as offset from the start of the media object. This offset is measured in normal media playback time from the beginning of the media. It uses the same attribute value syntax as the clipBegin attribute.
If the value of the clipEnd attribute exceeds the duration of the media object, the value is ignored, and the clip end is set equal to the effective end of the media object. clipEnd may also be expressed as clip-end for compatibility with SMIL 1.0. Software supporting the SMIL 2.1 Language Profile must be able to handle both clipEnd and clip-end, whereas software supporting only the SMIL media object module only needs to support clipEnd. If an element contains both a clipEnd and a clip-end attribute, then clipEnd takes precedence over clip-end. When used in conjunction with the timing attributes from the SMIL Timing Module, this attribute is applied before any SMIL Timing Module attributes.

See Changes to SMIL 1.0 Media Object Attributes for more discussion on this topic.

7.9 SMIL MediaClipMarkers Module

This section is normative.

This section defines the attribute extensions that make up the SMIL MediaClipMarkers Module definition. Languages implementing elements and attributes found in the MediaClipMarkers module must implement all elements and attributes defined below, as well as BasicMedia and MediaClipping.

7.9.1 MediaClipMarkers Attribute Extensions

clipBegin Media Marker attribute extension
Used to define a clip using named time points in a media object, rather than using clock values or SMPTE values. The metric specifier is "marker", and the marker value is a URI (see [URI] ). The URI is relative to the src attribute, rather than to the document root or the XML base of the SMIL document.

Clip-value-MediaClipMarkers ::= Clip-value-MediaClipping |
                      "marker" "=" URI-reference
   /* "URI-reference" is defined in  [URI]  */

Example: Assume that a recorded radio transmission consists of a sequence of songs, which are separated by announcements by a disk jockey. The audio format supports marked time points, and the begin of each song or announcement with number X is marked as songX or djX respectively. To extract the first song using the "marker" metric, the following audio media element can be used:

<audio clipBegin="marker=#song1" clipEnd="marker=#dj1" />
clipEnd Media Marker attribute extension
clipEnd media markers use the same attribute value syntax as the clipBegin media marker extension media marker attribute extension. For the complete description, see clipBegin media marker extension.

7.10 SMIL BrushMedia Module

This section is normative.

This section defines the elements and attributes that make up the SMIL BrushMedia Module definition. Languages implementing elements and attributes found in the BrushMedia module must implement all elements and attributes defined below.

7.10.1 The brush element

The brush element is a lightweight media object element which allows an author to paint a solid color in place of a media object. Attributes associated with media objects may also be applied to brush element. (A specific profile will determine the attribute set applied to this element.)

Attribute definitions
color
The use and definition of this attribute are identical to the "background-color" property in the CSS2 specification.

7.10.2 Integration Requirements

Profiles including the BrushMedia module must provide semantics for using a color attribute value of inherit on the brush element. Because inherit doesn't make sense in all contexts, the value of inherit is prohibited on the color attribute of the brush element for profiles that do not otherwise define these semantics.

7.11 SMIL MediaAccessibility Module

This section is normative.

This section defines the elements and attributes that make up the SMIL MediaAccessibility Module definition. Languages implementing elements and attributes found in the MediaAccessibility module must implement all elements and attributes defined below, as well as MediaDescription.

7.11.1 MediaAccessibility Attributes

Attribute definitions
alt
For user agents that cannot display a particular media object, this attribute specifies alternate text. alt may be displayed in addition to the media, or instead of media when the user has configured the user agent to not display the given media type.

It  is strongly recommended that all media object elements have an "alt" attribute with a brief, meaningful description. Authoring tools should ensure that no element can be introduced into a SMIL document without this attribute.

The value of this attribute is a CDATA text string.

longdesc
This attribute specifies a link ([URI] ) to a long description of a media object. This description should supplement the short description provided using the alt attribute or the abstract attribute. When the media object has associated hyperlinked content, this attribute should provide information about the hyperlinked content.

readIndex
This attribute specifies the position of the current element in the order in which longdesc, title and alt text are read aloud by assistive devices (such as screen readers) for the current document. User agents should ignore leading zeros. The default value is 0.

Elements that contain alt, title or longdesc attributes are read by the assistive technology according to the following rules:

  • Those elements that assign a positive value to the readindex attribute are read out first. Navigation proceeds from the element with the lowest readindex value to the element with the highest value. Values need not be sequential nor must they begin with any particular value. Elements that have identical readindex values should be read out in the order they appear in the character stream of the document.
  • Those elements that assign it a value of "0" are read out in the order they appear in the character stream of the document.
  • Elements in a switch statement that have test-attributes which evaluate to "false" are not read out.

Example

<par>
  <video id="carvideo" src="car.rm" region="videoregion" title="Car video"
         alt="Illustration of relativistic time dilation and length 
              contraction." 
         longdesc="carvideodesc.html" readIndex="3"/>
  <audio id="caraudio" src="caraudio.rm" region="videoregion" 
         title="Car presentation voiceover" begin="bar.begin"/>
  <animation id="cardiagram" src="car.svg" region="animregion" 
         title="Diagram of the car" readIndex="2"/>
  <img id="scvad" src="scv.png" region="videoregion" 
         title="Advertisement for Sugar Coated Vegetables"
         readIndex="1"/>
</par>

In this example, an assistive device that is presenting titles should present the "scvad" element title first (having the lowest readIndex value of "1"), followed by the "cardiagram" title, followed by the "carvideo" element title, and finally present the "caraudio" element title (having an implicit readIndex value of "0").

7.12 SMIL MediaDescription Module

This section is normative.

This section defines the elements and attributes that make up the SMIL MediaDescription Module definition. Languages implementing elements and attributes found in the MediaDescription module must implement all elements and attributes defined below.

7.12.1 MediaDescription Attributes

Attribute definitions
abstract
A brief description of the content contained in the element. Unlike alt, this attribute is generally not displayed as alternate content to the media object. It is typically used as a description when table of contents information is generated from a SMIL presentation, and typically contains more information than would be advisable to put in an alt attribute.

This attribute is deprecated in favor of using appropriate SMIL metadata markup in RDF. For example, this attribute maps well to the "description" attribute as defined by the Dublin Core Metadata Initiative [DC] .

author
The name of the author of the content contained in the element.

The value of this attribute is a CDATA text string.

copyright
The copyright notice of the content contained in the element.

The value of this attribute is a CDATA text string.

title
The title attribute as defined in the SMIL Structure module. It is strongly recommended that all media object elements have a title attribute with a brief, meaningful description. Authoring tools should ensure that no element can be introduced into a SMIL document without this attribute.
xml:lang
Used to identify the natural or formal language for the element. For a complete description, see section 2.12 Language Identification of [XML11].

xml:lang differs from the systemLanguage test attribute in one important respect. xml:lang provides information about the content's language independent of what implementations do with the information, whereas systemLanguage is a test attribute with specific associated behavior (see systemLanguage in SMIL Content Control Module for details)

This section is informative.

SMIL 3.0 also supports the use of the element within the MetaInformation Module to supply additional or alternative forms of metainformation for any media object.

7.13 MediaPanZoom Module

This section is normative.

7.13.1 Overview

This section is informative.

The SMIL MediaPanZoom module integrates the functionality of the SVG viewBox attribute and adapts it for use within the SMIL media framework. The SMIL viewBox attribute allows a SMIL author to define a two-dimensional extent over the visible surface of a media object and to subsequently project the contents within the viewBox into a SMIL presentation.

Most of SMIL's layout elements and attributes provide the ability to define and manage a two-dimensional rendering space. This space is defined relative to a root-layout (or topLayout) specification. All of the coordinate and size specifications are in terms of the coordinate space defined for the layout root. In contrast, the viewBox attribute allows users to define an area in terms of the coordinate space used by the media object that is associated with the viewBox. The viewBox may define an area that is smaller, equal to, or larger than the related media object.

The following illustration shows three views of a 300x200 pixel image. In the left view, a viewBox is shown that is the same size as the media object; in the middle view, a viewBox is defined that covers the middle part of the image only; in the right view, a viewBox is illustrated that is positioned (in both dimensions) partially outside the media object. Note that while this illustration shows the viewBox projected onto an image, similar illustrations could be defined for videos or text objects, or any other object that can be mapped to a particular media bounding box.

Picture showing a base image and three viewBox examples

Once a portion of a media object's visible area is defined with a viewBox, the portion within the viewBox is processed further as if it defined the full native view of the media object. The area within the viewBox is projected into a region in a manner that is dependent on the region element associated with that object, including any scaling dictated by the fit attribute or (if appropriate), sub-region positioning and alignment directives.

If the region and the viewBox have the same aspect ratios, then the viewBox will, by default, fill the entire region. If the effective pixel dimensions of the region are larger than that of the viewBox, the effect will be an enlargement of the media content. If the effective pixel dimensions of the window are smaller than that of the viewBox, the effect will be a reduction in size of the media object. Other effects can be obtained by manipulating the fit attribute of the region.

If supported by the profile implementing this module, a dynamic pan-and-zoom effect can be obtained by applying standard SMIL animation primitives to the dimensions of the viewBox. A pan effect may be obtained by varying the X and Y positioning values, and a zoom effect can be obtained by changing the size dimensions of the viewBox. Examples of these effects are given later in this section.

If a viewBox extends past the viewable extents of a media object (such as in the rightmost illustration, above), then the effective contents of these extended areas will be transparent.

7.13.2 Elements and Attributes for the MediaPanZoom Module

This module does not define any new elements. It provides extensions to the ref element (and its synonyms), and to the region element.

The ref Element

The viewBox attribute is added to media object references.

Element attributes
viewBox
This attribute specifies a rectangular area in media coordinates that defines the portion of a media object that is to be used within a SMIL presentation. The value of the viewBox attribute is an ordered list of four numbers, separated by whitespace and/or a comma:
min-x
A value that defines the minimum X coordinate of a rectangle in media space that serves as the X origin of the viewBox. A value of '0' represents the left edge of the media object.
min-y
A value that defines the minimum Y coordinate of a rectangle in media space that serves as the Y origin of the viewBox. A value of '0' represents the top edge of the media object.
width
A non-negative length value (using CSS2 pixel or non-negative percentage values) that defines the horizontal dimension of the viewBox. If pixel notation is used, the 'px' suffix may be omitted. A negative value is an error. The default value of width is auto.
height
A non-negative length value (using CSS2 pixel or non-negative percentage values) that defines the vertical dimension of the viewBox. If pixel notation is used, the 'px' suffix may be omitted. A negative value is an error. The default value of height is auto.
The default viewBox behavior is to select the entire visual space of the media object; this is equivalent to viewBox="0, 0, 100%, 100%".

The viewBox is processed on the media object before any other SMIL layout processing occurs. The actual visual rendering of the content resulting from the processed viewBox will be determined by, among other factors: the size of the target region, the application of sub-region positioning in that region (if supported by the profile), the value of the fit attribute on the region, and the effect of SMIL alignment attributes (if supported by the profile).

Element content

The SMIL MediaPanZoom module does not extend the content model for the ref element integrating these attributes.

The region Element

The viewBox attribute is added to regions definitions.

Element attributes
viewBox
This attribute is identical in definition to the viewBox attribute defined for the ref element in this section, with the exception that it defines a default viewBox that is applied to all media rendered in the associated region. All other aspects of viewBox processing are the same as with the ref element, except that the values defined for the viewBox on a region may be overridden by a viewBox specification on the ref element.
Element content

The SMIL MediaPanZoom module does not extend the content model for the region element integrating these attributes.

Attribute Examples

This section is informative.

Assume the following SMIL example:

<smil ...>
  <head>
  ...
    <layout>
      <root-layout height="200" width="300" backgroundColor="red" />
      <region id="I" top="0" left="0" height="200" width="300"  backgroundColor="blue" />
    </layout>
  </head>
  <body>
    <seq> 
      <ref id="R1" src="table.jpg" viewBox="0,0,300,200" dur="5s" region="I" />
      <ref id="R2" src="table.jpg" viewBox="50,195,160,125" dur="5s" region="I" fit="meet"/>
      <ref id="R3" src="table.jpg" viewBox="50,195,160,125" dur="5s" region="I" fit="meetBest"/>
      <ref id="R4" src="table.jpg" viewBox="240,120,85,110" dur="5s" region="I" fit="meet"/>
    </seq>
  </body>
</smil>

In this example, a single region is defined that is used to display four instances of the same image. Each media reference within the sequence S contains a different viewBox definition, each of which will result in the following behavior:

  1. The media reference R1 defines a viewBox that encompasses the entire media object space; the full image will be shown in region I, as is shown in the following image:
    A viewBox projection that is the same size as the target region.
    Note that the origin of the image is aligned with the origin of the media object, at the top-left of the region.
  2. The media reference R2 defines a viewBox that encompasses the center portion of the media object space. The projection of the media into region I will result in a zoom into the source image, as is shown in the following image:
    A viewBox projection that is smaller than the target region, resulting in a zoom effect.

    Note that the origin of the sub-image defined by the viewBox is placed at the origin of the top-left of the region. Note also that the value of the fit attribute determines that the image is scaled (while maintaining the aspect ratio), resulting in the zoom effect.

  3. The media reference R3 defines a viewBox that is the same as in reference R2; the difference in this example is that the value of the fit attribute does not permit enlargement of the source image into the region. As a result, the image is placed at top-left in an unscaled rendering:
    A viewBox projection that is smaller than the target region, but with a fit=
  4. The media reference R4 defines a viewBox that extends beyond the boundaries of the media object. When it is projected into the region I with a fit value that scales the image with preserved aspect ratio, the entire extent of the viewBox is scaled: the areas that extend beyond the image content are rendered as (scaled) transparent content:
    A viewBox projection that extends beyond the right/bottom edge of the image -- the extended part of the box will be transparent.

All of the previous examples illustrate how a viewBox operates on a media object that contains a media-defined viewable extent. The viewBox attribute may also be applied to visual objects that do not have predefined extents. Consider the following example, in which an unstructured text object is placed in a region:

<smil ...>
  <head>
  ...
    <layout>
      <root-layout height="200" width="300" backgroundColor="red" />
      <region id="T" top="0" left="0" height="50" width="300"  backgroundColor="blue" />
    </layout>
  </head>
  <body>
    <seq> 
      <ref id="R0" src="short_story.txt" viewBox="0,10,50,200" dur="10s" region="T" />
    </seq>
  </body>
</smil>

In this example, a single region is defined that is used to display a undimensioned text object. In SMIL 3.0, the text object would first be rendered to an off-screen bitmap based on the default settings for the media object (font, font size, font color) and then a viewBox of the defined size would be overlaid on this text representation. This facility is especially useful when combined with SMIL Animation, as discussed in the next example.

The ability to define a viewBox, when combined with SMIL animation primitives, provides a simple mechanism for doing pan/zoom animations over a visual object. (These pan/zoom animations are often called 'Ken Burns' animations.) The following example illustrates how a pan window can be positioned and moved over an image area:

<smil ...>
  <head>
  ...
    <layout>
      <root-layout height="200" width="300" backgroundColor="red" />
      <region id="B" top="0" left="0" height="50" width="75"  backgroundColor="blue" />
    </layout>
  </head>
  <body>
    <seq> 
      <ref id="R0" src="table_233x150.jpg" viewBox="0,0,50,75" dur="20s" region="T" fit=""meet" >
         <animate attributeName="viewBox" 
                     values="(25,20,50,75); (45,55,50,75);(140,40,50,75);(35,0,100,150); (0,0,100,150);" 
                     dur="20s" />
      </ref>
      ...
    </seq>
  </body>
</smil>

In this example, an image with intrinsic size of 233x150 pixels is rendered into a region of size 50x75. An initial viewBox is defined that displays a 50x75 portion of that image, positioned in its top-left corner. During the following 20 seconds, the viewBox is moved across the image according to the behavior of the animate element; the viewBox changes are scheduled at equal points across the animation timeline (in this case, every 5 seconds). During the final animation, the viewBox is extended to implement a zoom-out across the entire image. An illustration of the rendering results is shown below:


A viewBox projection and a set of animations that move the viewBox across the source image.

7.13.3 MediaPanZoom Module Events

This module does not define any SMIL events.

7.13.4 SMIL MediaPanZoom Implementation and Integration

Implementation Details

The MediaPanZoom module allows individual media object references to override the default values for certain attributes. In all cases, the attributes will apply only to the (sub-)region referenced by the media object. Changes will not propagate to child sub-regions or to parent regions.

Integration Requirements

The functionality in this module builds on top of the functionality in the Media module, which is a required prerequisite for inclusion of the MediaPanZoom module.

Differences with the SVG viewBox Attribute

The functionality in this module builds on the viewBox definition of SVG. Unlike SVG, the SMIL viewBox attribute defines a logical sub-image that contains only content within the area defined by the viewBox; SVG uses the viewBox to define a minimum viewing dimension for content, but allowing content outside the viewBox to be displayed in the region.

The MediaPanZoom module does not define a preserveAspectRatio attribute, since this functionality is already provided by the SMIL fit and registration/alignment attributes.

7.13.5 Document Type Definition (DTD) for the MediaPanZoom Module

See the full DTD for the SMIL Layout modules.

7.14 Appendices

This section is informative.

7.14.1 Appendix A: Changes to SMIL 1.0 Media Object Attributes

clipBegin, clipEnd, clip-begin, clip-end

With regards to the clipBegin/clip-begin and clipEnd/clip-end elements, SMIL 2.1 defines the following changes to the syntax defined in SMIL 1.0:

Handling of new clipBegin/clipEnd syntax in SMIL 1.0 software

Using attribute names with hyphens such as clip-begin and clip-end is problematic when using a scripting language and the DOM to manipulate these attributes. Therefore, this specification adds the attribute names clipBegin and clipEnd as an equivalent alternative to the SMIL 1.0 clip-begin and clip-end attributes. The attribute names with hyphens are deprecated.

Authors can use two approaches for writing SMIL 2.1 presentations that use the new clipping syntax and functionality ("marker", default metric) defined in this specification, but can still can be handled by SMIL 1.0 software. First, authors can use non-hyphenated versions of the new attributes that use the new functionality, and add SMIL 1.0 conformant clipping attributes later in the text.

Example:

<audio src="radio.wav" clipBegin="marker=song1" clipEnd="marker=moderator1" 
       clip-begin="npt=0s" clip-end="npt=3:50" />

SMIL 1.0 players implementing the recommended extensibility rules of SMIL 1.0 [SMIL10] will ignore the clip attributes using the new functionality, since they are not part of SMIL 1.0. SMIL 2.1 players, in contrast, will ignore the clip attributes using SMIL 1.0 syntax, because the SMIL 2.1 syntax takes precedence over the SMIL 1.0 syntax.

The second approach is to use the following steps:

  1. Add a "system-required" test attribute to media object elements using the new functionality. The value of the "system-required" attribute would correspond to a namespace prefix whose namespace URI ([URI] ) points to a SMIL specification which integrates the new functionality.
  2. Add an alternative version of the media object element that conforms to SMIL 1.0
  3. Include these two elements in a "switch" element

Example:

<smil xmlns="http://www.w3.org/2005/SMIL21/Language">
...
<switch>
  <audio src="radio.wav" clipBegin="marker=song1" clipEnd="marker=moderator1"  
   system-required="smil2" />
  <audio src="radio.wav" clip-begin="npt=0s" clip-end="npt=3:50" />
</switch>

New Accessibility Attributes

readIndex
Allows explicit ordering for controlling assistive technology.

New Advanced Media Attributes

mediaRepeat
The mediaRepeat attribute was added to provide better timing control over media with intrinsic repeat behavior (such as animated GIFs).
erase
Provides a way for visual media to remain visible throughout the duration of a presentation by overriding the default erase behavior.

previous   next   contents