WD-smil-971109

Synchronized Multimedia Integration Language

W3C Working Draft 09-November-97

Latest version:: http://www.w3.org/AudioVideo/Group/WD-smil
This version:: http://www.w3.org/TR/WD-smil-971109

Editor:: Philipp Hoschka, W3C (hoschka@w3.org)

Authors:: Stephan Bugaj, Lucent/Bell Labs
Dick Bulterman, CWI
Lynda Hardman, CWI
Jack Jansen, CWI
Rob Lanphier, RealNetworks
Nabil Layaida, INRIA
Jonathan Marsh, Microsoft
Anup Rao, Netscape
Warner ten Kate, Philips
Jacco van Ossenbruggen, CWI
Michael Vernick, Lucent/Bell Labs
Jin Yu, DEC

1 Status of this Document
2 Introduction
3 Relation to XML
4 SMIL Document
5 The Document Head
- 5.1 Layout Element
6 The Document Body
7 Appendix

1 Status of this document

This document is a W3C Working Draft produced by the W3C Working Group on Synchronized Multimedia (SYMM). It is inappropriate to use W3C Working Drafts as reference material or to cite them as other than "work in progress". A list of current W3C technical reports can be found at http://www.w3.org/TR.

This document is updated very frequently, and major changes are expected within the next months. Thus, this is a draft document which may be updated, replaced or obsoleted by other documents at any time. Please check back regularly to get the latest version of the draft. You are encouraged to implement a prototype based on this draft, but you should realize that you may have to change your implementation when the draft changes.

2 Introduction

This document specifies the Synchronized Multimedia Integration Language (SMIL, pronounced "smile"). SMIL allows integrating a set of independent multimedia objects into a synchronized multimedia presentation. Using SMIL, presentations such as a slide show synchronized with audio comments or a video synchronized with a text stream can be described.

A typical SMIL presentation has the following characteristics:

The presentation is composed of several components that are accessible via a URL, e.g. files stored on a Web server.
The components have different media types, such as audio, video, image or text.
The begin and end times of different components have to be synchronized with events in other components. For example, in a slide show, a particular slide is displayed when the narrator in the audio starts talking about it.
The user can control the presentation by using control buttons known from video-recorders, such as stop, fast-forward and rewind. Additional functions are "random access", i.e. the presentation can be started anywhere, and "slow motion", i.e. the presentation is played slower than at its original speed.
The user can follow hyper-links embedded in the presentation

SMIL has been designed so that it is easy to author simple presentations with a text editor. The key to success for HTML was that attractive hypertext content could be created without requiring a sophisticated authoring tool. SMIL achieves the same for synchronized hypermedia.

To get a quick idea on how to use SMIL, study the example in section 7.4.

3 Relation to XML

SMIL documents are well-formed XML documents in the sense of the XML 1.0 draft. For describing the syntax of SMILE documents, this specification uses two notations:

an augmented Backus-Naur form (BNF) similar to the one defined for HTTP 1.1 (see Appendix 7.2 for a description of the BNF notation used in this document)
an XML Document Type Definition (DTD) (see Appendix 7.3)

The BNF is included since it is easier to read for a large part of the intended audience of this specification. However, reflecting XML syntax in the BNF leads to a number of unusual constructs:

The current XML allows the use of two different characters for quoting attribute values. Consequently, the BNF contains the following production for each attribute A:
A = "A" "=" (<">A-value<"> | "'"A-value"'")
Following common practice in programming languages, all SMIL elements may be empty, i.e. contain no content. In the current XML draft, there are two alternative ways to express this. Consequently, the BNF contains one of the following production for each element E:

If E is always empty:
E = "<E" *att-list ("/>" | "></E" ">")
If E is not always empty:
E = "<E" *att-list ("> content </E" ">" | "/>")
(the syntax that is used more frequently in practice comes first in the "or" part of these rules)

If a SMIL implementation uses a parser implemented directly from the BNF, an additional check must be added to verify that each attribute occurs only once within a particular element. The BNF given in this specification does not reflect this requirement.

4 SMIL Document

Syntax

smil-doc            = "<smil" *smil-attribute (">" [head] [body] "</smil>" 
                      | "/>")
smil-attribute      = id | lipsync
id                  = "id" "=" (<">id-value<"> | "'"id-value"'")
id-value            = (any legal XML name symbol)
lipsync             = "lipsync" "=" (<">lipsync-value<"> | "'"lipsync-value"'")
lipsync-value       = "true" |  "false"
head                = "<head" *head-attribute (">" *head-element "</head>" 
                      | "/>")
head-attribute      = id
head-element        = comment | layout-section; more to be defined later
body                = "<body" *body-attribute (">" [body-content] "</body" ">" 
                      | "/>")
body-attribute      = id
body-content        = *comment-or-ool-link schedule-or-switch 
                      *comment-or-ool-link
comment-or-ool-link = comment | out-of-line
schedule-or-switch  = schedule | switch
comment             = "<!--" *TEXT "-->" 
                    ; the string "--" must not occur within comments

General Semantics

An SMIL document may contain a head part and a body part. Both parts may contain comments.

Attributes

id: This attribute specifies value that uniquely identifies an element within a document.
lipsync: The value of this attribute determines the default value of the lipsync attribute in parallel elements within the document (see 6.2). If this attribute is not specified, the default value of lipsync is implementation dependent.

5 The Document Head

(additional elements will be added to the document head at a later point. Candidates are: meta, title)

5.1 Layout Element

Syntax

layout-section       = layout 
                       | "<switch" *switch-attribute (">" *layout "</switch" ">" 
                         | "/>")
layout               = "<layout" *layout-attribute (">" layout-element 
                        "</layout" ">" | "/>")
layout-attribute     = id | layout-type
layout-type          = "type" "=" (<">layout-language<"> | 
                       <'>layout-language<'>)
layout-language      = "text/smil-basic" | external-language
layout-element       = layout-element-basic | external-layout-element
layout-element-basic = *tuner
external-language    = *TEXT

General Semantics

An SMIL document may contain a layout section that determines the placement presentation components in non-temporal dimensions. If the layout section is missing, the placement of elements is implementation dependent. The layout section may contain several alternative layout elements embedded within a switch element. The player chooses one of these alternatives. This can be used for example to describe the document layout using different layout languages. This specification defines a basic layout language for SMIL (see Appendix 7.1).

Attributes

type: The type attribute specifies which layout language is used in the layout element. If the player does not understand this language, it must skip all text up until the next "</layout>" tag.

6 The Document Body

6.1 Schedule Elements

Syntax

schedule = parallel | sequential | media-object

General Semantics

Schedule elements determine the temporal behavior of an SMIL document. They can be composite (parallel, sequential) or atomic (media-object).

Schedule elements have a begin and an end time. Begin and end time of an element E can be determined in one of two ways: either they are derived from the composite element which contains E, or they are determined by synchronization attributes contained within the start tag of E.

The player keeps track of a presentation clock that advances at the speed of the presentation.

6.2 Parallel Element

Syntax

parallel      = "<par" *par-attribute (">" *par-content "</par" ">" | "/>")
par-content   = comment | schedule | switch | link 
par-attribute = id | endsync | lipsync | dur | repeat | *sync-attribute | 
                *new-attribute
endsync       = "endsync" "=" (<">endsync-value<"> | <'>endsync-value<'>)
endsync-value = "first" | "last" |  id-ref
id-ref        = "id(" id-value ")"
dur           = "dur" "=" (<">clock-value<"> | <'>clock-value<'>)
repeat        = "repeat" "=" (<">*DIGIT<"> | <'>*DIGIT<'>)
new-attribute = attribute-name "=" 
               (<">attribute-value<"> | <'>attribute-value<'>)
               ; attribute-name is an XML name, 
               ; attribute-values must conform to XML syntax

General Semantics

End Time

By default, the end time of a parallel element is equal to the maximum end-time of all children in the parallel element. If none of the children has a known end time, the end time and the duration of the parallel element are also unknown. In this case, the parallel element is terminated by an external event, for example when the user hits a "stop" button.

The default end time of a parallel element can be overridden by using the "endsync", the "dur" or the "end" attribute (see below).

Begin/End Time of Children

If the begin time of a child in a parallel element is unknown, it is set to the begin time of the parallel element. If the end time of a child in a parallel element is unknown, it is set to the end time of the parallel element.

Attributes

lipsync (optional)

This attribute specifies how accurately the children in a parallel group are synchronized in case of playback delays. Most importantly, it determines what to do if the parallel group contains two or more continuous media types such as audio or video, and one of them experiences a delay. The attribute can have the following values:

"true": The player must synchronize the children in the parallel group to a common clock (see Figure 6.1 a)).
"false": Each child of the parallel element has its own clock, which runs independently of the clocks of other children in the parallel element (see Figure 6.1 b)).

The default value for lipsync is specified as an attribute of the "smil" element. If no default value is specified, the default value is implementation-dependent.

   audio
|----....------|
   video
|----....------|

   audio
|----------|
   video
|----....--|

a) lipsync = "true": The exact behavior is implementation-dependent

   audio
|----------|
   video
|----....------|

b) lipsync = "false"

Figure 6.1: Effect of a delay on playout schedule for different settings of the lipsync attribute

endsync (optional)

This attribute specifies that the parallel element depends on the end time of one of its children. The attribute can have the following values:

"last" (default): The parallel element ends at the same time as the child with the maximum end time of all children in the parallel group (see Figure 6.2 a)).
"first": The parallel element ends at the same time as the child with the minimum end time of all children in the group (see Figure 6.2 b)).
id-ref: The parallel group ends at the same time as the child identified by the id ends.

If a parallel element contains both an "endsync" attribute and an "end" attribute (see 6.6), the element ends at the minimum of the end times specified by these two attributes.

<par endsync="last">
  <audio .../>
  <video .../>
  <image .../>
</par>


   audio
|---------------|
                |
                |
  video         |
|--------|      |
                |
                |
  image        \ /
|---------------|

a) endsync="last" (default behavior)

<par endsync="first">
  <audio .../>
  <video .../>
  <image .../>
</par>


   audio
|---------......|
        / \
         |
  video  |
|--------|
         |
         |
  image \ /
|--------|

b) endsync="first"

Figure 6.2: Effect of endsync attribute

dur (optional): The value of this attribute specifies the difference between the begin time and the end time of the parallel element.
repeat (optional): This attribute specifies the number of times an element should be repeated. A value of "0" indicates that the element should be repeated an infinite number of times. The default value is 1.
new-attribute (optional): This attribute allows adding implementation-specific attributes to an element. For example, it may be used to specify attributes that are used by the switch element to select one of several alternatives. A list of standard attributes that can be used for selecting an alternative (e.g. bitrate, language) will be added to a future version of this specification.

6.3 Sequential Element

Syntax

sequential    = "<seq" *seq-attribute (">" *seq-content "</seq" ">" | "/>")
seq-content   = comment | schedule | switch | link 
seq-attribute = id | dur | repeat | *sync-attribute | *new-attribute

General Semantics

Begin and end time of a child in a sequential element have the following default values:

The begin time of the first child is set to the begin time of the sequential element
The begin times of all other children are set to the end times of their lexical predecessors

The default value for the end time of a sequential element is the end time of the last element contained in the sequence.

6.4 Media Object Element

Syntax

media-object    = ref |audio | img | video | text
ref             = "<ref" *mo-attribute ("/>" | "></ref" ">")
audio           = "<audio" *mo-attribute ("/>" | "></audio" ">")
img             = "<img" *mo-attribute ("/>" | "></img" ">")
video           = "<video" *mo-attribute ("/>" | "></video" ">")
text            = "<text" *mo-attribute ("/>" | "></text" ">")
mo-attribute    = id | href | type | loc | dur | repeat | *sync-attribute
                   | *new-attribute | mo-xml-link-def | mo-show-def 
                   | mo-actuate-def | mo-inline-def
href            = "href" "=" (<">URL<"> | <'>URL<'>) 
                  ; URL syntax defined in RFC 1808
type            = "type" "=" (<">MIME-type<"> | <'>MIME-type<'>)
                  ; MIME-type syntax defined in RFC 2045
loc             = "loc" "=" (<">id-value<"> | "'"id-value"'") 
; the following rules are added to handle normalized XML documents containing 
; the default values for these attributes
mo-xml-link-def = "xml-link" "=" (<">"simple"<"> | "'""simple""'")
mo-show-def     = "show" "=" (<">"embed"<"> | "'""embed""'")
mo-actuate-def  = "actuate" "=" (<">"auto"<"> | "'""auto""'")
mo-inline-def   = "inline" "=" (<">"true"<"> | "'""true""'")

General Semantics

The media object elements allows the inclusion of external components into an SMIL presentation.

The names "audio", "video", "text" and "img" are synonyms for "ref". They serve to improve the readability of the document. The player must not derive the type of the media object from the name of the media object element. This is important when the URL points to the description of a media object file rather than to the file itself.

Editor's note: The semantics of the "audio", "video", "text" and "img" elements may change in a later version of the document, if it is decided that media type specific attributes should be added to these elements, for example specific attributes that only apply in audio files.

Attributes

href (mandatory): The value of the href attribute is the URL of the media object.
type (optional): MIME type of the media object referenced by the href attribute
loc (optional): This attribute specifies the identifier of an abstract rendering surface (either visual or audio) defined within the layout section of the document.
dur (optional): This attribute specifies the difference between the begin time and the end time of the media object element. For continuous media objects (e.g. audio or video objects), the default value is the inherent duration of the media object.

6.5 Synchronization Attributes

Syntax

sync-attribute    = begin | end 
begin             = "begin" "=" event-val
end               = "end" "=" event-val
event-val         = <">event-spec<"> | <'>event-spec<'>
event-spec        = qualified-event | offset
qualified-event   = id "(" id-value ")" "(" event ")" ["+" delay]
event             = clock-val | "begin" |  "end" |  "ready"
delay             = clock-val
clock-val         = full-clock-val | partial-clock-val | timecount-val
full-clock-val    = hours ":" minutes ":" seconds ["." units]
partial-clock-val = minutes ":" seconds ["." units]
timecount-val     = timecount ["." fraction] ["h" | "min" | "s" | 
                    "ms" ] ; default is "s"
hours             = 2DIGIT ; range from 00 to 23
minutes           = 2DIGIT ; range from 00 to 59
seconds           = 2DIGIT ; range from 00 to 59
units             = 1*DIGIT ; range from 0 to ups-1
timecount         = 1*DIGIT 
fraction          = 1*DIGIT
offset            = clock-val

General Semantics

The synchronization attributes "begin" and "end" can be added to any schedule element. These attributes change the default begin and end times of the element. A synchronization attribute can be an offset value or a qualified event.

If the value of a synchronization attribute is an offset value, its semantics depends on the parent of the element containing the synchronization attribute:

If the parent is a parallel element, the value defines a time-offset from the beginning of the parallel element (see Figure 6.3).
If the parent is a sequential element, the value defines a time-offset from the end of the predecessor element (see Figure 6.4).

<par>
  <audio id="a" begin="6s" ... />
  ...
</par>

      par
|------------------|
  
   6s      a
<----->|-----------|

Figure 6.3: Synchronization attribute with offset value within a parallel group

<seq>
  <audio .../>
  <audio begin="5s" .../>
</seq>

   audio     5s     audio
|---------|<---->|---------|

Figure 6.4: Synchronization attribute with offset value within a sequential group

If the value of a synchronization attribute is a qualified event, the attribute specifies that an element should begin or end when a particular event occurs in another element (see Figure 6.5). This element must be a sibling of the element with the synchronization attribute.

The following events are defined for all schedule elements:

begin: The element becomes active.
Example use: begin="id(x)(begin)"
end: The element becomes inactive.
Example use: begin="id(x)(end)"
clock-val: The clock associated with an element reaches a particular value.
Example use: begin="id(x)(45s)"
ready: The element is ready to be activated (e.g. the display process has started and the file is downloaded). An element containing a synchronization attribute that refers to a "ready" event is called interlude element. In an interlude element, the minimal duration of the interlude is specified using the "dur" attribute (see Figure 6.6).
Example use: end="id(x)(ready)"

<par>
  <audio id="a" begin="6s" ... />
  <img  begin="id(a)(4s)" ... />
</par>

      par
|-----------------|
  
   6s      a
<---->|-----------|
        4s
       <-->
            img
          |-------|

Figure 6.5: Synchronization attribute with qualified event value

<seq>
  <img dur="5s" end="id(a)(ready)" .../>
  <audio id="a" .../>
</seq>

  img   audio
|-----|-------|
  
|-----|-----|-----|-----> Presentation clock (in s)
   |
0  |  5    10    15
   |
   Audio ready

a) audio ready before end of minimal interlude duration

   img      audio
|---------|-------|
  
|-----|-----|-----|-----> Presentation clock (in s)
          | 
0     5   | 10    15
          |
          Audio ready

b) audio ready after end of minimal interlude duration

Figure 6.6: Interlude element

Error Handling

If an element contains begin, end and dur attributes, the dur attribute is ignored.
If a continuous media object has either a dur attribute, or a begin and an end attribute, the element's duration is the minimum of its inherent duration, and the duration defined by the attributes.
All documents containing errors caused by synchronization attributes are invalid. Possible errors are for example loops in the graph specifying begin or end times, begin times that lie after end times, events generated by non-sibling elements, sequential elements in which elements overlap, a reference to a clock value that exceeds the duration of an element , conflicts between begin/end times and the duration of an object etc.

Attributes

begin (optional): The value of the begin attribute determines when the element containing this attribute gets activated, for example displayed on a screen or reproduced by a loudspeaker.

end (optional): The value of the end attribute determines when the element containing this attribute is deactivated, for example removed from the screen or stopped being reproduced on a loudspeaker.

6.6 Switch Element

Syntax

switch           = "<switch" *switch-attribute (">" *switch-content "</switch>"
                   | "/>")
switch-content   = comment | schedule | switch | link 
switch-attribute = id | new-attribute

General Semantics

The switch element allows an author to specify a set of alternative media objects from which only one media object should be chosen.

The switch element can be used, for example, to express that the audio track of a video is available in different languages. More generally, the elements within a switch differ with respect to one or more parameter values (e.g. language, bitrate). These parameters and their values are added to the elements using XML attribute-value pairs.

A list of standard attributes that can be used for selecting an alternative will be added to a future version of this specification. Implementations are free to add new attributes to this list.

The exact selection process of the list of elements is implementation specific. However, the recommended method of implementation is to evaluate the topmost element for acceptability, and if that element has acceptable properties, to select that element at the exclusion of other elements within the switch. Only if the first element does not have acceptable properties, the media player would move down the list of alternatives. Thus, authors should order the alternatives from the most desirable to the least desirable.

Examples

In a common scenario, implementations may wish to allow for selection via a "bitrate" parameter on elements. The media player evaluates each of the "choices" (elements within the switch) one at a time, looking for an acceptable bitrate given the known characteristics of the link between the media player and media server.

<switch>
  <par bitrate="40000">
    ........
  </par>
  <par bitrate="24000">
    ........
  </par>
  <par bitrate="10000">
    ........
  </par>
</switch>

The elements within the switch may be any combination of elements. For instance, one could merely be specifying an alternate audio track:

<switch>
   <audio href="joe-audio-better-quality" bitrate="16000" />
   <audio href="joe-audio" bitrate="8000" />
</switch>

It would also be possible to use the switches inside the <par> and <seq> elements. The following would be valid:

<par>
  <text ...../>
  <switch>
    <audio href="joe-audio-better-quality" bitrate="16000" />
    <audio href="joe-audio" bitrate="8000" />
  </switch>
  <video ..../>
</par>

6.7 Link Elements

Syntax

link = inline | out-of-line

Relation to XML linking

The link element allows the description of navigational links between objects. SMIL linking is based upon the linking concepts described in the XML Linking draft (XLL).

This specification uses the terms resource, linking element, locator, in-line link and out-of-line link as defined in XLL. It aims to provide minimal link functionality, which can be extended by future versions of SMIL or by individual applications. SMIL provides for both in-line as well as out-of-line link elements. Links are limited to uni-directional single-headed links (i.e. all links have exactly one source and one destination resource). All links in SMIL are actuated by the user (i.e. all links have an implicit attribute actuate="user").

Handling of Links in Embedded Documents

Due to its integrating nature, the presentation of an SMIL document may involve other (non-SMIL) applications or plug-ins. For example, an SMIL browser may use an HTML plug-in to display an embedded HTML page. Vice versa, an HTML browser may use an SMIL plug-in to display an SMIL document embedded in an HTML page.

In such presentations, links may be defined by documents at different levels and conflicts may arise. In this case, the link defined by the containing document should take precedence over the link defined by the embedded object. Note that since this might require communication between the browser and the plug-in, SMIL implementations may choose not to comply with this recommendation.

If a link is defined in an embedded SMIL document, traversal of the link affects only the embedded SMIL document.

If a link is defined in a non-SMIL document which is embedded in an SMIL document, link traversal can only affect the presentation of the embedded document and not the presentation of the containing SMIL document. This restriction may be released in future versions of SMIL.

Addressing

SMIL uses the locator syntax defined in XLL. Support for the name fragment identifier and the '#' connector is required, support for XPointers and the "|" connector is optional. In practice, this means that SMIL only requires support for locators as currently used in HTML (e.g. it uses locators of the form "http://foo.com/some/path#anchor1").

Note that an SMIL document may use an anchor in another document not only as the destination of a link (as in HTML), but also as the source of a link (by using an out-of-line link).

Linking to SMIL Fragments

A locator that points to an SMIL document may contain a fragment part (e.g. http://www.w3.org/test.smi#par1). The fragment part is an id attribute that identifies one of the elements within the referenced SMIL document. If a link containing a fragment part is followed, the presentation should start as if the user had fast-forwarded to the beginning of the designated fragment in the destination document.

6.8 Inline Link Element

Syntax

inline              = "<a" *inline-attribute (">" *src-element "</a" ">" | "/>")
src-element         = comment | schedule | switch
                     ; link element not in this list, 
                     : since inline links cannot be nested
inline-attribute    = link-attribute |  show | inline-xml-link-def 
                      | inline-inline-def
link-attribute      = id | href 
show                = "show" "=" (<">show-value<"> | "'"show-value"'")
show-value          = "replace" | "new" | "pause"
; the following rules are added to handle normalized XML documents containing 
; the default values for these attributes
inline-xml-link-def = "xml-link" "=" (<">"simple"<"> | "'""simple""'")
inline-inline-def   = "inline" "=" (<">"true"<"> | "'""true""'")

General Semantics

An inline link element has implicit attributes xml-link="simple" and inline="true". It has an attribute "show" that controls the temporal behavior of the source when the link is followed. Otherwise, its functionality is identical to the functionality of the a-element in HTML. Inline links may not be nested. The inline link element must have an href attribute.

Attributes

show

This attribute controls the behavior of the source document containing the link when the link is followed. It can have one of the following values:

"replace" (default): The presentation of the destination resource replaces the current presentation.
"new": The presentation of the destination resource starts in a new context, not affecting the source resource.
"pause": The source presentation is paused, and the presentation of the destination source starts in a new context. Since there should be a way to restart a paused presentation, each SMIL browser window is required to provide a method to at least restart a presentation (e.g. a play or pause toggle button). Note that the PAUSE value is not defined by XLL.

Support for the EMBED value as defined by XLL is not required.

Examples

Example 1

The link starts up the new presentation replacing the presentation that was playing.

<a href="http://www.cwi.nl/somewhereelse.smi">
     <video href="rtsp://foo.com/graph.imf" loc="l_window"/>
</a>

The first line defines the destination of the link. The second line is the next video item in the SMIL presentation (inline). The third line is the end of the link.

In the example, the second line can be replaced by a reference to any valid subtree of an SMIL presentation.

Example 2

The link starts up the new presentation in addition to the presentation that was playing.

<a href="http://www.cwi.nl/somewhereelse.html" show="new">
     <video href="rtsp://foo.com/graph.imf" loc="l_window"/>
</a>

This allows SMIL to spawn off an HTML browser, for example.

Example 3

The link starts up the new presentation and pauses the presentation that was playing.

<a href="http://www.cwi.nl/somewhereelse.smi" show="pause">
     <video href="rtsp://foo.com/graph.imf" loc="l_window"/>
</a>

6.9 Out-of-line Link Element

Syntax


out-of-line = "<hlink" *out-of-line-attribute (">" 2anchor "</hlink" ">" | "/>")
              ; limit to single headed links means only two anchors allowed
out-of-line-attribute = link-attribute | out-of-line-xml-link-def 
                        | out-of-line-inline-def
; the following rules are added to handle normalized XML documents containing 
; the default values for these attributes
out-of-line-xml-link-def = "xml-link" "=" (<">"extended"<"> | "'""extended""'")
out-of-line-inline-def   = "inline" "=" (<">"false"<"> | "'""false""'")

General Semantics

Out-of-line links serve to associate a single source element with a single destination element. In contrast to an in-line link, the source element is identified indirectly by an anchor element in the content of the out-of-line element. The content of an out-of-line link must contain exactly one source anchor and one destination anchor. Source and destination are discriminated by the value of the anchor's "role" attribute.

In terms of the XML linking specification, the <hlink> element is an extended out-of-line link, i.e. it has implicit attributes xml-link="extended" and inline="false".

The show attribute is optional. Its default value is "replace".

6.9.1 Anchor Element

Syntax

anchor           = "<anchor" *anchor-attribute ("/>" | "></anchor>")
anchor-attribute = id | href | role | annchor-locator-def
role             = "role" "=" (<">role-value<"> | "'"role-value"'")
role-value       = "src" | "dst"
; the following rule is added to handle normalized XML documents containing 
; the default values for this attribute
anchor-locator-def = "xml-link" "=" (<">"locator"<"> | <'>"locator"<'>)

General Semantics

The anchor element is used to locate the source and the destination resources in an out-of-line link element. They have an implicit attribute xml-link="locator".

Attributes

role

This attribute specifies whether the respective anchor element is the source or the destination of the out-of-line link. It can have one of the following values:

"src": The element is the source of the link.
"dst": The element is the destination of the link.

Example Use

Example 1

The following example achieves the same functionality as the example 1 in the previous section, but uses an out-of-line link instead of an inline link:

     <video id="graph" href="rtsp://foo.com/graph.imf" loc="l_window"/>
     .
     .
     .
     <hlink>
	<anchor role="src" href="#graph"/>
	<anchor role="dst" href="http://www.cwi.nl/somewhereelse.smi"/>
     </hlink>

Example 2

The following example contains an out-of-line link from an element in one presentation A to the middle of another presentation B. This would play presentation B starting from the point where the designated fragment begins (i.e. the presentation would start as if the user had fast-forwarded to the beginning of the designated fragment in the destination document).

Presentation A:

     <video id="graph" href="rtsp://foo.com/graph.imf" loc="l_window"/>
     ...
     ...
     <hlink>
         <anchor role="dst" href="http://www.cwi.nl/mm/presentationB#next"/>
                                                                     ^^^^^
         <anchor role="src" href="#graph"/>
     </hlink>

Presentation B:

      ...
      <seq>
        <par>
          <video href="rtsp://foo.com/graph.imf" loc="l_window"/>
          <video href="rtsp://foo.com/anchor.rm" loc="r_window"/>
          <text href="rtsp://foo.com/caption1.html" loc="l_1_title"/>
          <text href="rtsp://foo.com/caption2.rtx" loc="r_1_title"/>
        </par>
        <par>
          <video href="rtsp://foo.com/timbl.rm" loc="l_window"/>
          <video id="next" href="rtsp://foo.com/v1.rm" loc="r_window"/>
                 ^^^^^^^^^
          <text href="rtsp://foo.com/caption1.html" loc="l_2_title"/>
          <text href="rtsp://foo.com/caption2.rtx" loc="r_2_title"/>
        </par>
      </seq>
      ...

7 Appendix

7.1 SMIL Basic Layout

This is a normative appendix of this specification

Tuner Element

Syntax

tuner           = "<tuner" *tuner-attribute ("/>" | "></tuner" ">")
tuner-attribute = id | left | top | z | width | height
left            = "left" "=" (<">*DIGIT[%]<"> | <'>*DIGIT[%]<'>)
top             = "top" "=" (<">*DIGIT[%]<"> | <'>*DIGIT[%]<'>)
z               = "z" "=" (<">*DIGIT<"> | <'>*DIGIT<'>)
height          = "height" "=" (<">*DIGIT[%]<"> | <'>*DIGIT[%]<'>)
width           = "width" "=" (<">*DIGIT[%]<"> | <'>*DIGIT[%]<'>)

General Semantics

The tuner element controls the position and size of a visual media object (for example text, image, video) within a rendering window. If a media object overlaps with the borders of the rendering window, the parts that are outside of the rendering window are clipped.

The tuner element and its attributes define the SMIL basic layout language. The type identifier for this language is "text/smil-basic" (note that this is not a registered MIME type, and probably will not become one in future. A MIME-like notation is chosen to be consistent with the type-specification for other layout languages).

All tuner elements must have an id attribute. Media objects that use a particular tuner element reference its id in their loc attribute.

In order to select the default layout values for all elements in a document, the document must contain an empty layout section, for example:

"<layout type="text/smil-basic"></layout>"

Relation to CSS

CSS is one of the alternative layout languages that can be supported by SMIL implementations.

The working group has seriously studied the alternative of using CSS positioning directly for laying SMIL media objects, instead of introducing SMIL basic layout. This was preliminarily rejected for several reasons:

it was unclear whether the hierarchically nested coordinate spaces of CSS positioning are useful in the temporal domain
it was felt that the likelihood of getting initial implementations of SMIL would be higher if it did not require support for CSS positioning.
it is unclear whether introducing a new syntax for setting the value of only five attributes (left, top, z, height, width) is worthwhile

These issues require further study, and it was felt that a detailed evaluation of the suitability of CSS at this point would delay the core functionality of SMIL, namely its synchronization features. However, the issue is still under consideration, and the decision for "smil-basic" is not final, as any other decision reflected within this document.

Differences between SMIL basic layout and CSS positioning include:

In SMIL basic, positioning is relative to a single coordinate space, which is defined by the rendering window.
In SMIL basic, only media objects can be positioned, i.e. leaf elements of the document tree.

SMIL basic layout is identical to CSS positioning in the following points:

Both use the same names for the values that can be set (left, top, z, width, height)
Positioning is relative to the top left corner

Attributes

left: This attribute specifies the offset of the element relative to the left border of the rendering window (see Figure 5.1). Its value is given in terms of pixels or as a percentage value of the rendering window's width. The default offset is 0 pixels.
top: This attribute specifies the offset of the element relative to the top border of the rendering window (see Figure 5.1). Its value is given in terms of pixels or as a percentage value of the rendering window's height. The default offset is 0 pixels.

 

                     Rendering Window
                          |
 0,0                     \ /
   +-----------------------------------+  / \
   |       / \                         |   |    
   |        |                          |   |
   |        |   top                    |   |            
   |        |                          |   |
   |       \ /                         |   |
   | left |-------------|  / \         |   | Rendering Height
   |<---->|             |   |          |   |
   |      |             |   |  height  |   |
   |      |             |   |          |   |
   |      |-------------|  \ /         |   |
   |                                   |   |
   |      <------------->              |   |
   |           width                   |   |
   +-----------------------------------+  \ /

   <----------------------------------->
          Rendering Width

Figure 5.1: Semantics of left, top, width and height attributes

width: This attribute specifies the width of the space in which the object is rendered (see Figure 5.1). Its value can be specified either in pixels or a percentage of the total width of the rendering window in which the object appears. For determining a default value, two cases can be distinguished. First, the default value can be the natural width of the object, i.e. the width stored in the objects data. Second, if no natural width is available, the default value is the difference between the "left" coordinate of the object and the right border of the window.
height: This attribute specifies the height of the space in which the object is rendered (see Figure 5.1). Its value is specified either in pixels or as percentage of the total height of the rendering window in which the object appears. For determining a default value, two cases can be distinguished. First, the default value can be the natural height of the object, i.e. the height stored in the objects data. Second, if no natural height is available, the default value is the difference between the "top" coordinate of the object and the bottom border of the window.
z: This attribute specifies the stacking order of elements in the case that their rendering spaces overlap. Its value is a positive integer. Elements are stacked in order of increasing z value. The default value is 1. A document containing overlapping elements with identical z values is invalid.

7.2 Augmented BNF Notation

This is an informative part of this specification

All of the mechanisms specified in this document are described in both prose and an augmented Backus-Naur Form (BNF) similar to that used by RFC 2068 (http 1.1) and RFC 822. Implementers will need to be familiar with the notation in order to understand this specification. The augmented BNF includes the following constructs:

name = definition: The name of a rule is simply the name itself (without any enclosing "<" and ">") and is separated from its definition by the equal "=" character. Whitespace is only significant in that indentation of continuation lines is used to indicate a rule definition that spans more than one line. Certain basic rules are in uppercase, such as SP, LWS, HT, CRLF, DIGIT, ALPHA, etc. Angle brackets are used within definitions whenever their presence will facilitate discerning the use of rule names.
"literal": Quotation marks surround literal text. Unless stated otherwise, the text is case-insensitive.
rule1 | rule2: Elements separated by a bar ("|") are alternatives, for example, "yes | no" will accept yes or no.
(rule1 rule2): Elements enclosed in parentheses are treated as a single element. Thus, "(elem (foo | bar) elem)" allows the token sequences "elem foo elem" and "elem bar elem".
*rule: The character "*" preceding an element indicates repetition. The full form is "<n>*<m&gtelement" indicating at least <n> and at most <m> occurrences of element. Default values are 0 and infinity so that "*(element)" allows any number, including zero; "1*element" requires at least one; and "1*2element" allows one or two.
[rule]: Square brackets enclose optional elements; "[foo bar]" is equivalent to "*1(foo bar)".
N rule: Specific repetition: "<n>(element)" is equivalent to "<n>*<n>(element)"; that is, exactly <n> occurrences of (element). Thus 2DIGIT is a 2-digit number, and 3ALPHA is a string of three alphabetic characters.
#rule: A construct "#" is defined, similar to "*", for defining lists of elements. The full form is "<n>#<m&gtelement " indicating at least <n> and at most <m> elements, each separated by one or more commas (",") and optional linear whitespace (LWS). This makes the usual form of lists very easy; a rule such as "( *LWS element *( *LWS "," *LWS element )) " can be shown as "1#element". Wherever this construct is used, null elements are allowed, but do not contribute to the count of elements present. That is, "(element), , (element) " is permitted, but counts as only two elements. Therefore, where at least one element is required, at least one non-null element must be present. Default values are 0 and infinity so that "#element" allows any number, including zero; "1#element" requires at least one; and "1#2element" allows one or two.
; comment: A semi-colon, set off some distance to the right of rule text, starts a comment that continues to the end of line. This is a simple way of including useful notes in parallel with the specifications.
implied *LWS: The grammar described by this specification is token-based. Except where noted otherwise, linear whitespace (LWS) can be included between any two adjacent tokens, and between adjacent tokens and delimiters (tspecials), without changing the interpretation of a field. At least one delimiter (tspecials) must exist between any two tokens, since they would otherwise be interpreted as a single token.

Basic Rules

(section must be updated to comply with XML)

The following rules are used throughout this specification to describe basic parsing constructs. The US-ASCII coded character set is defined by ANSI X3.4-1986.

           OCTET          = <any 8-bit sequence of data>
           CHAR           = <any US-ASCII character (octets 0 - 127)>
           UPALPHA        = <any US-ASCII uppercase letter "A".."Z">
           LOALPHA        = <any US-ASCII lowercase letter "a".."z">
           ALPHA          = UPALPHA | LOALPHA
           DIGIT          = <any US-ASCII digit "0".."9">
           CTL            = <any US-ASCII control character
                            (octets 0 - 31) and DEL (127)>
           CR             = <US-ASCII CR, carriage return (13)>
           LF             = <US-ASCII LF, linefeed (10)>
           SP             = <US-ASCII SP, space (32)>
           HT             = <US-ASCII HT, horizontal-tab (9)>
           <">            = <US-ASCII double-quote mark (34)>
           CRLF           = CR LF
           LWS            = [CRLF] 1*( SP | HT ) 
           TEXT           = <any OCTET except CTLs,
                            but including LWS>
           token          = 1*<any CHAR except CTLs or tspecials>
 
           tspecials      = "(" | ")" | "<" | ">"
                          | "[" | "]" | "?" | "="
                          | SP | HT

7.3 SMIL DTD

This is a normative appendix of this specification

<!--

    This is a draft and experimental XML document type 
    definition for SMIL 1.0.  
  
        Draft:  $Date: 1997/11/03 17:22:30 $ ($Revision: 1.1 $)

        Author: Jacco van Ossenbruggen <jrvosse@cwi.nl>
  
    This is work in progress, subject to change at any time.
    Further information about SMIL 1.0 is available at:

          http://www.w3.org/AudioVideo/

-->

<!--=================== SMIL Document =====================================-->
<!--
     The root element SMIL contains all other elements,
     The lipsync attribute here provides a global default 
     for the lipsync attribute of the par elements.
     A default for this global attribute is implementation dependent:
-->
<!ELEMENT smil (head?,body?)>
<!ATTLIST smil
        id      ID              #IMPLIED
        lipsync (true|false)    #IMPLIED
>

<!--=================== The Document Head =================================-->
<!ENTITY % layout-section "layout|switch">
<!ENTITY % head-element "%layout-section;">

<!ELEMENT head ((%head-element;)*)>
<!ATTLIST head id ID #IMPLIED>


<!--=================== Layout Element ====================================-->
<!ELEMENT layout ANY>
<!ATTLIST layout
        id   ID         #IMPLIED
        type CDATA      "text/smil-basic"
>


<!--=================== Tuner Element =====================================-->
<!ELEMENT tuner EMPTY>
<!ATTLIST tuner
        id      ID      #REQUIRED
        left    CDATA   "0"
        top     CDATA   "0"
        z       CDATA   "1"
        height  CDATA   #IMPLIED
        width   CDATA   #IMPLIED
>



<!--=================== The Document Body =================================-->
<!ENTITY % media-object "audio|video|text|img|ref">
<!ENTITY % schedule "par|seq|(%media-object;)">
<!ENTITY % inline-link "a">
<!ENTITY % out-of-line-link "hlink">
<!ENTITY % link "%inline-link;|%out-of-line-link;">
<!ENTITY % container-content "(%schedule;)|switch|(%link;)">
<!ENTITY % body-content
"(%out-of-line-link;)*,(%schedule;|switch),(%out-of-line-link;)*">

<!ELEMENT body (%body-content;)>
<!ATTLIST body id ID #IMPLIED>


<!--=================== Synchronization Attributes ========================-->
<!ENTITY % sync-attributes "
        begin   CDATA   #IMPLIED
        end     CDATA   #IMPLIED
">

<!--=================== The Parallel Element ==============================-->
<!-- The default for par's lipsync attribute is the value 
     of the lipsync attribute of the SMIL root element.
-->
<!ENTITY % par-content "%container-content;">
<!ELEMENT par    (%par-content;)*>
<!ATTLIST par    
        id      ID              #IMPLIED
        lipsync (true|false)    #IMPLIED
        endsync CDATA           "last"
        dur     CDATA           #IMPLIED
        repeat  CDATA           "1"
        %sync-attributes;
>

<!--=================== The Sequential Element ============================-->
<!ENTITY % seq-content "%container-content;">
<!ELEMENT seq    (%seq-content;)*>
<!ATTLIST seq    
        id      ID      #IMPLIED
        dur     CDATA   #IMPLIED
        repeat  CDATA   "1"
        %sync-attributes;
>

<!--=================== The Switch Element ================================-->
<!-- In the head, a switch may contain only layout elements,
     in the body, only container elements. However, this
     constraint cannot be expressed in the DTD (?), so
     we allow both:
-->
<!ENTITY % switch-content "layout|(%container-content;)">
<!ELEMENT switch (%switch-content;)*>
<!ATTLIST switch id ID #IMPLIED>

<!--=================== Media Object Elements =============================-->
<!-- SMIL only defines the structure. The real media data is 
     referenced by the href attribute of the media objects.
     The media objects have the following link attributes 
     for XML linking compatibility:
-->
<!ENTITY % mo-XML-link-atts '
        xml-link        (simple)        #FIXED "simple"
        show            (embed)         #FIXED "embed"
        actuate         (auto)          #FIXED "auto"
        inline          (true)          #FIXED "true"
'>

<!-- Furthermore, they have the the following attributes as defined
     in the SMIL draft
-->
<!ENTITY % mo-attributes "
        href    CDATA   #REQUIRED
        type    CDATA   #IMPLIED
        loc     CDATA   #IMPLIED
        id      ID      #IMPLIED
        dur     CDATA   #IMPLIED
        repeat  CDATA   '1'
        %sync-attributes;
        %mo-XML-link-atts;
">

<!-- All info is in the attributes, media objects are empty elements: -->
<!ELEMENT audio EMPTY>
<!ELEMENT video EMPTY>
<!ELEMENT text  EMPTY>
<!ELEMENT img   EMPTY>
<!ELEMENT ref   EMPTY>

<!ATTLIST audio %mo-attributes;>
<!ATTLIST video %mo-attributes;>
<!ATTLIST text  %mo-attributes;>
<!ATTLIST img   %mo-attributes;>
<!ATTLIST ref   %mo-attributes;>


<!--=================== Link Elements =====================================-->
<!-- These should all conform to the XML linking elements -->


<!--=================== Inline Link Element ===============================-->
<!ELEMENT a (%media-object;|seq|par|switch)*>
<!ATTLIST a
        href            CDATA                   #REQUIRED
        id              ID                      #IMPLIED
        role            CDATA                   #IMPLIED
        content-role    CDATA                   #IMPLIED
        content-title   CDATA                   #IMPLIED
        behavior        CDATA                   #IMPLIED
        show            (replace|new|pause)     "replace"
        xml-link        (simple)                #FIXED "simple"
        actuate         (user)                  #FIXED "user"
        inline          (true)                  #FIXED "true"
>

<!--=================== Out-of-line Link Element ==========================-->
<!ELEMENT hlink (anchor,anchor)>
<!ATTLIST hlink
        id ID #IMPLIED
        show            (replace|new|pause)     "replace"
        role            CDATA                   #IMPLIED
        content-role    CDATA                   #IMPLIED
        content-title   CDATA                   #IMPLIED
        behavior        CDATA                   #IMPLIED
        xml-link        (extended)              #FIXED "extended"
        actuate         (user)                  #FIXED "user"
        inline          (false)                 #FIXED "false"
>


<!--=================== Anchor Element ====================================-->
<!ELEMENT anchor EMPTY>
<!ATTLIST anchor 
        id              ID                      #IMPLIED
        href            CDATA                   #REQUIRED
        role            (src|dst)               #REQUIRED
        behavior        CDATA                   #IMPLIED
        xml-link        (locator)               #FIXED "locator"
        actuate         (user)                  #FIXED "user"
>

7.4 Example: Interactive Newscast on Growth of the Web

This is an informative appendix of this specification

Imagine a news broadcast on the growth of the Web. In the first scene (see left hand side of Figure 7.1), a graph on the left hand side of the screen displays the growth of the Web. The right hand side of the screen is taken up by a video of an anchor person commenting the graph. The graph and the commentators video are set up on a background.

In the second scene (see right hand side of Figure 7.1), the graph is replaced by a video showing Tim Berners-Lee, and the anchor person starts to interview him. During the interview, the user can click on Tim's video, and Tim's homepage will be brought up (via a hyperlink).

Figure 7.1: Interactive newscast screenshots

Figure 7.2 shows the time line and some of the media components used in this presentation:

Figure 7.2: Schedule for interactive newscast scenario

Some components do not appear in the figure:

the image "Web Growth" is shown from time 0:00 to time 1:00
the text "Web Growth" is shown once the image has reached its final position
the text "Joe Doe" is shown from time 0:00 until the end of the Joe Doe Video
the text "Tim B.-Lee" is shown while the video of Tim is being displayed
the user can follow a hyperlink connected to Tim's video and text while they are shown

This scenario can be implemented using the following SMIL document:

<smil>
  <head>
    <layout type="text/smil-basic">
      <tuner id="left-video" left="20" top="50" z="2"/>
      <tuner id="left-text" left="20" top="120" z="2"/>
      <tuner id="right-video" left="150" top="50" z="2"/>
      <tuner id="right-text" left="150" top="120" z="2"/>
    </layout>
  </head>
  <body>
    <par>
      <img   href="bg"/>
      <seq>
        <par>
          <img  href="graph" loc="left-video" dur="45s"/>
          <text href="graph-text" loc="left-text"/>
        </par>
        <par>
          <a href="http://www.w3.org/People/Berners-Lee">
            <video href="tim-video" loc="left-video"/>
          </a>
          <text href="tim-text" loc="left-text"/>
        </par>
       </seq>
       <seq>
         <audio href="joe-audio"/>
         <audio href="tim-audio"/>
       </seq>
       <video id="jv" href="joe-video" loc="right-video"/>
       <text  href="joe-text" loc="right-text"/>
    </par>
  </body>
</smil>

Copyright © 1997 W3C (MIT, INRIA, Keio ), All Rights Reserved. W3C liability, trademark, document use and software licensing rules apply. Your interactions with this site are in accordance with our public and Member privacy statements.

WD-smil-971109

Synchronized Multimedia Integration Language

Table of Contents

1 Status of this document

2 Introduction

3 Relation to XML

4 SMIL Document

5 The Document Head

5.1 Layout Element

6 The Document Body

6.1 Schedule Elements

6.2 Parallel Element

6.3 Sequential Element

6.4 Media Object Element

6.5 Synchronization Attributes

6.6 Switch Element

6.7 Link Elements

6.8 Inline Link Element

6.9 Out-of-line Link Element

6.9.1 Anchor Element

7 Appendix

7.1 SMIL Basic Layout

7.2 Augmented BNF Notation

7.3 SMIL DTD

7.4 Example: Interactive Newscast on Growth of the Web