Subject : URL Embedding into Realtime Video and Audio Streams











A Distributed Network Approach and Evolution to HTML for New Media




Authors and Contributors

Rob Martell
Lingbo Cao
Sofia Mihaylova
Paul Zagrodney

@ Digital Renaissance Inc

190 Liberty Street, 2nd Floor

Toronto, Canada




Abstract

The standards for embedding URLs into Audio and Video media should be designed take advantage of a distributed network system. This will allow for the natural evolution of simple URL anchors into HTML for new media. It is proposed that the approach to anchors be expanded to form a distributed model with the anchors residing in a data source in the network, the media ( streamed or downloaded ) in its own file elsewhere in the network and intelligent players to deal with media/anchor synchronization.

By separating the URL anchors from the media and placing them in the network, the anchors can be manipulated, changed, made dynamic, or even personalized based on personal profile or geographic location. This allows simple URLs to be extended to be part of a set of anchor commends that can offer significantly more flexibility than embedded URL anchors.

Key Words

HTML for video, HTML for audio, Hypervideo, Embedded URLs, video , audio.

1.0 Introduction

World Wide Web development has been taken several new and exciting directions in the past two years. One of the most exciting developments is the field of linear or time based media on the Web. The focus of most of the research has been to provide the best quality video/audio streams/files with a limited bandwidth and considering the network characteristics of the web. Obviously without continued advancements in this area, linear media might have restrictions on how it could be used.

At present two methods are used for deploying the linear (video/audio) media throughout the World Wide Web. The choice of the method on a particular web site is usually based on what the site owners what to accomplish with the media. The two categories of media presentation are:

1: Download the content and play it locally on the users client machine.

2: Stream the content to the users client and play as soon as possible given appropriate buffering

( broadcast or uni-cast approach )

However, with significant focus on for streaming media architectures, improving synchronization, delivering broadcasted streams, and compression techniques, there is the opportunity now to begin focusing on making the media Interactive.

It is proposed that video and audio ( and other forms of linear media ) be given the opportunity to evolve into a flexible and interactive media. This would be accomplished by encouraging networked architectures and establishing standards that enable video and audio media to act upon markup languages in a distributed network environment. This is the same direction that text and images took as the web began to evolve. It is video and audio media's time to follow the same evolutionary path.

2.0 Anchor Requirements

International communities of interest and their desire to have multiple types of systems work within the Web has made users, developers, and content/information providers to rush in adding Web functionality. The idea that information and content can be accessible in many different forms on many types of systems has been very powerful development force. These requirements have driven the need for standards and network designs that are based on the need for an open network..

Open network connectivity is the founding principle behind this proposal that URL anchors be networked rather than limited to a just embedding them into a media file. By taking a distributed network approach, embedding the URLs into the media can be supported as one choice of deployment, usable in certain niche environments.

The standard approach would be to access a networked data source to obtain the URLs. It is contended that the evolution of linear media should be tied to a model that promotes network connectivity and distribution of information, thereby building on the principles of the World Wide Web.

Basic requirements for anchors in media files would be seen as the following

1: The ability to play a media file and have a URL reference embedded in to the

2: The ability to have the URL launched by the player or browser at a specific point

in the playing of the media.

If URLs were only to be embedded in the media type, the critical issue attached to the above features would be the format and location of that information. However it is proposed that a flexible anchor standard with a distributed network design could pave the way for additional , highly interactive anchor commands.

To encourage an application flexibility state of mind, embedded URL commands be termed one of a set of anchor commands. The potential for anchor command functionality goes well beyond basic URL embedding. Additional anchor commands could include features such as

1: Download a media file, play and have unlimited ( although impractical ) number

of anchors referenced from the media stream

  1. nested anchors
  2. overlapping anchors

2: Stream a media file and have unlimited number of anchors references from the

media stream

  1. nested anchors
  2. overlapping anchors
  3. realtime anchors

3: Anchors should be able to overlap ( several anchors at the same time, with different ( or same ) start and end points

4: Anchors should be defined as more than URLs. Anchors should be of several forms

  1. URL
  2. Executable File type
  3. Application specific API
  4. Media manipulation commands ( within the media presented )
  5. other

5: Anchors should be separate from the media file to allow for

  1. change in anchors based on specific user criteria
  2. change in anchors based on service provider triggers
  3. URL validation and site movement.
  4. change in anchors when new information via URLs are needed

6: Anchors system should be architected to allow for two way distribution of elements

  1. should support anchors to be distributed to players
  2. should support anchors to be collected from web users
  3. should support anchors to be collected from players
  4. should support statistics to be collected from players ( anonymous )

7: Anchors can be applicable to all type of linear media - video, audio, animation, motion graphics, etc

8: Anchors commands should support a flexible notification system to allow for custom graphics or sounds based on application specific requirements

  1. sound notification
  2. graphical notification
  3. button notification



3.0 Architectural Proposal

An open architecture to support the above principles - distributed data source, media separation, open and anchor command language requires at least three issues to be addressed.

1: Player and communication structure

2: Data and Media Synchronization

3: Data Format

3.1 Player and Communication Structure

The communication paths for the three potential servers involved in playing the media do not have to relay on the same approach.

.

3.1.1 Web Server

No protocol or HTML changes are necessary for implementation into any web page. The file is referenced with normal HTML links. ( embedded object or plug-in ). The HTML link activation would cause the browser to open the player. The file extension type ( or embedded object ) would be recognized by the broswer.

3.1.2 Media Server

The player's communication with the Media server ( a logical entity ) is based on the need for receiving the media. Current media players use a variety of communications methods ( UDP, TCP, VDP , etc ) . The protocol chosen is not based on the need for a networked data source approach, but is solely dependent on the type and method of media streaming ( or downloading ). Whichever protocol is chosen, no changes would be necessary to support the networked datasource.

3.1.3 Data Source and Server

The player communication with the Data Source server ( logical entity ) is based on the need for receiving and sending data. As with the media communication, this path is established between the player ( initiator ) and the data source ( recipient ). The choice of the communication protocols to perform the functions would depend on the type of performance required by the player and the applications. Methods of communication could be network aware ODBC, ORPC, HTTP, RPC, or other network protocols for datasource communication.

3.2 Data and Media Synchronization

The temptation with networked data is to attempt to synchronize the media and the anchors in realtime over the network. The World Wide Web's inherent architecture would not necessarily allow this approach to work effectively. Synchronization of the media with the anchors can be resolved through the use of data source snap shots. The player at the time of launching ( or other critical times ) would receive the data source and synchronize media with data locally. Network delay would not be a factor in synchronization.

3.3 File Type Formats

In the case where the player is to launched as a browser plugin, the anchor system would use an executable text files to allow for command flexibility. Below is an example of a possible syntax. The server downloads a file based on a users ( file extension .tgg ). This anchor command file is intended to look like HTML and the player's plug-ins would act on these files much like browsers act on HTML files. The contents of the anchor file would be at least three elements

1: Location of the Anchor database

<A DREF "URL_format"></A>

HREF could be used but another syntax element would need to indicate that the HREF is for a anchor database. The URL would point to the database.

2: Location of the Media file

<A MREF "URL_Format"></A>

HREF could be used but another syntax element would need to indicate that the HREF is for a anchor database. The URL would point to the media file for download or streaming.

3: Frame Management

There are many ways of handling the navigation of launched URLs from the player. For flexibility, it is suggested that any options be specified by adding a syntax line to the .tgg file. The option would specify whether the URL is sent to an existing frame, a new page or a new window )

<FRAME OPTION=1 TARGET="framename">

TARGET holds the name of the frame within the browser page. Option 2 and 3 may not need a frame or window name.

4: Notification

Notification that an URL anchor is present could be done through a variety of methods - visual, audio, automated , etc. To accommodate the notification options, the file syntax might want to have a number of parameters.

<NOTIFY ><GRAPHIC>
<IMG SRC="URL_Format/firstgraphic.gif " BORDER="0">
</GRAPHIC></NOTIFY>

<NOTIFY ><GRAPHIC>
<IMG SRC="URL_Format/secondgraphic.gif" BORDER="0">
</GRAPHIC></NOTIFY>

The notification system might have two graphical positions for the placing of graphics. The first <NOTIFY> places the graphic in for the first location. The second <NOTIFY> places the graphic in for the second location. This command would offer no movement in the location where the graphics are placed.

For audio notification the syntax might be :

<NOTIFY><SOUND>
<A HREF = "URL_Format/firstsound.wav"></A>
</SOUND></NOTIFY>

<NOTIFY><SOUND>
<A HREF = "URL_Format/secondsound.wav"></A>
</SOUND></NOTIFY>

In the case that the player is an object reference can be done from within the HTML script and there is no need for the separate file extension ( .tgg )

<HTML>

<BODY>

<OBJECT ID="Xplayer1" WIDTH=x HEIGHT=x

CLASSID="CLSID:55D97B44-E559-11CF-8900-00AA000A3ED0">
<PARAM NAME="_Version" VALUE="65536">
<PARAM NAME="_ExtentX" VALUE="16157">
<PARAM NAME="_ExtentY" VALUE="8996">
<PARAM NAME="_StockProps" VALUE="128">
<PARAM NAME="MediaName" VALUE="">
<PARAM NAME="DataSource" VALUE="URL_Format/db.mdb">
<PARAM NAME="Frame" VALUE="framename">
<PARAM NAME="NotifyGraphic1" VALUE=" URL_Format /firstgraphic.gif"
<PARAM NAME="NotifyGraphic2" VALUE=" URL_Format /secondgraphic.gif"
<PARAM NAME="NotifySound1" VALUE=" URL_Format /firstsound.wav"
<PARAM NAME="NotifySound2" VALUE=" URL_Format /secondsound.wav"
</OBJECT>

</BODY>

</HTML>

The media file name, data source name and the frame reference information are parameters in the object syntax.

The use of HTML like syntax is an example of a very flexible method of offering anchor commands to linear media. The use of this format method is not intended to create a new community of browsers but to use an open standard command syntax. HTML sets a good standard. With the appropriate plug-ins or downloadable objects, current browsers should be able to support all functionality discussed.

3.4 Growth to HTML for new Media

By establishing several principles in the proposed anchor architecture, there is the ability to support the natural evolution of linear media to a command structure tha allows for the manipulating of new media ( Video, audio, annimation, etc ). Anchor players could evolve to become interpreters of new media HTML.

The argument for HTML support for other media is that media such as Video and Audio are flat and have the very simple control , like that of of VCRs or Tape recorders. The imagination of the development and service provider community would offered tools to go far beyond embedded URLs. By supporting a open standard where the data is separate from the media, the system is open to allow for new and innovative directions.

Presently URL anchors represent the first entry into Hyperlinks references. ( like the command HREF ). The future of interactive new media in a web environment will be determined somewhat by the desire to provide an open distributed network system that considers users and service providers as potential creators of interactive content

4.0 Architectural Advantages

The advantages of a distributed data source network approach, with open command structure are many:

1: The anchors data sources can be rotated and changed for many reasons/purposes without the need for several copies of the media to be store, streamed or downloaded.

2: A single data source reference can be used for the same content, regardless of media format. Example : Entries are created for a movie trailer. The trailer is encoded AVI for some PC, Quicktime for Macs and MPEG for a broadband network. If the player and database are structured properly, one a single database is needed that can be referenced by all media players

3: The system is inherently bi-directional. In the spirit of the web , the system encourages collaboration, personalization, and personal participation in content creation.

4: The data source and the media do not have to exist on the same machine or even the same country. The architecture encourages data and network distribution.

5: The architecture provide maximum growth potential. Simple URL embedding can be extended to bring a variation of HTML for video and HTML for audio into our tools for Internets and Intranets.

6: The system can be downwards compatible and support URL embedding. There are some cases where embedding the URL may be the correct choice.( say in the case of CD-ROM publishing ) where network connectivity cannot be assumed and all the data is local. A flexible publishing system would be able to offer a choice of collapsing the file into an embedded format.

7: The system will also allow for realtime annotation. Through the use of collision detectable data sources systems, data anchor data can be updated in realtime. Data source snap shots can provided up-to-date information to the players.

8: The media files are not modified. They can be player using standard players or with anchor reading players.

5.0 Proposal Summary & Recommendations

It is recommended that the subject of embedding URL anchors in streaming media be considered the starting point for the evolution of interactive linear media. The evolution of new media into web based interactive models requires a small but significant set of recommendations

The following set are proposed.

  1. A distributed network model be encouraged. The media and the anchors should be separated.
  2. A variation on HTML ( or supplement to ) be considered for the creation of an anchor command syntax. Standards are necessary for open development and connectivity.
  3. Embedded URLs in the media becomes one option among a stronger palette of development and service tools for making turning video and audio content into interactive content..

This is the same direction that text and images took as the web began to evolve. It is video and audio media's time to follow the same evolutionary path.