W3C Media Fragments Working Group

Raphaël Troncy (EURECOM) <raphael.troncy@eurecom.fr>,
Erik Mannens (IBBT MediaLab, University of Ghent) <erik.mannens@ugent.be>

Media Fragments WG Pointers

Duration: September 2008 - January 2011

This slide: http://www.w3.org/2008/WebVideo/Fragments/talks/2010-04-07/
Previous SWCG Talk: http://www.w3.org/2008/WebVideo/Fragments/talks/2009-02-06/

Pointers:

15 (active) Participants:

from 10 organizations: Apple, DERI Galway, CWI, Opera, ETRI, IBBT, Meraka Institute, Samsung, Institut Telecom, W3C/ERCIM
+2 Invited Expert: Silvia Pfeiffer and Conrad Parker

Media Fragments WG Goal

Provide URI-based mechanisms for uniquely identifying fragments for media objects on the Web, such as video, audio, and images.

Beth, the audience, Belgium, her pitch, and herself
Photo credit: Robert Freund

Temporal addressing

User Stories (1/2)

Silvia is a big fan of Tim's research keynotes. She used to watch numerous videos starring Tim for following his research activities and often would like to share the highlight announcements with her collaborators.

Silvia is interested in TweeTube that will allow her to share video directly on Twitter but she would like to point and reference only small temporal sequences of these longer videos. She would like to have a simple interface, similar to VideoSurf, to edit the start and end time points delimiting a particular sequence, and get back in return the media fragment URI to share with the rest of the world.

She would also like to embed this portion of video on her blog together with comments and (semantic) annotations.

User Stories (2/2)

Lena would like to browse the descriptive audio tracks of a video as she does with Daisy audio books, by following the logical structure of the media.

Audio descriptions and captions generally come in blocks either timed or separated by silences. Chapter by chapter and then section by section she eventually jumps to a specific paragraph and down to the sentence level by using the "tab" control as she would normally do in audio books.

The descriptive audio track is an extra spoken track that provides a description of scenes happening in a video. When the descriptive audio track is not present, Lena can similarly browse through captions and descriptive text tracks which are either rendered through her braille reading device or through her text-to-speech engine.

Use Cases & Requirements

Working Draft:

http://www.w3.org/2008/WebVideo/Fragments/WD-media-fragments-reqs/

Requirements:

r01: Temporal fragments: a clipping along the time dimension from a start to an end time that are within the duration of the media resource
r02: Spatial fragments: a clipping of an image region, only consider rectangular regions
r03: Track fragments: a track as exposed by a container format of the media resource
r04: Named fragments: a media fragment - either a track, a time section, or a spatial region - that has been given a name through some sort of annotation mechanism

Side-conditions:

Restrict to what the container format (encapsulating the compressed media content) can express (and expose), thus no transcoding
Protocol covered: HTTP(S), FILE, RTSP, RTMP?

Media Fragment URI Syntax

Time: npt, smpte, smpte-25, smpte-30, smpte-30-drop, clock
```
http://www.example.com/video.ogv#t=10,20
```

Space: pixel, percent

http://www.example.com/video.ogv#xywh=160,120,320,240

Track: See also http://www.w3.org/WAI/PF/HTML/wiki/Media_MultitrackAPI
```
http://www.example.com/video.ogv#track=audio
```

Name:

http://www.example.com/video.ogv#id=chapter-1

Media Fragment Processing

General principle is that smart UA will strip out the fragment definition and encode it into custom http headers ...
(Media) Servers will handle the request, slice the media content and serve just the fragment while old ones will serve the whole resource.

Recipe 1: the User Agent knows how to map a custom unit into bytes and sends a normal Range request expressed in bytes
Recipe 2: the User Agent sends a Range request expressed in a custom unit (e.g. seconds), the server answers directly with a 206 Partial Content and indicates the mapping between bytes and the custom unit
Recipe 3: the User Agent sends a Range request expressed in a custom unit (e.g. seconds), the server answers first with just a Head and the mapping between the custom unit and bytes so that the User Agent issues another normal Range request expressed this time in bytes making the answer cacheable.
Recipe 4: the User Agent sends a Range request expressed in a custom unit (e.g. seconds), the server provides a multipart message body reply (multipart/byteranges) containing not only the bytes corresponding to requested media fragment but also the media header data making the resource playable.

Recipe 1: UA mapped byte ranges 1/2 (see spec)

The User Agent sends a Range request expressed in bytes

Recipe 1: UA mapped byte ranges 2/2

Recipe 2: Server mapped byte ranges 1/2 (see spec)

The User Agent sends a Range request expressed in a custom unit, the server answers directly with a 206 Partial Content and indicates
the mapping between bytes and the custom unit

Recipe 2: Server mapped byte ranges 2/2

Recipe 3: Proxy cacheable Server mapped byte ranges 1/2 (see spec)

The User Agent sends a Range request expressed in a custom unit, the server answers first with just a Head and the mapping between bytes
and the custom unit so that the User Agent issues another Range request expressed this time in bytes making the answer cacheable.

Recipe 3: Proxy cacheable Server mapped byte ranges 2/2

Recipe 4: Serving playable resources

The User Agent sends a Range request expressed in a custom unit (e.g. seconds), the server provides a multipart message body reply
(multipart/byteranges) containing not only the bytes corresponding to requested media fragment but also the media header data making the
resource playable.

Implementation Report

Media Fragment server:

Ninsuna: http://ninsuna.elis.ugent.be/MediaFragmentsServer

Media Fragment user agents:

Ninsuna Flash player: http://ninsuna.elis.ugent.be/MediaFragmentsPlayer
- Support recipes 1 and 4
Silvia's experiment with HTML5 + JS: http://annodex.net/~silvia/itext/mediafrag.html
- Support recipe 1
Firefox pluggin development in order to support all recipes (HTML5 + XMLHttpRequest)