Media Fragments WG Pointers
Duration: September 2008 - January 2011
Pointers:
15 (active) Participants:
- from 10 organizations: Apple, DERI Galway, CWI, Opera, ETRI, IBBT, Meraka Institute, Samsung, Institut Telecom, W3C/ERCIM
- +2 Invited Expert: Silvia Pfeiffer and Conrad Parker
Media Fragments WG Goal
Provide URI-based mechanisms for uniquely identifying fragments for media objects on the Web, such as video, audio, and images.

User Stories (1/2)
Silvia is a big fan of Tim's research keynotes. She used to watch numerous videos starring Tim for following his research activities and often
would like to share the highlight announcements with her collaborators.
Silvia is interested in TweeTube that will allow her to share video directly on Twitter but she would
like to point and reference only small temporal sequences of these longer videos. She would like to have a simple interface, similar to
VideoSurf, to edit the start and end time points delimiting a particular sequence, and get back in return
the media fragment URI to share with the rest of the world.
She would also like to embed this portion of video on her blog together with comments and (semantic) annotations.
User Stories (2/2)
Lena would like to browse the descriptive audio tracks of a video as she does with Daisy audio books, by following the logical structure of
the media.
Audio descriptions and captions generally come in blocks either timed or separated by silences. Chapter by chapter and then section by section
she eventually jumps to a specific paragraph and down to the sentence level by using the "tab" control as she would normally do in audio books.
The descriptive audio track is an extra spoken track that provides a description of scenes happening in a video. When the descriptive audio
track is not present, Lena can similarly browse through captions and descriptive text tracks which are either rendered through her braille
reading device or through her text-to-speech engine.
Use Cases & Requirements
Working Draft:
http://www.w3.org/2008/WebVideo/Fragments/WD-media-fragments-reqs/
Requirements:
- r01: Temporal fragments: a clipping along the time dimension from a start to an end time that are within the duration of the media resource
- r02: Spatial fragments: a clipping of an image region, only consider rectangular regions
- r03: Track fragments: a track as exposed by a container format of the media resource
- r04: Named fragments: a media fragment - either a track, a time section, or a spatial region - that has been given a name through
some sort of annotation mechanism
Side-conditions:
- Restrict to what the container format (encapsulating the compressed media content) can express (and expose), thus no transcoding
- Protocol covered: HTTP(S), FILE, RTSP, RTMP?
Media Fragment URI Syntax
- Time: npt, smpte, smpte-25, smpte-30, smpte-30-drop, clock
http://www.example.com/video.ogv#t=10,20
- Space: pixel, percent
http://www.example.com/video.ogv#xywh=160,120,320,240
- Track: See also http://www.w3.org/WAI/PF/HTML/wiki/Media_MultitrackAPI
http://www.example.com/video.ogv#track=audio
- Name:
http://www.example.com/video.ogv#id=chapter-1
Media Fragment Processing
General principle is that smart UA will strip out the fragment definition and encode it into custom http headers ...
(Media) Servers will handle the request, slice the media content and serve just the fragment while old ones will serve the whole resource.
- Recipe 1: the User Agent knows how to map a custom unit into bytes and sends a normal Range request expressed in bytes
- Recipe 2: the User Agent sends a Range request expressed in a custom unit (e.g. seconds), the server answers directly with a
206 Partial Content and indicates the mapping between bytes and the custom unit
- Recipe 3: the User Agent sends a Range request expressed in a custom unit (e.g. seconds), the server answers first with just a Head
and the mapping between the custom unit and bytes so that the User Agent issues another normal Range request expressed this time in bytes
making the answer cacheable.
- Recipe 4: the User Agent sends a Range request expressed in a custom unit (e.g. seconds), the server provides a multipart message body reply
(multipart/byteranges) containing not only the bytes corresponding to requested media fragment but also the media header data making the
resource playable.
Recipe 1: UA mapped byte ranges 1/2 (see spec)
Recipe 1: UA mapped byte ranges 2/2
Recipe 2: Server mapped byte ranges 1/2 (see spec)
Recipe 2: Server mapped byte ranges 2/2
Recipe 3: Proxy cacheable Server mapped byte ranges 1/2 (see spec)
Recipe 3: Proxy cacheable Server mapped byte ranges 2/2
Recipe 4: Serving playable resources
Implementation Report
Media Fragment server:
Media Fragment user agents:
Thank you
Video on the Web is not just what you see
— it's what you can search, discover, create, distribute and manage.