Media Fragments Working Group Teleconference

20 Oct 2008


See also: IRC log


Erik, Raphael




<trackbot> Date: 20 October 2008

<nessy> Meeting openend 9:08

1. Round of introductions

<nessy> Raphael

<nessy> Erik

<raphael> scribenick: raphael

Davy: also in Multimedia Lab, IBBT, Ghent (BE)

Silvia: involved in MPEG-7, MPEG-21, developed Annodex (annotation format for ogg media files)
... start my own start up for measuring the audience of video on the web + consultant for Mozilla
... developped the TemporalURI specification, 6 years ago

Guillaume Olivrin, South Africa, focus on accessibility, how do you attach specific semantics to parts of media

Daniel Park, Samsung, co-chair of the Media Annotation, focus on IPTV (background in wireless networking)

Andy Heath, Open University, UK, background on e-learning, but develop far more general technologies, focus on accessibility

scribe: experience in standards such as LOM, DC, SKORM

Colm Doyle: Blinkx

Larry Masinter: Adobe, experience in co-chairing HTTP group, focus on acquisition of metadata

Khang Cham, Samsung, focus on IPTV

Yves: W3C team contact, expertise in protocols, web services

<nessy> http://www.w3.org/2008/01/media-fragments-wg.html

<nessy> ... working group charter

Larry: important to define first requirements for why these URIs will be used for
... it might happen that you can not satisfy all the requirements with a URI, don't put that out of scope now

2. Use Cases Discussion (Part 1)

Photo Use Case: http://www.w3.org/2008/WebVideo/Fragments/wiki/Use_Cases_%26_Requirements#Photobook_UC

Slides at: http://www.w3.org/2008/WebVideo/Fragments/meetings/2008-10-20-f2f_cannes/photobook_UC.pdf

Erik goes through the slides

Erik: take parts of images ... and assemble them together in a slideshow

Guillaume: unclear the value of the fragments here
... I understand fragment as taking a part of a large thing

Larry: is it worth at all to look at Spatial URIs? Is it for doing partial retrieval?

Raphael: mention maps applications

Larry: but they are intereactive!

Raphael: mention multi-resolution images, image industry has huge need and will to expose high resolution version of images

Larry: they do have JPEG2000 and protocols

Silvia: SMIL has ellaborate on the need for spatial fragments

Jack: important needs in the SMIL community and SVG ... image maps, pan zoom, cropping

Erik: continues the presentation, after temporally assemble parts of images into a slideshow, assemble two parts of an image into a new one (stich)
... Existing technologies: RSS and Atom for the playlist generation
... W3C SMIL: XML-based markup language, requires a SMIL player
... MPEG-21: Part 17 for fragment identification of MPEG Ressources, client-side processing ... pseudo playlist
... MPEG-A: MAF (Media Application Format) that combines MPEG technologies
... XSPF (spiff): XML Shareable Playlist Format: Xiph Community
... Discussion: is it out of scope or not? specific use cases around? other technologies around?

Guillaume: unclear the value of the fragments here
... I understand fragment as taking a part of a large thing

Silvia: we are mainly looking at audio and videos files, but a video is a sequence of images

Larry: there are different servers and clients

Silvia: one way to look at a criteria is: is it a pure client-side issue or server-side + client-side problems?

Larry: even if it is only a client-side issue, it might be worth to do some standardisation
... the main point of still images fragment is the interactivity

Raphael: is interactivity the key interest in spatial fragment

Larry: there is a lot of work in this area, would recommend to focus on the temporal issue
... it is also a good exercise to look at the out-of-scope use case, help to shape the scope

Jack: URI is good because it is the web, the client is not necessarily aware of the time dimension
... HTML has already a notion of Area, so don't encode it in a URI

Larry: need to be carreful on URIs, resources, representations
... example of an image: need to decode it, take the parts, re-encode it
... JPEG2000 might have a direct way to do that

Guillaume: create mosaic, collage of parts of media

Yves: it depends if the transformation needs to be on the client or not

Jack: be carreful, to not put SVG in a URI :-)
... good balance on which processing can be on client side, and what is worth to put in a URL
... is it better to have the processing in the URL?

Erik: we question again the interest of the spatial fragment

Silvia: is it a question of the size of the media? Large: worth to have fragment, Small: not worth

Larry: define what do you mean by media
... it is reasonable to limit yourself to videos

Silvia: SMIL and Flash are interactive media, not necessarily one timeline
... we focus on a resource with one timeline
... there is a whole sweat of codecs issues

Larry: define markers in videos

<Yves> time... what is the reference of time for a video, embedded time code? 0 for the start?

Coffee break

Map Use Case: http://www.w3.org/2008/WebVideo/Fragments/wiki/Use_Cases_%26_Requirements/Map_Application_UC

<scribe> scribenick: erik

Raphael: Map UC Description

<nessy> http://www.w3.org/2008/WebVideo/Fragments/wiki/Use_Cases_%26_Requirements

Raphael: Annotation is key

<Kangchan> Question : What is relation between Geolocation Working Group(with http://www.w3.org/2008/geolocation/) and Web Map Services

Raphael: UC examples using Yahoo, Google & Microsoft

Jack: what we see here are URI's for the applications, not images

Raphael will look deeper into different specs over the next couple of weeks for this Map UC

Davy & Jack: is this a valid UC? will our spatial URL adressing scheme will be used by Maps Applications?

Raphael: as Larry said this morning, out-of-scope UC's are valid to come up with our final WG's scope

<guillaume> Must document the out of scope UC to explain why it is out of scope.

Sylvia: there might be a UC when we are talking about really large images (cfr. medical images in really high resolutions)
... having a way to get a subpart of such a big image is nice to have, but implementation is something different ... a lot of complications, certainly on some server-side implimentations

Guillaume: codec issues not to be underestimated, have a nice adressing scheme vs. server-side complexity

Sylvia: should look further than just server-side complexity, solutions for certain codecs will come around eventually if needed

Jack: pratical issues vs. fundamental issues have to be taken into account within this group
... media fragments are needed because some things can not be expressed today

Raphael: is it worth of having an overview of the TimedText WG?

<nessy> Guillaume: URI fragment identifier for text/plain: http://www.ietf.org/rfc/rfc5147.txt

<Yves> (multi-resolution formats, http://en.wikipedia.org/wiki/FlashPix is a good example of a single file containing multiple resolutions, maybe better than the map application)

Raphael: Zoomify is good example of UC of very big images (life sciences) using fragments
... task of this group to ensure interoperability of different standards? (eg. MPEG-21 URI to SVG)

Sylvia: defining the mappings should be out-of-scope for this WG

Jack: worthwile is testing our scheme to the others out there

Sylvia: last thing to do & should be straight forward by then if we did a good job

Raphael: what about spatial dimension?

Sylvia: temporal adressing need is biggest, but spatial adressing need is also valid

3. Use Case Discussion (Part 2)

Sylvia presenting the Media Annotation UC

<raphael> http://www.w3.org/2008/WebVideo/Fragments/wiki/Use_Cases_%26_Requirements#Media_Annotation_UC

<raphael> Silvia: Annotation can be attached to the full media resource or to fragments of media resources

<raphael> scribenick: raphael

Sylvia: annotations to fragment is relevant for this group

Guillaume: can the structure of the video be represented in the URI

Silvia: difference between the representation of the fragment and its semantics

<spark3> if necessary, what about adding a new UC (naming use case for fragment) into the Media Annotation WG UC ?

Silvia: drawing on the board

<erik> Jack: there's only 1 timeline for timed media

<erik> Jack: there's only 1 coordinate system for spatial media

<erik> Jack: Annotation UC is important because we're reasoning on a higher abstraction level

Jack: loves that use case since it is purely about fundamental description and indexing of a media

Silvia: goes through the advantages of a possible URI scheme for media fragments
... actually motivating the need for media fragments
... shows the picture at https://wiki.mozilla.org/Image:Video_Fragment_Linking.jpg
... jumps into the track problems
... there is actually 3 dimensions: space, time and track
... temporalURI just deal with cropping, no track awareness

Jack: rename this use case into 'Anchoring'
... annotation = RDF community
... structuring = SMIL community

Silvia: agree to rename it into Media Anchor Definition

Lunch break

Media Delivery Use Case: http://www.w3.org/2008/WebVideo/Fragments/wiki/Use_Cases_%26_Requirements#Media_Delivery_UC

<scribe> scribenick: Jack

<Yves> Scribe: Jack

<jackjansen> scribenick: jackjansen

<raphael> scribenick: jackjansen

Media Delivery use case

<raphael> Davy going through the slide at: http://www.w3.org/2008/WebVideo/Fragments/meetings/2008-10-20-f2f_cannes/media_delivery_UC.pdf

Various: (discussing slide 3, # vs. ? or ,): Can we use # as the only user-visible marker and use http-ranges or something similar?

<raphael> Silvia drawing a communication channel between UA and servers

<raphael> Discussion about the use of the "hash" character

<raphael> Yves: use case is to extract a frame of a video, and creates a new image (so a new resource), use a '?'

<raphael> ... use case is to keep the context, use a '#'

<raphael> Summary: there is use cases for both, should be further discussed tomorrow morning

summary: there are use cases for both. We will get back to the subject tomorrow.

<guillaume> dejà vue

<raphael> Davy: explains the MPEG-21 Fragment identification

<raphael> ... use of the '#', but no delivery protocol

<raphael> ... mention also the proposal of Dave Singer: UA get first N bytes representing the headers with timing and bytes offset information of the media resource

<raphael> ... goes through an explanation of MPEG-21: http://www.w3.org/2008/WebVideo/Fragments/wiki/State_of_the_Art#MPEG-21_Part_17:_Fragment_Identification_of_MPEG_Resources_.28Davy_.2F_Silvia.29

<raphael> ... 4 schemes

<raphael> ... ffp for the track

<raphael> ... offset for bytes range

all: discussing #mp() scheme

<raphael> ... mp for specifying the temporal or spatial fragment (only for MPEG mime-type resources)

sylvia: whoever controls the mimetype also controls what is after the # in a url

Jack: is surprised, but pleasantly so.

<raphael> Davy: the 4th scheme is 'mask' (only for MPEG resources)

<raphael> Jack: seems they structure the video resource and point towards this structure

<raphael> Raphael: how many user agents can understand this syntax?

all: none, that we know of

<raphael> Davy: i'm not aware of ... altough there is a referenced implementation

<raphael> Larry: http is not necessarily the best protocol to transport video

<Yves> in video, it depends if you want exact timing, control of the lag, and in that case HTTP is not the best choice

<raphael> Silvia: I would say that most of the videos is transported over http

<raphael> ... RTP and RTSP have their own fragments, we should learn from them

<raphael> ... if they do not satisfy all our requirements, we can feed them so they extend the use of fragments in these protocols

<raphael> Davy: goes through TemporalURI

<raphael> ... this is the only that specifies a delivery protocol over http

Silvia: Real used to allow something similar to temporal URLs

Jack: thinks it may be part of the .ram files

Guillaume: Flash allows doc author to export subparts by name, these can then be accessed with url#name

Davy: continues with slide 6, http media delivery

<guillaume> Guillaume: Flash could also embed internal links in movie attached to certain frames. Once compiled with specific option, fragment of the Flash movie could be accessed using #

<raphael> Silvia: draw the four-way handshake

<raphael> ... 1st exchange: User requests http://www.example.com/resource.ogv#t=20-30

<raphael> ... UA does a GET <uri stripped of hash>, Range: time 20-30

<raphael> ... Server send back a Response 200, with the content-range: time 20-30 + content-type + ogg header + time-range bytes 50000-20000

<raphael> ... (needs to create a new http header, 'time-range')

<raphael> Raphael: can we use content-range: bytes ... ?

<raphael> ... UA does a GET <URI strriped of the hash>, Range x bytes: 5000-20000

<raphael> ... Server send back a Response 200, with the content-range bytes + the cropped data

<raphael> Silvia: it is not implemented yet as far as I know

<raphael> ... discussion based on a lot of discussions with proxies vendors

<raphael> Davy: could we apply the same four-way handshake with RTSP?

<raphael> ... RTSP specifies a Range Header, similar to the HTTP byte range mechanism

<raphael> ... RTSP could support temporal fragments by a two-way handshake (using Range header)

<raphael> ... Problem: spatial fragments are not supported!

<raphael> Jack: the spatial problem is kind of orthogonal

<raphael> ... the spatial fragment will not be about bytes range

<raphael> Davy: cropping is more complex in images

<raphael> Jack: you're right, I can create a non-continous quicktime movie

<raphael> ... problem is it is not necessarily possible to generate a byte range from a time range

<raphael> Silvia: a single byte range

all: the non-contiguous ranges may occur more often than we like. But maybe
... we can get away with ignoring them (because all relevant formats also have a contiguous form).
... need to discuss after the break.

raphael: suggest coffee break

<guillaume> or need to coalesce

Larry: please decouple representation of how you refer to fragments form he implementations
... Also think about embedded metadata: if the original has a copyright statement, do you get it wth every fragment?

Sylvia: (on prev subject): wonders whether http can do multiple byte ranges

Larry: yes, I think so, with multipart

<erik> rssagent, draft minutes

<davy> scribenick: davy

Media Linking UC

raphael discusses the description written by Michael on the wiki

scribe: 3 things: bookmarking, playlists, and interlinking multimedia

silvia: definition of playlists is out of scope

guillaume: playlist is about presentation

raphael: regarding interlinked: temporal URIs can be described in RDF (RDF doc describing an audio file)
... difference between URI and RDF (or SMIL, or ...): you need to parse the metadata
... RDF description of time segment could be replaced by a temporal URI

silvia: interlinking multimedia is already covered in other UCs

Video Browser UC

silvia: large media files introduces special challenges
... requirement for server-side processing
... dynamic creation of thumbnails through URI mechanism

guillaume: link to PNG or GIF
... provide a preview function of the resource
... trivial: get all the I-frames of a video resource
... use them as thumbs
... thumbnail extraction is quite easy

silvia, jack: not so trivial, might be processing-intensive

silvia: it should be possible to point to one single frame with the URI scheme

jack: URI scheme should not know that frame is 'the' thumbnail

guillaume: you can have multiple thumbs per resource

raphael: URI scheme can point to a frame, but does not have knowledge about thumbs
... should we be able to address in terms of frames?

guillaume: no, too coding-specific

silvia: previews of images?
... preview is then a lower resolution image

guillaume: that is processing
... mostly, previews are already part of the media resource
... hence lower image resolutions are out of scope

jack: not too far?
... is a preview embedded in a resource still a fragment?

guillaume: compare it with tracks
... preview is just another track

raphael: we put this in mind and make a decision later

silvia: previews are another sort of tracks

raphael: should we also to be able to address metadata within the headers?

silvia: it is not a common property of all the formats to have previews, therefore, it is not a candidate to be standardized

raphael: after first fase of the WG: report the current limitations
... and wait for feedback

Moving Point Of Interest UC

raphael: complex UC
... should be for the second phase

jack: if this ever to be going to used at server-side?
... if not, it is out of scope

raphael: you can share the link of the moving region

erik: delivery to mobile devices is a use case introduced by the public flemish broadcaster

jack: there is no reason to use URIs for that purpose, use metadata

raphael: it is like concatenating spatial fragments over time

guillaume: we are addressing points over space or time

raphael: refer to HTML image maps
... region, interval can be defined by a combination of points
... you need more than one point


raphael: we will discuss this tomorrow

Summary of Action Items

[End of minutes]

Minutes formatted by David Booth's scribe.perl version 1.133 (CVS log)
$Date: 2008/10/26 10:50:37 $