Media Fragments Working Group Teleconference -- 20 Oct 2008

<trackbot> Date: 20 October 2008

<nessy> Meeting openend 9:08

1. Round of introductions

<nessy> Raphael

<nessy> Erik

<raphael> scribenick: raphael

Davy: also in Multimedia Lab, IBBT, Ghent (BE)

Silvia: involved in MPEG-7, MPEG-21, developed Annodex (annotation format for ogg media files)
... start my own start up for measuring the audience of video on the web + consultant for Mozilla
... developped the TemporalURI specification, 6 years ago

Guillaume Olivrin, South Africa, focus on accessibility, how do you attach specific semantics to parts of media

Daniel Park, Samsung, co-chair of the Media Annotation, focus on IPTV (background in wireless networking)

Andy Heath, Open University, UK, background on e-learning, but develop far more general technologies, focus on accessibility

scribe: experience in standards such as LOM, DC, SKORM

Colm Doyle: Blinkx

Larry Masinter: Adobe, experience in co-chairing HTTP group, focus on acquisition of metadata

Khang Cham, Samsung, focus on IPTV

Yves: W3C team contact, expertise in protocols, web services

<nessy> http://www.w3.org/2008/01/media-fragments-wg.html

<nessy> ... working group charter

Larry: important to define first requirements for why these URIs will be used for
... it might happen that you can not satisfy all the requirements with a URI, don't put that out of scope now

2. Use Cases Discussion (Part 1)

Photo Use Case: http://www.w3.org/2008/WebVideo/Fragments/wiki/Use_Cases_%26_Requirements#Photobook_UC

Slides at: http://www.w3.org/2008/WebVideo/Fragments/meetings/2008-10-20-f2f_cannes/photobook_UC.pdf

Erik goes through the slides

Erik: take parts of images ... and assemble them together in a slideshow

Guillaume: unclear the value of the fragments here
... I understand fragment as taking a part of a large thing

Larry: is it worth at all to look at Spatial URIs? Is it for doing partial retrieval?

Raphael: mention maps applications

Larry: but they are intereactive!

Raphael: mention multi-resolution images, image industry has huge need and will to expose high resolution version of images

Larry: they do have JPEG2000 and protocols

Silvia: SMIL has ellaborate on the need for spatial fragments

Jack: important needs in the SMIL community and SVG ... image maps, pan zoom, cropping

Erik: continues the presentation, after temporally assemble parts of images into a slideshow, assemble two parts of an image into a new one (stich)
... Existing technologies: RSS and Atom for the playlist generation
... W3C SMIL: XML-based markup language, requires a SMIL player
... MPEG-21: Part 17 for fragment identification of MPEG Ressources, client-side processing ... pseudo playlist
... MPEG-A: MAF (Media Application Format) that combines MPEG technologies
... XSPF (spiff): XML Shareable Playlist Format: Xiph Community
... Discussion: is it out of scope or not? specific use cases around? other technologies around?

Guillaume: unclear the value of the fragments here
... I understand fragment as taking a part of a large thing

Silvia: we are mainly looking at audio and videos files, but a video is a sequence of images

Larry: there are different servers and clients

Silvia: one way to look at a criteria is: is it a pure client-side issue or server-side + client-side problems?

Larry: even if it is only a client-side issue, it might be worth to do some standardisation
... the main point of still images fragment is the interactivity

Raphael: is interactivity the key interest in spatial fragment

Larry: there is a lot of work in this area, would recommend to focus on the temporal issue
... it is also a good exercise to look at the out-of-scope use case, help to shape the scope

Jack: URI is good because it is the web, the client is not necessarily aware of the time dimension
... HTML has already a notion of Area, so don't encode it in a URI

Larry: need to be carreful on URIs, resources, representations
... example of an image: need to decode it, take the parts, re-encode it
... JPEG2000 might have a direct way to do that

Guillaume: create mosaic, collage of parts of media

Yves: it depends if the transformation needs to be on the client or not

Jack: be carreful, to not put SVG in a URI :-)
... good balance on which processing can be on client side, and what is worth to put in a URL
... is it better to have the processing in the URL?

Erik: we question again the interest of the spatial fragment

Silvia: is it a question of the size of the media? Large: worth to have fragment, Small: not worth

Larry: define what do you mean by media
... it is reasonable to limit yourself to videos

Silvia: SMIL and Flash are interactive media, not necessarily one timeline
... we focus on a resource with one timeline
... there is a whole sweat of codecs issues

Larry: define markers in videos

<Yves> time... what is the reference of time for a video, embedded time code? 0 for the start?

Coffee break

Map Use Case: http://www.w3.org/2008/WebVideo/Fragments/wiki/Use_Cases_%26_Requirements/Map_Application_UC

<scribe> scribenick: erik

Raphael: Map UC Description

<nessy> http://www.w3.org/2008/WebVideo/Fragments/wiki/Use_Cases_%26_Requirements

Raphael: Annotation is key

<Kangchan> Question : What is relation between Geolocation Working Group(with http://www.w3.org/2008/geolocation/) and Web Map Services

Raphael: UC examples using Yahoo, Google & Microsoft

Jack: what we see here are URI's for the applications, not images

Raphael will look deeper into different specs over the next couple of weeks for this Map UC

Davy & Jack: is this a valid UC? will our spatial URL adressing scheme will be used by Maps Applications?

Raphael: as Larry said this morning, out-of-scope UC's are valid to come up with our final WG's scope

<guillaume> Must document the out of scope UC to explain why it is out of scope.

Sylvia: there might be a UC when we are talking about really large images (cfr. medical images in really high resolutions)
... having a way to get a subpart of such a big image is nice to have, but implementation is something different ... a lot of complications, certainly on some server-side implimentations

Guillaume: codec issues not to be underestimated, have a nice adressing scheme vs. server-side complexity

Sylvia: should look further than just server-side complexity, solutions for certain codecs will come around eventually if needed

Jack: pratical issues vs. fundamental issues have to be taken into account within this group
... media fragments are needed because some things can not be expressed today

Raphael: is it worth of having an overview of the TimedText WG?

<nessy> Guillaume: URI fragment identifier for text/plain: http://www.ietf.org/rfc/rfc5147.txt

<Yves> (multi-resolution formats, http://en.wikipedia.org/wiki/FlashPix is a good example of a single file containing multiple resolutions, maybe better than the map application)

Raphael: Zoomify is good example of UC of very big images (life sciences) using fragments
... task of this group to ensure interoperability of different standards? (eg. MPEG-21 URI to SVG)

Sylvia: defining the mappings should be out-of-scope for this WG

Jack: worthwile is testing our scheme to the others out there

Sylvia: last thing to do & should be straight forward by then if we did a good job

Raphael: what about spatial dimension?

Sylvia: temporal adressing need is biggest, but spatial adressing need is also valid

3. Use Case Discussion (Part 2)

Sylvia presenting the Media Annotation UC

<raphael> http://www.w3.org/2008/WebVideo/Fragments/wiki/Use_Cases_%26_Requirements#Media_Annotation_UC

<raphael> Silvia: Annotation can be attached to the full media resource or to fragments of media resources

<raphael> scribenick: raphael

Sylvia: annotations to fragment is relevant for this group

Guillaume: can the structure of the video be represented in the URI

Silvia: difference between the representation of the fragment and its semantics

<spark3> if necessary, what about adding a new UC (naming use case for fragment) into the Media Annotation WG UC ?

Silvia: drawing on the board

<erik> Jack: there's only 1 timeline for timed media

<erik> Jack: there's only 1 coordinate system for spatial media

<erik> Jack: Annotation UC is important because we're reasoning on a higher abstraction level

Jack: loves that use case since it is purely about fundamental description and indexing of a media

Silvia: goes through the advantages of a possible URI scheme for media fragments
... actually motivating the need for media fragments
... shows the picture at https://wiki.mozilla.org/Image:Video_Fragment_Linking.jpg
... jumps into the track problems
... there is actually 3 dimensions: space, time and track
... temporalURI just deal with cropping, no track awareness

Jack: rename this use case into 'Anchoring'
... annotation = RDF community
... structuring = SMIL community

Silvia: agree to rename it into Media Anchor Definition

Lunch break

Media Delivery Use Case: http://www.w3.org/2008/WebVideo/Fragments/wiki/Use_Cases_%26_Requirements#Media_Delivery_UC

<scribe> scribenick: Jack

<Yves> Scribe: Jack

<jackjansen> scribenick: jackjansen

<raphael> scribenick: jackjansen

Media Delivery use case

<raphael> Davy going through the slide at: http://www.w3.org/2008/WebVideo/Fragments/meetings/2008-10-20-f2f_cannes/media_delivery_UC.pdf

Various: (discussing slide 3, # vs. ? or ,): Can we use # as the only user-visible marker and use http-ranges or something similar?

<raphael> Silvia drawing a communication channel between UA and servers

<raphael> Discussion about the use of the "hash" character

<raphael> Yves: use case is to extract a frame of a video, and creates a new image (so a new resource), use a '?'

<raphael> ... use case is to keep the context, use a '#'

<raphael> Summary: there is use cases for both, should be further discussed tomorrow morning

summary: there are use cases for both. We will get back to the subject tomorrow.

<guillaume> dejà vue

<raphael> Davy: explains the MPEG-21 Fragment identification

<raphael> ... use of the '#', but no delivery protocol

<raphael> ... mention also the proposal of Dave Singer: UA get first N bytes representing the headers with timing and bytes offset information of the media resource

<raphael> ... goes through an explanation of MPEG-21: http://www.w3.org/2008/WebVideo/Fragments/wiki/State_of_the_Art#MPEG-21_Part_17:_Fragment_Identification_of_MPEG_Resources_.28Davy_.2F_Silvia.29

<raphael> ... 4 schemes

<raphael> ... ffp for the track

<raphael> ... offset for bytes range

all: discussing #mp() scheme

<raphael> ... mp for specifying the temporal or spatial fragment (only for MPEG mime-type resources)

sylvia: whoever controls the mimetype also controls what is after the # in a url

Jack: is surprised, but pleasantly so.

<raphael> Davy: the 4th scheme is 'mask' (only for MPEG resources)

<raphael> Jack: seems they structure the video resource and point towards this structure

<raphael> Raphael: how many user agents can understand this syntax?

all: none, that we know of

<raphael> Davy: i'm not aware of ... altough there is a referenced implementation

<raphael> Larry: http is not necessarily the best protocol to transport video

<Yves> in video, it depends if you want exact timing, control of the lag, and in that case HTTP is not the best choice

<raphael> Silvia: I would say that most of the videos is transported over http

<raphael> ... RTP and RTSP have their own fragments, we should learn from them

<raphael> ... if they do not satisfy all our requirements, we can feed them so they extend the use of fragments in these protocols

<raphael> Davy: goes through TemporalURI

<raphael> ... this is the only that specifies a delivery protocol over http

Silvia: Real used to allow something similar to temporal URLs

Jack: thinks it may be part of the .ram files

Guillaume: Flash allows doc author to export subparts by name, these can then be accessed with url#name

Davy: continues with slide 6, http media delivery

<guillaume> Guillaume: Flash could also embed internal links in movie attached to certain frames. Once compiled with specific option, fragment of the Flash movie could be accessed using #

<raphael> Silvia: draw the four-way handshake

<raphael> ... 1st exchange: User requests http://www.example.com/resource.ogv#t=20-30

<raphael> ... UA does a GET <uri stripped of hash>, Range: time 20-30

<raphael> ... Server send back a Response 200, with the content-range: time 20-30 + content-type + ogg header + time-range bytes 50000-20000

<raphael> ... (needs to create a new http header, 'time-range')

<raphael> Raphael: can we use content-range: bytes ... ?

<raphael> ... UA does a GET <URI strriped of the hash>, Range x bytes: 5000-20000

<raphael> ... Server send back a Response 200, with the content-range bytes + the cropped data

<raphael> Silvia: it is not implemented yet as far as I know

<raphael> ... discussion based on a lot of discussions with proxies vendors

<raphael> Davy: could we apply the same four-way handshake with RTSP?

<raphael> ... RTSP specifies a Range Header, similar to the HTTP byte range mechanism

<raphael> ... RTSP could support temporal fragments by a two-way handshake (using Range header)

<raphael> ... Problem: spatial fragments are not supported!

<raphael> Jack: the spatial problem is kind of orthogonal

<raphael> ... the spatial fragment will not be about bytes range

<raphael> Davy: cropping is more complex in images

<raphael> Jack: you're right, I can create a non-continous quicktime movie

<raphael> ... problem is it is not necessarily possible to generate a byte range from a time range

<raphael> Silvia: a single byte range

all: the non-contiguous ranges may occur more often than we like. But maybe
... we can get away with ignoring them (because all relevant formats also have a contiguous form).
... need to discuss after the break.

raphael: suggest coffee break

<guillaume> or need to coalesce

Larry: please decouple representation of how you refer to fragments form he implementations
... Also think about embedded metadata: if the original has a copyright statement, do you get it wth every fragment?

Sylvia: (on prev subject): wonders whether http can do multiple byte ranges

Larry: yes, I think so, with multipart

<erik> rssagent, draft minutes

<davy> scribenick: davy

Media Linking UC

raphael discusses the description written by Michael on the wiki

scribe: 3 things: bookmarking, playlists, and interlinking multimedia

silvia: definition of playlists is out of scope

guillaume: playlist is about presentation

raphael: regarding interlinked: temporal URIs can be described in RDF (RDF doc describing an audio file)
... difference between URI and RDF (or SMIL, or ...): you need to parse the metadata
... RDF description of time segment could be replaced by a temporal URI

silvia: interlinking multimedia is already covered in other UCs

Video Browser UC

silvia: large media files introduces special challenges
... requirement for server-side processing
... dynamic creation of thumbnails through URI mechanism

guillaume: link to PNG or GIF
... provide a preview function of the resource
... trivial: get all the I-frames of a video resource
... use them as thumbs
... thumbnail extraction is quite easy

silvia, jack: not so trivial, might be processing-intensive

silvia: it should be possible to point to one single frame with the URI scheme

jack: URI scheme should not know that frame is 'the' thumbnail

guillaume: you can have multiple thumbs per resource

raphael: URI scheme can point to a frame, but does not have knowledge about thumbs
... should we be able to address in terms of frames?

guillaume: no, too coding-specific

silvia: previews of images?
... preview is then a lower resolution image

guillaume: that is processing
... mostly, previews are already part of the media resource
... hence lower image resolutions are out of scope

jack: not too far?
... is a preview embedded in a resource still a fragment?

guillaume: compare it with tracks
... preview is just another track

raphael: we put this in mind and make a decision later

silvia: previews are another sort of tracks

raphael: should we also to be able to address metadata within the headers?

silvia: it is not a common property of all the formats to have previews, therefore, it is not a candidate to be standardized

raphael: after first fase of the WG: report the current limitations
... and wait for feedback

Moving Point Of Interest UC

raphael: complex UC
... should be for the second phase

jack: if this ever to be going to used at server-side?
... if not, it is out of scope

raphael: you can share the link of the moving region

erik: delivery to mobile devices is a use case introduced by the public flemish broadcaster

jack: there is no reason to use URIs for that purpose, use metadata

raphael: it is like concatenating spatial fragments over time

guillaume: we are addressing points over space or time

raphael: refer to HTML image maps
... region, interval can be defined by a combination of points
... you need more than one point

Issues

raphael: we will discuss this tomorrow

Media Fragments Working Group Teleconference

20 Oct 2008

Attendees

Contents

1. Round of introductions

2. Use Cases Discussion (Part 1)

3. Use Case Discussion (Part 2)

Media Delivery use case

Summary of Action Items