Media Fragments Working Group Teleconference -- 17 Sep 2009

<trackbot> Date: 17 September 2009

<scribe> Meeting: Media Fragments Working Group 4th F2F Meeting (Virtual)

Specification discussion

Erik: let's discuss first the aspect ratio issue

<scribe> Scribe: raphael

<scribe> Scribenick: raphael

Davy: in my opinion, the aspect is just another representation of the video, this is not a part of the video

Yves: I barely agree with the fact that aspect is a different thing, but from the processing point of view, this is also something that requires transcoding
... I'm happy to remove this use case if people are not comfortable with it

Silvia: I just respond to Yves, this is the server who does the clipping, but I think that in the case of the ratio, the server should do nothing ... this is up to the client to add the black parts
... so I see no reason for a use case, this is a presentation issue and not a fragment issue

<franck> hi all, will try to make the call (romm issue!)

PROPOSED RESOLUTION: take the aspect feature out of the spec and of our requirements

<davy> +1

<erik> +1

<Gui> +1

<jackjansen> +1

<conrad> +1

<mhausenblas> +1 (and explain why in the doc)

<silvia> apect ratio changes between what the server can provide and what the client wants to present are a presentation issue; one could either clip the video or add black bars; this should be up to the client to decide, not the server

<silvia> +1

RESOLUTION: we agree that aspect ratio is not a fragment and will not be something that we can address with a Media Fragment URI

<scribe> ACTION: Erik and Davy to write a paragraph in the documents to explain why we don't include this feature in the spec (rationale) based on the group analysis [recorded in http://www.w3.org/2009/09/17-mediafrag-minutes.html#action01]

<trackbot> Created ACTION-109 - And Davy to write a paragraph in the documents to explain why we don't include this feature in the spec (rationale) based on the group analysis [on Erik Mannens - due 2009-09-24].

Now, let's discuss the role of the ? vs #

Silvia summary: http://blog.gingertech.net/2009/09/08/uri-fragments-vs-uri-queries-for-media-fragment-addressing/

Silvia: I look at what Yves suggested, query and fragment are different depending on the need of trascoding or not
... fragments, as it is defined currently, is something that needs to be resolved locally by the UA
... any comments ?

Michael: if we have transcoding, then URI queries should be used?

<Gui> I agree on differentiation # for client nav ? for server transcoding

Silvia: yes to Michael question

Michael: my only concern is the extra complexity introduced for the implementors

Silvia: we are looking at various dimensions, we are pretty sure that for the temporal dimension, there will be often no transcoding required for most of the formats
... so no problem
... the problem will happen for the spatial dimension
... where are not sure yet, when transcoding will be required ? always ?
... I think therefore it is necessary to have solution for both cases when we need transcoding and we don't need

Conrad: with /query/ we can always go back to the server, with fragments the UA has to do something

Yves: conrad yes or no, yes if it receive the whole thing back, no if the server just send what's needed

Silvia: look at the example I post today, everytime you click, it refresh the pages, this is very painful ...
... this is what we want to change

<Gui> saw that. The clickthrough in the youtube example uses 2 separate videos that interlink each others

No Gui, see that http://lists.w3.org/Archives/Public/public-media-fragment/2009Sep/0087.html

<mhausenblas> Michael: I follow silvia's argumentation, though, for the sake of a simple standard I'd opt for # only

<Gui> Raphael, yes, that's the one I'm referring to. According to what I read, there were two similar looking videos involved to provide the linking effect for time offsetting.

Raphael: my concern is then what will happen with the spatial dimension ... since it requires transcoding most of the time

Yves: the whole document will be served
... since the server cannot satisfy the range request

Silvia: I agree

<davy> what about JPEG2000?

Raphael: I have the feeling then that we are specifying a feature, #xhwh=100,100,400,400 that will never been satisfiable !

Davy argues that JPEG2000 might do it?

Silvia: in the future, some codecs can do it ...

<Yves> normal jpg can do this with block elvel as well, no?

Silvia: I think this is not a new issue that comes up, we have discussed that a long time ago ...

Raphael: we will need to document that, and particularly add many test cases, when the server needs to transcode to satisfy the range request

<silvia> I think what we have to do in our standard is to provide means for any kind of resource to allow creation of a media fragment URI that can request access to a fragment; some will be able to satisfy it from the local resource, others only with transcoding; thus we need to specify our addressing scheme for both possiblities: ? and #

Yves: this is not an out of range case
... this is something that is forbidden

<davy> +1 to Silvia

Yves: 416 is only used when it is possible to do a range request, but you have a out of bond case

<Gui> I agree with Silvia : "? AND #"

Yves: here, we have something that cannot be applied
... in the case of transcoding, the server will then must serve the whole content with a 200

<conrad> I also agree that we need to define both ? and #

Yves: so the server needs to identify whether transcoding is necessary or not, and then rely on the default HTTP rules

<erik> +1 to Silvia too

<jackjansen> +1 to Silvia

<Gui> ? and # 1. Gives more control to the user 2. Makes our syntax specification usable by current server side implementations

+1 to silvia

Michael: I hesitate to make a big +1, because of the complexity
... but I can live with it ... I would prefer to care about # now, and work on ? later on

Davy: from an implementation point of view, I think the server has not a lot of extra work
... the query thing comes almost for free when you implement the hash

Michael: I'm thinking on both the UA and the server sides
... principle thing, the more options you have, the more test cases and things to think about
... but I understand the need and the arguments from the others

Silvia: I think you're totally right, we don't want to add complexity we don't need
... focus is on the hash, I see the ? as an optimization ... the communication aspect is already here anyway
... for query, we need to specify nothing almost, this is already handled
... in fact, we only specify the communication for the #, so to some extent you're ight
... we are saying that the URI syntax can also be used for ?

Michael: we are requiring that both are normative ? this is MUST or a SHOULD ?

Silvia: the specification of a URL does not say anything about the implementation
... we open the possibility to create URL in a standard format
... correct me if I'm wrong, but I think we specify just the syntax of the URL and the way we should parse it
... the hash resolution part must be normative

<conrad> an implementation that claims to support # must do it this way

<conrad> an implementation that claims to support ? must do it that way

<Gui> ? and # - 3. Also precise if the user wants data in or out of context ?

Silvia: for the query, we could suggest to use the same way

Michael: if this is not normative, does that have an incident on interoperability
... are we after a MUST or a SHOULD?

<conrad> i suggest that an implementation must state what it claims to support (eg. through http headers, uri parameters or whatever) -- and if it makes that claim, it must do it as described

<silvia> +1

Michael: in the case of ?, I think the MUST is a strong constraint

<conrad> whereas if it doesn't make that claim (eg. all existing urls) then this standard does not apply

Raphael: I'm well aware of the terminology

<conrad> we specify a MUST on the method of advertising that this standard applies to this URL

<silvia> as conrad says: if I claim conformance, I must follow the protocol - otherwise the communication cannot be resolved

<conrad> we need to specify how conformance is claimed

Michael: am I the only one who doesn't really get what will be specified normatively in the case of the ?

Silvia: are you talking about the communication between the UA and the server ?

<FD> when using query '?' you may have to specify the communication to get context info (Link header for example)

Raphael: oups, you're right Frank, thanks for the heads up :-)

Frank: I just wanted to remember the point of using the link header in the response of the server ... so the UA gets the context of the parent resource

<mhausenblas> Frank, can you please be more specific? Are you referring to LRDD?

Silvia: are you suggesting we write a MAY use instead of a MUST use?

<Gui> Is the difference between ? and # introduce a difference between secondary resource and a derived resource?

Frank: I have no precise idea of how the link header semantics should be used

<mhausenblas> I agree with Frank. Context should be done via LRDD (http://tools.ietf.org/html/draft-hammer-discovery-03)

Frank: but the communication between the server and the UA must be alterated because of the addition of the link header

Yves: for me, the link header should always be used in the query case
... we could mandate that
... I don't think there is a current property / value for that already, we might look for it
... invent a part-of ?

<conrad> i think the link header is useful, but should not be MUST

Yves: perhaps there is already something already

Michael: LRDD specifies the semantics for relating a resource and its description via describedBy for three cases (link element, Link: header, and well-known location)

Yves: we may have a separate RFC for that ...

Michael: problem of timeline, it will be ready on time ?

<silvia> I agree with conrad - if the resource has all the information about the offsetting etc inside it, it doesn't need to be accompanied by parent information

Yves: I don't exactly when the RFC will be ready ... but I believe the time frame is correct

<Yves> http://tools.ietf.org/id/draft-nottingham-http-link-header-06.txt

<Yves> http://www.mnot.net/drafts/draft-nottingham-http-link-header-07.txt

Conrad: I think the header is useful, but why mandating to send the whole context ?

<silvia> +1 with conrad

<FD> Agree with Conrad, required mainly for display/clipping

Raphael: time to summarize ... who wants to give it a try?

[silence]

<conrad> isn't the summary that both are useful in different situations?

[dead silence]

<silvia> resolution draft: we agree that there is a need for allowing both a ? and a # specification for media fragments

<silvia> we further agree that our main focus is #, but that the communication that we define between client and server will be adapted also to the ? case

Michael: I still have a question
... what's happen when there is both ? there is a hierarchy, query first and then fragment

<silvia> such that for media fragment URIs that cannot be resolved with # because it needs transcoding, ? can be used

Michael: what's happen when: ?t=10,30#15

<conrad> the ? defines a URL, the # is a relative offset

Silvia: I think this is obvious to what happens ...
... the query generates a new resource, and the fragment is a new relative offset to this new resource
... in raphael's case, it will start at 25s

<Gui> +1 that makes sense

<silvia> since a URI with a ? part creates a new resource, we have to do the fragment offset on the new resource, which in this case means it will start at 25s

<mhausenblas> +1 to the proposal. I'm fine with silvia's explanation

Raphael: our starting point is http://www.w3.org/2008/WebVideo/Fragments/WD-media-fragments-spec/#fragment-query

Silvia: does someone have any issue in my blog post?
... if not, then we can give an action to someone to draft a good exaplanation based on my post and this discussion on irc

Michael: wondering if there is a type in silvia's example at http://blog.gingertech.net/2009/09/08/uri-fragments-vs-uri-queries-for-media-fragment-addressing/
... Range: seconds=20- and then Content-Range: seconds 11.85-21.16/3600
... shouldn't this be seconds=12- ...?

Silvia: no, the server works in a best effort
... I am trying to explain that content may not be able to be resolved to the required resolution depending on the codec
... this is an example of what a server might only be able to do ...
... as a server, you ask for a time range, but I can only serve you that, and then the UA needs to throw away what's in extra
... no way of doing that differently, since the UA will not be able to decode it otherwise

<conrad> the client requests t=20, the previous keyframe is t=12 so the server sends from there

Raphael; this 8s is odd :-) there are more I-Frames in the middle :-)

Michael: I'm just suggesting there is a typo

<conrad> ignore my keyframe explanation here :)

<davy> check apple movie trailers: very few I Frames ...

Silvia: the remaining of the post talks about the headers, but this is for next half of the meeting

<erik> one thing at the time

Michael: my guess is that Silvia is the best volunteer to draft the summary

<silvia> ok :)

<conrad> zakim unmute me

<scribe> ACTION: Silvia to draft a summary starting from her blog post and these IRC minutes in the document [recorded in http://www.w3.org/2009/09/17-mediafrag-minutes.html#action02]

<trackbot> Created ACTION-110 - Draft a summary starting from her blog post and these IRC minutes in the document [on Silvia Pfeiffer - due 2009-09-24].

2. Protocol discussion

<mhausenblas> ACTION-69?

<trackbot> ACTION-69 -- Conrad Parker to draw a representation of the general structure of a media resource, for streamable formats (H/H' + K + D1 + D2 + D3) -- due 2009-04-24 -- OPEN

<trackbot> http://www.w3.org/2008/WebVideo/Fragments/tracker/actions/69

Conrad: I wanted to describe how ogg files are structured
... and if one is a sub part of another, then which parts changed or not

<conrad> if you have an original file H D1 D2 D3

Raphael: is this drawing available somewhere ?

<conrad> and you make a subview that goes H' D2 D3

<mhausenblas> raphael - no, it is not available, hence my comment

<mhausenblas> Michael: I think conrad needs to draw it (even just with pencil and scan it in - and we postpone it to tomorrow ...)

<conrad> with keyframes, you might end up with something like H' D2' D3

<silvia> I'd also like to point out that different containers/codecs work differently and have different challenges

Conrad: I will draw it tonight, postpone the visualization tomorrow morning

Raphael: two more things to discuss, headers and range syntax
... should we start with the Range syntax ?

[silence]

Yves's proposal: http://lists.w3.org/Archives/Public/public-media-fragment/2009Sep/0035.html

<silvia> let's not mix formats

Yves made 2 suggestions

scribe: 1: unit and then values
... 2: unit can be mixed

Michael: second option seems more complex

Raphael: we don't need to mix units, anyway, the URI syntax does not permit it

Yves proposal just concerns the time dimension ... more issue with other dimensions

Frank: what will be the duration of the track dimension?

<mhausenblas> Michael: track and ID do not have dimensions

Michael: track is identified by a name, full stop

Raphael: is track and id a Range request?
... if yes, then what is the Content Range ?
... if this is: Content-Range: track 'video' / what is behind the '/' ?

<silvia> you could talk about the number of labels

<conrad> track does not belong in Range

<erik> rssagent, draft minutes

Silvia: yes, good point from Conrad, why does he think track and id are not range request?

Conrad: I think a track is not something one can see as a range

<Gui> Track is not a time range, at best, Track is a Byte range which correspond to this piece of the media only holding the track, at worste, its muxed and interleaved and has no range

Conrad: i admit, tricky issue

<Zakim> mhausenblas, you wanted to talk about orthogonal addressing concept continuos (time/spatial) and discrete (track/ID)

<conrad> range should be for continuous addressing

Raphael: Guillaume, we are not talking about time range ... but range request, expressed in bytes or other units

Michael: Orthogonal addressing concepts: time / spatial (continuous) vs track / id (discrete)

Raphael: I would say "id" is even different, since this is a combination of the others
... I would put id aside

<Gui> raphael: I was answering to michael mentioning that track COULD be a time range, and I think it just can't

<conrad> you can't define a distance metric over track ;-)

<silvia> so do we need different mechanisms to resolve id and track?

<conrad> so this is why i was suggesting a Fragment header

<Gui> Michael & Raphael : Ok

<silvia> maybe track can only ever be used with ?

<conrad> http://www.w3.org/2008/WebVideo/Fragments/wiki/Server-parsed_Fragments

<conrad> http://www.w3.org/2008/WebVideo/Fragments/wiki/HTTP_Examples#Track.2BTime_Fragment_URI

<Gui> There is a case where what's behind the track video / 1 could be an index in case many audio or video tracks

Silvia: when we started to talk about fragments, we were talking about continous set of bytes
... for temporal, it was a reasonable assumption
... for spatial, it starts to be a problem in most of the coding format
... for track, as Conrad said, it is difficult
... is it a case of transcoding that can be resolved only with transcoding ?

<davy> if an adaptation can be expressed in terms of byte ranges, it is not transcoding

Silvia: I'm not sure about that either ... I'm very uncertain so far, I need to make my mind

Michael: I'm afraid we are introducing something too complex with the ID concept; track might be sufficient

Raphael: accessibility is the main use case for tracks and it is very important

<conrad> a track request without transcoding may result in thousands of byte ranges for concatenation

Silvia: our focus is on time ... this needs to get more thoughts

<Gui> Gui agrees with Conrad

<davy> what's wrong with that if the server joines these byte ranges?

Raphael: expectation is to have a 2nd WD ready by the end of the month to be published

Silvia: would suggest to focus on temporal domain for this next version of the WD, such that people can start using it - the browser vendors and HTML5 are keen to get into it

Davy: I don't think this is an issue for the server to do the join of the bytes ranges and serve the joint part

<conrad> davy, yes, that case is not an issue

<conrad> for this resolution we would need to specify an exact set of range names

Raphael: time for resolution
... I see 2 proposals on the table
... Proposal 1: Content-Range <timeformat> ' ' <real start time> '-' <real end time> '/' <total duration>
... actually, <timeformat> is <unit>
... Proposal 2: Content-Range <dimension> ':' <unit> ' ' <real start time> '-' <real end time> '/' <total duration>

<silvia> no quite - this is correct for proposal 2: Content-Range: time:<timeformat> ' ' <real start time> '-' <real end time> '/' <total duration>

Raphael: in the second proposal, <dimension> will be 'time', 'xywh', etc.

<silvia> yeah, but <total duration> may change depending on the <unit>

Raphael: what is the added value of having the dimension ?
... smpte unit means we are in the time dimension, no confusion possible

<conrad> the value of specifying dimension is to simplify the standardization

Silvia: it is more readable, and the total duration can be unit dependent and NOT unit dependent

<davy> +1 to Silvia

<conrad> ie. "the advantage of being more flexible, but less robust to the introduction of new units"

Erik: the proposal 2 here is NOT the proposal 2 of Yves
... Proposal 2 is an amendment from Silvia from Proposal 1 from Yves

<silvia> advantages as I see them: (1) can use default unit per dimension with only dimension (2) can be more easily extended with new units since <total duration> won't change (3) is more like url specification

<conrad> if the duration includes a frameno it is timeformat dependent

correct conrad

Silvia: <total duration> is not correct
... we need to be more generic

<FD> Rename <total duration> into <total-dimension> ?

other suggestion ?

<silvia> Proposal 2: Content-Range <dimension> [':' <unit>] ' ' <real start time> '-' <real end time> '/' <total dimension>

<erik> +1 to Silvia's "Proposal 2" here

PROPOSED RESOLUTION: Content-Range <dimension> [':' <unit>] ' ' <real start time> '-' <real end time> '/' <total dimension>

<conrad> +1

<davy> +1

<silvia> +1

<mhausenblas> +1

<erik> +1

<Gui> looking

<Gui> +1

RESOLUTION: Content-Range <dimension> [':' <unit>] ' ' <real start time> '-' <real end time> '/' <total dimension> is the syntax to be used for a Range Request for the temporal dimension

<silvia> yay

<Gui> great!

<erik> :)

<FD> not only temporal!

<silvia> well, we have to see if it works for all dimensions

<silvia> right now we're sure it works for time

3. AOB

<FD> I think <real start time> and <real end time> should also be generalized

<silvia> FD, so right!

<silvia> bye

Thanks all for the engagement

<FD> quit

<davy> bye

<erik> rssagent, draft minutes

scribe: I wish we have the same productivity tomorrow morning

- DRAFT -

Media Fragments Working Group Teleconference

17 Sep 2009

Attendees

Contents

Specification discussion

2. Protocol discussion

3. AOB

Summary of Action Items

Scribe.perl diagnostic output