Re: Processing requirements from Silvia Pfeiffer on 2009-12-30 (public-media-fragment@w3.org from December 2009)

From: Silvia Pfeiffer <silviapfeiffer1@gmail.com>
Date: Wed, 30 Dec 2009 14:33:36 +1100
To: Philip Jägenstedt <philipj@opera.com>
Cc: Jack Jansen <Jack.Jansen@cwi.nl>, Media Fragment <public-media-fragment@w3.org>
Message-ID: <2c0e02830912291933p185c237aq1679e5958f733f9d@mail.gmail.com>
On Wed, Dec 30, 2009 at 3:20 AM, Philip Jägenstedt <philipj@opera.com> wrote:
> On Tue, 29 Dec 2009 15:03:50 +0100, Silvia Pfeiffer
> <silviapfeiffer1@gmail.com> wrote:
>>
>>
>> Now, I'd say that we're probably safe using "&" as a separator for URI
>> queries, since that has been specified in the CGI "standard" and has
>> continuously been applied, even if never formally specified. It is a
>> de-facto standard.
>
> I agree that it's safe, but we must formally specify it, either by
> referencing an existing spec (which I have failed to find) or by specifying
> it ourselves.

A proper spec doesn't exist. All we have is the CGI spec. It's been my
greatest problem with the temporal URI spec for years from a
"completeness" point of view, but actually has never been a practical
problem, since ppl have just assumed the de-facto standard.


>> As for URI fragments, the idea is to keep it in sync with URI queries
>> and thus we also used the "&".
>
> I certainly agree with keeping them in sync, but the fragment component
> syntax is the one we can specify ourselves and it will work on many existing
> server configurations as a bonus.

Actually: no, we cannot define the fragment component syntax for any
video or audio mime type. In fact, the URI specification says that the
fragment syntax is specified by the owner of the mime type - i.e. the
owner of video/ogg or video/mpeg4 (and audio) in the HTML5 case. All
that we can realistically do is provide a recommendation for mime type
owners to adopt our specification. We cannot really make an
enforceable standard. OTOH, ppl have been waiting for such a spec, so
they will gladly adopt it rather than create their own.


>> Now, both approaches (URI fragment and query) may conflict with some
>> already created specifications (as analysed and listed in
>>
>> http://www.w3.org/2008/WebVideo/Fragments/WD-media-fragments-reqs/#ExistingSchemes).
>> This is unavoidable when standardising the use of something that has
>> been in the wild so far.
>>
>> http://www.w3.org/2008/WebVideo/Fragments/WD-media-fragments-spec/#processing-overview-standardisation
>> talks about this problem and makes clear that harmonisation is
>> necessary and that it is not possible to "prescribe" this format.
>> Which probably means that media fragments will always be a
>> recommendation rather than a standard.
>
> Yes, we will conflict with e.g. your Temporal URI spec and MPEG-21, which is
> to be expected as MF is supposed to supersede both.

Well, I'm not actually sure MPEG-21 will adopt it. But the thing is:
even if the mime type owners don't accept it, what actually counts is
what the browser vendors implement. :-)


> However, existing query component schemes aren't really specs as such, they
> are actually defined by their (usually single) implementation. However, if
> we agree that MF should only normatively define the syntax and processing
> rules for URI *fragments*, then we don't need to discuss the query component
> issue any further.

Some past discussions have found that we need to do both. The URI
queries approach has its use cases where you want to create a shorter
form from a longer resource - e.g. a playlists mashed up from segments
from multiple videos. We have embraced such use cases in the
requirements specification and they would require the use of URI
queries.

To be complete, it is also possible to not use URI queries, but to use
some kind of REST interface, as you have mentioned before, e.g.
http://www.example.com/video/track=video1/track=audio2/t=20,80 . But
this resource has nothing at all to do with the original resource,
which may be http://www.example.com/video/, so caching is impossible.
Using URI queries at least provides a means to enable caching and to
continue having the link back to the original resource.

OTOH there are a lot of issues to deal with when using queries. We can
only address a small part of the URI query possibilities in the MF
spec, namely the one that overlaps with the spec we're creating for
URI fragments. That has been the basis of our decisions so far.

Why do you think URI queries are so much more of a problem? I wasn't
able to read that out of the irc discussion either. Standardisation of
how to create URI queries is useful, since then there are compatible
naming conventions across servers and clients and applications can
rely on things working the way they'd expect to. From a HTML5 POV, URI
queries don't matter, since they don't concern the browser. But when
specifying URIs, one has to think far beyond just the browser, IMHO.


>> We could do one thing though: maybe we should add the link to the CGI
>> specification to the spec to explain where the formatting comes from.
>
> The CGI documentation only provides a rough description and isn't suitable
> for a normative reference. For example, it says "you should URL decode the
> name" but not how to do that. It is quite important to know how to interpret
> #t=npt%3a10s (%3A is ':', but is %3a also tolerated?) and #id=100% ('%'
> should be encoded as %25, but what to do with a stray %?).
>
> Specifying this is very simple:
>
> 1. split the string on &
> 2. split the resulting string on the first occurrence of '=' and let name be
> the first part and value be the second part. if there is no = in the string
> let value be ''
> 3. decode name and value according to [some very fine spec we can reuse I
> hope]
>
> Simple but necessary as the spec can't make any normative requirements at
> all about fragment dimensions if it doesn't define how to get from a
> fragment component to a list of fragment dimensions.

Agreed, that is somewhat implicit in the specification right now.


>> Philip, note that the specification only defines a syntax for the URI
>> fragment case, but leaves out the URI query case and just alludes to
>> the fact that it is done in the same way. I think that is already what
>> you are suggesting, no?
>
> The spec treats the query and fragment component equally as far as I can
> see, so any normative requirements on URI fragments are also being made on
> URI queries. For example:
>
> "The syntax is based on the specification of particular field-value pairs
> that can be used in URI fragment and URI query requests to restrict a media
> resource to a certain fragment."
>
> "There are therefore two possibilities for representing the media fragment
> addressing in URIs: the URI query part or the URI fragment part."
>
> "The composition of a URI fragment or query string for a media resource
> relies on a series of field-value pairs to be added behind the URI fragment
> ('#') or query ('?') identifier."
>
> "In this section we present the ABNF syntax for the field-value pairs that
> relate to a media fragment URI. The names for the non-terminals more-or-less
> follow the names used in the previous subsections, with one clear
> difference: the start symbol is called mediasegment, because we want to
> allow application of it to both URI fragment and URI query strings."

Yes, I think you're right. It does apply to both URI fragment and URI
query. But that was intentional, as discussed above.


> If the intention is that the ABNF syntax be normative only for URI
> fragments, this should be clarified by removing the 'segment' ABNF and
> instead require that mediasegment be a valid production of the ifragment
> syntax from the IRI spec. This might have implications for the use of '+' in
> datetime, I haven't checked.

I do wonder about this last detail. Might be worth checking.


> There are several places in the spec that talk about Media Fragments, URI
> fragments and URI queries as if URI fragments and URI queries are a subset
> or Media Fragments rather the Media Fragments being a subset of URI
> fragments. I'm quite confused by this terminology, could someone clarify? I
> would like to see Media Fragment added to the terminology section.

So far, what we have specified is the following (see
http://www.w3.org/2008/WebVideo/Fragments/WD-media-fragments-spec/#terminology):
In this document, when the term 'media fragment URIs' is used, it
actually means 'media fragment URI references'.

This means that a media fragment URI is just generally a URI that
deals with a section of a media resource. It does not say how.

URI fragment and URI query quite plainly specify how to deal with the
media fragment URI: namely either through use of a URI fragment or a
URI query.

I thought we used these quite consistently and made sure they didn't
get mixed up. So, what, in your opinion, is missing?


> [pause]
>
> My primary concern is that the processing of fragment component is still
> undefined as it is my intention to support MF in Opera at some point. In the
> bad old days when a spec left something undefined one browser would just
> make something up and the others would reverse-engineer it, but I am still
> young and naive to think that things are different now. I am willing to edit
> the spec myself to show clearly what it is I'm suggesting.

I'm more than happy for you to make such changes - in particular to
separate out the structure of parameters in a URI fragment and URI
query from the actual specification of the name-value pairs in use. As
mentioned in the email to Jack, I do think it makes sense to separate
that into a section that specifies the foundations that we build upon.
If you want to go ahead and do that, I wouldn't have a problem. But I
don't speak for the others, so maybe wait until we get their input.
:-)


Cheers,
Silvia.
Received on Wednesday, 30 December 2009 03:34:29 UTC